All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] ptrace,signal: group stop / ptrace updates
@ 2011-01-28 15:08 Tejun Heo
  2011-01-28 15:08 ` [PATCH 01/10] signal: fix SIGCONT notification code Tejun Heo
                   ` (10 more replies)
  0 siblings, 11 replies; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm

Hello,

This is another posting of ptrace and group stop interaction update.
The last posting was split over two patchsets[1][2].  Changes are,

* Rebased on top of v2.6.38-rc2

* 0010-ptrace-clean-transitions-between-TASK_STOPPED-and-TR.patch
  updated as per Oleg's comments - the TRACED/TRAPPING race condition
  closed and trapping clearing separated out from group_stop clearing.

0001-signal-fix-SIGCONT-notification-code.patch
0002-ptrace-remove-the-extra-wake_up_process-from-ptrace_.patch
0003-signal-remove-superflous-try_to_freeze-loop-in-do_si.patch
0004-ptrace-kill-tracehook_notify_jctl.patch
0005-ptrace-add-why-to-ptrace_stop.patch
0006-signal-fix-premature-completion-of-group-stop-when-i.patch
0007-signal-use-GROUP_STOP_PENDING-to-stop-once-for-a-sin.patch
0008-ptrace-participate-in-group-stop-from-ptrace_stop-if.patch
0009-ptrace-make-do_signal_stop-use-ptrace_stop-if-the-ta.patch
0010-ptrace-clean-transitions-between-TASK_STOPPED-and-TR.patch

0001-0004 are cleanup/bugfix patches.  0005-0010 improve group stop
handling.

Discussions are still on-going on the following points.

1. Removal of spurious wake_up_process() by 0002 may not be safe[3].

2. STOPPED -> RUNNING -> TRACED transition window may be visible to
   tasks which are not the tracer[4].  Tracee always entering TRACED
   also causes one ptrace test case to fail[5].

3. After immediately re-attaching to a detached task in stopped state,
   WNOHANG wait(2) may fail.

This patchset does change ptrace behavior but the changed aspects are
somewhere between awkward and outright buggy before the changes and
the changes are visible only through very convoluted use cases.
Regardless of future directions from here, I don't think the patches
posted in this patchset would be a problem.

The patchset is available in the following git tree.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git ptrace-review

Thank you.

 fs/exec.c                 |    1 
 include/linux/sched.h     |   11 ++
 include/linux/tracehook.h |   27 -----
 kernel/ptrace.c           |   51 ++++++++--
 kernel/signal.c           |  226 ++++++++++++++++++++++++++++++++++------------
 5 files changed, 225 insertions(+), 91 deletions(-)

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1079975
[2] http://thread.gmane.org/gmane.linux.kernel/1080700
[3] http://thread.gmane.org/gmane.linux.kernel/1079975/focus=1088490
[4] http://thread.gmane.org/gmane.linux.kernel/1080700/focus=1088538
[5] http://thread.gmane.org/gmane.linux.kernel/1080700/focus=1093056

^ permalink raw reply	[flat|nested] 160+ messages in thread

* [PATCH 01/10] signal: fix SIGCONT notification code
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 15:08 ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Tejun Heo
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

After a task receives SIGCONT, its parent is notified via SIGCHLD with
its siginfo describing what the notified event is.  If SIGCONT is
received while the child process is stopped, the code should be
CLD_CONTINUED.  If SIGCONT is recieved while the child process is in
the process of being stopped, it should be CLD_STOPPED.  Which code to
use is determined in prepare_signal() and recorded in signal->flags
using SIGNAL_CLD_CONTINUED|STOP flags.

get_signal_deliver() should test these flags and then notify
accoringly; however, it incorrectly tested SIGNAL_STOP_CONTINUED
instead of SIGNAL_CLD_CONTINUED, thus incorrectly notifying
CLD_CONTINUED if the signal is delivered before the task is wait(2)ed
and CLD_STOPPED if the state was fetched already.

Fix it by testing SIGNAL_CLD_CONTINUED.  While at it, uncompress the
?: test into if/else clause for better readability.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
---
 kernel/signal.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 4e3cff1..fe004b5 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1853,8 +1853,13 @@ relock:
 	 * the CLD_ si_code into SIGNAL_CLD_MASK bits.
 	 */
 	if (unlikely(signal->flags & SIGNAL_CLD_MASK)) {
-		int why = (signal->flags & SIGNAL_STOP_CONTINUED)
-				? CLD_CONTINUED : CLD_STOPPED;
+		int why;
+
+		if (signal->flags & SIGNAL_CLD_CONTINUED)
+			why = CLD_CONTINUED;
+		else
+			why = CLD_STOPPED;
+
 		signal->flags &= ~SIGNAL_CLD_MASK;
 
 		why = tracehook_notify_jctl(why, CLD_CONTINUED);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach()
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
  2011-01-28 15:08 ` [PATCH 01/10] signal: fix SIGCONT notification code Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 18:46   ` Roland McGrath
  2011-01-28 15:08 ` [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

This wake_up_process() has a turbulent history.  This is a remnant
from ancient ptrace implementation and patently wrong.  Commit
95a3540d (ptrace_detach: the wrong wakeup breaks the ERESTARTxxx
logic) removed it but the change was reverted later by commit edaba2c5
(ptrace: revert "ptrace_detach: the wrong wakeup breaks the
ERESTARTxxx logic ") citing compatibility breakage and general
brokeness of the whole group stop / ptrace interaction.

Digging through the mailing archives, the compatibility breakage
doesn't seem to be critical in the sense that the behavior isn't well
defined or reliable to begin with and it seems to have been agreed to
remove the wakeup with proper cleanup of the whole thing.

Now that the group stop and its interaction with ptrace are cleaned up
and well defined, it's high time to finally kill this silliness.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
---
 kernel/ptrace.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 99bbaa3..a8c9f26 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -312,8 +312,6 @@ int ptrace_detach(struct task_struct *child, unsigned int data)
 	if (child->ptrace) {
 		child->exit_code = data;
 		dead = __ptrace_detach(current, child);
-		if (!child->exit_state)
-			wake_up_process(child);
 	}
 	write_unlock_irq(&tasklist_lock);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop()
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
  2011-01-28 15:08 ` [PATCH 01/10] signal: fix SIGCONT notification code Tejun Heo
  2011-01-28 15:08 ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 18:46   ` Roland McGrath
  2011-01-28 15:08 ` [PATCH 04/10] ptrace: kill tracehook_notify_jctl() Tejun Heo
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

do_signal_stop() is used only by get_signal_to_deliver() and after a
successful signal stop, it always calls try_to_freeze(), so the
try_to_freeze() loop around schedule() in do_signal_stop() is
superflous and confusing.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
---
 kernel/signal.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index fe004b5..0a6816a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1781,9 +1781,7 @@ static int do_signal_stop(int signr)
 	}
 
 	/* Now we don't run again until woken by SIGCONT or SIGKILL */
-	do {
-		schedule();
-	} while (try_to_freeze());
+	schedule();
 
 	tracehook_finish_jctl();
 	current->exit_code = 0;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 04/10] ptrace: kill tracehook_notify_jctl()
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (2 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 21:09   ` Roland McGrath
  2011-01-28 15:08 ` [PATCH 05/10] ptrace: add @why to ptrace_stop() Tejun Heo
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

tracehook_notify_jctl() aids in determining whether and what to report
to the parent when a task is stopped or continued.  The function also
adds an extra requirement that siglock may be released across it,
which is currently unused and quite difficult to satisfy in
well-defined manner.

As job control and the notifications are about to receive major
overhaul, remove the tracehook and open code it.  If ever necessary,
let's factor it out after the overhaul.

= Oleg spotted incorrect CLD_CONTINUED/STOPPED selection when ptraced.
  Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
---
 include/linux/tracehook.h |   27 ---------------------------
 kernel/signal.c           |   34 ++++++++++++++--------------------
 2 files changed, 14 insertions(+), 47 deletions(-)

diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 3a2e66d..b073f3c 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -469,33 +469,6 @@ static inline int tracehook_get_signal(struct task_struct *task,
 }
 
 /**
- * tracehook_notify_jctl - report about job control stop/continue
- * @notify:		zero, %CLD_STOPPED or %CLD_CONTINUED
- * @why:		%CLD_STOPPED or %CLD_CONTINUED
- *
- * This is called when we might call do_notify_parent_cldstop().
- *
- * @notify is zero if we would not ordinarily send a %SIGCHLD,
- * or is the %CLD_STOPPED or %CLD_CONTINUED .si_code for %SIGCHLD.
- *
- * @why is %CLD_STOPPED when about to stop for job control;
- * we are already in %TASK_STOPPED state, about to call schedule().
- * It might also be that we have just exited (check %PF_EXITING),
- * but need to report that a group-wide stop is complete.
- *
- * @why is %CLD_CONTINUED when waking up after job control stop and
- * ready to make a delayed @notify report.
- *
- * Return the %CLD_* value for %SIGCHLD, or zero to generate no signal.
- *
- * Called with the siglock held.
- */
-static inline int tracehook_notify_jctl(int notify, int why)
-{
-	return notify ?: (current->ptrace & PT_PTRACED) ? why : 0;
-}
-
-/**
  * tracehook_finish_jctl - report about return from job control stop
  *
  * This is called by do_signal_stop() after wakeup.
diff --git a/kernel/signal.c b/kernel/signal.c
index 0a6816a..7dc0ca2 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1727,7 +1727,7 @@ void ptrace_notify(int exit_code)
 static int do_signal_stop(int signr)
 {
 	struct signal_struct *sig = current->signal;
-	int notify;
+	int notify = 0;
 
 	if (!sig->group_stop_count) {
 		struct task_struct *t;
@@ -1759,19 +1759,16 @@ static int do_signal_stop(int signr)
 	 * a group stop in progress and we are the last to stop, report
 	 * to the parent.  When ptraced, every thread reports itself.
 	 */
-	notify = sig->group_stop_count == 1 ? CLD_STOPPED : 0;
-	notify = tracehook_notify_jctl(notify, CLD_STOPPED);
-	/*
-	 * tracehook_notify_jctl() can drop and reacquire siglock, so
-	 * we keep ->group_stop_count != 0 before the call. If SIGCONT
-	 * or SIGKILL comes in between ->group_stop_count == 0.
-	 */
-	if (sig->group_stop_count) {
-		if (!--sig->group_stop_count)
-			sig->flags = SIGNAL_STOP_STOPPED;
-		current->exit_code = sig->group_exit_code;
-		__set_current_state(TASK_STOPPED);
+	if (!--sig->group_stop_count) {
+		sig->flags = SIGNAL_STOP_STOPPED;
+		notify = CLD_STOPPED;
 	}
+	if (task_ptrace(current))
+		notify = CLD_STOPPED;
+
+	current->exit_code = sig->group_exit_code;
+	__set_current_state(TASK_STOPPED);
+
 	spin_unlock_irq(&current->sighand->siglock);
 
 	if (notify) {
@@ -1860,14 +1857,11 @@ relock:
 
 		signal->flags &= ~SIGNAL_CLD_MASK;
 
-		why = tracehook_notify_jctl(why, CLD_CONTINUED);
 		spin_unlock_irq(&sighand->siglock);
 
-		if (why) {
-			read_lock(&tasklist_lock);
-			do_notify_parent_cldstop(current->group_leader, why);
-			read_unlock(&tasklist_lock);
-		}
+		read_lock(&tasklist_lock);
+		do_notify_parent_cldstop(current->group_leader, why);
+		read_unlock(&tasklist_lock);
 		goto relock;
 	}
 
@@ -2034,7 +2028,7 @@ void exit_signals(struct task_struct *tsk)
 	if (unlikely(tsk->signal->group_stop_count) &&
 			!--tsk->signal->group_stop_count) {
 		tsk->signal->flags = SIGNAL_STOP_STOPPED;
-		group_stop = tracehook_notify_jctl(CLD_STOPPED, CLD_STOPPED);
+		group_stop = CLD_STOPPED;
 	}
 out:
 	spin_unlock_irq(&tsk->sighand->siglock);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 05/10] ptrace: add @why to ptrace_stop()
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (3 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 04/10] ptrace: kill tracehook_notify_jctl() Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 18:48   ` Roland McGrath
  2011-01-28 15:08 ` [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace Tejun Heo
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

To prepare for cleanup of the interaction between group stop and
ptrace, add @why to ptrace_stop().  Existing users are updateda such
that there is no behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 7dc0ca2..4569801 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1617,7 +1617,7 @@ static int sigkill_pending(struct task_struct *tsk)
  * If we actually decide not to stop at all because the tracer
  * is gone, we keep current->exit_code unless clear_code.
  */
-static void ptrace_stop(int exit_code, int clear_code, siginfo_t *info)
+static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
@@ -1655,7 +1655,7 @@ static void ptrace_stop(int exit_code, int clear_code, siginfo_t *info)
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
 	if (may_ptrace_stop()) {
-		do_notify_parent_cldstop(current, CLD_TRAPPED);
+		do_notify_parent_cldstop(current, why);
 		/*
 		 * Don't want to allow preemption here, because
 		 * sys_ptrace() needs this task to be inactive.
@@ -1714,7 +1714,7 @@ void ptrace_notify(int exit_code)
 
 	/* Let the debugger run.  */
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_stop(exit_code, 1, &info);
+	ptrace_stop(exit_code, CLD_TRAPPED, 1, &info);
 	spin_unlock_irq(&current->sighand->siglock);
 }
 
@@ -1795,7 +1795,7 @@ static int ptrace_signal(int signr, siginfo_t *info,
 	ptrace_signal_deliver(regs, cookie);
 
 	/* Let the debugger run.  */
-	ptrace_stop(signr, 0, info);
+	ptrace_stop(signr, CLD_TRAPPED, 0, info);
 
 	/* We're back.  Did the debugger cancel the sig?  */
 	signr = current->exit_code;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (4 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 05/10] ptrace: add @why to ptrace_stop() Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 21:22   ` Roland McGrath
  2011-01-28 15:08 ` [PATCH 07/10] signal: use GROUP_STOP_PENDING to stop once for a single group stop Tejun Heo
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

task->signal->group_stop_count is used to track the progress of group
stop.  It's initialized to the number of tasks which need to stop for
group stop to finish and each stopping or trapping task decrements.
However, each task doesn't keep track of whether it decremented the
counter or not and if woken up before the group stop is complete and
stops again, it can decrement the counter multiple times.

Please consider the following example code.

 static void *worker(void *arg)
 {
	 while (1) ;
	 return NULL;
 }

 int main(void)
 {
	 pthread_t thread;
	 pid_t pid;
	 int i;

	 pid = fork();
	 if (!pid) {
		 for (i = 0; i < 5; i++)
			 pthread_create(&thread, NULL, worker, NULL);
		 while (1) ;
		 return 0;
	 }

	 ptrace(PTRACE_ATTACH, pid, NULL, NULL);
	 while (1) {
		 waitid(P_PID, pid, NULL, WSTOPPED);
		 ptrace(PTRACE_SINGLESTEP, pid, NULL, (void *)(long)SIGSTOP);
	 }
	 return 0;
 }

The child creates five threads and the parent continuously traps the
first thread and whenever the child gets a signal, SIGSTOP is
delivered.  If an external process sends SIGSTOP to the child, all
other threads in the process should reliably stop.  However, due to
the above bug, the first thread will often end up consuming
group_stop_count multiple times and SIGSTOP often ends up stopping
none or part of the other four threads.

This patch adds a new field task->group_stop which is protected by
siglock and uses GROUP_STOP_CONSUME flag to track which task is still
to consume group_stop_count to fix this bug.

task_clear_group_stop_pending() and task_participate_group_stop() are
added to help manipulating group stop states.  As ptrace_stop() now
also uses task_participate_group_stop(), it will set
SIGNAL_STOP_STOPPED if it completes a group stop.

There still are many issues regarding the interaction between group
stop and ptrace.  Patches to address them will follow.

- Oleg spotted duplicate GROUP_STOP_CONSUME.  Dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
---
 include/linux/sched.h |    6 ++++
 kernel/signal.c       |   62 ++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d747f94..58df43d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1261,6 +1261,7 @@ struct task_struct {
 	int exit_state;
 	int exit_code, exit_signal;
 	int pdeath_signal;  /*  The signal sent when the parent dies  */
+	unsigned int group_stop;	/* GROUP_STOP_*, siglock protected */
 	/* ??? */
 	unsigned int personality;
 	unsigned did_exec:1;
@@ -1772,6 +1773,11 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define tsk_used_math(p) ((p)->flags & PF_USED_MATH)
 #define used_math() tsk_used_math(current)
 
+/*
+ * task->group_stop flags
+ */
+#define GROUP_STOP_CONSUME	(1 << 17) /* consume group stop count */
+
 #ifdef CONFIG_PREEMPT_RCU
 
 #define RCU_READ_UNLOCK_BLOCKED (1 << 0) /* blocked while in RCU read-side. */
diff --git a/kernel/signal.c b/kernel/signal.c
index 4569801..238eeba 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -223,6 +223,52 @@ static inline void print_dropped_signal(int sig)
 				current->comm, current->pid, sig);
 }
 
+/**
+ * task_clear_group_stop_pending - clear pending group stop
+ * @task: target task
+ *
+ * Clear group stop states for @task.
+ *
+ * CONTEXT:
+ * Must be called with @task->sighand->siglock held.
+ */
+static void task_clear_group_stop_pending(struct task_struct *task)
+{
+	task->group_stop &= ~GROUP_STOP_CONSUME;
+}
+
+/**
+ * task_participate_group_stop - participate in a group stop
+ * @task: task participating in a group stop
+ *
+ * @task is participating in a group stop.  Group stop states are cleared
+ * and the group stop count is consumed if %GROUP_STOP_CONSUME was set.  If
+ * the consumption completes the group stop, the appropriate %SIGNAL_*
+ * flags are set.
+ *
+ * CONTEXT:
+ * Must be called with @task->sighand->siglock held.
+ */
+static bool task_participate_group_stop(struct task_struct *task)
+{
+	struct signal_struct *sig = task->signal;
+	bool consume = task->group_stop & GROUP_STOP_CONSUME;
+
+	task_clear_group_stop_pending(task);
+
+	if (!consume)
+		return false;
+
+	if (!WARN_ON_ONCE(sig->group_stop_count == 0))
+		sig->group_stop_count--;
+
+	if (!sig->group_stop_count) {
+		sig->flags = SIGNAL_STOP_STOPPED;
+		return true;
+	}
+	return false;
+}
+
 /*
  * allocate a new signal queue record
  * - this may be called without locks if and only if t == current, otherwise an
@@ -1645,7 +1691,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	 * we must participate in the bookkeeping.
 	 */
 	if (current->signal->group_stop_count > 0)
-		--current->signal->group_stop_count;
+		task_participate_group_stop(current);
 
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
@@ -1730,6 +1776,7 @@ static int do_signal_stop(int signr)
 	int notify = 0;
 
 	if (!sig->group_stop_count) {
+		unsigned int gstop = GROUP_STOP_CONSUME;
 		struct task_struct *t;
 
 		if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED) ||
@@ -1741,6 +1788,7 @@ static int do_signal_stop(int signr)
 		 */
 		sig->group_exit_code = signr;
 
+		current->group_stop = gstop;
 		sig->group_stop_count = 1;
 		for (t = next_thread(current); t != current; t = next_thread(t))
 			/*
@@ -1750,19 +1798,19 @@ static int do_signal_stop(int signr)
 			 */
 			if (!(t->flags & PF_EXITING) &&
 			    !task_is_stopped_or_traced(t)) {
+				t->group_stop = gstop;
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
-			}
+			} else
+				task_clear_group_stop_pending(t);
 	}
 	/*
 	 * If there are no other threads in the group, or if there is
 	 * a group stop in progress and we are the last to stop, report
 	 * to the parent.  When ptraced, every thread reports itself.
 	 */
-	if (!--sig->group_stop_count) {
-		sig->flags = SIGNAL_STOP_STOPPED;
+	if (task_participate_group_stop(current))
 		notify = CLD_STOPPED;
-	}
 	if (task_ptrace(current))
 		notify = CLD_STOPPED;
 
@@ -2026,10 +2074,8 @@ void exit_signals(struct task_struct *tsk)
 			recalc_sigpending_and_wake(t);
 
 	if (unlikely(tsk->signal->group_stop_count) &&
-			!--tsk->signal->group_stop_count) {
-		tsk->signal->flags = SIGNAL_STOP_STOPPED;
+	    task_participate_group_stop(tsk))
 		group_stop = CLD_STOPPED;
-	}
 out:
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 07/10] signal: use GROUP_STOP_PENDING to stop once for a single group stop
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (5 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 15:08 ` [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

Currently task->signal->group_stop_count is used to decide whether to
stop for group stop.  However, if there is a task in the group which
is taking a long time to stop, other tasks which are continued by
ptrace would repeatedly stop for the same group stop until the group
stop is complete.

Conversely, if a ptraced task is in TASK_TRACED state, the debugger
won't get notified of group stops which is inconsistent compared to
the ptraced task in any other state.

This patch introduces GROUP_STOP_PENDING which tracks whether a task
is yet to stop for the group stop in progress.  The flag is set when a
group stop starts and cleared when the task stops the first time for
the group stop, and consulted whenever whether the task should
participate in a group stop needs to be determined.  Note that now
tasks in TASK_TRACED also participate in group stop.

This results in the following behavior changes.

* For a single group stop, a ptracer would see at most one stop
  reported.

* A ptracee in TASK_TRACED now also participates in group stop and the
  tracer would get the notification.  However, as a ptraced task could
  be in TASK_STOPPED state or any ptrace trap could consume group
  stop, the notification may still be missing.  These will be
  addressed with further patches.

* A ptracee may start a group stop while one is still in progress if
  the tracer let it continue with stop signal delivery.  Group stop
  code handles this correctly.

Oleg:

* Spotted that a task might skip signal check even when its
  GROUP_STOP_PENDING is set.  Fixed by updating
  recalc_sigpending_tsk() to check GROUP_STOP_PENDING instead of
  group_stop_count.

* Pointed out that task->group_stop should be cleared whenever
  task->signal->group_stop_count is cleared.  Fixed accordingly.

* Pointed out the behavior inconsistency between TASK_TRACED and
  RUNNING and the last behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
---
 fs/exec.c             |    1 +
 include/linux/sched.h |    3 +++
 kernel/signal.c       |   36 +++++++++++++++++++++---------------
 3 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index c62efcb..0928da8 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1653,6 +1653,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
+		task_clear_group_stop_pending(t);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 58df43d..0fc6c5e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1776,8 +1776,11 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 /*
  * task->group_stop flags
  */
+#define GROUP_STOP_PENDING	(1 << 16) /* task should stop for group stop */
 #define GROUP_STOP_CONSUME	(1 << 17) /* consume group stop count */
 
+extern void task_clear_group_stop_pending(struct task_struct *task);
+
 #ifdef CONFIG_PREEMPT_RCU
 
 #define RCU_READ_UNLOCK_BLOCKED (1 << 0) /* blocked while in RCU read-side. */
diff --git a/kernel/signal.c b/kernel/signal.c
index 238eeba..7527778 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -124,7 +124,7 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
-	if (t->signal->group_stop_count > 0 ||
+	if ((t->group_stop & GROUP_STOP_PENDING) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
@@ -232,19 +232,19 @@ static inline void print_dropped_signal(int sig)
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-static void task_clear_group_stop_pending(struct task_struct *task)
+void task_clear_group_stop_pending(struct task_struct *task)
 {
-	task->group_stop &= ~GROUP_STOP_CONSUME;
+	task->group_stop &= ~(GROUP_STOP_PENDING | GROUP_STOP_CONSUME);
 }
 
 /**
  * task_participate_group_stop - participate in a group stop
  * @task: task participating in a group stop
  *
- * @task is participating in a group stop.  Group stop states are cleared
- * and the group stop count is consumed if %GROUP_STOP_CONSUME was set.  If
- * the consumption completes the group stop, the appropriate %SIGNAL_*
- * flags are set.
+ * @task has GROUP_STOP_PENDING set and is participating in a group stop.
+ * Group stop states are cleared and the group stop count is consumed if
+ * %GROUP_STOP_CONSUME was set.  If the consumption completes the group
+ * stop, the appropriate %SIGNAL_* flags are set.
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
@@ -254,6 +254,8 @@ static bool task_participate_group_stop(struct task_struct *task)
 	struct signal_struct *sig = task->signal;
 	bool consume = task->group_stop & GROUP_STOP_CONSUME;
 
+	WARN_ON_ONCE(!(task->group_stop & GROUP_STOP_PENDING));
+
 	task_clear_group_stop_pending(task);
 
 	if (!consume)
@@ -765,6 +767,9 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		t = p;
 		do {
 			unsigned int state;
+
+			task_clear_group_stop_pending(t);
+
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
 			/*
 			 * If there is a handler for SIGCONT, we must make
@@ -906,6 +911,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
+				task_clear_group_stop_pending(t);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1139,6 +1145,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
+		task_clear_group_stop_pending(t);
 		count++;
 
 		/* Don't bother with already dead threads */
@@ -1690,7 +1697,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	 * If there is a group stop in progress,
 	 * we must participate in the bookkeeping.
 	 */
-	if (current->signal->group_stop_count > 0)
+	if (current->group_stop & GROUP_STOP_PENDING)
 		task_participate_group_stop(current);
 
 	current->last_siginfo = info;
@@ -1775,8 +1782,8 @@ static int do_signal_stop(int signr)
 	struct signal_struct *sig = current->signal;
 	int notify = 0;
 
-	if (!sig->group_stop_count) {
-		unsigned int gstop = GROUP_STOP_CONSUME;
+	if (!(current->group_stop & GROUP_STOP_PENDING)) {
+		unsigned int gstop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME;
 		struct task_struct *t;
 
 		if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED) ||
@@ -1796,8 +1803,7 @@ static int do_signal_stop(int signr)
 			 * stop is always done with the siglock held,
 			 * so this check has no races.
 			 */
-			if (!(t->flags & PF_EXITING) &&
-			    !task_is_stopped_or_traced(t)) {
+			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
 				t->group_stop = gstop;
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
@@ -1926,8 +1932,8 @@ relock:
 		if (unlikely(signr != 0))
 			ka = return_ka;
 		else {
-			if (unlikely(signal->group_stop_count > 0) &&
-			    do_signal_stop(0))
+			if (unlikely(current->group_stop &
+				     GROUP_STOP_PENDING) && do_signal_stop(0))
 				goto relock;
 
 			signr = dequeue_signal(current, &current->blocked,
@@ -2073,7 +2079,7 @@ void exit_signals(struct task_struct *tsk)
 		if (!signal_pending(t) && !(t->flags & PF_EXITING))
 			recalc_sigpending_and_wake(t);
 
-	if (unlikely(tsk->signal->group_stop_count) &&
+	if (unlikely(tsk->group_stop & GROUP_STOP_PENDING) &&
 	    task_participate_group_stop(tsk))
 		group_stop = CLD_STOPPED;
 out:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (6 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 07/10] signal: use GROUP_STOP_PENDING to stop once for a single group stop Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 21:30   ` Roland McGrath
  2011-01-28 15:08 ` [PATCH 09/10] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced Tejun Heo
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

Currently, ptrace_stop() unconditionally participates in group stop
bookkeeping.  This is unnecessary and inaccurate.  Make it only
participate if the task is trapping for group stop - ie. if @why is
CLD_STOPPED.  As ptrace_stop() currently is not used when trapping for
group stop, this equals to disabling group stop participation from
ptrace_stop().

A visible behavior change is increased likelihood of delayed group
stop completion if the thread group contains one or more ptraced
tasks.

This is to preapre for further cleanup of the interaction between
group stop and ptrace.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
---
 kernel/signal.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 7527778..2e20dad 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1694,10 +1694,13 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	}
 
 	/*
-	 * If there is a group stop in progress,
-	 * we must participate in the bookkeeping.
+	 * If @why is CLD_STOPPED, we're trapping to participate in a group
+	 * stop.  Do the bookkeeping.  Note that if SIGCONT was delievered
+	 * while siglock was released for the arch hook, PENDING could be
+	 * clear now.  We act as if SIGCONT is received after TASK_TRACED
+	 * is entered - ignore it.
 	 */
-	if (current->group_stop & GROUP_STOP_PENDING)
+	if (why == CLD_STOPPED && (current->group_stop & GROUP_STOP_PENDING))
 		task_participate_group_stop(current);
 
 	current->last_siginfo = info;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 09/10] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (7 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-01-28 15:08 ` [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED Tejun Heo
  2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
  10 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

A ptraced task would still stop at do_signal_stop() when it's stopping
for stop signals and do_signal_stop() behaves the same whether the
task is ptraced or not.  However, in addition to stopping,
ptrace_stop() also does ptrace specific stuff like calling
architecture specific callbacks, so this behavior makes the code more
fragile and difficult to understand.

This patch makes do_signal_stop() test whether the task is ptraced and
use ptrace_stop() if so.  This renders tracehook_notify_jctl() rather
pointless as the ptrace notification is now handled by ptrace_stop()
regardless of the return value from the tracehook.  It probably is a
good idea to update it.

This doesn't solve the whole problem as tasks already in stopped state
would stay in the regular stop when ptrace attached.  That part will
be handled by the next patch.

Oleg pointed out that this makes a userland-visible change.  Before,
SIGCONT would be able to wake up a task in group stop even if the task
is ptraced if the tracer hasn't issued another ptrace command
afterwards (as the next ptrace commands transitions the state into
TASK_TRACED which ignores SIGCONT wakeups).  With this and the next
patch, SIGCONT may race with the transition into TASK_TRACED and is
ignored if the tracee already entered TASK_TRACED.

Another userland visible change of this and the next patch is that the
ptracee's state would now be TASK_TRACED where it used to be
TASK_STOPPED, which is visible via fs/proc.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
---
 kernel/signal.c |   43 +++++++++++++++++++++++++------------------
 1 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 2e20dad..4404474 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1783,7 +1783,6 @@ void ptrace_notify(int exit_code)
 static int do_signal_stop(int signr)
 {
 	struct signal_struct *sig = current->signal;
-	int notify = 0;
 
 	if (!(current->group_stop & GROUP_STOP_PENDING)) {
 		unsigned int gstop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME;
@@ -1813,29 +1812,37 @@ static int do_signal_stop(int signr)
 			} else
 				task_clear_group_stop_pending(t);
 	}
-	/*
-	 * If there are no other threads in the group, or if there is
-	 * a group stop in progress and we are the last to stop, report
-	 * to the parent.  When ptraced, every thread reports itself.
-	 */
-	if (task_participate_group_stop(current))
-		notify = CLD_STOPPED;
-	if (task_ptrace(current))
-		notify = CLD_STOPPED;
 
 	current->exit_code = sig->group_exit_code;
 	__set_current_state(TASK_STOPPED);
 
-	spin_unlock_irq(&current->sighand->siglock);
+	if (likely(!task_ptrace(current))) {
+		int notify = 0;
 
-	if (notify) {
-		read_lock(&tasklist_lock);
-		do_notify_parent_cldstop(current, notify);
-		read_unlock(&tasklist_lock);
-	}
+		/*
+		 * If there are no other threads in the group, or if there
+		 * is a group stop in progress and we are the last to stop,
+		 * report to the parent.
+		 */
+		if (task_participate_group_stop(current))
+			notify = CLD_STOPPED;
 
-	/* Now we don't run again until woken by SIGCONT or SIGKILL */
-	schedule();
+		spin_unlock_irq(&current->sighand->siglock);
+
+		if (notify) {
+			read_lock(&tasklist_lock);
+			do_notify_parent_cldstop(current, notify);
+			read_unlock(&tasklist_lock);
+		}
+
+		/* Now we don't run again until woken by SIGCONT or SIGKILL */
+		schedule();
+
+		spin_lock_irq(&current->sighand->siglock);
+	} else
+		ptrace_stop(current->exit_code, CLD_STOPPED, 0, NULL);
+
+	spin_unlock_irq(&current->sighand->siglock);
 
 	tracehook_finish_jctl();
 	current->exit_code = 0;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (8 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 09/10] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced Tejun Heo
@ 2011-01-28 15:08 ` Tejun Heo
  2011-02-03 20:41   ` [PATCH 0/1] (Was: ptrace: clean transitions between TASK_STOPPED and TRACED) Oleg Nesterov
  2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
  10 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-28 15:08 UTC (permalink / raw)
  To: roland, oleg, jan.kratochvil, linux-kernel; +Cc: torvalds, akpm, Tejun Heo

Currently, if the task is STOPPED on ptrace attach, it's left alone
and the state is silently changed to TRACED on the next ptrace call.
The behavior breaks the assumption that arch_ptrace_stop() is called
before any task is poked by ptrace and is ugly in that a task
manipulates the state of another task directly.

With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and
TRACED can be made clean.  The tracer can use the flag to tell the
tracee to retry stop on attach and detach.  On retry, the tracee will
enter the desired state in the correct way.  The lower 16bits of
task->group_stop is used to remember the signal number which caused
the last group stop.  This is used while retrying for ptrace attach as
the original group_exit_code could have been consumed with wait(2) by
then.

As the real parent may wait(2) and consume the group_exit_code
anytime, the group_exit_code needs to be saved separately so that it
can be used when switching from regular sleep to ptrace_stop().  This
is recorded in the lower 16bits of task->group_stop.

If a task is already stopped and there's no intervening SIGCONT, a
ptrace request immediately following a successful PTRACE_ATTACH should
always succeed even if the tracer doesn't wait(2) for attach
completion; however, with this change, the tracee might still be
TASK_RUNNING trying to enter TASK_TRACED which would cause the
following request to fail with -ESRCH.

This intermediate state is hidden from the ptracer by setting
GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait
for it to clear on its signal->wait_chldexit.  Completing the
transition or getting killed clears TRAPPING and wakes up the tracer.

Note that the STOPPED -> RUNNING -> TRACED transition is still visible
to other threads which are in the same group as the ptracer and the
reverse transition is visible to all.  Please read the comments for
details.

Oleg:

* Spotted a race condition where a task may retry group stop without
  proper bookkeeping.  Fixed by redoing bookkeeping on retry.

* Spotted that the transition is visible to userland in several
  different ways.  Most are fixed with GROUP_STOP_TRAPPING.  Unhandled
  corner case is documented.

* Pointed out not setting GROUP_STOP_SIGMASK on an already stopped
  task would result in more consistent behavior.

* Pointed out that calling ptrace_stop() from do_signal_stop() in
  TASK_STOPPED can race with group stop start logic and then confuse
  the TRAPPING wait in ptrace_check_attach().  ptrace_stop() is now
  called with TASK_RUNNING.

* Suggested using signal->wait_chldexit instead of bit wait.

* Spotted a race condition between TRACED transition and clearing of
  TRAPPING.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
---
 include/linux/sched.h |    2 +
 kernel/ptrace.c       |   49 +++++++++++++++++++++++++++---
 kernel/signal.c       |   79 +++++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 112 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0fc6c5e..8e541e6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1776,8 +1776,10 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 /*
  * task->group_stop flags
  */
+#define GROUP_STOP_SIGMASK	0xffff    /* signr of the last group stop */
 #define GROUP_STOP_PENDING	(1 << 16) /* task should stop for group stop */
 #define GROUP_STOP_CONSUME	(1 << 17) /* consume group stop count */
+#define GROUP_STOP_TRAPPING	(1 << 18) /* switching from STOPPED to TRACED */
 
 extern void task_clear_group_stop_pending(struct task_struct *task);
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index a8c9f26..31de220 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -49,14 +49,22 @@ static void ptrace_untrace(struct task_struct *child)
 	spin_lock(&child->sighand->siglock);
 	if (task_is_traced(child)) {
 		/*
-		 * If the group stop is completed or in progress,
-		 * this thread was already counted as stopped.
+		 * If group stop is completed or in progress, it should
+		 * participate in the group stop.  Set GROUP_STOP_PENDING
+		 * before kicking it.
+		 *
+		 * This involves TRACED -> RUNNING -> STOPPED transition
+		 * which is similar to but in the opposite direction of
+		 * what happens while attaching to a stopped task.
+		 * However, in this direction, the intermediate RUNNING
+		 * state is not hidden even from the current ptracer and if
+		 * it immediately re-attaches and performs a WNOHANG
+		 * wait(2), it may fail.
 		 */
 		if (child->signal->flags & SIGNAL_STOP_STOPPED ||
 		    child->signal->group_stop_count)
-			__set_task_state(child, TASK_STOPPED);
-		else
-			signal_wake_up(child, 1);
+			child->group_stop |= GROUP_STOP_PENDING;
+		signal_wake_up(child, 1);
 	}
 	spin_unlock(&child->sighand->siglock);
 }
@@ -165,6 +173,7 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 
 int ptrace_attach(struct task_struct *task)
 {
+	bool wait_trap = false;
 	int retval;
 
 	audit_ptrace(task);
@@ -204,12 +213,42 @@ int ptrace_attach(struct task_struct *task)
 	__ptrace_link(task, current);
 	send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
 
+	spin_lock(&task->sighand->siglock);
+
+	/*
+	 * If the task is already STOPPED, set GROUP_STOP_PENDING and
+	 * TRAPPING, and kick it so that it transits to TRACED.  TRAPPING
+	 * will be cleared if the child completes the transition or any
+	 * event which clears the group stop states happens.  We'll wait
+	 * for the transition to complete before returning from this
+	 * function.
+	 *
+	 * This hides STOPPED -> RUNNING -> TRACED transition from the
+	 * attaching thread but a different thread in the same group can
+	 * still observe the transient RUNNING state.  IOW, if another
+	 * thread's WNOHANG wait(2) on the stopped tracee races against
+	 * ATTACH, the wait(2) may fail due to the transient RUNNING.
+	 *
+	 * The following task_is_stopped() test is safe as both transitions
+	 * in and out of STOPPED are protected by siglock.
+	 */
+	if (task_is_stopped(task)) {
+		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
+		signal_wake_up(task, 1);
+		wait_trap = true;
+	}
+
+	spin_unlock(&task->sighand->siglock);
+
 	retval = 0;
 unlock_tasklist:
 	write_unlock_irq(&tasklist_lock);
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
+	if (wait_trap)
+		wait_event(current->signal->wait_chldexit,
+			   !(task->group_stop & GROUP_STOP_TRAPPING));
 	return retval;
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 4404474..c146150 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -224,6 +224,26 @@ static inline void print_dropped_signal(int sig)
 }
 
 /**
+ * task_clear_group_stop_trapping - clear group stop trapping bit
+ * @task: target task
+ *
+ * If GROUP_STOP_TRAPPING is set, a ptracer is waiting for us.  Clear it
+ * and wake up the ptracer.  Note that we don't need any further locking.
+ * @task->siglock guarantees that @task->parent points to the ptracer.
+ *
+ * CONTEXT:
+ * Must be called with @task->sighand->siglock held.
+ */
+static void task_clear_group_stop_trapping(struct task_struct *task)
+{
+	if (unlikely(task->group_stop & GROUP_STOP_TRAPPING)) {
+		task->group_stop &= ~GROUP_STOP_TRAPPING;
+		__wake_up_sync(&task->parent->signal->wait_chldexit,
+			       TASK_UNINTERRUPTIBLE, 1);
+	}
+}
+
+/**
  * task_clear_group_stop_pending - clear pending group stop
  * @task: target task
  *
@@ -1706,8 +1726,20 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
 
-	/* Let the debugger run.  */
-	__set_current_state(TASK_TRACED);
+	/*
+	 * TRACED should be visible before TRAPPING is cleared; otherwise,
+	 * the tracer might fail do_wait().
+	 */
+	set_current_state(TASK_TRACED);
+
+	/*
+	 * We're committing to trapping.  Clearing GROUP_STOP_TRAPPING and
+	 * transition to TASK_TRACED should be atomic with respect to
+	 * siglock.  This hsould be done after the arch hook as siglock is
+	 * released and regrabbed across it.
+	 */
+	task_clear_group_stop_trapping(current);
+
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
 	if (may_ptrace_stop()) {
@@ -1788,6 +1820,9 @@ static int do_signal_stop(int signr)
 		unsigned int gstop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME;
 		struct task_struct *t;
 
+		/* signr will be recorded in task->group_stop for retries */
+		WARN_ON_ONCE(signr & ~GROUP_STOP_SIGMASK);
+
 		if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED) ||
 		    unlikely(signal_group_exit(sig)))
 			return 0;
@@ -1797,25 +1832,27 @@ static int do_signal_stop(int signr)
 		 */
 		sig->group_exit_code = signr;
 
-		current->group_stop = gstop;
+		current->group_stop &= ~GROUP_STOP_SIGMASK;
+		current->group_stop |= signr | gstop;
 		sig->group_stop_count = 1;
-		for (t = next_thread(current); t != current; t = next_thread(t))
+		for (t = next_thread(current); t != current;
+		     t = next_thread(t)) {
+			t->group_stop &= ~GROUP_STOP_SIGMASK;
 			/*
 			 * Setting state to TASK_STOPPED for a group
 			 * stop is always done with the siglock held,
 			 * so this check has no races.
 			 */
 			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
-				t->group_stop = gstop;
+				t->group_stop |= signr | gstop;
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
-			} else
+			} else {
 				task_clear_group_stop_pending(t);
+			}
+		}
 	}
-
-	current->exit_code = sig->group_exit_code;
-	__set_current_state(TASK_STOPPED);
-
+retry:
 	if (likely(!task_ptrace(current))) {
 		int notify = 0;
 
@@ -1827,6 +1864,7 @@ static int do_signal_stop(int signr)
 		if (task_participate_group_stop(current))
 			notify = CLD_STOPPED;
 
+		__set_current_state(TASK_STOPPED);
 		spin_unlock_irq(&current->sighand->siglock);
 
 		if (notify) {
@@ -1839,13 +1877,28 @@ static int do_signal_stop(int signr)
 		schedule();
 
 		spin_lock_irq(&current->sighand->siglock);
-	} else
-		ptrace_stop(current->exit_code, CLD_STOPPED, 0, NULL);
+	} else {
+		ptrace_stop(current->group_stop & GROUP_STOP_SIGMASK,
+			    CLD_STOPPED, 0, NULL);
+		current->exit_code = 0;
+	}
+
+	/*
+	 * GROUP_STOP_PENDING could be set if another group stop has
+	 * started since being woken up or ptrace wants us to transit
+	 * between TASK_STOPPED and TRACED.  Retry group stop.
+	 */
+	if (current->group_stop & GROUP_STOP_PENDING) {
+		WARN_ON_ONCE(!(current->group_stop & GROUP_STOP_SIGMASK));
+		goto retry;
+	}
+
+	/* PTRACE_ATTACH might have raced with task killing, clear trapping */
+	task_clear_group_stop_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
 
 	tracehook_finish_jctl();
-	current->exit_code = 0;
 
 	return 1;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 160+ messages in thread

* Re: [PATCHSET] ptrace,signal: group stop / ptrace updates
  2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
                   ` (9 preceding siblings ...)
  2011-01-28 15:08 ` [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED Tejun Heo
@ 2011-01-28 16:54 ` Ingo Molnar
  2011-01-28 17:41   ` Thomas Gleixner
  2011-01-28 17:55   ` Oleg Nesterov
  10 siblings, 2 replies; 160+ messages in thread
From: Ingo Molnar @ 2011-01-28 16:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, oleg, jan.kratochvil, linux-kernel, torvalds, akpm,
	Peter Zijlstra, Thomas Gleixner, Frédéric Weisbecker


Hi,

I'm hijacking this thread, to report a signal handling bug that Linux and Bash has, 
and which has been there at least for 10 years since i started using SMP Linux 
systems ...

It's not easy to reproduce but today i found a reproducer - maybe you guys have an 
idea what's going on.

There's two very simple scripts, one calls the other in an infinite loop:

 $ cat test-signal
 #!/bin/bash

 while true; do ./test-signal2; done

 $ cat test-signal2
 #!/bin/bash

 true

The bug is that occasionally Ctrl-C does not get processed, and that the Ctrl-C is 
'lost'. It can be reproduced here by running ./test-signal several times, and 
Ctrl-C-ing it:

 $ ./test-signal
 ^C
 $ ./test-signal
 ^C^C
 $ ./test-signal
 ^C

See that '^C^C' line? That is where i had to do Ctrl-C twice.

It only fails here about once every 10 times, so it's very rare. I have a stock F14 
system running on that box, with the very latest .38 based kernel.

Any ideas what's going on?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCHSET] ptrace,signal: group stop / ptrace updates
  2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
@ 2011-01-28 17:41   ` Thomas Gleixner
  2011-01-28 18:04     ` Anca Emanuel
  2011-01-28 17:55   ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Thomas Gleixner @ 2011-01-28 17:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tejun Heo, roland, oleg, jan.kratochvil, linux-kernel, torvalds,
	akpm, Peter Zijlstra, Frédéric Weisbecker

On Fri, 28 Jan 2011, Ingo Molnar wrote:
> See that '^C^C' line? That is where i had to do Ctrl-C twice.
> 
> It only fails here about once every 10 times, so it's very rare. I have a stock F14 
> system running on that box, with the very latest .38 based kernel.

Tripped over the refuse ^C thing today twice. Had to kill a kernel
build from another shell. It just happily displayed ^C and never
stopped. That happens once in a while and I have no idea either how to
debug that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCHSET] ptrace,signal: group stop / ptrace updates
  2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
  2011-01-28 17:41   ` Thomas Gleixner
@ 2011-01-28 17:55   ` Oleg Nesterov
  2011-01-28 18:29     ` Bash not reacting to Ctrl-C Ingo Molnar
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-01-28 17:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tejun Heo, roland, jan.kratochvil, linux-kernel, torvalds, akpm,
	Peter Zijlstra, Thomas Gleixner, Frédéric Weisbecker

On 01/28, Ingo Molnar wrote:
>
> The bug is that occasionally Ctrl-C does not get processed, and that the Ctrl-C is
> 'lost'. It can be reproduced here by running ./test-signal several times, and
> Ctrl-C-ing it:
>
>  $ ./test-signal
>  ^C
>  $ ./test-signal
>  ^C^C
>  $ ./test-signal
>  ^C
>
> See that '^C^C' line? That is where i had to do Ctrl-C twice.

Reproduced.

At first glance, /bin/sh should be blamed... Hmm, probably yes,
I even reproduced this under strace, and this is what I see

	wait4(-1, 0x7fff388431c4, 0, NULL) = ? ERESTARTSYS (To be restarted)
	--- SIGINT (Interrupt) @ 0 (0) ---
	rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)
	wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 9706

So, ^C is not lost, but ./test-signal doesn't want to exit.




This is what ./test-signal does when ^C does work:

	wait4(-1, 0x7fff1c283b74, 0, NULL)      = ? ERESTARTSYS (To be restarted)
	--- SIGINT (Interrupt) @ 0 (0) ---
	rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)

OK, it doesn't exit immediately, but then it kills itself:

	wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGINT}], 0, NULL) = 19585
	rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f3c3035b150}, {0x433d30, [], SA_RESTORER, 0x7f3c3035b150}, 8) = 0
	rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f3c3035b150}, {SIG_DFL, [], SA_RESTORER, 0x7f3c3035b150}, 8) = 0
	kill(19584, SIGINT)




Looking into the previous log (when it doesn't exit) again,

	wait4(-1, 0x7fff388431c4, 0, NULL) = ? ERESTARTSYS (To be restarted)
	--- SIGINT (Interrupt) @ 0 (0) ---
	rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)
	wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 9706
	rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
	--- SIGCHLD (Child exited) @ 0 (0) ---
	wait4(-1, 0x7fff38842d24, WNOHANG, NULL) = -1 ECHILD (No child processes)
	rt_sigreturn(0x8)                       = 0
	rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f3cbdbd0150}, {0x433d30, [], SA_RESTORER, 0x7f3cbdbd0150}, 8) = 0
	rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
	rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
	rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0
	clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f3cbe9ab780) = 9707

Perhaps the handler for SIGCHLD clears some internal i_am_going_to_exit flag,
I dunno.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCHSET] ptrace,signal: group stop / ptrace updates
  2011-01-28 17:41   ` Thomas Gleixner
@ 2011-01-28 18:04     ` Anca Emanuel
  2011-01-28 18:36       ` Mathieu Desnoyers
  0 siblings, 1 reply; 160+ messages in thread
From: Anca Emanuel @ 2011-01-28 18:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Tejun Heo, roland, oleg, jan.kratochvil,
	linux-kernel, torvalds, akpm, Peter Zijlstra,
	Frédéric Weisbecker, Mathieu Desnoyers

On Fri, Jan 28, 2011 at 7:41 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 28 Jan 2011, Ingo Molnar wrote:
>> See that '^C^C' line? That is where i had to do Ctrl-C twice.
>>
>> It only fails here about once every 10 times, so it's very rare. I have a stock F14
>> system running on that box, with the very latest .38 based kernel.
>
> Tripped over the refuse ^C thing today twice. Had to kill a kernel
> build from another shell. It just happily displayed ^C and never
> stopped. That happens once in a while and I have no idea either how to
> debug that.

cc: Mathieu

Use lttng ?

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-01-28 17:55   ` Oleg Nesterov
@ 2011-01-28 18:29     ` Ingo Molnar
  2011-02-05 20:34       ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Ingo Molnar @ 2011-01-28 18:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, roland, jan.kratochvil, linux-kernel, torvalds, akpm,
	Peter Zijlstra, Thomas Gleixner, Frédéric Weisbecker


* Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/28, Ingo Molnar wrote:
> >
> > The bug is that occasionally Ctrl-C does not get processed, and that the Ctrl-C is
> > 'lost'. It can be reproduced here by running ./test-signal several times, and
> > Ctrl-C-ing it:
> >
> >  $ ./test-signal
> >  ^C
> >  $ ./test-signal
> >  ^C^C
> >  $ ./test-signal
> >  ^C
> >
> > See that '^C^C' line? That is where i had to do Ctrl-C twice.
> 
> Reproduced.
> 
> At first glance, /bin/sh should be blamed... Hmm, probably yes,
> I even reproduced this under strace, and this is what I see
> 
> 	wait4(-1, 0x7fff388431c4, 0, NULL) = ? ERESTARTSYS (To be restarted)
> 	--- SIGINT (Interrupt) @ 0 (0) ---
> 	rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)
> 	wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 9706
> 
> So, ^C is not lost, but ./test-signal doesn't want to exit.

Might be some Bash assumption or race that works under other OSs but somehow Linux 
does differently. IIRC Bash is being developed on MacOS-X.

But it's happening all the time (with yum for example - but also with makejobs, as 
Thomas has reported it) - this is simply the first time i managed to reproduce it 
with something really simple.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCHSET] ptrace,signal: group stop / ptrace updates
  2011-01-28 18:04     ` Anca Emanuel
@ 2011-01-28 18:36       ` Mathieu Desnoyers
  0 siblings, 0 replies; 160+ messages in thread
From: Mathieu Desnoyers @ 2011-01-28 18:36 UTC (permalink / raw)
  To: Anca Emanuel
  Cc: Thomas Gleixner, Ingo Molnar, Tejun Heo, roland, oleg,
	jan.kratochvil, linux-kernel, torvalds, akpm, Peter Zijlstra,
	Frédéric Weisbecker

* Anca Emanuel (anca.emanuel@gmail.com) wrote:
> On Fri, Jan 28, 2011 at 7:41 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Fri, 28 Jan 2011, Ingo Molnar wrote:
> >> See that '^C^C' line? That is where i had to do Ctrl-C twice.
> >>
> >> It only fails here about once every 10 times, so it's very rare. I have a stock F14
> >> system running on that box, with the very latest .38 based kernel.
> >
> > Tripped over the refuse ^C thing today twice. Had to kill a kernel
> > build from another shell. It just happily displayed ^C and never
> > stopped. That happens once in a while and I have no idea either how to
> > debug that.
> 
> cc: Mathieu
> 
> Use lttng ?

Heh :) I'm sure Ingo and Thomas have their own tools for that ;) There is
one extra thing in the LTTng instrumentation that can help solve this problem:
the "input subsystem" instrumentation (enabled with ltt-armall -i). You can then
get a dump of:

- Your keystrokes (you can then grep for your ctrl-c input)
- Read/poll/select system calls (so you know when your terminal receives the
  input).
- Signals sent/delivered

Some of these are already instrumented in the mainline kernel, so you might get
away without the input subsystem instrumentation.

If I had to take a wild guess, my bet would be to take a look in the area of
signal delivery, but you never know, maybe it's a userspace bug in the X
terminal emulator code that is causing this weirdness.

Hope this helps,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach()
  2011-01-28 15:08 ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Tejun Heo
@ 2011-01-28 18:46   ` Roland McGrath
  2011-01-31 10:38     ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-01-28 18:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

NAK.  Let's have the wake_up_state(task, TASK_TRACED|TASK_STOPPED) version
go in first.  That one will be more appropriate for -stable.  Even if we
shortly remove it entirely in mainline, having the more conservative
intermediate state in the git history will make any future problems more
susceptible to bisection.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop()
  2011-01-28 15:08 ` [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
@ 2011-01-28 18:46   ` Roland McGrath
  0 siblings, 0 replies; 160+ messages in thread
From: Roland McGrath @ 2011-01-28 18:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Acked-by: Roland McGrath <roland@redhat.com>

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 05/10] ptrace: add @why to ptrace_stop()
  2011-01-28 15:08 ` [PATCH 05/10] ptrace: add @why to ptrace_stop() Tejun Heo
@ 2011-01-28 18:48   ` Roland McGrath
  0 siblings, 0 replies; 160+ messages in thread
From: Roland McGrath @ 2011-01-28 18:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

> ptrace, add @why to ptrace_stop().  Existing users are updateda such
								^
log typo.

Otherwise,

Acked-by: Roland McGrath <roland@redhat.com>

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 04/10] ptrace: kill tracehook_notify_jctl()
  2011-01-28 15:08 ` [PATCH 04/10] ptrace: kill tracehook_notify_jctl() Tejun Heo
@ 2011-01-28 21:09   ` Roland McGrath
  0 siblings, 0 replies; 160+ messages in thread
From: Roland McGrath @ 2011-01-28 21:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

I'm OK with this one if Oleg is.  I'll leave it up to him.  What's good
about the tracehook functions is that they clearly specify the semantics
and the assumptions in their kerneldoc comments.  It's ok to change things
around and have some fur flying while we clean things up.  But we really
should get back to a situation where the semantics and the logic of the
code are clearly documented, and we don't have implementation details and
ptrace user ABI semantics jumbled together implicitly in code that doesn't
explain what it all means.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace
  2011-01-28 15:08 ` [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace Tejun Heo
@ 2011-01-28 21:22   ` Roland McGrath
  2011-01-31 11:00     ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-01-28 21:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

It feels nasty to add a word to task_struct just for this.
I don't see another place to store such bookkeeping bits.
But I'm not entirely convinced that we'll really need them
when we conclude on fully cleaning up the whole picture.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-01-28 15:08 ` [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
@ 2011-01-28 21:30   ` Roland McGrath
  2011-01-31 11:26     ` Tejun Heo
  2011-02-01 19:36     ` Oleg Nesterov
  0 siblings, 2 replies; 160+ messages in thread
From: Roland McGrath @ 2011-01-28 21:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

> A visible behavior change is increased likelihood of delayed group
> stop completion if the thread group contains one or more ptraced
> tasks.

I object to that difference in behavior.  As I've said before, I don't
think there should be any option to circumvent a group stop via ptrace.
If you think otherwise, you have a hard road to convince me of it.  

It was always the intent that traced tasks should participate in the
group stop bookkeeping.  I suspect the better line of fixes will be just
to tie up the loose ends of the ptrace interactions so that all ptrace
stops do the correct group_stop_count bookkeeping and notifications.  If
there is a group stop in progress but not yet complete, then PTRACE_CONT
on a thread in the group should probably just move it from TASK_TRACED
to TASK_STOPPED without resuming it at all.  

Once a group stop is complete, then probably the ideal is that
PTRACE_CONT would not resume a thread until a real SIGCONT cleared the
job control stop condition.  But it's likely that existing ptrace users
have expectations contrary to that, so we'll have to see.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach()
  2011-01-28 18:46   ` Roland McGrath
@ 2011-01-31 10:38     ` Tejun Heo
  2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
  2011-02-02  5:28       ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Roland McGrath
  0 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-01-31 10:38 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Hello,

On Fri, Jan 28, 2011 at 10:46:01AM -0800, Roland McGrath wrote:
> NAK.  Let's have the wake_up_state(task, TASK_TRACED|TASK_STOPPED) version
> go in first.  That one will be more appropriate for -stable.  Even if we
> shortly remove it entirely in mainline, having the more conservative
> intermediate state in the git history will make any future problems more
> susceptible to bisection.

Yeap, sounds reasonable.  So, you're saying it can go away but should
be converted to safer sleep first, right?  I'll resequence the patches.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace
  2011-01-28 21:22   ` Roland McGrath
@ 2011-01-31 11:00     ` Tejun Heo
  2011-02-02  5:44       ` Roland McGrath
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-31 11:00 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Hello,

On Fri, Jan 28, 2011 at 01:22:57PM -0800, Roland McGrath wrote:
> It feels nasty to add a word to task_struct just for this.
> I don't see another place to store such bookkeeping bits.

I have plans on separate out ptrace related stuff from task_struct so
that they're allocate iff they're used which will save some tens of
bytes on the task struct, so there at least is a plan to compensate
for the added cruft.

> But I'm not entirely convinced that we'll really need them when we
> conclude on fully cleaning up the whole picture.

I really don't know at this point.  I tried to make it share one of
the related fields but there needs to be a per-task field which is
protected by siglock and there currently isn't any, so...

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-01-28 21:30   ` Roland McGrath
@ 2011-01-31 11:26     ` Tejun Heo
  2011-02-02  5:57       ` Roland McGrath
  2011-02-01 19:36     ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-01-31 11:26 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Hello, Roland.

On Fri, Jan 28, 2011 at 01:30:09PM -0800, Roland McGrath wrote:
> > A visible behavior change is increased likelihood of delayed group
> > stop completion if the thread group contains one or more ptraced
> > tasks.
> 
> I object to that difference in behavior.  As I've said before, I don't
> think there should be any option to circumvent a group stop via ptrace.
> If you think otherwise, you have a hard road to convince me of it.  

Yes, I do have some other ideas.  When a ptraced task gets a stop
signal, its delivery is controlled by the tracer, right?  So, right
from the beginning, group stop having consistent precedence over
ptrace breaks.

As long as the interaction is consistent and well defined, I don't
really think it matters one way or the other but given the above
precedence and the current ptracers' expectations, I can't see how we
would be able to prioritize group stop over ptrace at this point.

> It was always the intent that traced tasks should participate in the
> group stop bookkeeping.  I suspect the better line of fixes will be just
> to tie up the loose ends of the ptrace interactions so that all ptrace
> stops do the correct group_stop_count bookkeeping and notifications.

The problem is that those loose ends can't be tied up without breaking
the current users.  PTRACE_CONT has priority over group stop and it's
a very visible from userland.  I'm afraid the window of opportunity to
make that behavior the default had already passed quite some time ago.

What we can do is making the overriding behavior well defined and
logical.  This change makes the interaction at least logical - the
tracer would reliably know that the tracee has participated and
stopped for group stop whereas before the patch the tracer can't tell
whether a task has or hasn't participated in a pending group stop.

Please note that this change in practice only affects when the
completion of group stop is notified.  As our group stop notification
is almost completely broken while ptraced, this can't really break
anything further.

> If there is a group stop in progress but not yet complete, then
> PTRACE_CONT on a thread in the group should probably just move it
> from TASK_TRACED to TASK_STOPPED without resuming it at all.

I really don't think that's an option at this point and can't see how
such behavior could be made consistent given ptracer has inherent
superiority over signal delivery.  That would make initiation of group
stop controllerd by ptracer but participation not.  The behavior
becomes essentially indeterministic depending on which task in the
group gets the signal.  :-(

> Once a group stop is complete, then probably the ideal is that
> PTRACE_CONT would not resume a thread until a real SIGCONT cleared
> the job control stop condition.  But it's likely that existing
> ptrace users have expectations contrary to that, so we'll have to
> see.

So, no, I don't think that would be possible or even desirable.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-01-31 10:38     ` Tejun Heo
@ 2011-02-01 10:26       ` Tejun Heo
  2011-02-01 13:40         ` Oleg Nesterov
                           ` (2 more replies)
  2011-02-02  5:28       ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Roland McGrath
  1 sibling, 3 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-01 10:26 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

The wake_up_process() call in ptrace_detach() is spurious and not
interlocked with the tracee state.  IOW, the tracee could be running
or sleeping in any place in the kernel by the time wake_up_process()
is called.  This can lead to the tracee waking up unexpectedly which
can be dangerous.

The wake_up is spurious and should be removed but for now reduce its
toxicity by only waking up if the tracee is in TRACED or STOPPED
state.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
---
So, something like this.  Roland, Oleg, can you guys please ack this?
Also, should these ptrace patches be routed?  Shall I set up a git
tree?

Thank you.

 kernel/ptrace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: work/kernel/ptrace.c
===================================================================
--- work.orig/kernel/ptrace.c
+++ work/kernel/ptrace.c
@@ -313,7 +313,7 @@ int ptrace_detach(struct task_struct *ch
 		child->exit_code = data;
 		dead = __ptrace_detach(current, child);
 		if (!child->exit_state)
-			wake_up_process(child);
+			wake_up_state(child, TASK_TRACED | TASK_STOPPED);
 	}
 	write_unlock_irq(&tasklist_lock);
 

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
@ 2011-02-01 13:40         ` Oleg Nesterov
  2011-02-01 15:07           ` Tejun Heo
  2011-02-02  0:27         ` Andrew Morton
  2011-02-02  5:29         ` Roland McGrath
  2 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-01 13:40 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/01, Tejun Heo wrote:
>
> --- work.orig/kernel/ptrace.c
> +++ work/kernel/ptrace.c
> @@ -313,7 +313,7 @@ int ptrace_detach(struct task_struct *ch
>  		child->exit_code = data;
>  		dead = __ptrace_detach(current, child);
>  		if (!child->exit_state)
> -			wake_up_process(child);
> +			wake_up_state(child, TASK_TRACED | TASK_STOPPED);

Well, it can't be TASK_TRACED at this point. And of course this still
contradicts to __set_task_state(child, TASK_STOPPED) in ptrace_untrace().
IOW, to me the previous patch makes more sense.

But OK, I understand Roland's concerns. And, at least this change
fixes the bug mentioned in 95a3540d.

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-01 13:40         ` Oleg Nesterov
@ 2011-02-01 15:07           ` Tejun Heo
  2011-02-01 19:17             ` Oleg Nesterov
  2011-02-02  5:31             ` Roland McGrath
  0 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-01 15:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hello,

On Tue, Feb 01, 2011 at 02:40:37PM +0100, Oleg Nesterov wrote:
> On 02/01, Tejun Heo wrote:
> >
> > --- work.orig/kernel/ptrace.c
> > +++ work/kernel/ptrace.c
> > @@ -313,7 +313,7 @@ int ptrace_detach(struct task_struct *ch
> >  		child->exit_code = data;
> >  		dead = __ptrace_detach(current, child);
> >  		if (!child->exit_state)
> > -			wake_up_process(child);
> > +			wake_up_state(child, TASK_TRACED | TASK_STOPPED);
> 
> Well, it can't be TASK_TRACED at this point. And of course this still
> contradicts to __set_task_state(child, TASK_STOPPED) in ptrace_untrace().
> IOW, to me the previous patch makes more sense.

Yeah, it can't be in TRACED but the whole point of the patch is
avoiding rude wakeups, so as long as it doesn't end up waking random
[un]interruptible sleeps...  It will be removed later anyway.

> But OK, I understand Roland's concerns. And, at least this change
> fixes the bug mentioned in 95a3540d.
> 
> Acked-by: Oleg Nesterov <oleg@redhat.com>

Oleg, Roland, you guys are the maintainers, so how do you guys want to
route the patches which have been acked?  As it's likely that there
will be quite some number of ptrace patches, it will be better to have
a git tree.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-01 15:07           ` Tejun Heo
@ 2011-02-01 19:17             ` Oleg Nesterov
  2011-02-02  5:31             ` Roland McGrath
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-01 19:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/01, Tejun Heo wrote:
>
> Hello,
>
> On Tue, Feb 01, 2011 at 02:40:37PM +0100, Oleg Nesterov wrote:
> > On 02/01, Tejun Heo wrote:
> > >
> > > --- work.orig/kernel/ptrace.c
> > > +++ work/kernel/ptrace.c
> > > @@ -313,7 +313,7 @@ int ptrace_detach(struct task_struct *ch
> > >  		child->exit_code = data;
> > >  		dead = __ptrace_detach(current, child);
> > >  		if (!child->exit_state)
> > > -			wake_up_process(child);
> > > +			wake_up_state(child, TASK_TRACED | TASK_STOPPED);
> >
> > Well, it can't be TASK_TRACED at this point. And of course this still
> > contradicts to __set_task_state(child, TASK_STOPPED) in ptrace_untrace().
> > IOW, to me the previous patch makes more sense.
>
> Yeah, it can't be in TRACED but the whole point of the patch is
> avoiding rude wakeups, so as long as it doesn't end up waking random
> [un]interruptible sleeps...  It will be removed later anyway.

Yes, yes, I understand.

> > But OK, I understand Roland's concerns. And, at least this change
> > fixes the bug mentioned in 95a3540d.
> >
> > Acked-by: Oleg Nesterov <oleg@redhat.com>
>
> Oleg, Roland, you guys are the maintainers, so how do you guys want to
> route the patches which have been acked?

Well. I know only one way, send it to akpm ;)

> As it's likely that there
> will be quite some number of ptrace patches, it will be better to have
> a git tree.

Probably yes... but everything in this area goes through -mm so far.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-01-28 21:30   ` Roland McGrath
  2011-01-31 11:26     ` Tejun Heo
@ 2011-02-01 19:36     ` Oleg Nesterov
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-01 19:36 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Tejun Heo, jan.kratochvil, linux-kernel, torvalds, akpm

On 01/28, Roland McGrath wrote:
>
> If
> there is a group stop in progress but not yet complete, then PTRACE_CONT
> on a thread in the group should probably just move it from TASK_TRACED
> to TASK_STOPPED without resuming it at all.
>
> Once a group stop is complete, then probably the ideal is that
> PTRACE_CONT would not resume a thread until a real SIGCONT cleared the
> job control stop condition.

Well. I agree. I even tried to mention this before.

Or, if PTRACE_CONT resumes the tracee it should clear SIGNAL_STOP_STOPPED.

> But it's likely that existing ptrace users
> have expectations contrary to that,

Sure ;)




Btw. I just realized that I didn't reply explicitly to this series.
Because I thought we already discussed everything and I have nothing
to add. I think that everything is technically correct. I mean, I
believe the patches do exactly what the changelog says.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
  2011-02-01 13:40         ` Oleg Nesterov
@ 2011-02-02  0:27         ` Andrew Morton
  2011-02-02  5:33           ` Roland McGrath
  2011-02-02  5:29         ` Roland McGrath
  2 siblings, 1 reply; 160+ messages in thread
From: Andrew Morton @ 2011-02-02  0:27 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, oleg, jan.kratochvil, linux-kernel, torvalds

On Tue, 1 Feb 2011 11:26:18 +0100
Tejun Heo <tj@kernel.org> wrote:

> The wake_up_process() call in ptrace_detach() is spurious and not
> interlocked with the tracee state.  IOW, the tracee could be running
> or sleeping in any place in the kernel by the time wake_up_process()
> is called.  This can lead to the tracee waking up unexpectedly which
> can be dangerous.
> 
> The wake_up is spurious and should be removed but for now reduce its
> toxicity by only waking up if the tracee is in TRACED or STOPPED
> state.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Roland McGrath <roland@redhat.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: stable@kernel.org

Am unable to work out why you tagged it for backporting.  It fixes some
observed bug?  Perhaps a regression?

> Index: work/kernel/ptrace.c
> ===================================================================
> --- work.orig/kernel/ptrace.c
> +++ work/kernel/ptrace.c
> @@ -313,7 +313,7 @@ int ptrace_detach(struct task_struct *ch
>  		child->exit_code = data;
>  		dead = __ptrace_detach(current, child);
>  		if (!child->exit_state)
> -			wake_up_process(child);
> +			wake_up_state(child, TASK_TRACED | TASK_STOPPED);
>  	}
>  	write_unlock_irq(&tasklist_lock);
>  

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach()
  2011-01-31 10:38     ` Tejun Heo
  2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
@ 2011-02-02  5:28       ` Roland McGrath
  1 sibling, 0 replies; 160+ messages in thread
From: Roland McGrath @ 2011-02-02  5:28 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

> Yeap, sounds reasonable.  So, you're saying it can go away but should
> be converted to safer sleep first, right?  I'll resequence the patches.

Correct.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
  2011-02-01 13:40         ` Oleg Nesterov
  2011-02-02  0:27         ` Andrew Morton
@ 2011-02-02  5:29         ` Roland McGrath
  2 siblings, 0 replies; 160+ messages in thread
From: Roland McGrath @ 2011-02-02  5:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Acked-by: Roland McGrath <roland@redhat.com>

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-01 15:07           ` Tejun Heo
  2011-02-01 19:17             ` Oleg Nesterov
@ 2011-02-02  5:31             ` Roland McGrath
  2011-02-02 10:35               ` Tejun Heo
  1 sibling, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-02-02  5:31 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm

> Oleg, Roland, you guys are the maintainers, so how do you guys want to
> route the patches which have been acked?  As it's likely that there
> will be quite some number of ptrace patches, it will be better to have
> a git tree.

In practice we are never trusted enough with changes in this area to get
them just pulled from a tree of ours.  They have to go through akpm and
Linus for individual approval before they get in.  So I see no point in me
or Oleg maintaining a git tree that won't get merged directly anyway.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02  0:27         ` Andrew Morton
@ 2011-02-02  5:33           ` Roland McGrath
  2011-02-02  5:38             ` Andrew Morton
  2011-02-02 21:40             ` Oleg Nesterov
  0 siblings, 2 replies; 160+ messages in thread
From: Roland McGrath @ 2011-02-02  5:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Tejun Heo, oleg, jan.kratochvil, linux-kernel, torvalds

> Am unable to work out why you tagged it for backporting.  It fixes some
> observed bug?  Perhaps a regression?

No observed bug, only theoretical ones (AFAIK, never even a ginned-up
synthetic test case has been demonstrated).  Certainly not a regression,
since it has been this (wrong) way since the dawn of time.  I don't think
this first change is dangerous for -stable, but I have seen no positive
rationale for pushing it there.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02  5:33           ` Roland McGrath
@ 2011-02-02  5:38             ` Andrew Morton
  2011-02-02 10:34               ` Tejun Heo
  2011-02-02 21:40             ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Andrew Morton @ 2011-02-02  5:38 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Tejun Heo, oleg, jan.kratochvil, linux-kernel, torvalds

On Tue,  1 Feb 2011 21:33:31 -0800 (PST) Roland McGrath <roland@redhat.com> wrote:

> > Am unable to work out why you tagged it for backporting.  It fixes some
> > observed bug?  Perhaps a regression?
> 
> No observed bug, only theoretical ones (AFAIK, never even a ginned-up
> synthetic test case has been demonstrated).  Certainly not a regression,
> since it has been this (wrong) way since the dawn of time.  I don't think
> this first change is dangerous for -stable, but I have seen no positive
> rationale for pushing it there.
> 

OK, thanks.  I shall destabilize my copy of this patch.

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace
  2011-01-31 11:00     ` Tejun Heo
@ 2011-02-02  5:44       ` Roland McGrath
  2011-02-02 10:56         ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-02-02  5:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

> I have plans on separate out ptrace related stuff from task_struct so
> that they're allocate iff they're used which will save some tens of
> bytes on the task struct, so there at least is a plan to compensate
> for the added cruft.

My, that sounds familiar.  Oleg and I have pursued that before, though
not in exactly the same context.  It sure gets complicated quickly in
the corners.  But we would still like to see it get done one way or
another.

> > But I'm not entirely convinced that we'll really need them when we
> > conclude on fully cleaning up the whole picture.
> 
> I really don't know at this point.  I tried to make it share one of
> the related fields but there needs to be a per-task field which is
> protected by siglock and there currently isn't any, so...

My point was that I am not yet convinced we'll need any new bookkeeping of
that sort once we fully work out the group-stop interactions.  We're still
discussing that in the other threads.  So we'll see where that goes.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-01-31 11:26     ` Tejun Heo
@ 2011-02-02  5:57       ` Roland McGrath
  2011-02-02 10:53         ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-02-02  5:57 UTC (permalink / raw)
  To: Tejun Heo; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

> Yes, I do have some other ideas.  When a ptraced task gets a stop
> signal, its delivery is controlled by the tracer, right?  So, right
> from the beginning, group stop having consistent precedence over
> ptrace breaks.

I would not agree with that way of describing it.  The tracer controls
what signals actually get delivered.  A group stop doesn't begin until
a stop signal is actually delivered (or a core dump is started).  What
I'm talking about when I say that group stop should have precedence is
that once a group stop is initiated by one thread, then it should
happen fully and immediately to all threads.  When a tracer intercepts
a signal and does not specify it for delivery, then no signal is
delivered and so no thread initiates a group stop at all.  That's an
entirely different kettle of fish.

> As long as the interaction is consistent and well defined, I don't
> really think it matters one way or the other but given the above
> precedence and the current ptracers' expectations, I can't see how we
> would be able to prioritize group stop over ptrace at this point.

I don't follow this logic at all.  Perhaps it is predicated on the
wrong idea of what "initiating a group stop" means, and you would not
say this given my paragraph above.  If you mean something different
than the misperception that a group stop exists at all before a signal
is truly delivered, then I don't understand what you mean.

> The problem is that those loose ends can't be tied up without breaking
> the current users.  PTRACE_CONT has priority over group stop and it's
> a very visible from userland.  I'm afraid the window of opportunity to
> make that behavior the default had already passed quite some time ago.

I am not convinced of that at all, though I certainly wouldn't say now
that it's a settled question yet.  The userland expectations are
pretty convoluted too.  I suspect that what you are calling the
userland expectations for PTRACE_CONT to ignore the state of a pending
group stop are in fact just fallout of userland confusion about what's
a job control stop and what's a ptrace stop.

> > If there is a group stop in progress but not yet complete, then
> > PTRACE_CONT on a thread in the group should probably just move it
> > from TASK_TRACED to TASK_STOPPED without resuming it at all.
> 
> I really don't think that's an option at this point and can't see how
> such behavior could be made consistent given ptracer has inherent
> superiority over signal delivery.  That would make initiation of group
> stop controllerd by ptracer but participation not.  The behavior
> becomes essentially indeterministic depending on which task in the
> group gets the signal.  :-(

I don't think I follow your logic and I certainly don't agree with
your conclusion.  It's simple: if a stop signal is actually delivered,
then every thread stops.  If one thread is traced and another is not,
then the tracer can prevent a signal from being delivered to one
thread and cannot prevent it from being delivered to another.

> > Once a group stop is complete, then probably the ideal is that
> > PTRACE_CONT would not resume a thread until a real SIGCONT cleared
> > the job control stop condition.  But it's likely that existing
> > ptrace users have expectations contrary to that, so we'll have to
> > see.
> 
> So, no, I don't think that would be possible or even desirable.

Of course it's possible.  We have to work out the entire web of
assumptions and ramifications to be sure what's desireable given
practical compatibility constraints.  I think it's just plain obvious
that it's the desireable thing in the abstract--that tracing stops and
job control stops should be independent functions.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02  5:38             ` Andrew Morton
@ 2011-02-02 10:34               ` Tejun Heo
  2011-02-02 19:33                 ` Andrew Morton
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-02 10:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Roland McGrath, oleg, jan.kratochvil, linux-kernel, torvalds

Hello,

On Tue, Feb 01, 2011 at 09:38:28PM -0800, Andrew Morton wrote:
> On Tue,  1 Feb 2011 21:33:31 -0800 (PST) Roland McGrath <roland@redhat.com> wrote:
> 
> > > Am unable to work out why you tagged it for backporting.  It fixes some
> > > observed bug?  Perhaps a regression?
> > 
> > No observed bug, only theoretical ones (AFAIK, never even a ginned-up
> > synthetic test case has been demonstrated).  Certainly not a regression,
> > since it has been this (wrong) way since the dawn of time.  I don't think
> > this first change is dangerous for -stable, but I have seen no positive
> > rationale for pushing it there.
> > 
> 
> OK, thanks.  I shall destabilize my copy of this patch.

It can be used as an attack vector.  I don't think it will take too
much effort to come up with an attack which triggers oops somewhere.
Most sleeps are wrapped in condition test loops and should be safe but
we have quite a number of places where sleep and wakeup conditions are
expected to be interlocked.  Although the window of opportunity is
tiny, ptrace can be used by non-privileged users and with some loading
the window can definitely be extended and exploited.

The chance of this problem being visible under normal usage is
extremely low so no wonder there is no related bug report but that is
very different from being safe against targeted attacks.

As the likelihood of causing user noticeable breakage is very low, I
think we better push it through -stable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02  5:31             ` Roland McGrath
@ 2011-02-02 10:35               ` Tejun Heo
  0 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-02 10:35 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm

Hello,

On Tue, Feb 01, 2011 at 09:31:44PM -0800, Roland McGrath wrote:
> > Oleg, Roland, you guys are the maintainers, so how do you guys want to
> > route the patches which have been acked?  As it's likely that there
> > will be quite some number of ptrace patches, it will be better to have
> > a git tree.
> 
> In practice we are never trusted enough with changes in this area to get
> them just pulled from a tree of ours.  They have to go through akpm and
> Linus for individual approval before they get in.  So I see no point in me
> or Oleg maintaining a git tree that won't get merged directly anyway.

Hmm... okay.  Alright, through -mm then.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-02-02  5:57       ` Roland McGrath
@ 2011-02-02 10:53         ` Tejun Heo
  2011-02-03 10:02           ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-02 10:53 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Hello,

On Tue, Feb 01, 2011 at 09:57:11PM -0800, Roland McGrath wrote:
> > The problem is that those loose ends can't be tied up without breaking
> > the current users.  PTRACE_CONT has priority over group stop and it's
> > a very visible from userland.  I'm afraid the window of opportunity to
> > make that behavior the default had already passed quite some time ago.
> 
> I am not convinced of that at all, though I certainly wouldn't say now
> that it's a settled question yet.  The userland expectations are
> pretty convoluted too.  I suspect that what you are calling the
> userland expectations for PTRACE_CONT to ignore the state of a pending
> group stop are in fact just fallout of userland confusion about what's
> a job control stop and what's a ptrace stop.

Okay, but that doesn't change the fact that currently PTRACE_CONT is
superior to group stop.  You're suggesting to reverse the priority.  I
can't see how that would be possible.  I'm confused because that's a
_MUCH_ bigger change than the ones suggested here or in the whole
series.  If we can change that, it's almost free for all, which I
dont't mind, but isn't consistent with how things have progressed upto
now.

We can introduce new interface which behaves that way but I don't
think we can reverse the priority without breaking a lot of userland
visible behaviors.

> I don't think I follow your logic and I certainly don't agree with
> your conclusion.  It's simple: if a stop signal is actually delivered,
> then every thread stops.  If one thread is traced and another is not,
> then the tracer can prevent a signal from being delivered to one
> thread and cannot prevent it from being delivered to another.

I suppose it depends on POV and I can see your point too.

> > > Once a group stop is complete, then probably the ideal is that
> > > PTRACE_CONT would not resume a thread until a real SIGCONT cleared
> > > the job control stop condition.  But it's likely that existing
> > > ptrace users have expectations contrary to that, so we'll have to
> > > see.
> > 
> > So, no, I don't think that would be possible or even desirable.
> 
> Of course it's possible.  We have to work out the entire web of
> assumptions and ramifications to be sure what's desireable given
> practical compatibility constraints.  I think it's just plain obvious
> that it's the desireable thing in the abstract--that tracing stops and
> job control stops should be independent functions.

You'll have to show a lot more details on how to untangle "the entire
web of assumptions and ramifications" so that we can reverse the
current priority without causing noticeable behavior differences to
the existing users.  I don't agree what you suggest is "plain
obviously desirable" but that's almost beside the point.  I really
can't see how that would be realistically possible at this point.

So, can you please elaborate how to reverse that and at the same time
avoid breaking existing assumptions?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace
  2011-02-02  5:44       ` Roland McGrath
@ 2011-02-02 10:56         ` Tejun Heo
  0 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-02 10:56 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Hello,

On Tue, Feb 01, 2011 at 09:44:05PM -0800, Roland McGrath wrote:
> > I have plans on separate out ptrace related stuff from task_struct so
> > that they're allocate iff they're used which will save some tens of
> > bytes on the task struct, so there at least is a plan to compensate
> > for the added cruft.
> 
> My, that sounds familiar.  Oleg and I have pursued that before, though
> not in exactly the same context.  It sure gets complicated quickly in
> the corners.  But we would still like to see it get done one way or
> another.

Yeap, I have mostly working code.  It was necessary to allow nesting,
so...

> > > But I'm not entirely convinced that we'll really need them when we
> > > conclude on fully cleaning up the whole picture.
> > 
> > I really don't know at this point.  I tried to make it share one of
> > the related fields but there needs to be a per-task field which is
> > protected by siglock and there currently isn't any, so...
> 
> My point was that I am not yet convinced we'll need any new bookkeeping of
> that sort once we fully work out the group-stop interactions.  We're still
> discussing that in the other threads.  So we'll see where that goes.

Yeah, sure.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02 10:34               ` Tejun Heo
@ 2011-02-02 19:33                 ` Andrew Morton
  2011-02-02 20:01                   ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Andrew Morton @ 2011-02-02 19:33 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, oleg, jan.kratochvil, linux-kernel, torvalds

On Wed, 2 Feb 2011 11:34:02 +0100
Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On Tue, Feb 01, 2011 at 09:38:28PM -0800, Andrew Morton wrote:
> > On Tue,  1 Feb 2011 21:33:31 -0800 (PST) Roland McGrath <roland@redhat.com> wrote:
> > 
> > > > Am unable to work out why you tagged it for backporting.  It fixes some
> > > > observed bug?  Perhaps a regression?
> > > 
> > > No observed bug, only theoretical ones (AFAIK, never even a ginned-up
> > > synthetic test case has been demonstrated).  Certainly not a regression,
> > > since it has been this (wrong) way since the dawn of time.  I don't think
> > > this first change is dangerous for -stable, but I have seen no positive
> > > rationale for pushing it there.
> > > 
> > 
> > OK, thanks.  I shall destabilize my copy of this patch.
> 
> It can be used as an attack vector.  I don't think it will take too
> much effort to come up with an attack which triggers oops somewhere.
> Most sleeps are wrapped in condition test loops and should be safe but
> we have quite a number of places where sleep and wakeup conditions are
> expected to be interlocked.  Although the window of opportunity is
> tiny, ptrace can be used by non-privileged users and with some loading
> the window can definitely be extended and exploited.
> 
> The chance of this problem being visible under normal usage is
> extremely low so no wonder there is no related bug report but that is
> very different from being safe against targeted attacks.
> 
> As the likelihood of causing user noticeable breakage is very low, I
> think we better push it through -stable.
> 

We're learning some lessons about changelogging here :(

I added this:

: This bug can possibly be used as an attack vector.  I don't think
: it will take too much effort to come up with an attack which triggers
: oops somewhere.  Most sleeps are wrapped in condition test loops and
: should be safe but we have quite a number of places where sleep and
: wakeup conditions are expected to be interlocked.  Although the
: window of opportunity is tiny, ptrace can be used by non-privileged
: users and with some loading the window can definitely be extended and
: exploited.

to the changelog so the -stable maintainers can understand why we're
sending this patch at them.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02 19:33                 ` Andrew Morton
@ 2011-02-02 20:01                   ` Tejun Heo
  0 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-02 20:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Roland McGrath, oleg, jan.kratochvil, linux-kernel, torvalds

Hello,

On Wed, Feb 2, 2011 at 8:33 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> We're learning some lessons about changelogging here :(
>
> I added this:

I was aiming for being a bit less explicit, which may or may not have
been a good idea.  Anyways, sounds good to me.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH] ptrace: use safer wake up on ptrace_detach()
  2011-02-02  5:33           ` Roland McGrath
  2011-02-02  5:38             ` Andrew Morton
@ 2011-02-02 21:40             ` Oleg Nesterov
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-02 21:40 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Tejun Heo, jan.kratochvil, linux-kernel, torvalds

On 02/01, Roland McGrath wrote:
>
> > Am unable to work out why you tagged it for backporting.  It fixes some
> > observed bug?  Perhaps a regression?
>
> No observed bug,

Well, this bug triggered the problem in practice, this was the reason
for 95a3540d.

> I don't think
> this first change is dangerous for -stable, but I have seen no positive
> rationale for pushing it there.

Agreed, I don't really think -stable needs this change, but otoh
it is obviously safe.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for group stop
  2011-02-02 10:53         ` Tejun Heo
@ 2011-02-03 10:02           ` Tejun Heo
  0 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-03 10:02 UTC (permalink / raw)
  To: Roland McGrath; +Cc: oleg, jan.kratochvil, linux-kernel, torvalds, akpm

Hey, again.

On Wed, Feb 02, 2011 at 11:53:03AM +0100, Tejun Heo wrote:
> You'll have to show a lot more details on how to untangle "the entire
> web of assumptions and ramifications" so that we can reverse the
> current priority without causing noticeable behavior differences to
> the existing users.  I don't agree what you suggest is "plain
> obviously desirable" but that's almost beside the point.  I really
> can't see how that would be realistically possible at this point.
> 
> So, can you please elaborate how to reverse that and at the same time
> avoid breaking existing assumptions?

I've been thinking about this and the more I think about it I don't
see how we can make this priority flipping without adversely affecting
the expect userland behavior.

For example, if a gdb traced task is instructed to participate in a
group stop and then hits a ptrace trap, it would have to participate
in the group stop as it enters ptrace trap, right?  gdb's wait(2)
would complete indicating ptrace trap.  After the user tells the task
to continue, the task shouldn't resume until SIGCONT is received;
however, at this point, there's no way for gdb to tell what's going on
with the tracee.  It'll wait with its input prompt disabled until
someone sends SIGCONT from outside to the tracee.  Please note that ^C
wouldn't do anything either in this state.

So, as I wrote before, I don't think we can change this at this point.
If ptrace behaved like that from the beginning, gdb would have behaved
differently and worked around those cases but that hasn't been the
case and I don't see any way we can flip the priority and get away
with it without impacting the current users significantly.

If I'm missing something, please enlighten me.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* [PATCH 0/1] (Was: ptrace: clean transitions between TASK_STOPPED and TRACED)
  2011-01-28 15:08 ` [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED Tejun Heo
@ 2011-02-03 20:41   ` Oleg Nesterov
  2011-02-03 20:41     ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-03 20:41 UTC (permalink / raw)
  To: Tejun Heo, Roland McGrath; +Cc: jan.kratochvil, linux-kernel, torvalds, akpm

On 01/28, Tejun Heo wrote:
>
> Currently, if the task is STOPPED on ptrace attach, it's left alone
> and the state is silently changed to TRACED on the next ptrace call.

In particular, this means that it is very hard to attach correctly.
Any ptrace request needs STOPPED/TRACED tracee, but apart from wait()
there is no simple way to verify this and many applications (imho
rightly) assume that wait() after PTRACE_ATTACH should work.

While this patch should fix this old known problem, it needs more
discussion.

Tejun, Roland, perhaps it makes sense to fix this partcular problem
first? Personally I do not know, but as Jan reports this is really
annoying for gdb at least.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-03 20:41   ` [PATCH 0/1] (Was: ptrace: clean transitions between TASK_STOPPED and TRACED) Oleg Nesterov
@ 2011-02-03 20:41     ` Oleg Nesterov
  2011-02-03 21:36       ` Roland McGrath
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-03 20:41 UTC (permalink / raw)
  To: Tejun Heo, Roland McGrath; +Cc: jan.kratochvil, linux-kernel, torvalds, akpm

Test-case:

	#include <stdio.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/ptrace.h>
	#include <sys/wait.h>
	#include <assert.h>

	int main(void)
	{
		int pid, stat;

		pid = fork();
		if (!pid) {
			kill(getpid(), SIGSTOP);
			assert(0);
		}

		if (!fork()) {
			assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
			/* eat ->exit_code */
			assert(wait(&stat) == pid);
			/* exit instead of DETACH to avoid the extra wakeup */
			assert(stat == 0x137f);
			return 0;
		}
		wait(NULL);

		assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
		alarm(1);
		assert(wait(&stat) == pid);
		assert(stat == 0x137f);
		assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);

		kill(pid, SIGKILL);
		return 0;
	}

Without this patch wait() hangs forever after the 2nd PTRACE_DETACH.
This is because task->exit_code was already cleared by the previous
debugger. Change ptrace_attach() to restore ->exit_code in this case.

The new exit_code is not necessarily accurate and we can't use
->group_exit_code because it can be cleared by ->real_parent too,
but I think this doesn't really matter.

Even with this patch SIGCONT can "steal" exit_code and the pending
SIGSTOP, but in this case the tracee will report SIGCONT to the new
debugger, so wait() won't hang anyway.

(Of course, SIGCONT after wait() can break PTRACE_DETACH but this
 is another story, any ptrace request can fail if the tracee is
 TASK_STOPPED).

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/ptrace.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

--- 37/kernel/ptrace.c~attach_exit_code	2010-11-02 19:48:08.000000000 +0100
+++ 37/kernel/ptrace.c	2011-02-03 20:39:26.000000000 +0100
@@ -202,9 +202,22 @@ int ptrace_attach(struct task_struct *ta
 		task->ptrace |= PT_PTRACE_CAP;
 
 	__ptrace_link(task, current);
-	send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
 
+	if (task_is_stopped(task)) {
+		/* safe, we checked ->exit_state */
+		spin_lock(&task->sighand->siglock);
+		/*
+		 * This can only happen if the previous tracer cleared
+		 * ->exit_code, make sure do_wait() will not hang.
+		 */
+		if (task_is_stopped(task) && !task->exit_code)
+			task->exit_code = SIGSTOP;
+		spin_unlock(&task->sighand->siglock);
+	}
+
+	send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
 	retval = 0;
+
 unlock_tasklist:
 	write_unlock_irq(&tasklist_lock);
 unlock_creds:


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-03 20:41     ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Oleg Nesterov
@ 2011-02-03 21:36       ` Roland McGrath
  2011-02-03 21:44         ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-02-03 21:36 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Tejun Heo, jan.kratochvil, linux-kernel, torvalds, akpm

IMHO this sort of band-aid does not really help the overall situation.
It takes something that is intricate and fiddly and just fiddles it a
bit more.  Userland will still have to handle older kernels where this
behavior is not there.  If userland does anything that relies on this
new behavior, then it will have to try somehow to figure out which
kernel versions have which behavior and adapt, etc.

When the old behaviors are unhelpful like this, I think it is really
better to add new mechanisms instead.  We can make new mechanisms more
clear and straightforward for userland to work with from the beginning.
Then the compatibility picture for userland is simply to try a new call
or new ptrace request or new option bit or whatever it is.  When the
kernel supports the new thing, things are easy.  When it doesn't, then
they cope with life as they have been coping before on old kernels.

I have some ideas about new things to add for this problem area.  But
we have to think those through carefully and discuss all the details
thoroughly with Jan and other folks working on userland debuggers
before we write the kernel side.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-03 21:36       ` Roland McGrath
@ 2011-02-03 21:44         ` Oleg Nesterov
  2011-02-04 10:53           ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-03 21:44 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Tejun Heo, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/03, Roland McGrath wrote:
>
> IMHO this sort of band-aid does not really help the overall situation.
> It takes something that is intricate and fiddly and just fiddles it a
> bit more.  Userland will still have to handle older kernels where this
> behavior is not there.  If userland does anything that relies on this
> new behavior, then it will have to try somehow to figure out which
> kernel versions have which behavior and adapt, etc.

Absolutely agreed.

As I said, I am not sure this patch makes sense. I only sent it
because I have to react to the bug report.

> When the old behaviors are unhelpful like this, I think it is really
> better to add new mechanisms instead.

Agreed!

Can't resist, let me repeat... imho ptrace is unfixable ;)


OK, please ignore this patch.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-03 21:44         ` Oleg Nesterov
@ 2011-02-04 10:53           ` Tejun Heo
  2011-02-04 13:04             ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-04 10:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hello, Oleg, Roland.

On Thu, Feb 03, 2011 at 10:44:50PM +0100, Oleg Nesterov wrote:
> On 02/03, Roland McGrath wrote:
> >
> > IMHO this sort of band-aid does not really help the overall situation.
> > It takes something that is intricate and fiddly and just fiddles it a
> > bit more.  Userland will still have to handle older kernels where this
> > behavior is not there.  If userland does anything that relies on this
> > new behavior, then it will have to try somehow to figure out which
> > kernel versions have which behavior and adapt, etc.
> 
> Absolutely agreed.
> 
> As I said, I am not sure this patch makes sense. I only sent it
> because I have to react to the bug report.
> 
> > When the old behaviors are unhelpful like this, I think it is really
> > better to add new mechanisms instead.
> 
> Agreed!
> 
> Can't resist, let me repeat... imho ptrace is unfixable ;)

Hmm... I can't reproduce the problem here, but isn't the problematic
part here the mixing of ptrace and group stop and sliently
transforming group stop into ptrace and ptracer consuming the usual
exit code instead of the ptrace specific one?

Also, I don't agree with the notion that doing something entirely new
would magically solve all the problems.  Improvements are achieved
through evolution.  For ptrace, the situation definitely is aggravated
by the use of wait and weird interaction with group stop, but the
interaction is inherently complex for debugging facility and most
problems won't automatically go away with new interface.

Actually, I think such approach is significantly harmful to
improvements of the existing code base.  Instead of encouraging
investigating the actual problems and making sensible tradeoffs, such
approach discourages making reasonable tradeoffs with the false
expectation that something in the future will magically solve the
problems.  In a lot of cases, there are no unicorns.  Proceeding
forward while managing damage at reasonable level is usually the right
way to go.

That said, well, there always are exceptions and maybe there are some
rainbow farting unicorns in the ptrace land.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-04 10:53           ` Tejun Heo
@ 2011-02-04 13:04             ` Oleg Nesterov
  2011-02-04 14:48               ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-04 13:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/04, Tejun Heo wrote:
>
> Hello, Oleg, Roland.
>
> On Thu, Feb 03, 2011 at 10:44:50PM +0100, Oleg Nesterov wrote:
> > On 02/03, Roland McGrath wrote:
> > >
> > > IMHO this sort of band-aid does not really help the overall situation.
> > > It takes something that is intricate and fiddly and just fiddles it a
> > > bit more.  Userland will still have to handle older kernels where this
> > > behavior is not there.  If userland does anything that relies on this
> > > new behavior, then it will have to try somehow to figure out which
> > > kernel versions have which behavior and adapt, etc.
> >
> > Absolutely agreed.
> >
> > As I said, I am not sure this patch makes sense. I only sent it
> > because I have to react to the bug report.
> >
> > > When the old behaviors are unhelpful like this, I think it is really
> > > better to add new mechanisms instead.
> >
> > Agreed!
> >
> > Can't resist, let me repeat... imho ptrace is unfixable ;)
>
> Hmm... I can't reproduce the problem here,

Very strange. Do you mean the test-case doesn't die? (on vanilla kernel).

> but isn't the problematic
> part here the mixing of ptrace and group stop and sliently
> transforming group stop into ptrace

Not exactly,

> and ptracer consuming the usual
> exit code instead of the ptrace specific one?

Well, unless the task dies nobody except ptrace can use ->exit_code.

The problem is:

	- the task T stops, it sets ->exit_code exactly because
	  the tracer can attach after that

	- the tracer attaches, does wait(), consumes exit_code
	  and exits

	- another tracer attaches, but exit_code == 0

There is no STOPPED/TRACED transformation at all.

> Also, I don't agree with the notion that doing something entirely new
> would magically solve all the problems.  Improvements are achieved
> through evolution.  For ptrace, the situation definitely is aggravated
> by the use of wait

... and reparenting, and signals.

> and weird interaction with group stop,

Yes. And to me the main problem is not the current behaviour. The
problem is that we never tried to define the correct behavior.
OK, real_parent can miss the notification. We can fix this, but
for what? The tracer can resume the thread "silently", this doesn't
look very good anyway.

But even this doesn't matter. We can not change ptrace API so that,
say, it does not reparent the tracee. Once we do this, we already
have the new API.

So, personally I think we need the new API. And we already have
utrace which allows to implement "anything" on top of it, including
the old ptrace for compatibility.

> Actually, I think such approach is significantly harmful to
> improvements of the existing code base.  Instead of encouraging
> investigating the actual problems and making sensible tradeoffs, such
> approach discourages making reasonable tradeoffs with the false
> expectation that something in the future will magically solve the
> problems.  In a lot of cases, there are no unicorns.  Proceeding
> forward while managing damage at reasonable level is usually the right
> way to go.

Well, perhaps I am wrong, this is only my opinion.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-04 13:04             ` Oleg Nesterov
@ 2011-02-04 14:48               ` Tejun Heo
  2011-02-04 17:06                 ` Oleg Nesterov
  2011-02-13 21:24                 ` Denys Vlasenko
  0 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-04 14:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hey, guys.

On Fri, Feb 04, 2011 at 02:04:55PM +0100, Oleg Nesterov wrote:
> > Hmm... I can't reproduce the problem here,
> 
> Very strange. Do you mean the test-case doesn't die? (on vanilla kernel).

Heh, it turns out the second child was attaching before the first
succeeded stopping itself, so when it gets detached for the first
time, the first child then stops generating new exit_code.  Adding a
small delay to the parent after the first child started made it
reliably fail on the vanially kernel.

> > but isn't the problematic
> > part here the mixing of ptrace and group stop and sliently
> > transforming group stop into ptrace
> 
> Not exactly,
> 
> > and ptracer consuming the usual
> > exit code instead of the ptrace specific one?
> 
> Well, unless the task dies nobody except ptrace can use ->exit_code.
> 
> The problem is:
> 
> 	- the task T stops, it sets ->exit_code exactly because
> 	  the tracer can attach after that
> 
> 	- the tracer attaches, does wait(), consumes exit_code
> 	  and exits
> 
> 	- another tracer attaches, but exit_code == 0
> 
> There is no STOPPED/TRACED transformation at all.

But it is.  It happens because there is no clear distinction between
group stop and ptrace_stop.  With my first series applied, it doesn't
happen anymore because ptracer _never_ depends on or consumes group
stop exit_code.  The exit_code is cached in task->group_stop and used
when the tracee enters ptrace_stop() for group stop.  It doesn't
matter how many times it gets detached, re-attached or someone else
consuming the group stop exit_code.

> > Also, I don't agree with the notion that doing something entirely new
> > would magically solve all the problems.  Improvements are achieved
> > through evolution.  For ptrace, the situation definitely is aggravated
> > by the use of wait
> 
> ... and reparenting, and signals.
> 
> > and weird interaction with group stop,
> 
> Yes. And to me the main problem is not the current behaviour. The
> problem is that we never tried to define the correct behavior.
> OK, real_parent can miss the notification. We can fix this, but
> for what? The tracer can resume the thread "silently", this doesn't
> look very good anyway.

Yes, I agree it's ugly but that's what we already have.  I think we
can still achieve well-defined behavior even with ptracer allowed to
diddle with the task while group stop is in effect.  It may not be
immediately intuitive but I personally think it actually would be more
useful to do things that way, as long as we clearly lay out what are
supported what are undefined.

I think a good compromise would be guaranteeing that when the ptracer
goes away, the tracee would put into the state the real parent can
agree to and the real parent to be notified that it has happened.  We
are already skipping all notifications to the real parent for ptraced
children, there's no pressing need to change that.  If there becomes a
real pressing requirement to change that.

> But even this doesn't matter. We can not change ptrace API so that,
> say, it does not reparent the tracee. Once we do this, we already
> have the new API.

I would argue that we can get by well enough by trimming and updating
the curren ptrace API.

> So, personally I think we need the new API. And we already have
> utrace which allows to implement "anything" on top of it, including
> the old ptrace for compatibility.

I could be wrong (with pretty high probability) but I don't really see
the pressing need for a completely new API.  ptrace sure is ugly and
quirky but it's something people are already used to.

> Well, perhaps I am wrong, this is only my opinion.

That's all anyone can do anyway and I'm much more likely to be wrong
on the subject than you and Roland.  I just hope to find out where I'm
wrong.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-04 14:48               ` Tejun Heo
@ 2011-02-04 17:06                 ` Oleg Nesterov
  2011-02-05 13:39                   ` Tejun Heo
  2011-02-13 21:24                 ` Denys Vlasenko
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-04 17:06 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/04, Tejun Heo wrote:
>
> Hey, guys.
>
> On Fri, Feb 04, 2011 at 02:04:55PM +0100, Oleg Nesterov wrote:
> > > Hmm... I can't reproduce the problem here,
> >
> > Very strange. Do you mean the test-case doesn't die? (on vanilla kernel).
>
> Heh, it turns out the second child was attaching before the first
> succeeded stopping itself, so when it gets detached for the first
> time, the first child then stops generating new exit_code.

OOPS! Yes, the test-case is racy.

> Adding a
> small delay to the parent after the first child started made it
> reliably fail on the vanially kernel.

Yes, or waitpid(WSTOPPED).

> > There is no STOPPED/TRACED transformation at all.
>
> But it is.  It happens because there is no clear distinction between
> group stop and ptrace_stop.

Ah, in this sense I agree.

> With my first series applied, it doesn't
> happen anymore because ptracer _never_ depends on or consumes group
> stop exit_code.

Yes, I know ;) Please note "[PATCH 0/1]", I specially mentioned that
your patch should fix the problem too. And yes, my patch is the hack
which doesn't even try to address the underlying problem.

> Yes, I agree it's ugly but that's what we already have.  I think we
> can still achieve well-defined behavior even with ptracer allowed to
> diddle with the task while group stop is in effect.  It may not be
> immediately intuitive but I personally think it actually would be more
> useful to do things that way, as long as we clearly lay out what are
> supported what are undefined.
>
> I think a good compromise would be guaranteeing that when the ptracer
> goes away, the tracee would put into the state the real parent can
> agree to and the real parent to be notified that it has happened.

I am not sure. The tracing should be transparent as much as possible.

> We
> are already skipping all notifications to the real parent for ptraced
> children, there's no pressing need to change that.  If there becomes a
> real pressing requirement to change that.

And this looks just wrong to me. Say, why the ptraced application
doesn't react to ^Z ? It does, just it parent have no chance to see
this. (yes, yes, we should also change do_wait).

> I could be wrong (with pretty high probability) but I don't really see
> the pressing need for a completely new API.  ptrace sure is ugly and
> quirky but it's something people are already used to.

I won't argue. And yes, in any case ptrace can't go away, we should
try to improve it anyway. The obvious problem is that almost any
"visible" improvement breaks something.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-04 17:06                 ` Oleg Nesterov
@ 2011-02-05 13:39                   ` Tejun Heo
  2011-02-07 13:42                     ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-05 13:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hello, Oleg.

On Fri, Feb 04, 2011 at 06:06:02PM +0100, Oleg Nesterov wrote:
> > I think a good compromise would be guaranteeing that when the ptracer
> > goes away, the tracee would put into the state the real parent can
> > agree to and the real parent to be notified that it has happened.
> 
> I am not sure. The tracing should be transparent as much as possible.

Yes but there are different ways to achieve that.  Please read on.

> > We are already skipping all notifications to the real parent for
> > ptraced children, there's no pressing need to change that.  If
> > there becomes a real pressing requirement to change that.
> 
> And this looks just wrong to me. Say, why the ptraced application
> doesn't react to ^Z ? It does, just it parent have no chance to see
> this. (yes, yes, we should also change do_wait).

That's the shortcomings of the current implementation.  The specific
problem sure can be fixed by putting group stop on top of ptrace but
that is not the only direction.  In fact, that actually is the
direction we CAN'T take with ptrace because changing that will create
a lot more problems that can't be worked around.

We can introduce something completely new but we can also augment the
current implementation with what's necessary to remedy the problem.
The ptracer is already notified when the ptracee enters group stop.
There's nothing stopping us giving ptracer the ability to tell ptracee
to participate in group stop completion and notify the real parent.
It will be an extra feature, probably a new PTRACE_ operation.  This
way, the existing users behave the same way while the ones which are
updated to use the new feature would behave better w.r.t. group stop
while being ptraced.

The above is one of may possibilities and it might not be possible /
desirable as described.  I haven't thought that much about it, but my
point is that "oh, we'll have something completely new which cures
everything at once, so let's reject changes to the current code as
much as possible" does not help anyone.  Approaches like that rarely
work.  For example, the problem in this thread is cleanly solved by
really examining the problem and fixing the problem at the source (the
mixup of group and ptrace stop) and accumulating those proper
improvments and thus evolving the code base is a much more effective
way.

So, let's stop chasing unicorns.  Cake is a lie.  Let's work on what
we already have and incrementally improve it.  There might be some
extreme corner cases where userland might notice differences but as I
wrote before given the wide range of inconsistencies the current code
is showing, I think we actually have good amount of latitude.  It'll
be mostly about making sensible tradeoffs and testing the existing
users.

> > I could be wrong (with pretty high probability) but I don't really see
> > the pressing need for a completely new API.  ptrace sure is ugly and
> > quirky but it's something people are already used to.
> 
> I won't argue. And yes, in any case ptrace can't go away, we should
> try to improve it anyway. The obvious problem is that almost any
> "visible" improvement breaks something.

I don't believe any 'visible' improvement breaks something.  In fact,
the current users seem quite resillient to behavior changes (they
should as the current code behaves inconsistently in many places).
There sure will be some cases which would be put into
undefined/unsupported area but there's no reason to preserve every
single inconsistent detail of the current behavior.  As everything
else, it is a trade off we can make.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-01-28 18:29     ` Bash not reacting to Ctrl-C Ingo Molnar
@ 2011-02-05 20:34       ` Oleg Nesterov
  2011-02-07 13:08         ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-05 20:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tejun Heo, roland, jan.kratochvil, linux-kernel, torvalds, akpm,
	Peter Zijlstra, Thomas Gleixner, Frédéric Weisbecker

On 01/28, Ingo Molnar wrote:
>
> * Oleg Nesterov <oleg@redhat.com> wrote:
>
> > On 01/28, Ingo Molnar wrote:
> > >
> > > The bug is that occasionally Ctrl-C does not get processed, and that the Ctrl-C is
> > > 'lost'. It can be reproduced here by running ./test-signal several times, and
> > > Ctrl-C-ing it:
> > >
> > >  $ ./test-signal
> > >  ^C
> > >  $ ./test-signal
> > >  ^C^C
> > >  $ ./test-signal
> > >  ^C
> > >
> > > See that '^C^C' line? That is where i had to do Ctrl-C twice.
> >
> > Reproduced.
> >
> > At first glance, /bin/sh should be blamed... Hmm, probably yes,
> > I even reproduced this under strace, and this is what I see
> >
> > 	wait4(-1, 0x7fff388431c4, 0, NULL) = ? ERESTARTSYS (To be restarted)
> > 	--- SIGINT (Interrupt) @ 0 (0) ---
> > 	rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)
> > 	wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 9706
> >
> > So, ^C is not lost, but ./test-signal doesn't want to exit.
>
> Might be some Bash assumption or race that works under other OSs but somehow Linux
> does differently. IIRC Bash is being developed on MacOS-X.
>
> But it's happening all the time (with yum for example - but also with makejobs, as
> Thomas has reported it) - this is simply the first time i managed to reproduce it
> with something really simple.

OK, I seem to understand what happens. Of course I am not sure, I never
looked into these sources before...

Suppose that jctl ^C races with the normal child exit. In this case
waitchld() sets child->status = status (zero in this case) and calls
set_job_status_and_cleanup().

set_job_status_and_cleanup() notice wait_sigint_received and send
SIGINT to itself (termsig_handler (SIGINT)), but somehow it assumes
that the last foreground job should be terminated by SIGINT too:

	 else if (wait_sigint_received && (WTERMSIG (child->status) == SIGINT) &&

Then the next wait_for() clears wait_sigint_received and bash
looses ^C

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-02-05 20:34       ` Oleg Nesterov
@ 2011-02-07 13:08         ` Oleg Nesterov
  2011-02-09  6:17           ` Michael Witten
  2011-02-11 14:41           ` Pavel Machek
  0 siblings, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-07 13:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tejun Heo, roland, jan.kratochvil, linux-kernel, torvalds, akpm,
	Peter Zijlstra, Thomas Gleixner, Frédéric Weisbecker

On 02/05, Oleg Nesterov wrote:
>
> On 01/28, Ingo Molnar wrote:
> >
> > * Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > On 01/28, Ingo Molnar wrote:
> > > >
> > > > The bug is that occasionally Ctrl-C does not get processed, and that the Ctrl-C is
> > > > 'lost'. It can be reproduced here by running ./test-signal several times, and
> > > > Ctrl-C-ing it:
> > > >
> > > >  $ ./test-signal
> > > >  ^C
> > > >  $ ./test-signal
> > > >  ^C^C
> > > >  $ ./test-signal
> > > >  ^C
> > > >
> > > > See that '^C^C' line? That is where i had to do Ctrl-C twice.
> > >
> > > Reproduced.
> > >
> > > At first glance, /bin/sh should be blamed... Hmm, probably yes,
> > > I even reproduced this under strace, and this is what I see
> > >
> > > 	wait4(-1, 0x7fff388431c4, 0, NULL) = ? ERESTARTSYS (To be restarted)
> > > 	--- SIGINT (Interrupt) @ 0 (0) ---
> > > 	rt_sigreturn(0)                         = -1 EINTR (Interrupted system call)
> > > 	wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 9706
> > >
> > > So, ^C is not lost, but ./test-signal doesn't want to exit.
> >
> > Might be some Bash assumption or race that works under other OSs but somehow Linux
> > does differently. IIRC Bash is being developed on MacOS-X.
> >
> > But it's happening all the time (with yum for example - but also with makejobs, as
> > Thomas has reported it) - this is simply the first time i managed to reproduce it
> > with something really simple.
>
> OK, I seem to understand what happens. Of course I am not sure, I never
> looked into these sources before...
>
> Suppose that jctl ^C races with the normal child exit. In this case
> waitchld() sets child->status = status (zero in this case) and calls
> set_job_status_and_cleanup().
>
> set_job_status_and_cleanup() notice wait_sigint_received and send
> SIGINT to itself (termsig_handler (SIGINT)), but somehow it assumes
> that the last foreground job should be terminated by SIGINT too:
>
> 	 else if (wait_sigint_received && (WTERMSIG (child->status) == SIGINT) &&
>
> Then the next wait_for() clears wait_sigint_received and bash
> looses ^C

IOW.

Now that it is clear what happens, the test-case becomes even more
trivial:

	bash-4.1$ ./bash -c 'while true; do /bin/true; done'
	^C^C

needs 4-5 attempts on my machine.

The patch below fixes the problem, but most probably it is not
correct. Although I don't understand the point of "status == SIGINT"
check, we already checked this job is dead. But I won't pretend I
really understand this code.

Oleg.

--- bash-4.1/jobs.c~ctrlc_exit_race	2011-02-07 13:52:48.000000000 +0100
+++ bash-4.1/jobs.c	2011-02-07 13:55:30.000000000 +0100
@@ -3299,7 +3299,7 @@ set_job_status_and_cleanup (job)
 	 signals are sent to process groups) or via kill(2) to the foreground
 	 process by another process (or itself).  If the shell did receive the
 	 SIGINT, it needs to perform normal SIGINT processing. */
-      else if (wait_sigint_received && (WTERMSIG (child->status) == SIGINT) &&
+      else if (wait_sigint_received /*&& (WTERMSIG (child->status) == SIGINT)*/ &&
 	      IS_FOREGROUND (job) && IS_JOBCONTROL (job) == 0)
 	{
 	  int old_frozen;


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-05 13:39                   ` Tejun Heo
@ 2011-02-07 13:42                     ` Oleg Nesterov
  2011-02-07 14:11                       ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-07 13:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/05, Tejun Heo wrote:
>
> On Fri, Feb 04, 2011 at 06:06:02PM +0100, Oleg Nesterov wrote:
>
> > > We are already skipping all notifications to the real parent for
> > > ptraced children, there's no pressing need to change that.  If
> > > there becomes a real pressing requirement to change that.
> >
> > And this looks just wrong to me. Say, why the ptraced application
> > doesn't react to ^Z ? It does, just it parent have no chance to see
> > this. (yes, yes, we should also change do_wait).
>
> That's the shortcomings of the current implementation.  The specific
> problem sure can be fixed by putting group stop on top of ptrace but
> that is not the only direction.  In fact, that actually is the
> direction we CAN'T take with ptrace because changing that will create
> a lot more problems that can't be worked around.

Which problems?

> We can introduce something completely new but we can also augment the
> current implementation with what's necessary to remedy the problem.
> The ptracer is already notified when the ptracee enters group stop.
> There's nothing stopping us giving ptracer the ability to tell ptracee
> to participate in group stop completion and notify the real parent.
> It will be an extra feature, probably a new PTRACE_ operation.

Yes. But I still can't understand the point, please see below.

For the moment, please forget about CLD_CONTINUED. Forget that do_wait()
doesn't work for the real_parent, this is simple (afaics) to fix.

> This
> way, the existing users behave the same way

Wait. The current behaviour is just broken. This is bad. But, at the
same time, this is good: it gives us more rights to introduce the
user-visible changes once we decide what exactly we want.

Firstly, the current behaviour is unpredictable. Suppose that we have
two threads, T1 and T2. Only one thread is ptraced. Now, real_parent
will be notified or not, depending on which thread calls do_signal_stop()
last. I see no point to preserve this randomness. And note that
real_parent can be notified anyway.

Otoh, I don't really understand why should we delay the notification
to real_parent until detach in the simplest case (with your patches)
by default.

> while the ones which are
> updated to use the new feature would behave better w.r.t. group stop
> while being ptraced.

OK, perhaps the extra feature is fine since it gives more control,
but simplicity is the nice goal too. Especially if we are talking
about incremental changes.

> "oh, we'll have something completely new which cures
> everything at once, so let's reject changes to the current code as
> much as possible"

No, no, sorry, I didn't mean exactly this. At least I certainly didn't
mean we should not fix the bugs ;)

> For example, the problem in this thread is cleanly solved by
> really examining the problem and fixing the problem at the source (the
> mixup of group and ptrace stop)

Yes, but I am worried that this change (in its current form) makes
impossible to create a TASK_STOPPED tracee, but you already know this.

> So, let's stop chasing unicorns.  Cake is a lie.  Let's work on what
> we already have and incrementally improve it.

OK. But what I can't understand is why the alternative change is
not better. Once again:

	- the stopping thread always notifies the debugger

	- the last thread notifies both: debugger and real_parent

	- do_wait() is modified so that WSTOPPED always works
	  for real_parent, even if its child is ptraced.

(to remind, please ignore SIGCONT/ptrace_resume problems, and of
 course ptrace_stop() playing with group_stop_count should be fixed).

Roland, Tejun, seriously, could you explain why this is bad?

This is much simpler (including implementation) and straightforward.
Easy to understand. No need to remember the state of the tracee wrt
group-stop.

If we are going to add the new ptrace options, I'd rather prefer
to add PTRACE_O_DONT_NOTIFY_REAL_PARENT is we want to control this
behaviour.

What do you think?

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-07 13:42                     ` Oleg Nesterov
@ 2011-02-07 14:11                       ` Tejun Heo
  2011-02-07 15:37                         ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-07 14:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hello, Oleg.

On Mon, Feb 07, 2011 at 02:42:35PM +0100, Oleg Nesterov wrote:
> > That's the shortcomings of the current implementation.  The specific
> > problem sure can be fixed by putting group stop on top of ptrace but
> > that is not the only direction.  In fact, that actually is the
> > direction we CAN'T take with ptrace because changing that will create
> > a lot more problems that can't be worked around.
> 
> Which problems?

I was talking about prioritizing group stop over ptrace in general.
Please see the following messages.

 http://article.gmane.org/gmane.linux.kernel/1095119
 http://article.gmane.org/gmane.linux.kernel/1095603

Notifying the parent w/o making group stop superior to ptrace sure is
a possibility.

> > This way, the existing users behave the same way
> 
> Wait. The current behaviour is just broken. This is bad. But, at the
> same time, this is good: it gives us more rights to introduce the
> user-visible changes once we decide what exactly we want.
> 
> Firstly, the current behaviour is unpredictable. Suppose that we have
> two threads, T1 and T2. Only one thread is ptraced. Now, real_parent
> will be notified or not, depending on which thread calls do_signal_stop()
> last. I see no point to preserve this randomness. And note that
> real_parent can be notified anyway.
> 
> Otoh, I don't really understand why should we delay the notification
> to real_parent until detach in the simplest case (with your patches)
> by default.

Yeah, agreed regarding the notification.

> > For example, the problem in this thread is cleanly solved by
> > really examining the problem and fixing the problem at the source (the
> > mixup of group and ptrace stop)
> 
> Yes, but I am worried that this change (in its current form) makes
> impossible to create a TASK_STOPPED tracee, but you already know this.

Why is that a problem?  A ptraced task stops in TASK_TRACED.  It
should for a number of different reasons.  We can augment the group
stop notification if desirable.

> OK. But what I can't understand is why the alternative change is
> not better. Once again:
> 
> 	- the stopping thread always notifies the debugger
> 
> 	- the last thread notifies both: debugger and real_parent
> 
> 	- do_wait() is modified so that WSTOPPED always works
> 	  for real_parent, even if its child is ptraced.

I think the disconnection comes from the scope of the problem.  If we
restrict our attention to group stop notification.  I agree that what
you're describing seems like a good compromise.  What I was objecting
to was putting group stop mechanism in general on top of ptrace.  I
can't see how that would work.

Also, for a ptraced task, what would you consider to be participating
in a group stop?  I think it should only include the case where the
tracee actually stops for group stop excluding all other trapping
points.  That way ptracer also can tell that the tracee has stopped
for group stop and participated in group stop completion/notification.

But, I don't think this really changes the need for state tracking.
We would still have to put the tracee into approriate mode on detach.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-07 14:11                       ` Tejun Heo
@ 2011-02-07 15:37                         ` Oleg Nesterov
  2011-02-07 16:31                           ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-07 15:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/07, Tejun Heo wrote:
>
> Hello, Oleg.
>
> On Mon, Feb 07, 2011 at 02:42:35PM +0100, Oleg Nesterov wrote:
> > > That's the shortcomings of the current implementation.  The specific
> > > problem sure can be fixed by putting group stop on top of ptrace but
> > > that is not the only direction.  In fact, that actually is the
> > > direction we CAN'T take with ptrace because changing that will create
> > > a lot more problems that can't be worked around.
> >
> > Which problems?
>
> I was talking about prioritizing group stop over ptrace in general.
> Please see the following messages.
>
>  http://article.gmane.org/gmane.linux.kernel/1095119
>  http://article.gmane.org/gmane.linux.kernel/1095603

Yes, I tried to read this... But I have to admit I can hardly understand
your discussion with Roland. More precisely, I don't understand what
exactly you have in mind.

One (may be off-topic) note,

	On 01/31, Tejun Heo wrote:
	>
	> On Fri, Jan 28, 2011 at 01:30:09PM -0800, Roland McGrath wrote:
	> > > A visible behavior change is increased likelihood of delayed group
	> > > stop completion if the thread group contains one or more ptraced
	> > > tasks.
	> >
	> > I object to that difference in behavior.  As I've said before, I don't
	> > think there should be any option to circumvent a group stop via ptrace.
	> > If you think otherwise, you have a hard road to convince me of it.

I agree with Roland here.

	> Yes, I do have some other ideas.  When a ptraced task gets a stop
	> signal, its delivery is controlled by the tracer, right?

Right, but note that the tracer does not fully control the group-stop.
One a thread dequeues SIGSTOP (and please note this thread can be !traced),
all other threads (traced or not) should participate.

As for SIGCONT priority, see below.

> Notifying the parent w/o making group stop superior to ptrace sure is
> a possibility.

Could you please reiterate? I think I missed something before, and
now I do not really understand what do you mean.

> > > For example, the problem in this thread is cleanly solved by
> > > really examining the problem and fixing the problem at the source (the
> > > mixup of group and ptrace stop)
> >
> > Yes, but I am worried that this change (in its current form) makes
> > impossible to create a TASK_STOPPED tracee, but you already know this.
>
> Why is that a problem?

See above. Because I think ptrace should not "hide" jctl stops (at
least by default), and SIGCONT should work in this case.

> A ptraced task stops in TASK_TRACED.

Unless it reacts to SIGSTOP/group_stop_count.

> > OK. But what I can't understand is why the alternative change is
> > not better. Once again:
> >
> > 	- the stopping thread always notifies the debugger
> >
> > 	- the last thread notifies both: debugger and real_parent
> >
> > 	- do_wait() is modified so that WSTOPPED always works
> > 	  for real_parent, even if its child is ptraced.
>
> I think the disconnection comes from the scope of the problem.  If we
> restrict our attention to group stop notification.

Of course, we shouldn't restrict.

> I agree that what
> you're describing seems like a good compromise.  What I was objecting
> to was putting group stop mechanism in general on top of ptrace.  I
> can't see how that would work.

And I still can't understand why this can't work ;)

And I don't really understand "putting group stop mechanism in general
on top of ptrace". It is very possible I am wrong, but I see this from
the different angle: stop/ptrace should be "parallel".

> Also, for a ptraced task, what would you consider to be participating
> in a group stop?

Yes, this is the question.

> I think it should only include the case where the
> tracee actually stops for group stop excluding all other trapping
> points.

I was thinking about this too and probably this makes sense. But
I think at least initial changes should keep the current behaviour
(assuming this behaviour is fixed).

> But, I don't think this really changes the need for state tracking.
> We would still have to put the tracee into approriate mode on detach.

Sure, but we already have SIGNAL_STOP_STOPPED/group_signal_stop. I meant,
we do not need to remember the state per-thread.


As for SIGCONT. Roland suggests (roughly) to change ptrace_resume()
so that it doesn't wakeup the stopped tracee until the real SIGCONT
comes. And this makes sense to me.

	On 02/03, Tejun Heo wrote:
	>
	> I've been thinking about this and the more I think about it I don't
	> see how we can make this priority flipping without adversely affecting
	> the expect userland behavior.
	>
	> For example, if a gdb traced task is instructed to participate in a
	> group stop and then hits a ptrace trap, it would have to participate
	> in the group stop as it enters ptrace trap, right?  gdb's wait(2)
	> would complete indicating ptrace trap.  After the user tells the task
	> to continue, the task shouldn't resume until SIGCONT is received;

Yes. But to me, this looks correct! The tracee shouldn't resume exactly
because it is stopped.

	> however, at this point, there's no way for gdb to tell what's going on
	> with the tracee.

Yes. I think this should be improved somehow, currently gdb can only
look in /proc/tid/status to detect this case.

	> If ptrace behaved like that from the beginning, gdb would have behaved
	> differently and worked around those cases but that hasn't been the
	> case

Cough... I thought we agreed it is better to break some corner cases
but make ptrace more consistent ;)

But yes, I see your point. And while I think that Roland's suggestion is
fine, I also have another proposal

	- never send CLD_CONTINUED to the tracer, always send it to parent.

	  Firstly, this is completely pointless: ptrace is per-thread, while
	  this notification is per-process

	- change do_wait() so that WCONTINUED for real_parent

	- change ptrace_resume() to check SIGNAL_STOP_STOPPED case. It should
	  act as SIGCONT in this case. Yes: "act as SIGCONT" needs more
	  discussion.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-07 15:37                         ` Oleg Nesterov
@ 2011-02-07 16:31                           ` Tejun Heo
  2011-02-07 17:48                             ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-07 16:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hey, Oleg.

On Mon, Feb 07, 2011 at 04:37:23PM +0100, Oleg Nesterov wrote:
> On 02/07, Tejun Heo wrote:
> >  http://article.gmane.org/gmane.linux.kernel/1095119
> >  http://article.gmane.org/gmane.linux.kernel/1095603
> 
> Yes, I tried to read this... But I have to admit I can hardly understand
> your discussion with Roland. More precisely, I don't understand what
> exactly you have in mind.

Heh, okay, I'll try harder.  :-)

> One (may be off-topic) note,
> 
> 	On 01/31, Tejun Heo wrote:
> 	>
> 	> On Fri, Jan 28, 2011 at 01:30:09PM -0800, Roland McGrath wrote:
> 	> > > A visible behavior change is increased likelihood of delayed group
> 	> > > stop completion if the thread group contains one or more ptraced
> 	> > > tasks.
> 	> >
> 	> > I object to that difference in behavior.  As I've said before, I don't
> 	> > think there should be any option to circumvent a group stop via ptrace.
> 	> > If you think otherwise, you have a hard road to convince me of it.
> 
> I agree with Roland here.
> 
> 	> Yes, I do have some other ideas.  When a ptraced task gets a stop
> 	> signal, its delivery is controlled by the tracer, right?
> 
> Right, but note that the tracer does not fully control the group-stop.
> One a thread dequeues SIGSTOP (and please note this thread can be !traced),
> all other threads (traced or not) should participate.

I don't know.  Maybe it's more consistent that way and I'm not
fundamentally against that but it is a big behavior change and I don't
think it falls in the corner case category.  Please read on.

> As for SIGCONT priority, see below.
> 
> > Notifying the parent w/o making group stop superior to ptrace sure is
> > a possibility.
> 
> Could you please reiterate? I think I missed something before, and
> now I do not really understand what do you mean.

I was trying to say that it's still possible to deliver group stop
notifications to the real parent while letting the ptracer override
group stop with PTRACE_CONT.

> > > > For example, the problem in this thread is cleanly solved by
> > > > really examining the problem and fixing the problem at the source (the
> > > > mixup of group and ptrace stop)
> > >
> > > Yes, but I am worried that this change (in its current form) makes
> > > impossible to create a TASK_STOPPED tracee, but you already know this.
> >
> > Why is that a problem?
> 
> See above. Because I think ptrace should not "hide" jctl stops (at
> least by default), and SIGCONT should work in this case.
>
> > A ptraced task stops in TASK_TRACED.
> 
> Unless it reacts to SIGSTOP/group_stop_count.

What do you do about PTRACE requests while a task is group stopped?
Reject them?  Block them?

> > I agree that what
> > you're describing seems like a good compromise.  What I was objecting
> > to was putting group stop mechanism in general on top of ptrace.  I
> > can't see how that would work.
> 
> And I still can't understand why this can't work ;)
> 
> And I don't really understand "putting group stop mechanism in general
> on top of ptrace". It is very possible I am wrong, but I see this from
> the different angle: stop/ptrace should be "parallel".

Hmmm... currently ptrace overrides group stop and has full control
over when and where the tracee stops and continues, which I think is a
quite visible assumption.  I don't think it's an extreme corner case
we can break.  For example, if a user gdb's a program which raises one
of the stop signals, currently the user expects to be able to continue
the program from whithin the gdb.  If we make group stop override
ptrace, there's no other recourse than sending signal from outside.

It is a deep rooted assumption that ptracer has full control over
execution of the tracee.  I can't see how we would be able to change
that.  Moreover, although it may not be immediately intuitive, I
actually think it is more useful behavior for ptrace for
e.g. debugging multithread behavior as long as we can keep the whole
thing well defined.

> > I think it should only include the case where the
> > tracee actually stops for group stop excluding all other trapping
> > points.
> 
> I was thinking about this too and probably this makes sense. But
> I think at least initial changes should keep the current behaviour
> (assuming this behaviour is fixed).

But if you make the other change but not this one, we end up with
ptrace which doesn't notify the ptracer what's going on.  Apart from
_polling_ /proc/tid/status, there is no mechanism to discover the
tracee's state.  The only thing it achieves is the integrity of group
stop and I'm not really sure whether that's something worth achieving
at the cost of debugging capabilities especially when we don't _have_
to lose them.

> > But, I don't think this really changes the need for state tracking.
> > We would still have to put the tracee into approriate mode on detach.
> 
> Sure, but we already have SIGNAL_STOP_STOPPED/group_signal_stop. I meant,
> we do not need to remember the state per-thread.

Yeah, yeah, I was still thinking about allowing PTRACE_CONT in which
case we need to keep track of who did what.

> As for SIGCONT. Roland suggests (roughly) to change ptrace_resume()
> so that it doesn't wakeup the stopped tracee until the real SIGCONT
> comes. And this makes sense to me.

I agree it adds more integrity to group stop but at the cost of
debugging capabilities.  I'm not yet convinced integrity of group stop
is that important.  Why is it such a big deal?

> 	> For example, if a gdb traced task is instructed to participate in a
> 	> group stop and then hits a ptrace trap, it would have to participate
> 	> in the group stop as it enters ptrace trap, right?  gdb's wait(2)
> 	> would complete indicating ptrace trap.  After the user tells the task
> 	> to continue, the task shouldn't resume until SIGCONT is received;
> 
> Yes. But to me, this looks correct! The tracee shouldn't resume exactly
> because it is stopped.
> 
> 	> however, at this point, there's no way for gdb to tell what's going on
> 	> with the tracee.
> 
> Yes. I think this should be improved somehow, currently gdb can only
> look in /proc/tid/status to detect this case.
> 
> 	> If ptrace behaved like that from the beginning, gdb would have behaved
> 	> differently and worked around those cases but that hasn't been the
> 	> case
> 
> Cough... I thought we agreed it is better to break some corner cases
> but make ptrace more consistent ;)

Yeah, yeah, sure.  I just don't agree it is a corner case that we can
change.  It looks like a quite fundamental assumption/expectation to
me and changing it takes away existing debugging capabilities.  It's
something people would actually be using / used to.

> But yes, I see your point. And while I think that Roland's suggestion is
> fine, I also have another proposal
> 
> 	- never send CLD_CONTINUED to the tracer, always send it to parent.
> 
> 	  Firstly, this is completely pointless: ptrace is per-thread, while
> 	  this notification is per-process

CLD_STOPPED is too but while ptrace is attached the notifications are
made per-task and delivered to the tracer.  But, if there's no side
effect to worry about, sure.

> 	- change do_wait() so that WCONTINUED for real_parent
> 
> 	- change ptrace_resume() to check SIGNAL_STOP_STOPPED case. It should
> 	  act as SIGCONT in this case. Yes: "act as SIGCONT" needs more
> 	  discussion.

I don't know.  I think this is the key question.  Whether to de-throne
PTRACE_CONT such that it cannot override group stop.  As I've already
said several times already, I think it is a pretty fundamental
property of ptrace and change to it would be quite visible, in
negative way, from userland.  Furthermore, I think actually the
current behavior is more desirable, not from the POV of group stop
integrity but from that of debugging capabilities.  I don't think
group stop integrity is that important as long as we can keep the
state well defined while ptraced && consistent after the debugger is
gone.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-07 16:31                           ` Tejun Heo
@ 2011-02-07 17:48                             ` Oleg Nesterov
  2011-02-09 14:18                               ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-07 17:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/07, Tejun Heo wrote:
>
> Hey, Oleg.
>
> On Mon, Feb 07, 2011 at 04:37:23PM +0100, Oleg Nesterov wrote:
> >
> > 	> Yes, I do have some other ideas.  When a ptraced task gets a stop
> > 	> signal, its delivery is controlled by the tracer, right?
> >
> > Right, but note that the tracer does not fully control the group-stop.
> > One a thread dequeues SIGSTOP (and please note this thread can be !traced),
> > all other threads (traced or not) should participate.
>
> I don't know.  Maybe it's more consistent that way and I'm not
> fundamentally against that but it is a big behavior change

Hmm. I tried to describe the current behaviour...

> > > Notifying the parent w/o making group stop superior to ptrace sure is
> > > a possibility.
> >
> > Could you please reiterate? I think I missed something before, and
> > now I do not really understand what do you mean.
>
> I was trying to say that it's still possible to deliver group stop
> notifications to the real parent while letting the ptracer override
> group stop with PTRACE_CONT.

Do you mean the current code? Yes, this is possible. And yes, this
doesn't look good. PTRACE_CONT should either notify the real parent
or do not resume the tracee.

> > > A ptraced task stops in TASK_TRACED.
> >
> > Unless it reacts to SIGSTOP/group_stop_count.
>
> What do you do about PTRACE requests while a task is group stopped?
> Reject them?  Block them?

Yes, another known oddity. Of course we shouldn't reject or block.
Perhaps we can ignore this initially. If SIGCONT comes after another
request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
tracee won't run until the next PTRACE_CONT, this makes sense.

The problem is, gdb can't leave the tracee in STOPPED state if it
wants. We need to improve this somehow (like in your previous example
with gdb).

> > > I agree that what
> > > you're describing seems like a good compromise.  What I was objecting
> > > to was putting group stop mechanism in general on top of ptrace.  I
> > > can't see how that would work.
> >
> > And I still can't understand why this can't work ;)
> >
> > And I don't really understand "putting group stop mechanism in general
> > on top of ptrace". It is very possible I am wrong, but I see this from
> > the different angle: stop/ptrace should be "parallel".
>
> Hmmm... currently ptrace overrides group stop and has full control
> over when and where the tracee stops and continues,

Only if it attaches to every thread in the thread group. Otherwise,
if the non-thread has already initiated the group-stop, the tracee
will notice TIF_SIGPENDING eventually and call do_signal_stop(),
debugger can't control this.

> I don't think it's an extreme corner case
> we can break.  For example, if a user gdb's a program which raises one
> of the stop signals, currently the user expects to be able to continue
> the program from whithin the gdb.  If we make group stop override
> ptrace, there's no other recourse than sending signal from outside.

Yes. Of course, gdb can be "fixed", it can send SIGCONT.

But yes, this is the noticeable change, that is why I suggested
ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
compatible. (although let me repeat, _pesonally_ I'd prefer to
simply tell user-space to learn the new rules ;)

> > > I think it should only include the case where the
> > > tracee actually stops for group stop excluding all other trapping
> > > points.
> >
> > I was thinking about this too and probably this makes sense. But
> > I think at least initial changes should keep the current behaviour
> > (assuming this behaviour is fixed).
>
> But if you make the other change but not this one, we end up with
> ptrace which doesn't notify the ptracer what's going on.  Apart from
> _polling_ /proc/tid/status, there is no mechanism to discover the
> tracee's state.

(Don't forget, ptrace_stop() should be fixed to notify and set
 SIGNAL_STOP_STOPPED if needed. Damn, it is not exactly trivial as
 I though initially... but hopefully possible anyway)

If gdb attaches to all threads it can detect this case, otherwise
it doesn't fully control the group-stop anyway.

But,

> The only thing it achieves is the integrity of group
> stop

Given that SIGCHLD doesn't queue and with or without your changes
we send it per-thread, it is not trivial for gdb to detect the
group-stop anyway. Again, the kernel should help somehow.

> and I'm not really sure whether that's something worth achieving
> at the cost of debugging capabilities especially when we don't _have_
> to lose them.

But we do not? I mean, at least this is not worse than the current
behaviour.

> > As for SIGCONT. Roland suggests (roughly) to change ptrace_resume()
> > so that it doesn't wakeup the stopped tracee until the real SIGCONT
> > comes. And this makes sense to me.
>
> I agree it adds more integrity to group stop but at the cost of
> debugging capabilities.  I'm not yet convinced integrity of group stop
> is that important.  Why is it such a big deal?

Of course I can't "prove" it is that important. But I think so.

> > But yes, I see your point. And while I think that Roland's suggestion is
> > fine, I also have another proposal
> >
> > 	- never send CLD_CONTINUED to the tracer, always send it to parent.
> >
> > 	  Firstly, this is completely pointless: ptrace is per-thread, while
> > 	  this notification is per-process
>
> CLD_STOPPED is too but while ptrace is attached the notifications are
> made per-task and delivered to the tracer.

No, there is a difference. Sure, CLD_STOPPED is per-process without
ptrace. But CLD_CONTINUED continues to be per-process even if all
threads are traced.

> > 	- change do_wait() so that WCONTINUED for real_parent
> >
> > 	- change ptrace_resume() to check SIGNAL_STOP_STOPPED case. It should
> > 	  act as SIGCONT in this case. Yes: "act as SIGCONT" needs more
> > 	  discussion.
>
> I don't know.

Heh, if only I could say I know ;)

> I think this is the key question.  Whether to de-throne
> PTRACE_CONT such that it cannot override group stop.  As I've already
> said several times already, I think it is a pretty fundamental
> property of ptrace

Again, I am a bit confused. Note that PTRACE_CONT overrides
group stop if we do the above. It should wake up the tracee, in
SIGCONT-compatible way (yes, the latter is not exactly clear).

But at least this should be visible to real parent. We shouldn't
silently make the stopped tracee running while its real_parent
thinks everything is stopped.

> and change to it would be quite visible,

If you meant Roland's suggestion (PTRACE_CONT doesn't wakeup
but needs SIGCONT) - yes.

> in
> negative way, from userland.

At least, in a non-compatible way.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-02-07 13:08         ` Oleg Nesterov
@ 2011-02-09  6:17           ` Michael Witten
  2011-02-09 14:53             ` Ingo Molnar
  2011-02-11 14:41           ` Pavel Machek
  1 sibling, 1 reply; 160+ messages in thread
From: Michael Witten @ 2011-02-09  6:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Tejun Heo, roland, jan.kratochvil, linux-kernel,
	torvalds, akpm, Peter Zijlstra, Thomas Gleixner,
	Frédéric Weisbecker

On Mon, Feb 7, 2011 at 07:08, Oleg Nesterov <oleg@redhat.com> wrote:
> Now that it is clear what happens, the test-case becomes even more
> trivial:
>
>        bash-4.1$ ./bash -c 'while true; do /bin/true; done'
>        ^C^C
>
> needs 4-5 attempts on my machine.

I feel like the odd penguin out.

I can't reproduce the behavior in question when using that example (I
haven't tried the other).

I'm running:

    * bash version 4.1.9(2)-release (i686-pc-linux-gnu)

    * linux 2.6.38-rc4 (100b33c8bd8a3235fd0b7948338d6cbb3db3c63d)

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-07 17:48                             ` Oleg Nesterov
@ 2011-02-09 14:18                               ` Tejun Heo
  2011-02-09 14:21                                 ` Tejun Heo
                                                   ` (2 more replies)
  0 siblings, 3 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-09 14:18 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hello, Oleg.

On Mon, Feb 07, 2011 at 06:48:21PM +0100, Oleg Nesterov wrote:
> > I don't know.  Maybe it's more consistent that way and I'm not
> > fundamentally against that but it is a big behavior change
> 
> Hmm. I tried to describe the current behaviour...

We can make it behave like the following.  { | } denotes two
alternative behaviors regarding SIGCONT.

  If a group stop is initiated while, or was in progress when a task
  is ptraced, the task will stop for group stop and notify the ptracer
  accordingly.  Note that the task could be trapped elsewhere delaying
  this from happening.  When the task stops for group stop, it
  participates in group stop as if it is not ptraced and the real
  parent is notified of group stop completion.

  Note that { the task is put into TASK_TRACED state and group stop
  resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
  state and the following PTRACE request will transition it into
  TASK_TRACED.  If SIGCONT is received before transition to
  TASK_TRACED is made, the task will resume execution.  If PTRACE
  request faces with SIGCONT, PTRACE request may fail. }

  The ptracer may resume execution of the task using PTRACE_CONT
  without affecting other tasks in the group.  The task will not stop
  for the same group stop again while ptraced.

  On ptrace detach, if group stop is in effect, the task will be put
  into TASK_STOPPED state and if it is the first time the task is
  stopping for the current group stop, it will participate in group
  stop completion.

This can be phrased better but it seems well defined enough for me.  I
take it that one of your concerns is direct transition into
TASK_TRACED on group stop while ptraced which prevents the tracee from
reacting to the following SIGCONT.  I'm not sure how much of an actual
problem it is given that our notification to real parent hasn't worked
at all till now but we can definitely implement proper TASK_STOPPED ->
TRACED transition on the next PTRACE request.  There exists a
fundamental race condition between SIGCONT and the next PTRACE call
but I don't think it's such a big deal as long as the transition
itself is done properly.

If we don't go that route, another solution would be to add a ptrace
call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
which the ptracer can call once it knows the tracee entered group
stop.

In either case, the fundamentals of ptrace operation don't really
change.  All ptrace operations are still per-task and ptracer almost
always has control over execution of the tracee.  Sure, it allows
ptraced task to escape group stop but it seems defined clear enough
and IMHO actually is a helpful debugging feature.  After all, it's not
like stop signals can be used for absoultely reliable job control.
There's an inherent race against SIGCONT.

> > What do you do about PTRACE requests while a task is group stopped?
> > Reject them?  Block them?
> 
> Yes, another known oddity. Of course we shouldn't reject or block.
> Perhaps we can ignore this initially. If SIGCONT comes after another
> request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
> tracee won't run until the next PTRACE_CONT, this makes sense.

That conceptually might make sense but other than the conceptual
integrity it widely changes the assumptions and is less useful than
the current behavior.  I don't really see why we would want to do
that.

> The problem is, gdb can't leave the tracee in STOPPED state if it
> wants. We need to improve this somehow (like in your previous example
> with gdb).
>
> Only if it attaches to every thread in the thread group. Otherwise,
> if the non-thread has already initiated the group-stop, the tracee
> will notice TIF_SIGPENDING eventually and call do_signal_stop(),
> debugger can't control this.

The debugger is still notified and can override it.  gdb already can
and does.

> > I don't think it's an extreme corner case
> > we can break.  For example, if a user gdb's a program which raises one
> > of the stop signals, currently the user expects to be able to continue
> > the program from whithin the gdb.  If we make group stop override
> > ptrace, there's no other recourse than sending signal from outside.
> 
> Yes. Of course, gdb can be "fixed", it can send SIGCONT.
> 
> But yes, this is the noticeable change, that is why I suggested
> ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
> compatible. (although let me repeat, _pesonally_ I'd prefer to
> simply tell user-space to learn the new rules ;)

I can't really agree there.  First, to me, it seems like too radical a
change and secondly the resulting behavior might look conceptually
pleasing but is not as useful as the current one.  Why make a change
which results in reduced usefulness while noticeably breaking existing
users?

> > The only thing it achieves is the integrity of group
> > stop
> 
> Given that SIGCHLD doesn't queue and with or without your changes
> we send it per-thread, it is not trivial for gdb to detect the
> group-stop anyway. Again, the kernel should help somehow.

Hmmm?  Isn't this discoverable from the exit code from wait?

> > and I'm not really sure whether that's something worth achieving
> > at the cost of debugging capabilities especially when we don't _have_
> > to lose them.
> 
> But we do not? I mean, at least this is not worse than the current
> behaviour.

I think it's worse.  With your changes, debuggers can't diddle the
tasks behind group stop's back which the current users already expect.

> > > As for SIGCONT. Roland suggests (roughly) to change ptrace_resume()
> > > so that it doesn't wakeup the stopped tracee until the real SIGCONT
> > > comes. And this makes sense to me.
> >
> > I agree it adds more integrity to group stop but at the cost of
> > debugging capabilities.  I'm not yet convinced integrity of group stop
> > is that important.  Why is it such a big deal?
> 
> Of course I can't "prove" it is that important. But I think so.

Heh, I'm asking for proof that it is more useful. :-) But I'm still
curious why you think it's important because the benefits aren't
apparent to me.  Roland and you seem to share this opinion without
much dicussion so maybe I'm missing something?

> > CLD_STOPPED is too but while ptrace is attached the notifications are
> > made per-task and delivered to the tracer.
> 
> No, there is a difference. Sure, CLD_STOPPED is per-process without
> ptrace. But CLD_CONTINUED continues to be per-process even if all
> threads are traced.

Hmm... I need to think more about it.  I'm not fully following your
point.

> > I think this is the key question.  Whether to de-throne
> > PTRACE_CONT such that it cannot override group stop.  As I've already
> > said several times already, I think it is a pretty fundamental
> > property of ptrace
> 
> Again, I am a bit confused. Note that PTRACE_CONT overrides
> group stop if we do the above. It should wake up the tracee, in
> SIGCONT-compatible way (yes, the latter is not exactly clear).

What do you mean?  Waking up in SIGCONT-compatible way?  Sending
SIGCONT ending the whole group stop?

> But at least this should be visible to real parent. We shouldn't
> silently make the stopped tracee running while its real_parent
> thinks everything is stopped.

I think maybe this is where our different POVs come from.  To me, it
isn't too objectionable to allow debuggers to diddle with tracees
behind the real parent's back.  In fact, it would be quite useful when
debugging job control related behaviors.  I wouldn't have much problem
accepting the other way around - ie. strict job control even while
being debugged, but given that it is already allowed and visible, I
fail to see why we should change the behavior.  It doesn't seem to
have enough benefits to warrant such visible change.

If I change the patchset such that group stop while ptraced first
enters TASK_STOPPED and then transitions into TASK_TRACED on the next
PTRACE call, would that be something you guys can agree on?  We would
still need to ask the tracee to move into TASK_TRACED and so there
will be race window which would be visible under very convoluted
situations but IMHO they truly are the extreme corner cases.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-09 14:18                               ` Tejun Heo
@ 2011-02-09 14:21                                 ` Tejun Heo
  2011-02-09 21:25                                 ` Oleg Nesterov
  2011-02-13 22:25                                 ` Denys Vlasenko
  2 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-09 14:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Wed, Feb 09, 2011 at 03:18:03PM +0100, Tejun Heo wrote:
> Heh, I'm asking for proof that it is more useful. :-) But I'm still
          ^
         not
:-)

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-02-09  6:17           ` Michael Witten
@ 2011-02-09 14:53             ` Ingo Molnar
  2011-02-09 19:37               ` Michael Witten
  0 siblings, 1 reply; 160+ messages in thread
From: Ingo Molnar @ 2011-02-09 14:53 UTC (permalink / raw)
  To: Michael Witten
  Cc: Oleg Nesterov, Tejun Heo, roland, jan.kratochvil, linux-kernel,
	torvalds, akpm, Peter Zijlstra, Thomas Gleixner,
	Frédéric Weisbecker


* Michael Witten <mfwitten@gmail.com> wrote:

> On Mon, Feb 7, 2011 at 07:08, Oleg Nesterov <oleg@redhat.com> wrote:
> > Now that it is clear what happens, the test-case becomes even more
> > trivial:
> >
> >        bash-4.1$ ./bash -c 'while true; do /bin/true; done'
> >        ^C^C
> >
> > needs 4-5 attempts on my machine.
> 
> I feel like the odd penguin out.
> 
> I can't reproduce the behavior in question when using that example (I
> haven't tried the other).
> 
> I'm running:
> 
>     * bash version 4.1.9(2)-release (i686-pc-linux-gnu)
> 
>     * linux 2.6.38-rc4 (100b33c8bd8a3235fd0b7948338d6cbb3db3c63d)

Oleg provided another testcase, can you reproduce the Ctrl-C problem with this 
it?

#!/bin/bash

perl -we '$SIG{INT} = sub {exit}; sleep'

echo "Hehe, I am going to sleep after ^C"
sleep 100


Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-02-09 14:53             ` Ingo Molnar
@ 2011-02-09 19:37               ` Michael Witten
  0 siblings, 0 replies; 160+ messages in thread
From: Michael Witten @ 2011-02-09 19:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Oleg Nesterov, Tejun Heo, roland, jan.kratochvil, linux-kernel,
	torvalds, akpm, Peter Zijlstra, Thomas Gleixner,
	Frédéric Weisbecker, bug-bash, Chet Ramey

On Wed, Feb 9, 2011 at 08:53, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Michael Witten <mfwitten@gmail.com> wrote:
>
>> On Mon, Feb 7, 2011 at 07:08, Oleg Nesterov <oleg@redhat.com> wrote:
>> > Now that it is clear what happens, the test-case becomes even more
>> > trivial:
>> >
>> >        bash-4.1$ ./bash -c 'while true; do /bin/true; done'
>> >        ^C^C
>> >
>> > needs 4-5 attempts on my machine.
>>
>> I feel like the odd penguin out.
>>
>> I can't reproduce the behavior in question when using that example (I
>> haven't tried the other).
>>
>> I'm running:
>>
>>     * bash version 4.1.9(2)-release (i686-pc-linux-gnu)
>>
>>     * linux 2.6.38-rc4 (100b33c8bd8a3235fd0b7948338d6cbb3db3c63d)
>
> Oleg provided another testcase, can you reproduce the Ctrl-C problem with this
> it?
>
> #!/bin/bash
>
> perl -we '$SIG{INT} = sub {exit}; sleep'
>
> echo "Hehe, I am going to sleep after ^C"
> sleep 100
>
>
> Thanks,
>
>        Ingo
>

Yes, that requires me to press Ctrl-C twice in order to escape the
entire script. However, what do you expect the following to do:

#!/bin/bash

perl -we '$SIG{INT} = "IGNORE"; sleep 10'

echo "Hehe, I am going to sleep after ^C"
sleep 100

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-09 14:18                               ` Tejun Heo
  2011-02-09 14:21                                 ` Tejun Heo
@ 2011-02-09 21:25                                 ` Oleg Nesterov
  2011-02-13 23:01                                   ` Denys Vlasenko
  2011-02-14 14:50                                   ` Tejun Heo
  2011-02-13 22:25                                 ` Denys Vlasenko
  2 siblings, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-09 21:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/09, Tejun Heo wrote:
>
> We can make it behave like the following.  { | } denotes two
> alternative behaviors regarding SIGCONT.
>
>   If a group stop is initiated while, or was in progress when a task
>   is ptraced, the task will stop for group stop and notify the ptracer
>   accordingly.  Note that the task could be trapped elsewhere delaying
>   this from happening.  When the task stops for group stop, it
>   participates in group stop as if it is not ptraced and the real
>   parent is notified of group stop completion.

OK,

>   Note that { the task is put into TASK_TRACED state and group stop
>   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
>   state and the following PTRACE request will transition it into
>   TASK_TRACED.  If SIGCONT is received before transition to
>   TASK_TRACED is made, the task will resume execution.  If PTRACE
>   request faces with SIGCONT, PTRACE request may fail. }

To me, the first variant looks better. But, only because it is closer
to the current behaviour. I mean, it is better to change the things
incrementally.

But in the longer term - I do not know. Personally, I like the
TASK_STOPPED variant. To the point, I was thinking that (perhaps)
we can change ptrace_stop() so that it simply calls do_signal_stop()
if it notices ->group_stop_count != 0.

>   The ptracer may resume execution of the task using PTRACE_CONT
>   without affecting other tasks in the group.

And this is what I do not like. I just can't accept the fact there
is a running thread in the SIGNAL_STOP_STOPPED group.

But yes: this is what the current code does, I am not sure we can
change this, and both PTRACE_CONT-doesnt-resume-until-SIGCONT and
PTRACE_CONT-acts-as-SIGCONT are not "perfect" too.

>   On ptrace detach, if group stop is in effect, the task will be put
>   into TASK_STOPPED state and if it is the first time the task is
>   stopping for the current group stop, it will participate in group
>   stop completion.

Yes. (but depends on above).

> This can be phrased better but it seems well defined enough for me.  I
> take it that one of your concerns is direct transition into
> TASK_TRACED on group stop while ptraced which prevents the tracee from
> reacting to the following SIGCONT.

Yes,

> I'm not sure how much of an actual
> problem it is given that our notification to real parent hasn't worked
> at all till now

Yes! and this is very good argument in favour of all your objections ;)

Yes, this doesn't work anyway. But I _think_ this is the bug, if
we are going to change this code we should fix this bug as well.

Again, again, this is very subjective, I agree.

> but we can definitely implement proper TASK_STOPPED ->
> TRACED transition on the next PTRACE request.

I guess, you mean that the current code bypasses the
ptrace_stop()->arch_ptrace_stop_needed() code while doing s/STOPPED/TRACED ?

Oh, currently I am ignoring this, my only concern is how this all
looks to the userland. But this is the good point, and I have to admit
that I never realized this is just wrong. Yes, I agree, we should do
something, but this is not visible to user-space (except this should
fix the bug ;)

> There exists a
> fundamental race condition between SIGCONT and the next PTRACE call

Yes, and this race is already here, ptracer should take care.

> If we don't go that route, another solution would be to add a ptrace
> call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
> which the ptracer can call once it knows the tracee entered group
> stop.

Perhaps... Or something else, but surely there is a room for improvements.
Fortunately, the changes like this are "safe". I mean, they can
break nothing. Just we should try to not make them wrong ;)

> In either case, the fundamentals of ptrace operation don't really
> change.  All ptrace operations are still per-task and ptracer almost
> always has control over execution of the tracee.  Sure, it allows
> ptraced task to escape group stop but it seems defined clear enough
> and IMHO actually is a helpful debugging feature.

Heh, I think we found the place where we can't convince each other.
What if we toss a coin?

> After all, it's not
> like stop signals can be used for absoultely reliable job control.
> There's an inherent race against SIGCONT.

Sure, if we are talking about SIGCONT from "nowhere". But, the same
way ^Z is not reliable too.

> > > What do you do about PTRACE requests while a task is group stopped?
> > > Reject them?  Block them?
> >
> > Yes, another known oddity. Of course we shouldn't reject or block.
> > Perhaps we can ignore this initially. If SIGCONT comes after another
> > request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
> > tracee won't run until the next PTRACE_CONT, this makes sense.
>
> That conceptually might make sense

I only meant, this makes sense initially.

> but other than the conceptual
> integrity it widely changes the assumptions and is less useful than
> the current behavior.

Hmm, this is what we currently have?

> I don't really see why we would want to do
> that.

No, I think we do not really want this in the longer term. But I
can't say what exactly we want.

> > Only if it attaches to every thread in the thread group. Otherwise,
> > if the non-thread has already initiated the group-stop, the tracee
> > will notice TIF_SIGPENDING eventually and call do_signal_stop(),
> > debugger can't control this.
>
> The debugger is still notified and can override it.

Hmm... no, it can't? Of course it is notified after the tracee
participates and calls do_signal_stop() and gdb can resume it then.
But it can't prevent the tracee from stopping.

> > > I don't think it's an extreme corner case
> > > we can break.  For example, if a user gdb's a program which raises one
> > > of the stop signals, currently the user expects to be able to continue
> > > the program from whithin the gdb.  If we make group stop override
> > > ptrace, there's no other recourse than sending signal from outside.
> >
> > Yes. Of course, gdb can be "fixed", it can send SIGCONT.
> >
> > But yes, this is the noticeable change, that is why I suggested
> > ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
> > compatible. (although let me repeat, _pesonally_ I'd prefer to
> > simply tell user-space to learn the new rules ;)
>
> I can't really agree there.  First, to me, it seems like too radical a
> change

(I assume you mean PTRACE_CONT-doesnt-resume variant?)

> and secondly the resulting behavior might look conceptually
> pleasing but is not as useful as the current one.  Why make a change
> which results in reduced usefulness while noticeably breaking existing
> users?

I don't really agree with "not as useful", but this doesn't matter.
I agree with "noticeably breaking", this is enough. (assuming my
guess above is correct).

> > Given that SIGCHLD doesn't queue and with or without your changes
> > we send it per-thread, it is not trivial for gdb to detect the
> > group-stop anyway. Again, the kernel should help somehow.
>
> Hmmm?  Isn't this discoverable from the exit code from wait?

Sure. Probably I misunderstood. I thought, you mean we need something
like per-process "the whole group is stopped" notification for the
debugger.

> > > and I'm not really sure whether that's something worth achieving
> > > at the cost of debugging capabilities especially when we don't _have_
> > > to lose them.
> >
> > But we do not? I mean, at least this is not worse than the current
> > behaviour.
>
> I think it's worse.  With your changes, debuggers can't diddle the
> tasks behind group stop's back which the current users already expect.

OK, I certainly misunderstood you, and now I can't restore the context.
Could you spell?

> > > I agree it adds more integrity to group stop but at the cost of
> > > debugging capabilities.  I'm not yet convinced integrity of group stop
> > > is that important.  Why is it such a big deal?
> >
> > Of course I can't "prove" it is that important. But I think so.
>
> Heh, I'm not asking for proof that it is more useful. :-) But I'm still
> curious why you think it's important because the benefits aren't
> apparent to me.  Roland and you seem to share this opinion without
> much dicussion so maybe I'm missing something?

I can't!

I hate this from the time when I noticed that the application doesn't
respond to ^Z under strace. And I used strace exactly because I wanted
do debug some (I can't recall exactly) problems with jctl. That is all.

But in any case. Some users run the services under ptrace. I mean,
the application borns/runs/dies under ptrace. That is why personally
I certainly do not like anything which delays until detach (say,
the-tracee-doesnt-participate-in-group-stop-until-detach logic).

> > > CLD_STOPPED is too but while ptrace is attached the notifications are
> > > made per-task and delivered to the tracer.
> >
> > No, there is a difference. Sure, CLD_STOPPED is per-process without
> > ptrace. But CLD_CONTINUED continues to be per-process even if all
> > threads are traced.
>
> Hmm... I need to think more about it.  I'm not fully following your
> point.

This is simple. No matter how many threads we have, no matter how
many of them are ptraced, we send a single CLD_CONTINUED notification.
The only difference ptrace can make is: we look at ->group_leader
to decide who will get this notification.

> > > I think this is the key question.  Whether to de-throne
> > > PTRACE_CONT such that it cannot override group stop.  As I've already
> > > said several times already, I think it is a pretty fundamental
> > > property of ptrace
> >
> > Again, I am a bit confused. Note that PTRACE_CONT overrides
> > group stop if we do the above. It should wake up the tracee, in
> > SIGCONT-compatible way (yes, the latter is not exactly clear).
>
> What do you mean?  Waking up in SIGCONT-compatible way?  Sending
> SIGCONT ending the whole group stop?

Yes. I do not mean we should literally do send_sig_info(SIGCONT)
of course.

> > But at least this should be visible to real parent. We shouldn't
> > silently make the stopped tracee running while its real_parent
> > thinks everything is stopped.
>
> I think maybe this is where our different POVs come from.

Yes, probably.

> To me, it
> isn't too objectionable to allow debuggers to diddle with tracees
> behind the real parent's back.  In fact, it would be quite useful when
> debugging job control related behaviors.  I wouldn't have much problem
> accepting the other way around - ie. strict job control even while
> being debugged, but given that it is already allowed and visible, I
> fail to see why we should change the behavior.  It doesn't seem to
> have enough benefits to warrant such visible change.

All I can say is: sure, I see your point, and perhaps you are right
and I am wrong.

I'd really like to force CC list to participate ;)

> If I change the patchset such that group stop while ptraced first
> enters TASK_STOPPED and then transitions into TASK_TRACED on the next
> PTRACE call,

Again, I am not sure I understand what exactly you mean... If you
mean that it is wrong to simply change the state of the tracee in
ptrace_check_attach() without arch_ptrace_stop() - I agree, this
probably should be fixed.

I am wondering, if there is a simpler change... probably not.

But. this looks a bit off-topic (I mean, this looks orthogonal
to the other things we are discussing), or I missed something else?

> there will be race window which would be visible

Personally, I think this is fine.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: Bash not reacting to Ctrl-C
  2011-02-07 13:08         ` Oleg Nesterov
  2011-02-09  6:17           ` Michael Witten
@ 2011-02-11 14:41           ` Pavel Machek
  1 sibling, 0 replies; 160+ messages in thread
From: Pavel Machek @ 2011-02-11 14:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Tejun Heo, roland, jan.kratochvil, linux-kernel,
	torvalds, akpm, Peter Zijlstra, Thomas Gleixner,
	Fr?d?ric Weisbecker

Hi!

> > set_job_status_and_cleanup() notice wait_sigint_received and send
> > SIGINT to itself (termsig_handler (SIGINT)), but somehow it assumes
> > that the last foreground job should be terminated by SIGINT too:
> >
> > 	 else if (wait_sigint_received && (WTERMSIG (child->status) == SIGINT) &&
> >
> > Then the next wait_for() clears wait_sigint_received and bash
> > looses ^C
> 
> IOW.
> 
> Now that it is clear what happens, the test-case becomes even more
> trivial:
> 
> 	bash-4.1$ ./bash -c 'while true; do /bin/true; done'
> 	^C^C
> 
> needs 4-5 attempts on my machine.

Huh, this happened so often to me that I assumed it is a feature
:-(. Reproducible on both up arm and 4way x86...

Ok, it would be very good to get it fixed.


									Pavel

> --- bash-4.1/jobs.c~ctrlc_exit_race	2011-02-07 13:52:48.000000000 +0100
> +++ bash-4.1/jobs.c	2011-02-07 13:55:30.000000000 +0100
> @@ -3299,7 +3299,7 @@ set_job_status_and_cleanup (job)
>  	 signals are sent to process groups) or via kill(2) to the foreground
>  	 process by another process (or itself).  If the shell did receive the
>  	 SIGINT, it needs to perform normal SIGINT processing. */
> -      else if (wait_sigint_received && (WTERMSIG (child->status) == SIGINT) &&
> +      else if (wait_sigint_received /*&& (WTERMSIG (child->status) == SIGINT)*/ &&
>  	      IS_FOREGROUND (job) && IS_JOBCONTROL (job) == 0)
>  	{
>  	  int old_frozen;


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-04 14:48               ` Tejun Heo
  2011-02-04 17:06                 ` Oleg Nesterov
@ 2011-02-13 21:24                 ` Denys Vlasenko
  2011-02-14 15:06                   ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-13 21:24 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Friday 04 February 2011 15:48, Tejun Heo wrote:
> > But even this doesn't matter. We can not change ptrace API so that,
> > say, it does not reparent the tracee. Once we do this, we already
> > have the new API.
> 
> I would argue that we can get by well enough by trimming and updating
> the curren ptrace API.

In the past Roland wasn't very enthusiastic about changes which
were fixing ptrace's bugs-turned-features.

If you want to do that, you need to convince him to change
his position a bit.

For example, PTRACE_DETACH requires tracee to be stopped to succeed.
If debugger tries to detach while the tracee is running, it will get
an error. This forces debugger to do stupid things like sending SIGSTOP,
then waiting for tracee to stop, then doing PTRACE_DETACH, then
sending SIGCONT. Of course, while this dance is performed,
any SIGSTOPs/SIGCONTs which may be  sent to the tracee by other processes
are totally disrupted by this.

The natural (for me) fix is to make PTRACE_DETACH work even on running
tracee. It simply makes a lot of sense. Why on earth do we need tracee
to be stopped? There is no reason.

But this is a change in ptrace behavior, and therefore is not acceptable
for Roland.

Basically, we have slightly idea what are ptrace's development goals are.

>From my POV, we want to have strace tool which is *completely*, 100%
transparent for traced process and its parent (sans decrease in speed).
No "vfork turns into fork under strace" (we had this sometime ago),
no missed or misinterpreted SIGSTOPs or SIGTRAPs. Real parent
should still see its child stopping on ^Z even if strace -p PID
attached itself to the child. Etc etc etc.

If this can only be achieved by slightly changing (basically, fixing)
ptrace API, I'd go for it.

In many cases, Roland won't.

He is the maintainer. If you want changes which break some aspects of
current behavior, even quirky ones, you need to convince him.


> I could be wrong (with pretty high probability) but I don't really see
> the pressing need for a completely new API.  ptrace sure is ugly and
> quirky but it's something people are already used to.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-09 14:18                               ` Tejun Heo
  2011-02-09 14:21                                 ` Tejun Heo
  2011-02-09 21:25                                 ` Oleg Nesterov
@ 2011-02-13 22:25                                 ` Denys Vlasenko
  2011-02-14 15:13                                   ` Tejun Heo
  2011-02-14 15:31                                   ` Oleg Nesterov
  2 siblings, 2 replies; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-13 22:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Wednesday 09 February 2011 15:18, Tejun Heo wrote:
> Hello, Oleg.
> 
> On Mon, Feb 07, 2011 at 06:48:21PM +0100, Oleg Nesterov wrote:
> > > I don't know.  Maybe it's more consistent that way and I'm not
> > > fundamentally against that but it is a big behavior change
> > 
> > Hmm. I tried to describe the current behaviour...
> 
> We can make it behave like the following.  { | } denotes two
> alternative behaviors regarding SIGCONT.
> 
>   If a group stop is initiated while, or was in progress when a task
>   is ptraced, the task will stop for group stop and notify the ptracer
>   accordingly.  Note that the task could be trapped elsewhere delaying
>   this from happening.  When the task stops for group stop, it
>   participates in group stop as if it is not ptraced and the real
>   parent is notified of group stop completion.
> 
>   Note that { the task is put into TASK_TRACED state and group stop
>   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
>   state and the following PTRACE request will transition it into
>   TASK_TRACED.  If SIGCONT is received before transition to
>   TASK_TRACED is made, the task will resume execution.  If PTRACE
>   request faces with SIGCONT, PTRACE request may fail. }
> 
>   The ptracer may resume execution of the task using PTRACE_CONT
>   without affecting other tasks in the group.  The task will not stop
>   for the same group stop again while ptraced.
> 
>   On ptrace detach, if group stop is in effect, the task will be put
>   into TASK_STOPPED state and if it is the first time the task is
>   stopping for the current group stop, it will participate in group
>   stop completion.
> 
> This can be phrased better but it seems well defined enough for me.  I
> take it that one of your concerns is direct transition into
> TASK_TRACED on group stop while ptraced which prevents the tracee from
> reacting to the following SIGCONT.  I'm not sure how much of an actual
> problem it is given that our notification to real parent hasn't worked
> at all till now but we can definitely implement proper TASK_STOPPED ->
> TRACED transition on the next PTRACE request.  There exists a
> fundamental race condition between SIGCONT and the next PTRACE call
> but I don't think it's such a big deal as long as the transition
> itself is done properly.
> 
> If we don't go that route, another solution would be to add a ptrace
> call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
> which the ptracer can call once it knows the tracee entered group
> stop.
> 
> In either case, the fundamentals of ptrace operation don't really
> change.  All ptrace operations are still per-task and ptracer almost
> always has control over execution of the tracee.  Sure, it allows
> ptraced task to escape group stop but it seems defined clear enough
> and IMHO actually is a helpful debugging feature.  After all, it's not
> like stop signals can be used for absoultely reliable job control.
> There's an inherent race against SIGCONT.
> 
> > > What do you do about PTRACE requests while a task is group stopped?
> > > Reject them?  Block them?
> > 
> > Yes, another known oddity. Of course we shouldn't reject or block.
> > Perhaps we can ignore this initially. If SIGCONT comes after another
> > request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
> > tracee won't run until the next PTRACE_CONT, this makes sense.
> 
> That conceptually might make sense but other than the conceptual
> integrity it widely changes the assumptions and is less useful than
> the current behavior.  I don't really see why we would want to do
> that.
> 
> > The problem is, gdb can't leave the tracee in STOPPED state if it
> > wants. We need to improve this somehow (like in your previous example
> > with gdb).
> >
> > Only if it attaches to every thread in the thread group. Otherwise,
> > if the non-thread has already initiated the group-stop, the tracee
> > will notice TIF_SIGPENDING eventually and call do_signal_stop(),
> > debugger can't control this.
> 
> The debugger is still notified and can override it.  gdb already can
> and does.
> 
> > > I don't think it's an extreme corner case
> > > we can break.  For example, if a user gdb's a program which raises one
> > > of the stop signals, currently the user expects to be able to continue
> > > the program from whithin the gdb.  If we make group stop override
> > > ptrace, there's no other recourse than sending signal from outside.
> > 
> > Yes. Of course, gdb can be "fixed", it can send SIGCONT.
> > 
> > But yes, this is the noticeable change, that is why I suggested
> > ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
> > compatible. (although let me repeat, _pesonally_ I'd prefer to
> > simply tell user-space to learn the new rules ;)
> 
> I can't really agree there.  First, to me, it seems like too radical a
> change and secondly the resulting behavior might look conceptually
> pleasing but is not as useful as the current one.  Why make a change
> which results in reduced usefulness while noticeably breaking existing
> users?
> 
> > > The only thing it achieves is the integrity of group
> > > stop
> > 
> > Given that SIGCHLD doesn't queue and with or without your changes
> > we send it per-thread, it is not trivial for gdb to detect the
> > group-stop anyway. Again, the kernel should help somehow.
> 
> Hmmm?  Isn't this discoverable from the exit code from wait?
> 
> > > and I'm not really sure whether that's something worth achieving
> > > at the cost of debugging capabilities especially when we don't _have_
> > > to lose them.
> > 
> > But we do not? I mean, at least this is not worse than the current
> > behaviour.
> 
> I think it's worse.  With your changes, debuggers can't diddle the
> tasks behind group stop's back which the current users already expect.

But this "diddling behind group stop's back" is exactly the current
problem with stop signals.

Here I try to stop a ptraced process:

$ strace -tt sleep 30
23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
...
23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
    (I forgot again why we see it twice. Another quirk I guess...)
23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
23:02:45.622433 close(1)                = 0
23:02:45.622743 close(2)                = 0
23:02:45.622885 exit_group(0)           = ?

Why sleep didn't stop?

Because PTRACE_SYSCALL brought the task out of group stop at once,
even though strace did try hard to not do so:

    ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!

PTRACE_CONT in this situation would do the same.

You are saying that it is useful that gdb restarts group-stopped task
with mere PTRACE_CONT. Above is a counter-example where it is anti-useful:
I would muchly prefer strace to see task sit stopped until it gets SIGCONT
(or some fatal signal).

Why gdb can't use SIGCONT instead of PTRACE_CONT, just like every
other tool which needs to resume stopped tasks?


Hmm... it occurred to me that we can use 4th argument of
PTRACE_CONT/SYSCALL to distinguish between the case when tracer wants
to leave tracee stopped (pass SIGSTOP), or wants it to be resumed
(pass 0), or even wants to simulate "real" signal
arriving (pass other SIGfoo).

This will automagically fix "strace sleep" case, because
strace already is sending SIGSTOP in PTRACE_SYSCALL (currently,
in vain).

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-09 21:25                                 ` Oleg Nesterov
@ 2011-02-13 23:01                                   ` Denys Vlasenko
  2011-02-14  9:03                                     ` Jan Kratochvil
  2011-02-14 15:51                                     ` Oleg Nesterov
  2011-02-14 14:50                                   ` Tejun Heo
  1 sibling, 2 replies; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-13 23:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Wednesday 09 February 2011 22:25, Oleg Nesterov wrote:
> >   Note that { the task is put into TASK_TRACED state and group stop
> >   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
> >   state and the following PTRACE request will transition it into
> >   TASK_TRACED.  If SIGCONT is received before transition to
> >   TASK_TRACED is made, the task will resume execution.  If PTRACE
> >   request faces with SIGCONT, PTRACE request may fail. }
> 
> To me, the first variant looks better. But, only because it is closer
> to the current behaviour. I mean, it is better to change the things
> incrementally.
> 
> But in the longer term - I do not know. Personally, I like the
> TASK_STOPPED variant. To the point, I was thinking that (perhaps)
> we can change ptrace_stop() so that it simply calls do_signal_stop()
> if it notices ->group_stop_count != 0.
> 
> >   The ptracer may resume execution of the task using PTRACE_CONT
> >   without affecting other tasks in the group.
> 
> And this is what I do not like. I just can't accept the fact there
> is a running thread in the SIGNAL_STOP_STOPPED group.
> 
> But yes: this is what the current code does, I am not sure we can
> change this, and both PTRACE_CONT-doesnt-resume-until-SIGCONT and
> PTRACE_CONT-acts-as-SIGCONT are not "perfect" too.

Can you enumerate reasons why each of them are not perfect?
I want to understand your thinking better here.


> > There exists a
> > fundamental race condition between SIGCONT and the next PTRACE call
> 
> Yes, and this race is already here, ptracer should take care.

>From the API POV, there is no race, if we assume Oleg's interpretation
that "stopped/not-stopped" and "traced/not-traced" states are
completely orthogonal:

As long as task is in "traced" state and it is in ptrace-stop, SIGCONT
delivered to it does not make it run. Only next PTRACE_CONT (or SYSCALL)
will. Neither will SIGCONT delivered to any other thread group member:
even though this will terminate group-stop state and all untraced
tasks will start running, all tasks which are in ptrace-stop will not:
they will wait for the next PTRACE_CONT (or SYSCALL).

I realize that currently it doesn't work like this, because
group-stop and ptrace-stop are intermingled concepts right now.
My point is, it can be made to work that way, and become free
of this particular race.


> > In either case, the fundamentals of ptrace operation don't really
> > change.  All ptrace operations are still per-task and ptracer almost
> > always has control over execution of the tracee.  Sure, it allows
> > ptraced task to escape group stop but it seems defined clear enough
> > and IMHO actually is a helpful debugging feature.
> 
> Heh, I think we found the place where we can't convince each other.
> What if we toss a coin?

I'm with Oleg on this. If debugger wants to terminate group-stop,
it should just send SIGCONT, not depend on the obscure feature (it is not
documented, right?) that PTRACE_CONT somehow affects group-stop state.


> > > > What do you do about PTRACE requests while a task is group stopped?
> > > > Reject them?  Block them?
> > >
> > > Yes, another known oddity. Of course we shouldn't reject or block.

Why they need to be rejected or blocked? Think again about
"strace sleep" interrupted by SIGSTOP (or SIGTSTP):

* sleep runs in nanosleep
* SIGSTOP arrives, strace sees it
* strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
* sleep process enters group-stop
* nothing happens until some other signal arrives
* say, SIGCONT arrives
* strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGCONT)

I believe your question is "what if tracer wants to do a ptrace op
on tracee while it is in group-stop" (step 4 above)?

The answer is simple:
the same as if tracer wants to do a ptrace op on tracee while it is running,
that is - ptrace() should return error. For the tracer (in my example,
strace) there is no difference in state after ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
and ptrace(PTRACE_SYSCALL, ..., <0 or SIGWINCH or any other sig>):
in both cases tracer must wait for tracee to enter ptrace-stop before
any ptrace op is allowed.

Jan, from gdb developer's POV, do you have a problem with this?


> > Heh, I'm not asking for proof that it is more useful. :-) But I'm still
> > curious why you think it's important because the benefits aren't
> > apparent to me.  Roland and you seem to share this opinion without
> > much dicussion so maybe I'm missing something?
> 
> I can't!
> 
> I hate this from the time when I noticed that the application doesn't
> respond to ^Z under strace. And I used strace exactly because I wanted
> do debug some (I can't recall exactly) problems with jctl. That is all.

Recently I had exactly this experience too. It's frustrating.


> > To me, it
> > isn't too objectionable to allow debuggers to diddle with tracees
> > behind the real parent's back.  In fact, it would be quite useful when
> > debugging job control related behaviors.  I wouldn't have much problem
> > accepting the other way around - ie. strict job control even while
> > being debugged, but given that it is already allowed and visible, I
> > fail to see why we should change the behavior.  It doesn't seem to
> > have enough benefits to warrant such visible change.
> 
> All I can say is: sure, I see your point, and perhaps you are right
> and I am wrong.
> 
> I'd really like to force CC list to participate ;)

You just succeeded :)

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-13 23:01                                   ` Denys Vlasenko
@ 2011-02-14  9:03                                     ` Jan Kratochvil
  2011-02-14 11:39                                       ` Denys Vlasenko
                                                         ` (2 more replies)
  2011-02-14 15:51                                     ` Oleg Nesterov
  1 sibling, 3 replies; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-14  9:03 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Mon, 14 Feb 2011 00:01:47 +0100, Denys Vlasenko wrote:
> * sleep runs in nanosleep
> * SIGSTOP arrives, strace sees it
> * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
> * sleep process enters group-stop

The last point breaks the documented behavior of ptrace:
	If data is nonzero and not SIGSTOP, it is interpreted as a signal to
	be delivered to the child; otherwise, no signal is delivered.

I do not see it would affect gdb.  strace will change its behavior when
SIGSTOP is sent to its tracee although the new behavior may be OK.

It is more a subject of apps compatibility testing with such a kernel change.


> * nothing happens until some other signal arrives
> * say, SIGCONT arrives

What if other signal arrives?  The tracer probably should not be notified as
the tracee is in a group-stop.


> * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGCONT)


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14  9:03                                     ` Jan Kratochvil
@ 2011-02-14 11:39                                       ` Denys Vlasenko
  2011-02-14 17:32                                         ` Oleg Nesterov
  2011-02-14 16:01                                       ` Oleg Nesterov
  2011-02-26  3:59                                       ` Pavel Machek
  2 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 11:39 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Oleg Nesterov, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Monday 14 February 2011 10:03, Jan Kratochvil wrote:
> On Mon, 14 Feb 2011 00:01:47 +0100, Denys Vlasenko wrote:
> > * sleep runs in nanosleep
> > * SIGSTOP arrives, strace sees it
> > * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
> > * sleep process enters group-stop
> 
> The last point breaks the documented behavior of ptrace:
> 	If data is nonzero and not SIGSTOP, it is interpreted as a signal to
> 	be delivered to the child; otherwise, no signal is delivered.

But SIGSTOP _is_ delivered - that's why sleep process stops.

> I do not see it would affect gdb.  strace will change its behavior when
> SIGSTOP is sent to its tracee although the new behavior may be OK.
> 
> It is more a subject of apps compatibility testing with such a kernel change.
> 
> 
> > * nothing happens until some other signal arrives
> > * say, SIGCONT arrives
> 
> What if other signal arrives?  The tracer probably should not be notified as
> the tracee is in a group-stop.

The behavior here ideally should be the same as for non-traced process:
the signals are remembered while process is stopped, and it sees them
only after SIGCONT, as demonstrated by the following program

#include <errno.h>
#include <string.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
static void sig(int n)
{
        char buf[128];
        int e = errno;
        sprintf(buf, "sig: %d %s\n", n, strsignal(n));
        write(1, buf, strlen(buf));
        errno = e;
}
int main()
{
        signal(SIGSTOP, sig);
        signal(SIGCONT, sig);
        signal(SIGWINCH, sig);
        signal(SIGABRT, sig);
 again:
        printf("PID: %d\n", getpid());
        fflush(NULL);
        errno = 0;
        sleep(30);
        int e = errno;
        printf("after sleep: errno=%d %s\n", e, strerror(e));
        if (e) goto again;
        return 0;
}

# ./a.out
PID: 16382
  <------ kill -STOP 16382
  <------ kill -ABRT 16382
  <------ kill -WINCH 16382
  <------ kill -CONT 16382
sig: 28 Window changed
sig: 18 Continued
sig: 6 Aborted
after sleep: errno=4 Interrupted system call
PID: 16382


I believe it would be best if debugger sees signals immediately,
but when it does ptrace(PTRACE_CONT/SYSCALL, ..., <sig>)
in order to send signals to group-stopped tracee, they are queued
to it without terminating group-stop. When SIGCONT arrives,
ptrace(PTRACE_CONT/SYSCALL, ..., SIGCONT) terminates group-stop
and causes all queued signals to be handled (in random order,
not necessarily in the order of arrival. Even CONT handler is
not guaranteed to be called first, as you see above).
 
-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-09 21:25                                 ` Oleg Nesterov
  2011-02-13 23:01                                   ` Denys Vlasenko
@ 2011-02-14 14:50                                   ` Tejun Heo
  2011-02-14 18:53                                     ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-14 14:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

Hello, Oleg.  Sorry about the delay.

On Wed, Feb 09, 2011 at 10:25:26PM +0100, Oleg Nesterov wrote:
> >   Note that { the task is put into TASK_TRACED state and group stop
> >   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
> >   state and the following PTRACE request will transition it into
> >   TASK_TRACED.  If SIGCONT is received before transition to
> >   TASK_TRACED is made, the task will resume execution.  If PTRACE
> >   request faces with SIGCONT, PTRACE request may fail. }
> 
> To me, the first variant looks better. But, only because it is closer
> to the current behaviour. I mean, it is better to change the things
> incrementally.

Alright, it's the first variant then.

> But in the longer term - I do not know. Personally, I like the
> TASK_STOPPED variant. To the point, I was thinking that (perhaps)
> we can change ptrace_stop() so that it simply calls do_signal_stop()
> if it notices ->group_stop_count != 0.

I don't know.  IMHO it's enough to give the ptracer a way to honor
group stop and notification so things like strace can stay more or
less transparent.  I don't think it's something we need to enforce.

> >   The ptracer may resume execution of the task using PTRACE_CONT
> >   without affecting other tasks in the group.
> 
> And this is what I do not like. I just can't accept the fact there
> is a running thread in the SIGNAL_STOP_STOPPED group.

Let's agree to disagree here.  I agree it's weird but don't think it's
necessarily a bad thing that needs to be changed.

> But yes: this is what the current code does, I am not sure we can
> change this, and both PTRACE_CONT-doesnt-resume-until-SIGCONT and
> PTRACE_CONT-acts-as-SIGCONT are not "perfect" too.

Yeap.

> > I'm not sure how much of an actual problem it is given that our
> > notification to real parent hasn't worked at all till now
> 
> Yes! and this is very good argument in favour of all your objections ;)

My objections to what?  The one thing that I'm really against is
allowing group stop to override PTRACE_CONT and that's visible
regardless of notifications.  Other than that, I think we're pretty
much in agreement now, aren't we?

> > but we can definitely implement proper TASK_STOPPED ->
> > TRACED transition on the next PTRACE request.
> 
> I guess, you mean that the current code bypasses the
> ptrace_stop()->arch_ptrace_stop_needed() code while doing s/STOPPED/TRACED ?

Yes.

> Oh, currently I am ignoring this, my only concern is how this all
> looks to the userland. But this is the good point, and I have to admit
> that I never realized this is just wrong. Yes, I agree, we should do
> something, but this is not visible to user-space (except this should
> fix the bug ;)

Mostly not but there's the obscure window where the tracee goes
through TASK_RUNNING which can be visible from userland from another
task of the ptracer group, which I think is ignorable as long as it's
properly masked from the ptracing task itself.  I wanted to make sure
we agree on that one.

> > If we don't go that route, another solution would be to add a ptrace
> > call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
> > which the ptracer can call once it knows the tracee entered group
> > stop.
> 
> Perhaps... Or something else, but surely there is a room for improvements.
> Fortunately, the changes like this are "safe". I mean, they can
> break nothing. Just we should try to not make them wrong ;)

But if we go with the first option and strace and friends can stay
mostly transparent, I don't think we'll really need this change.  It's
a bit hairy but still usable.

> > In either case, the fundamentals of ptrace operation don't really
> > change.  All ptrace operations are still per-task and ptracer almost
> > always has control over execution of the tracee.  Sure, it allows
> > ptraced task to escape group stop but it seems defined clear enough
> > and IMHO actually is a helpful debugging feature.
> 
> Heh, I think we found the place where we can't convince each other.
> What if we toss a coin?

Right, let's leave it alone for now.

> > After all, it's not like stop signals can be used for absoultely
> > reliable job control.  There's an inherent race against SIGCONT.
> 
> Sure, if we are talking about SIGCONT from "nowhere". But, the same
> way ^Z is not reliable too.

It doesn't even have to be from "nowhere".  A process can be raising
SIGCONT itself.  To me, the whole thing feels more like an
administration aid by design than a strict monitor/control mechanism.
This is quite subjective, I agree.

> > but other than the conceptual integrity it widely changes the
> > assumptions and is less useful than the current behavior.
> 
> Hmm, this is what we currently have?

I'm a bit lost on what you mean above but I was still talking about
PTRACE_CONT being able to escape group stop.

> > The debugger is still notified and can override it.
> 
> Hmm... no, it can't? Of course it is notified after the tracee
> participates and calls do_signal_stop() and gdb can resume it then.
> But it can't prevent the tracee from stopping.

That was what I meant.  It can't prevent group stop from happening but
it knows when it happens and can escape it.

> > I can't really agree there.  First, to me, it seems like too radical a
> > change
> 
> (I assume you mean PTRACE_CONT-doesnt-resume variant?)

Yeah, maybe I was a bit too obsessed with it?  :-)

> > > Given that SIGCHLD doesn't queue and with or without your changes
> > > we send it per-thread, it is not trivial for gdb to detect the
> > > group-stop anyway. Again, the kernel should help somehow.
> >
> > Hmmm?  Isn't this discoverable from the exit code from wait?
> 
> Sure. Probably I misunderstood. I thought, you mean we need something
> like per-process "the whole group is stopped" notification for the
> debugger.

Oh, I didn't mean that.  Just the usual trace task specific
notification.

> > I think it's worse.  With your changes, debuggers can't diddle the
> > tasks behind group stop's back which the current users already expect.
> 
> OK, I certainly misunderstood you, and now I can't restore the context.
> Could you spell?

I was still talking about PTRACE_CONT escaping the tracee from group
stop.

> > Heh, I'm not asking for proof that it is more useful. :-) But I'm still
> > curious why you think it's important because the benefits aren't
> > apparent to me.  Roland and you seem to share this opinion without
> > much dicussion so maybe I'm missing something?
> 
> I can't!
> 
> I hate this from the time when I noticed that the application doesn't
> respond to ^Z under strace. And I used strace exactly because I wanted
> do debug some (I can't recall exactly) problems with jctl. That is all.
> 
> But in any case. Some users run the services under ptrace. I mean,
> the application borns/runs/dies under ptrace. That is why personally
> I certainly do not like anything which delays until detach (say,
> the-tracee-doesnt-participate-in-group-stop-until-detach logic).

I see, so let's make them participate in the group stop and notify the
real parent of the group stop.  As long as PTRACE_CONT behavior isn't
changed, I don't object.

> > If I change the patchset such that group stop while ptraced first
> > enters TASK_STOPPED and then transitions into TASK_TRACED on the next
> > PTRACE call,
> 
> Again, I am not sure I understand what exactly you mean... If you
> mean that it is wrong to simply change the state of the tracee in
> ptrace_check_attach() without arch_ptrace_stop() - I agree, this
> probably should be fixed.

Yeap, that one.

> I am wondering, if there is a simpler change... probably not.
> 
> But. this looks a bit off-topic (I mean, this looks orthogonal
> to the other things we are discussing), or I missed something else?

Hmmmm... maybe I'm missing something but I think we're in kind of
agreement now.  I'm gonna change the patchset such that SIGSTOP while
being ptraced would group stop the tracee so that SIGCONT can wake it
up, which will makes the only changes visible to userland the odd race
window things.

> > there will be race window which would be visible
> 
> Personally, I think this is fine.

Alright.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-13 21:24                 ` Denys Vlasenko
@ 2011-02-14 15:06                   ` Oleg Nesterov
  2011-02-14 15:19                     ` Tejun Heo
  2011-02-14 17:05                     ` Denys Vlasenko
  0 siblings, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 15:06 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/13, Denys Vlasenko wrote:
>
> For example, PTRACE_DETACH requires tracee to be stopped to succeed.
> If debugger tries to detach while the tracee is running, it will get
> an error. This forces debugger to do stupid things like sending SIGSTOP,
> then waiting for tracee to stop, then doing PTRACE_DETACH, then
> sending SIGCONT. Of course, while this dance is performed,
> any SIGSTOPs/SIGCONTs which may be  sent to the tracee by other processes
> are totally disrupted by this.

Yes.

> The natural (for me) fix is to make PTRACE_DETACH work even on running
> tracee. It simply makes a lot of sense. Why on earth do we need tracee
> to be stopped? There is no reason.

Agreed, but

> But this is a change in ptrace behavior, and therefore is not acceptable
> for Roland.

I agree with Roland. Not only this is too visible change, it is not clear
what detach-with-signal can do if the tracee is not stopped.

This was (very briefly) discussed recently. Probably we can implement
PTRACE_DETACH_RUNNING (the name is random) which doesn't require the
stopped tracee but ignores the "data" argument.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-13 22:25                                 ` Denys Vlasenko
@ 2011-02-14 15:13                                   ` Tejun Heo
  2011-02-14 16:15                                     ` Oleg Nesterov
  2011-02-14 17:20                                     ` Denys Vlasenko
  2011-02-14 15:31                                   ` Oleg Nesterov
  1 sibling, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-14 15:13 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello, Denys.

On Sun, Feb 13, 2011 at 11:25:55PM +0100, Denys Vlasenko wrote:
> But this "diddling behind group stop's back" is exactly the current
> problem with stop signals.

Maybe.  I don't necessarily agree but can see your point too but I
think more important part is that that's a behavior which is quite
noticeable from userland.

> Here I try to stop a ptraced process:
> 
> $ strace -tt sleep 30
> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> ...
> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>     (I forgot again why we see it twice. Another quirk I guess...)
> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
> 23:02:45.622433 close(1)                = 0
> 23:02:45.622743 close(2)                = 0
> 23:02:45.622885 exit_group(0)           = ?
> 
> Why sleep didn't stop?
> 
> Because PTRACE_SYSCALL brought the task out of group stop at once,
> even though strace did try hard to not do so:
> 
>     ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!
> 
> PTRACE_CONT in this situation would do the same.

This can be fixed by updating strace, right?  strace can look at the
wait(2) exit code and if the tracee stopped for group stop, wait for
the tracee to be continued instead of issuing PTRACE_SYSCALL.

> You are saying that it is useful that gdb restarts group-stopped task
> with mere PTRACE_CONT. Above is a counter-example where it is anti-useful:
> I would muchly prefer strace to see task sit stopped until it gets SIGCONT
> (or some fatal signal).

This is more of an issue which can be improved in strace.  Sure,
changing the kernel to enforce group stop over ptrace would make this
case behave better but at the cost of breaking gdb.

> Why gdb can't use SIGCONT instead of PTRACE_CONT, just like every
> other tool which needs to resume stopped tasks?

Because that's how PTRACE_CONT behaved the whole time.  It can but
just hasn't needed to.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 15:06                   ` Oleg Nesterov
@ 2011-02-14 15:19                     ` Tejun Heo
  2011-02-14 16:20                       ` Oleg Nesterov
  2011-02-14 17:05                     ` Denys Vlasenko
  1 sibling, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-14 15:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Mon, Feb 14, 2011 at 04:06:56PM +0100, Oleg Nesterov wrote:
> This was (very briefly) discussed recently. Probably we can implement
> PTRACE_DETACH_RUNNING (the name is random) which doesn't require the
> stopped tracee but ignores the "data" argument.

I think the root problem is not how ptrace detaches but how ptrace
attaches and stops tracee.  If we have a clean way to seize the
tracee, how we detach doesn't really matter.  For example, a new
ptrace call which stops the tracee and puts it in a ptrace command
ready state without messing with the signal and group stop stuff.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-13 22:25                                 ` Denys Vlasenko
  2011-02-14 15:13                                   ` Tejun Heo
@ 2011-02-14 15:31                                   ` Oleg Nesterov
  2011-02-14 17:24                                     ` Denys Vlasenko
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 15:31 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/13, Denys Vlasenko wrote:
>
> On Wednesday 09 February 2011 15:18, Tejun Heo wrote:
> >
> > > > and I'm not really sure whether that's something worth achieving
> > > > at the cost of debugging capabilities especially when we don't _have_
> > > > to lose them.
> > >
> > > But we do not? I mean, at least this is not worse than the current
> > > behaviour.
> >
> > I think it's worse.  With your changes, debuggers can't diddle the
> > tasks behind group stop's back which the current users already expect.
>
> But this "diddling behind group stop's back" is exactly the current
> problem with stop signals.
>
> Here I try to stop a ptraced process:
>
> $ strace -tt sleep 30
> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> ...
> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>     (I forgot again why we see it twice. Another quirk I guess...)

      (this is correct, the tracee reports the signal=SIGSTOP, then
       it reports it actually stopps with exit_code=SIGSTOP)

> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
> 23:02:45.622433 close(1)                = 0
> 23:02:45.622743 close(2)                = 0
> 23:02:45.622885 exit_group(0)           = ?
>
> Why sleep didn't stop?

Yes. And I think this all should be fixed.

Although, depending on how we change the kernel, strace may need the
fixes too.

> Because PTRACE_SYSCALL brought the task out of group stop at once,
> even though strace did try hard to not do so:
>
>     ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!

Yes.

(just to clarify, data=SIGSTOP has no effect when the tracee reports
 from do_signal_stop. iow, when it reports i-am-stopped)

But otherwise I agree, and that was my point too.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-13 23:01                                   ` Denys Vlasenko
  2011-02-14  9:03                                     ` Jan Kratochvil
@ 2011-02-14 15:51                                     ` Oleg Nesterov
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 15:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> On Wednesday 09 February 2011 22:25, Oleg Nesterov wrote:
> >
> > But yes: this is what the current code does, I am not sure we can
> > change this, and both PTRACE_CONT-doesnt-resume-until-SIGCONT and
> > PTRACE_CONT-acts-as-SIGCONT are not "perfect" too.
>
> Can you enumerate reasons why each of them are not perfect?
> I want to understand your thinking better here.

Standard answer: this can break things ;)

Also, PTRACE_CONT-acts-as-SIGCONT looks a bit ugly, it can wakeup
other tracees (or we can turn them into TASK_TRACED, I dunno).

> > Yes, and this race is already here, ptracer should take care.
>
> From the API POV, there is no race,

Sorry for confusion... I just meant that if the tracee is TASK_STOPPED
then ptrace(PTRACE_WHATEVER) can always fail if it races with SIGCONT
from the third party.

> > > In either case, the fundamentals of ptrace operation don't really
> > > change.  All ptrace operations are still per-task and ptracer almost
> > > always has control over execution of the tracee.  Sure, it allows
> > > ptraced task to escape group stop but it seems defined clear enough
> > > and IMHO actually is a helpful debugging feature.
> >
> > Heh, I think we found the place where we can't convince each other.
> > What if we toss a coin?
>
> I'm with Oleg on this. If debugger wants to terminate group-stop,
> it should just send SIGCONT, not depend on the obscure feature (it is not
> documented, right?) that PTRACE_CONT somehow affects group-stop state.

Yes, this is PTRACE_CONT-doesnt-resume-until-SIGCONT suggested by Roland.

But Tejun rightly points this can confuse gdb (and nobody knows what
else ;) Can we do this change and require the applications to learn
the new rules? I do not know.

> > I hate this from the time when I noticed that the application doesn't
> > respond to ^Z under strace. And I used strace exactly because I wanted
> > do debug some (I can't recall exactly) problems with jctl. That is all.
>
> Recently I had exactly this experience too. It's frustrating.

Agreed.

> You just succeeded :)

Thanks ;)

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14  9:03                                     ` Jan Kratochvil
  2011-02-14 11:39                                       ` Denys Vlasenko
@ 2011-02-14 16:01                                       ` Oleg Nesterov
  2011-02-26  3:59                                       ` Pavel Machek
  2 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 16:01 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/14, Jan Kratochvil wrote:
>
> On Mon, 14 Feb 2011 00:01:47 +0100, Denys Vlasenko wrote:
> > * sleep runs in nanosleep
> > * SIGSTOP arrives, strace sees it
> > * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
> > * sleep process enters group-stop
>
> The last point breaks the documented behavior of ptrace:

Well, afaics no. This is what we currently do.

> 	If data is nonzero and not SIGSTOP, it is interpreted as a signal to
> 	be delivered to the child; otherwise, no signal is delivered.

Fantastic. I never knew the man states this (although the documentation
above means PTRACE_CONT).

But this is not true. And iirc this was never true. Netither PTRACE_CONT,
nor any other request threat SIGSTOP specially.

(also, please note that the signal is not necessarily delivered, only
 if we are going to resume the tracee after it reported the signal or
 syscall entry/exit)

> > * nothing happens until some other signal arrives
> > * say, SIGCONT arrives
>
> What if other signal arrives?

only SIGCONT can resume the stopped task (ignorign SIGKILL).

> The tracer probably should not be notified as
> the tracee is in a group-stop.

It is.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 15:13                                   ` Tejun Heo
@ 2011-02-14 16:15                                     ` Oleg Nesterov
  2011-02-14 16:33                                       ` Tejun Heo
  2011-02-14 17:20                                     ` Denys Vlasenko
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 16:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/14, Tejun Heo wrote:
>
> Hello, Denys.
>
> On Sun, Feb 13, 2011 at 11:25:55PM +0100, Denys Vlasenko wrote:
>
> > $ strace -tt sleep 30
> > 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> > ...
> > 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> > 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> > 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >     (I forgot again why we see it twice. Another quirk I guess...)
> > 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
> > 23:02:45.622433 close(1)                = 0
> > 23:02:45.622743 close(2)                = 0
> > 23:02:45.622885 exit_group(0)           = ?
> >
> > Why sleep didn't stop?
> >
> > Because PTRACE_SYSCALL brought the task out of group stop at once,
> > even though strace did try hard to not do so:
> >
> >     ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!
> >
> > PTRACE_CONT in this situation would do the same.
>
> This can be fixed by updating strace, right?  strace can look at the
> wait(2) exit code and if the tracee stopped for group stop, wait for
> the tracee to be continued instead of issuing PTRACE_SYSCALL.

Yes, in this particular case strace could be more clever.

But. The tracee should react to SIGCONT after that, this means we
shouldn't "delay" this stop or force the TASK_TRACED state.

And note that in this case real_parent == debugger. Another case
is more interesting, and this means we shouldn't delay or hide the
notifications.

(I just tried to summarize the previous discussion for Denys)

> > Why gdb can't use SIGCONT instead of PTRACE_CONT, just like every
> > other tool which needs to resume stopped tasks?
>
> Because that's how PTRACE_CONT behaved the whole time.

Unfortunately, this is true.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 15:19                     ` Tejun Heo
@ 2011-02-14 16:20                       ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 16:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/14, Tejun Heo wrote:
>
> Hello,
>
> On Mon, Feb 14, 2011 at 04:06:56PM +0100, Oleg Nesterov wrote:
> > This was (very briefly) discussed recently. Probably we can implement
> > PTRACE_DETACH_RUNNING (the name is random) which doesn't require the
> > stopped tracee but ignores the "data" argument.
>
> I think the root problem is not how ptrace detaches but how ptrace
> attaches and stops tracee.

Agreed, but please note that currently it is _very_ nontrivial to detach
correctly.

> If we have a clean way to seize the
> tracee, how we detach doesn't really matter.  For example, a new
> ptrace call which stops the tracee and puts it in a ptrace command
> ready state without messing with the signal and group stop stuff.

Indeed. Also briefly discussed: PTRACE_INTERRUPT.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 16:15                                     ` Oleg Nesterov
@ 2011-02-14 16:33                                       ` Tejun Heo
  2011-02-14 17:23                                         ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-14 16:33 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Mon, Feb 14, 2011 at 05:15:15PM +0100, Oleg Nesterov wrote:
> > > PTRACE_CONT in this situation would do the same.
> >
> > This can be fixed by updating strace, right?  strace can look at the
> > wait(2) exit code and if the tracee stopped for group stop, wait for
> > the tracee to be continued instead of issuing PTRACE_SYSCALL.
> 
> Yes, in this particular case strace could be more clever.
> 
> But. The tracee should react to SIGCONT after that, this means we
> shouldn't "delay" this stop or force the TASK_TRACED state.

Yeap, which is achievable by treating group stop differently from
ptrace traps and make it proceed to TASK_TRACED only if ptrace wants
to issue commands.  (reiterating just to make sure there's no
misunderstanding)

> And note that in this case real_parent == debugger. Another case
> is more interesting, and this means we shouldn't delay or hide the
> notifications.
> 
> (I just tried to summarize the previous discussion for Denys)

Agreed.  We should be notifying both the real parent and ptracer.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 15:06                   ` Oleg Nesterov
  2011-02-14 15:19                     ` Tejun Heo
@ 2011-02-14 17:05                     ` Denys Vlasenko
  2011-02-14 17:18                       ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 17:05 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Mon, Feb 14, 2011 at 4:06 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/13, Denys Vlasenko wrote:
>>
>> For example, PTRACE_DETACH requires tracee to be stopped to succeed.
>> If debugger tries to detach while the tracee is running, it will get
>> an error. This forces debugger to do stupid things like sending SIGSTOP,
>> then waiting for tracee to stop, then doing PTRACE_DETACH, then
>> sending SIGCONT. Of course, while this dance is performed,
>> any SIGSTOPs/SIGCONTs which may be  sent to the tracee by other processes
>> are totally disrupted by this.
>
> Yes.
>
>> The natural (for me) fix is to make PTRACE_DETACH work even on running
>> tracee. It simply makes a lot of sense. Why on earth do we need tracee
>> to be stopped? There is no reason.
>
> Agreed, but
>
>> But this is a change in ptrace behavior, and therefore is not acceptable
>> for Roland.
>
> I agree with Roland. Not only this is too visible change, it is not clear
> what detach-with-signal can do if the tracee is not stopped.
>
> This was (very briefly) discussed recently. Probably we can implement
> PTRACE_DETACH_RUNNING (the name is random) which doesn't require the
> stopped tracee but ignores the "data" argument.

IIRC data argument is already ignored by PTRACE_CONT if it is issued in
the ptrace stop which wasn't caused by signal delivery to the tracee.

Basically, *if debugger sees SIGfoo*, it can either allow it:
ptrace(PTRACE_CONT, ...,  SIGfoo);
ignore it:
ptrace(PTRACE_CONT, ...,  0);
or even inject some other signal:
ptrace(PTRACE_CONT, ...,  SIGbar);

but if it resumes tracee from, say, post-execve ptrace stop,
it can't inject a signal: last ptrace() argument will be ignored.

So, it isn't a new precedent to make
ptrace(PTRACE_DETACH, ...,  <something>);
to ignore <something> if tracee isn't in signal-delivery-induced ptrace stop.
In particular, if it isn't in any stop at all, if it's running.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:05                     ` Denys Vlasenko
@ 2011-02-14 17:18                       ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 17:18 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> On Mon, Feb 14, 2011 at 4:06 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 02/13, Denys Vlasenko wrote:
> >>
> >> For example, PTRACE_DETACH requires tracee to be stopped to succeed.
> >> If debugger tries to detach while the tracee is running, it will get
> >> an error. This forces debugger to do stupid things like sending SIGSTOP,
> >> then waiting for tracee to stop, then doing PTRACE_DETACH, then
> >> sending SIGCONT. Of course, while this dance is performed,
> >> any SIGSTOPs/SIGCONTs which may be  sent to the tracee by other processes
> >> are totally disrupted by this.
> >
> > Yes.
> >
> >> The natural (for me) fix is to make PTRACE_DETACH work even on running
> >> tracee. It simply makes a lot of sense. Why on earth do we need tracee
> >> to be stopped? There is no reason.
> >
> > Agreed, but
> >
> >> But this is a change in ptrace behavior, and therefore is not acceptable
> >> for Roland.
> >
> > I agree with Roland. Not only this is too visible change, it is not clear
> > what detach-with-signal can do if the tracee is not stopped.
> >
> > This was (very briefly) discussed recently. Probably we can implement
> > PTRACE_DETACH_RUNNING (the name is random) which doesn't require the
> > stopped tracee but ignores the "data" argument.
>
> IIRC data argument is already ignored by PTRACE_CONT if it is issued in
> the ptrace stop which wasn't caused by signal delivery to the tracee.

Yes, almost correct. There are a couple of exceptions like syscall entry.

> So, it isn't a new precedent to make
> ptrace(PTRACE_DETACH, ...,  <something>);
> to ignore <something> if tracee isn't in signal-delivery-induced ptrace stop.
> In particular, if it isn't in any stop at all, if it's running.

Yes, agreed, but still the new option looks safer to me.

Suppose that debugger sends SIGFOO to the tracee and then it does
ptrace(PTRACE_DETACH, SIGBAR). If it forgets to do wait() in between
it should see the error.

Perhaps this doesn't matter, but compatibility is always good unless
we have to break things. And we should change strace anyway to detach
without SIGSTOP/etc, it can use the new option.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 15:13                                   ` Tejun Heo
  2011-02-14 16:15                                     ` Oleg Nesterov
@ 2011-02-14 17:20                                     ` Denys Vlasenko
  2011-02-14 17:30                                       ` Tejun Heo
                                                         ` (2 more replies)
  1 sibling, 3 replies; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 17:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Mon, Feb 14, 2011 at 4:13 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Denys.
>
> On Sun, Feb 13, 2011 at 11:25:55PM +0100, Denys Vlasenko wrote:
>> But this "diddling behind group stop's back" is exactly the current
>> problem with stop signals.
>
> Maybe.  I don't necessarily agree but can see your point too but I
> think more important part is that that's a behavior which is quite
> noticeable from userland.
>
>> Here I try to stop a ptraced process:
>>
>> $ strace -tt sleep 30
>> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
>> ...
>> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
>> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>>     (I forgot again why we see it twice. Another quirk I guess...)
>> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
>> 23:02:45.622433 close(1)                = 0
>> 23:02:45.622743 close(2)                = 0
>> 23:02:45.622885 exit_group(0)           = ?
>>
>> Why sleep didn't stop?
>>
>> Because PTRACE_SYSCALL brought the task out of group stop at once,
>> even though strace did try hard to not do so:
>>
>>     ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!
>>
>> PTRACE_CONT in this situation would do the same.
>
> This can be fixed by updating strace, right?  strace can look at the
> wait(2) exit code and if the tracee stopped for group stop, wait for
> the tracee to be continued instead of issuing PTRACE_SYSCALL.

But tracee didn't stop _yet_. Signal is not delivered _yet_, debugger
can decide at this point whether to deliver it:
ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
or ignore:
ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)

strace has to deliver SIGSTOP if it wants to make program run exactly
as it would run without strace. So it tries to do so.
Currently, ptrace machinery doesn't react as strace, its user, expects it to.

You are proposing to special-case SIGSTOP delivery in strace:
"if (sig != SIGSTOP) ptrace(PTRACE_SYSCALL, $PID, 0x1, sig)"

This is problematic. For example, will have an effect of not stopping
other threads of a multi-threaded process.


>> You are saying that it is useful that gdb restarts group-stopped task
>> with mere PTRACE_CONT. Above is a counter-example where it is anti-useful:
>> I would muchly prefer strace to see task sit stopped until it gets SIGCONT
>> (or some fatal signal).
>
> This is more of an issue which can be improved in strace.  Sure,
> changing the kernel to enforce group stop over ptrace would make this
> case behave better but at the cost of breaking gdb.
>
>> Why gdb can't use SIGCONT instead of PTRACE_CONT, just like every
>> other tool which needs to resume stopped tasks?
>
> Because that's how PTRACE_CONT behaved the whole time.  It can but
> just hasn't needed to.

Jan, please put on your gdb maintainer's hat, we need your opinion here.
Is it a problem from gdb's POV?

-- 
vda.

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 16:33                                       ` Tejun Heo
@ 2011-02-14 17:23                                         ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 17:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/14, Tejun Heo wrote:
>
> On Mon, Feb 14, 2011 at 05:15:15PM +0100, Oleg Nesterov wrote:
> > > > PTRACE_CONT in this situation would do the same.
> > >
> > > This can be fixed by updating strace, right?  strace can look at the
> > > wait(2) exit code and if the tracee stopped for group stop, wait for
> > > the tracee to be continued instead of issuing PTRACE_SYSCALL.
> >
> > Yes, in this particular case strace could be more clever.
> >
> > But. The tracee should react to SIGCONT after that, this means we
> > shouldn't "delay" this stop or force the TASK_TRACED state.
>
> Yeap, which is achievable by treating group stop differently from
> ptrace traps and make it proceed to TASK_TRACED only if ptrace wants
> to issue commands.

Yes, agreed. And this is exactly what we currently do. Except, as you
pointed out, the simple s/STOPPED/TRACED/ change is buggy. But the fix
you suggested should be almost invisible to the userland.

> (reiterating just to make sure there's no
> misunderstanding)

The same from my side ;)

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 15:31                                   ` Oleg Nesterov
@ 2011-02-14 17:24                                     ` Denys Vlasenko
  2011-02-14 17:39                                       ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 17:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Mon, Feb 14, 2011 at 4:31 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/13, Denys Vlasenko wrote:
>> On Wednesday 09 February 2011 15:18, Tejun Heo wrote:
>> > > > and I'm not really sure whether that's something worth achieving
>> > > > at the cost of debugging capabilities especially when we don't _have_
>> > > > to lose them.
>> > >
>> > > But we do not? I mean, at least this is not worse than the current
>> > > behaviour.
>> >
>> > I think it's worse.  With your changes, debuggers can't diddle the
>> > tasks behind group stop's back which the current users already expect.
>>
>> But this "diddling behind group stop's back" is exactly the current
>> problem with stop signals.
>>
>> Here I try to stop a ptraced process:
>>
>> $ strace -tt sleep 30
>> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
>> ...
>> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
>> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>>     (I forgot again why we see it twice. Another quirk I guess...)
>
>      (this is correct, the tracee reports the signal=SIGSTOP, then
>       it reports it actually stopps with exit_code=SIGSTOP)

Ah, I see. Is there any way debugger can distinguish between these two
different stops?

>> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
>> 23:02:45.622433 close(1)                = 0
>> 23:02:45.622743 close(2)                = 0
>> 23:02:45.622885 exit_group(0)           = ?
>>
>> Why sleep didn't stop?
>
> Yes. And I think this all should be fixed.
>
> Although, depending on how we change the kernel, strace may need the
> fixes too.

Exactly my thoughts. strace must not try to inject another SIGSTOP
when it sees the second SIGSTOP event. Currently, it does,
because it has no way to understand that the second one
*is not a signal delivery*.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:20                                     ` Denys Vlasenko
@ 2011-02-14 17:30                                       ` Tejun Heo
  2011-02-14 17:45                                         ` Oleg Nesterov
  2011-02-14 17:54                                         ` Denys Vlasenko
  2011-02-14 17:51                                       ` Oleg Nesterov
  2011-02-16 21:51                                       ` Jan Kratochvil
  2 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-14 17:30 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Mon, Feb 14, 2011 at 06:20:52PM +0100, Denys Vlasenko wrote:
> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >>     (I forgot again why we see it twice. Another quirk I guess...)
> >> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
> >> 23:02:45.622433 close(1)                = 0
> >> 23:02:45.622743 close(2)                = 0
> >> 23:02:45.622885 exit_group(0)           = ?
...
> > This can be fixed by updating strace, right?  strace can look at the
> > wait(2) exit code and if the tracee stopped for group stop, wait for
> > the tracee to be continued instead of issuing PTRACE_SYSCALL.
> 
> But tracee didn't stop _yet_. Signal is not delivered _yet_, debugger
> can decide at this point whether to deliver it:
> ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
> or ignore:
> ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)
> 
> strace has to deliver SIGSTOP if it wants to make program run exactly
> as it would run without strace. So it tries to do so.
> Currently, ptrace machinery doesn't react as strace, its user, expects it to.

Okay, maybe I'm missing something but so once SIGSTOP is determined to
be delivered, then the tracee enters group stop and that's the second
SIGSTOP notification you get.  At that point, strace should wait for
the tracee to be continued by SIGCONT.  That should work, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 11:39                                       ` Denys Vlasenko
@ 2011-02-14 17:32                                         ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 17:32 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Jan Kratochvil, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> On Monday 14 February 2011 10:03, Jan Kratochvil wrote:
> > On Mon, 14 Feb 2011 00:01:47 +0100, Denys Vlasenko wrote:
> > > * sleep runs in nanosleep
> > > * SIGSTOP arrives, strace sees it
> > > * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
> > > * sleep process enters group-stop
> >
> > The last point breaks the documented behavior of ptrace:
> > 	If data is nonzero and not SIGSTOP, it is interpreted as a signal to
> > 	be delivered to the child; otherwise, no signal is delivered.
>
> But SIGSTOP _is_ delivered - that's why sleep process stops.

Yes.

> > What if other signal arrives?  The tracer probably should not be notified as
> > the tracee is in a group-stop.
>
> The behavior here ideally should be the same as for non-traced process:
> the signals are remembered while process is stopped, and it sees them
> only after SIGCONT, as demonstrated by the following program

Agreed. And this is what we currently do.

> I believe it would be best if debugger sees signals immediately,
> but when it does ptrace(PTRACE_CONT/SYSCALL, ..., <sig>)
> in order to send signals to group-stopped tracee, they are queued
> to it without terminating group-stop. When SIGCONT arrives,
> ptrace(PTRACE_CONT/SYSCALL, ..., SIGCONT) terminates group-stop
> and causes all queued signals to be handled (in random order,
> not necessarily in the order of arrival. Even CONT handler is
> not guaranteed to be called first, as you see above).

Yes, personaly I think this would the best behaviour.

But, damn, again, again, again, yes this change is very noticable.
Tejun is right too.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:24                                     ` Denys Vlasenko
@ 2011-02-14 17:39                                       ` Oleg Nesterov
  2011-02-14 17:57                                         ` Denys Vlasenko
  2011-02-14 18:59                                         ` Denys Vlasenko
  0 siblings, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 17:39 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> On Mon, Feb 14, 2011 at 4:31 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 02/13, Denys Vlasenko wrote:
> >>
> >> $ strace -tt sleep 30
> >> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> >> ...
> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >>     (I forgot again why we see it twice. Another quirk I guess...)
> >
> >      (this is correct, the tracee reports the signal=SIGSTOP, then
> >       it reports it actually stopps with exit_code=SIGSTOP)
>
> Ah, I see. Is there any way debugger can distinguish between these two
> different stops?

IIRC, the (only?) way to distinguish is to check last_siginfo != NULL
via ptrace(PTRACE_GETSIGINFO).

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:30                                       ` Tejun Heo
@ 2011-02-14 17:45                                         ` Oleg Nesterov
  2011-02-14 17:54                                         ` Denys Vlasenko
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 17:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/14, Tejun Heo wrote:
>
> On Mon, Feb 14, 2011 at 06:20:52PM +0100, Denys Vlasenko wrote:
> > >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> > >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> > >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> > >>     (I forgot again why we see it twice. Another quirk I guess...)
> > >> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
> > >> 23:02:45.622433 close(1)                = 0
> > >> 23:02:45.622743 close(2)                = 0
> > >> 23:02:45.622885 exit_group(0)           = ?
> ...
> > > This can be fixed by updating strace, right?  strace can look at the
> > > wait(2) exit code and if the tracee stopped for group stop, wait for
> > > the tracee to be continued instead of issuing PTRACE_SYSCALL.
> >
> > But tracee didn't stop _yet_. Signal is not delivered _yet_, debugger
> > can decide at this point whether to deliver it:
> > ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
> > or ignore:
> > ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)
> >
> > strace has to deliver SIGSTOP if it wants to make program run exactly
> > as it would run without strace. So it tries to do so.
> > Currently, ptrace machinery doesn't react as strace, its user, expects it to.
>
> Okay, maybe I'm missing something but so once SIGSTOP is determined to
> be delivered, then the tracee enters group stop and that's the second
> SIGSTOP notification you get.

Yes, this is correct.

But my head spins ;) I have already lost the picture.

> At that point, strace should wait for
> the tracee to be continued by SIGCONT.  That should work, right?

Again, given that strace is the real parent, in this particular
case I think strace can work as you suggest.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:20                                     ` Denys Vlasenko
  2011-02-14 17:30                                       ` Tejun Heo
@ 2011-02-14 17:51                                       ` Oleg Nesterov
  2011-02-14 18:55                                         ` Denys Vlasenko
  2011-02-16 21:51                                       ` Jan Kratochvil
  2 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 17:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> >> $ strace -tt sleep 30
> >> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> >> ...
> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >>     (I forgot again why we see it twice. Another quirk I guess...)
> >> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
> >> 23:02:45.622433 close(1)                = 0
> >> 23:02:45.622743 close(2)                = 0
> >> 23:02:45.622885 exit_group(0)           = ?
> >>
> >> Why sleep didn't stop?
> >>
> >> Because PTRACE_SYSCALL brought the task out of group stop at once,
> >> even though strace did try hard to not do so:
> >>
> >>     ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!
> >>
> >> PTRACE_CONT in this situation would do the same.
> >
> > This can be fixed by updating strace, right?  strace can look at the
> > wait(2) exit code and if the tracee stopped for group stop, wait for
> > the tracee to be continued instead of issuing PTRACE_SYSCALL.

Ah, I seem to understand the confusion, let me repeat...

> But tracee didn't stop _yet_.

This depends on "_yet_". strace does ptrace(SYSCALL, SIGSTOP) twice.
The first time it does this after the tracee reports the signal, and
the tracee stopps.

> Signal is not delivered _yet_, debugger
> can decide at this point whether to deliver it:
> ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
> or ignore:
> ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)
>
> strace has to deliver SIGSTOP if it wants to make program run exactly
> as it would run without strace. So it tries to do so.
> Currently, ptrace machinery doesn't react as strace, its user, expects it to.

It does, see above. But then the tracee actually stopps, and report
this to the tracer. However, strace handles this case as if this was
another signal=SIGSTOP, so it does ptrace(SYSCALL, SIGSTOP) again.

SIGSTOP has no effect, but PTRACE_SYSCALL wakeups the tracee.

> >> Why gdb can't use SIGCONT instead of PTRACE_CONT, just like every
> >> other tool which needs to resume stopped tasks?
> >
> > Because that's how PTRACE_CONT behaved the whole time.  It can but
> > just hasn't needed to.
>
> Jan, please put on your gdb maintainer's hat, we need your opinion here.
> Is it a problem from gdb's POV?

Yes, please ;)

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:30                                       ` Tejun Heo
  2011-02-14 17:45                                         ` Oleg Nesterov
@ 2011-02-14 17:54                                         ` Denys Vlasenko
  2011-02-21 15:16                                           ` Tejun Heo
  1 sibling, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 17:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Mon, Feb 14, 2011 at 6:30 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Mon, Feb 14, 2011 at 06:20:52PM +0100, Denys Vlasenko wrote:
>> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
>> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> >>     (I forgot again why we see it twice. Another quirk I guess...)
>> >> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
>> >> 23:02:45.622433 close(1)                = 0
>> >> 23:02:45.622743 close(2)                = 0
>> >> 23:02:45.622885 exit_group(0)           = ?
> ...
>> > This can be fixed by updating strace, right?  strace can look at the
>> > wait(2) exit code and if the tracee stopped for group stop, wait for
>> > the tracee to be continued instead of issuing PTRACE_SYSCALL.
>>
>> But tracee didn't stop _yet_. Signal is not delivered _yet_, debugger
>> can decide at this point whether to deliver it:
>> ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
>> or ignore:
>> ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)
>>
>> strace has to deliver SIGSTOP if it wants to make program run exactly
>> as it would run without strace. So it tries to do so.
>> Currently, ptrace machinery doesn't react as strace, its user, expects it to.
>
> Okay, maybe I'm missing something but so once SIGSTOP is determined to
> be delivered, then the tracee enters group stop and that's the second
> SIGSTOP notification you get.  At that point, strace should wait for
> the tracee to be continued by SIGCONT.  That should work, right?

Do you mean "Will it work on current kernels" or
"that's what strace has to do and then it is supposed to work correctly,
modulo bugs"?

"Will it work on current kernels" - I don't know. Need to experiment.

"That's what strace has to do and then it is supposed to work correctly,
modulo bugs" - it depends on how we define group-stop and ptrace-stop
relationship.

In this particular scenario, first SIGSTOP is ptrace-stop.
Obviously, we must issue ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
to continue.

Second SIGSTOP is notification of tracee's group-stop to debugger.

The question is, logically, by sending this notification, does tracee,
or does it not enter into ptrace-stop too? (IOW: is ptrace-stop a separate
bit in task state, independent of group-stop?)
If yes, then we need to release tracee from ptrace-stop (but it will remain in
group-stop) by issuing ptrace(PTRACE_SYSCALL, $PID, 0x1, 0).
If not, then we must not do so, because the task is not ptrace-stopped,
and ptrace(PTRACE_SYSCALL, $PID, 0x1, 0) is undefined (I think it should
error out to indicate that).

How do you prefer to define it?

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:39                                       ` Oleg Nesterov
@ 2011-02-14 17:57                                         ` Denys Vlasenko
  2011-02-14 18:00                                           ` Oleg Nesterov
  2011-02-14 18:59                                         ` Denys Vlasenko
  1 sibling, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 17:57 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Mon, Feb 14, 2011 at 6:39 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/14, Denys Vlasenko wrote:
>>
>> On Mon, Feb 14, 2011 at 4:31 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > On 02/13, Denys Vlasenko wrote:
>> >>
>> >> $ strace -tt sleep 30
>> >> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
>> >> ...
>> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
>> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> >>     (I forgot again why we see it twice. Another quirk I guess...)
>> >
>> >      (this is correct, the tracee reports the signal=SIGSTOP, then
>> >       it reports it actually stopps with exit_code=SIGSTOP)
>>
>> Ah, I see. Is there any way debugger can distinguish between these two
>> different stops?
>
> IIRC, the (only?) way to distinguish is to check last_siginfo != NULL
> via ptrace(PTRACE_GETSIGINFO).

What do you think strace needs to do when it sees second SIGSTOP
(meaning "in theory", not "on current kernel which may be buggy")?

ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)?
nothing?
something else?

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:57                                         ` Denys Vlasenko
@ 2011-02-14 18:00                                           ` Oleg Nesterov
  2011-02-14 18:06                                             ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 18:00 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> On Mon, Feb 14, 2011 at 6:39 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 02/14, Denys Vlasenko wrote:
> >>
> >> On Mon, Feb 14, 2011 at 4:31 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >> > On 02/13, Denys Vlasenko wrote:
> >> >>
> >> >> $ strace -tt sleep 30
> >> >> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> >> >> ...
> >> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> >> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> >> >>     (I forgot again why we see it twice. Another quirk I guess...)
> >> >
> >> >      (this is correct, the tracee reports the signal=SIGSTOP, then
> >> >       it reports it actually stopps with exit_code=SIGSTOP)
> >>
> >> Ah, I see. Is there any way debugger can distinguish between these two
> >> different stops?
> >
> > IIRC, the (only?) way to distinguish is to check last_siginfo != NULL
> > via ptrace(PTRACE_GETSIGINFO).
>
> What do you think strace needs to do when it sees second SIGSTOP
> (meaning "in theory", not "on current kernel which may be buggy")?
>
> ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)?

proably this, or even ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP).
I think.

(assuming that ptrace_resume() respects TASK_STOPPED)

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 18:00                                           ` Oleg Nesterov
@ 2011-02-14 18:06                                             ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 18:06 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Oleg Nesterov wrote:
>
> On 02/14, Denys Vlasenko wrote:
> >
> > On Mon, Feb 14, 2011 at 6:39 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > > On 02/14, Denys Vlasenko wrote:
> > >>
> > >> On Mon, Feb 14, 2011 at 4:31 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > >> > On 02/13, Denys Vlasenko wrote:
> > >> >>
> > >> >> $ strace -tt sleep 30
> > >> >> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
> > >> >> ...
> > >> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
> > >> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> > >> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> > >> >>     (I forgot again why we see it twice. Another quirk I guess...)
> > >> >
> > >> >      (this is correct, the tracee reports the signal=SIGSTOP, then
> > >> >       it reports it actually stopps with exit_code=SIGSTOP)
> > >>
> > >> Ah, I see. Is there any way debugger can distinguish between these two
> > >> different stops?
> > >
> > > IIRC, the (only?) way to distinguish is to check last_siginfo != NULL
> > > via ptrace(PTRACE_GETSIGINFO).
> >
> > What do you think strace needs to do when it sees second SIGSTOP
> > (meaning "in theory", not "on current kernel which may be buggy")?
> >
> > ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)?
>
> proably this, or even ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP).
> I think.
>
> (assuming that ptrace_resume() respects TASK_STOPPED)

Oh, but I forgot to mention... there is another problem, _any_
ptrace request when the tracee is stopped turns it into TASK_TRACED.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 14:50                                   ` Tejun Heo
@ 2011-02-14 18:53                                     ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 18:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Tejun Heo wrote:
>
> Hello, Oleg.  Sorry about the delay.
>
> On Wed, Feb 09, 2011 at 10:25:26PM +0100, Oleg Nesterov wrote:
> > >   Note that { the task is put into TASK_TRACED state and group stop
> > >   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
> > >   state and the following PTRACE request will transition it into
> > >   TASK_TRACED.  If SIGCONT is received before transition to
> > >   TASK_TRACED is made, the task will resume execution.  If PTRACE
> > >   request faces with SIGCONT, PTRACE request may fail. }
> >
> > To me, the first variant looks better. But, only because it is closer
> > to the current behaviour. I mean, it is better to change the things
> > incrementally.
>
> Alright, it's the first variant then.
>
> > But in the longer term - I do not know. Personally, I like the
> > TASK_STOPPED variant. To the point, I was thinking that (perhaps)
> > we can change ptrace_stop() so that it simply calls do_signal_stop()
> > if it notices ->group_stop_count != 0.
>
> I don't know.  IMHO it's enough to give the ptracer a way to honor
> group stop and notification so things like strace can stay more or
> less transparent.  I don't think it's something we need to enforce.

Agreed, I don't know too.

> > And this is what I do not like. I just can't accept the fact there
> > is a running thread in the SIGNAL_STOP_STOPPED group.
>
> Let's agree to disagree here.  I agree it's weird but don't think it's
> necessarily a bad thing that needs to be changed.

Yes, I see.

> > Yes! and this is very good argument in favour of all your objections ;)
>
> My objections to what?  The one thing that I'm really against is
> allowing group stop to override PTRACE_CONT and that's visible
> regardless of notifications.  Other than that, I think we're pretty
> much in agreement now, aren't we?

Ah, OK, good.

> > Oh, currently I am ignoring this, my only concern is how this all
> > looks to the userland. But this is the good point, and I have to admit
> > that I never realized this is just wrong. Yes, I agree, we should do
> > something, but this is not visible to user-space (except this should
> > fix the bug ;)
>
> Mostly not but there's the obscure window where the tracee goes
> through TASK_RUNNING which can be visible from userland from another
> task of the ptracer group, which I think is ignorable as long as it's
> properly masked from the ptracing task itself.  I wanted to make sure
> we agree on that one.

I think we are.

> > Sure, if we are talking about SIGCONT from "nowhere". But, the same
> > way ^Z is not reliable too.
>
> It doesn't even have to be from "nowhere".  A process can be raising
> SIGCONT itself.  To me, the whole thing feels more like an
> administration aid by design than a strict monitor/control mechanism.

Yes, this is true.

> > I hate this from the time when I noticed that the application doesn't
> > respond to ^Z under strace. And I used strace exactly because I wanted
> > do debug some (I can't recall exactly) problems with jctl. That is all.
> >
> > But in any case. Some users run the services under ptrace. I mean,
> > the application borns/runs/dies under ptrace. That is why personally
> > I certainly do not like anything which delays until detach (say,
> > the-tracee-doesnt-participate-in-group-stop-until-detach logic).
>
> I see, so let's make them participate in the group stop and notify the
> real parent of the group stop.

Great.

> As long as PTRACE_CONT behavior isn't
> changed, I don't object.

Well, this is important. And, the changes above depend on this. I mean,
if we do not change PTRACE_CONT behavior, then the tracee should remember
if it already participated in this group stop (like you previous patches
did).

I do not think I can add something else to this discussion. But as you
already noticed, Denys has joined the club ;)

And I hope Jan and Roland can comment this too.

I won't argue any longer (at least I'll try ;), I understand your concerns.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:51                                       ` Oleg Nesterov
@ 2011-02-14 18:55                                         ` Denys Vlasenko
  2011-02-14 19:01                                           ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 18:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Mon, Feb 14, 2011 at 6:51 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/14, Denys Vlasenko wrote:
>>
>> >> $ strace -tt sleep 30
>> >> 23:02:15.619262 execve("/bin/sleep", ["sleep", "30"], [/* 30 vars */]) = 0
>> >> ...
>> >> 23:02:15.622112 nanosleep({30, 0}, NULL) = ? ERESTART_RESTARTBLOCK (To be restarted)
>> >> 23:02:23.781165 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> >> 23:02:23.781251 --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> >>     (I forgot again why we see it twice. Another quirk I guess...)
>> >> 23:02:23.781310 restart_syscall(<... resuming interrupted call ...>) = 0
>> >> 23:02:45.622433 close(1)                = 0
>> >> 23:02:45.622743 close(2)                = 0
>> >> 23:02:45.622885 exit_group(0)           = ?
>> >>
>> >> Why sleep didn't stop?
>> >>
>> >> Because PTRACE_SYSCALL brought the task out of group stop at once,
>> >> even though strace did try hard to not do so:
>> >>
>> >>     ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP) <-- note SIGSTOP!
>> >>
>> >> PTRACE_CONT in this situation would do the same.
>> >
>> > This can be fixed by updating strace, right?  strace can look at the
>> > wait(2) exit code and if the tracee stopped for group stop, wait for
>> > the tracee to be continued instead of issuing PTRACE_SYSCALL.
>
> Ah, I seem to understand the confusion, let me repeat...
>
>> But tracee didn't stop _yet_.
>
> This depends on "_yet_". strace does ptrace(SYSCALL, SIGSTOP) twice.
> The first time it does this after the tracee reports the signal, and
> the tracee stopps.
>
>> Signal is not delivered _yet_, debugger
>> can decide at this point whether to deliver it:
>> ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
>> or ignore:
>> ptrace(PTRACE_SYSCALL, $PID, 0x1, 0)
>>
>> strace has to deliver SIGSTOP if it wants to make program run exactly
>> as it would run without strace. So it tries to do so.
>> Currently, ptrace machinery doesn't react as strace, its user, expects it to.
>
> It does, see above. But then the tracee actually stopps, and report
> this to the tracer. However, strace handles this case as if this was
> another signal=SIGSTOP, so it does ptrace(SYSCALL, SIGSTOP) again.
>
> SIGSTOP has no effect, but PTRACE_SYSCALL wakeups the tracee.

I performed a small experiment. You are right, SIGSTOP here
is ignored, and PTRACE_SYSCALL wakes the tracee up:
replacing SIGSTOP with 0 doesn't change anything.

I tried to simply not do ptrace(PTRACE_SYSCALL, ..., 0) at all.
Behavior changes, but it is still wrong. Now tracee doesn't wake up
on SIGCONT. Here is the run of modified strace:

# strace -tt -s99 -oLOG ./strace sleep 55
execve("/bin/sleep", ["sleep", "55"], [/* 48 vars */]) = 0
brk(0)                                  = 0x22a9000
...
nanosleep({55, 0}, NULL)                = ? ERESTART_RESTARTBLOCK (To
be restarted)
          <-- kill -STOP 25339
--- SIGSTOP (Stopped (signal)) @ 0 (0) --- STOP: si_signo:19 si_code:0
si_status:0 si_value:(nil)
--- SIGSTOP (Stopped (signal)) @ 0 (0) --- STOP:
ptrace(PTRACE_GETSIGINFO) failed
...does not exit for minutes...
           <-- kill -CONT 25339
...still nothing, it is stopped, does not exit for minutes...
           <-- kill -KILL 25339
+++ killed by SIGKILL +++


Here is what patched strace saw and did:

19:41:09.601764 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}],
__WALL, NULL) = 25339
19:41:09.601914 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
19:41:09.602081 ptrace(PTRACE_GETSIGINFO, 25339, 0, {si_signo=SIGSTOP,
si_code=SI_USER, si_pid=10105, si_uid=0, si_value={int=0, ptr=0}}) = 0
19:41:09.602273 write(2, "--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
STOP: si_signo:19 si_code:0 si_status:0 si_value:(nil) \n", 99) = 99
19:41:09.602456 ptrace(PTRACE_SYSCALL, 25339, 0x1, SIGSTOP) = 0
19:41:09.602582 --- SIGCHLD (Child exited) @ 0 (0) ---
19:41:09.602652 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

19:41:09.602792 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}],
__WALL, NULL) = 25339
19:41:09.602927 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
19:41:09.603081 ptrace(PTRACE_GETSIGINFO, 25339, 0, 0x7fff436fc730) =
-1 EINVAL (Invalid argument)
19:41:09.603231 write(2, "--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
STOP: ptrace(PTRACE_GETSIGINFO) failed \n", 83) = 83
19:41:09.603369 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
      <<<< the change is here. Unpatched strace would do
ptrace(PTRACE_SYSCALL, 25339, 0x1, SIGSTOP) >>>

19:41:09.603511 wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) ==
SIGKILL}], __WALL, NULL) = 25339
      <<<< SIGCONT is not visible! >>>>>
19:47:00.836723 --- SIGCHLD (Child exited) @ 0 (0) ---
19:47:00.836804 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
19:47:00.837010 write(2, "+++ killed by SIGKILL +++\n", 26) = 26
19:47:00.837212 rt_sigaction(SIGKILL, {SIG_DFL, [KILL],
SA_RESTORER|SA_RESTART, 0x7f5df12d5970}, {0x7fff436f0043, ~[HUP INT
BUS USR2 PIPE ALRM TTIN XCPU PROF WINCH IO PWR RTMIN RT_16 RT_17 RT_18
RT
19:47:00.837458 gettid()                = 25338
19:47:00.837596 tgkill(25338, 25338, SIGKILL <unfinished ...>
19:47:00.837831 +++ killed by SIGKILL +++

As you see, SIGCONT was completely invisible to debugger.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:39                                       ` Oleg Nesterov
  2011-02-14 17:57                                         ` Denys Vlasenko
@ 2011-02-14 18:59                                         ` Denys Vlasenko
  1 sibling, 0 replies; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 18:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Mon, Feb 14, 2011 at 6:39 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>> Ah, I see. Is there any way debugger can distinguish between these two
>> different stops?
>
> IIRC, the (only?) way to distinguish is to check last_siginfo != NULL
> via ptrace(PTRACE_GETSIGINFO).

Right. ptrace(PTRACE_GETSIGINFO) will fail on non-signal-delivery stops.
Thus, it can be used to distinguish different kinds of SIGSTOPs in my example.
-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 18:55                                         ` Denys Vlasenko
@ 2011-02-14 19:01                                           ` Oleg Nesterov
  2011-02-14 19:42                                             ` Denys Vlasenko
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 19:01 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> I tried to simply not do ptrace(PTRACE_SYSCALL, ..., 0) at all.
> Behavior changes, but it is still wrong. Now tracee doesn't wake up
> on SIGCONT.

please see below,

> 19:41:09.601764 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}],
> __WALL, NULL) = 25339
> 19:41:09.601914 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
> 19:41:09.602081 ptrace(PTRACE_GETSIGINFO, 25339, 0, {si_signo=SIGSTOP,
> si_code=SI_USER, si_pid=10105, si_uid=0, si_value={int=0, ptr=0}}) = 0
> 19:41:09.602273 write(2, "--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
> STOP: si_signo:19 si_code:0 si_status:0 si_value:(nil) \n", 99) = 99
> 19:41:09.602456 ptrace(PTRACE_SYSCALL, 25339, 0x1, SIGSTOP) = 0
> 19:41:09.602582 --- SIGCHLD (Child exited) @ 0 (0) ---
> 19:41:09.602652 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> 19:41:09.602792 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}],
> __WALL, NULL) = 25339

OK, it is stopped.

> 19:41:09.603081 ptrace(PTRACE_GETSIGINFO, 25339, 0, 0x7fff436fc730) =

And this changes the state to TASK_TRACED. See another email from me.
That is why SIGCONT doesn't work.

This is another problem, the kernel should help somehow. This was
already discussed a bit, but it is not clear what exactly we can do.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 19:01                                           ` Oleg Nesterov
@ 2011-02-14 19:42                                             ` Denys Vlasenko
  2011-02-14 20:01                                               ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-14 19:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On Mon, Feb 14, 2011 at 8:01 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/14, Denys Vlasenko wrote:
>>
>> I tried to simply not do ptrace(PTRACE_SYSCALL, ..., 0) at all.
>> Behavior changes, but it is still wrong. Now tracee doesn't wake up
>> on SIGCONT.
>
> please see below,
>
>> 19:41:09.601764 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}],
>> __WALL, NULL) = 25339
>> 19:41:09.601914 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
>> 19:41:09.602081 ptrace(PTRACE_GETSIGINFO, 25339, 0, {si_signo=SIGSTOP,
>> si_code=SI_USER, si_pid=10105, si_uid=0, si_value={int=0, ptr=0}}) = 0
>> 19:41:09.602273 write(2, "--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
>> STOP: si_signo:19 si_code:0 si_status:0 si_value:(nil) \n", 99) = 99
>> 19:41:09.602456 ptrace(PTRACE_SYSCALL, 25339, 0x1, SIGSTOP) = 0
>> 19:41:09.602582 --- SIGCHLD (Child exited) @ 0 (0) ---
>> 19:41:09.602652 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
>> 19:41:09.602792 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}],
>> __WALL, NULL) = 25339
>
> OK, it is stopped.
>
>> 19:41:09.603081 ptrace(PTRACE_GETSIGINFO, 25339, 0, 0x7fff436fc730) =
>
> And this changes the state to TASK_TRACED. See another email from me.
> That is why SIGCONT doesn't work.
>
> This is another problem, the kernel should help somehow. This was
> already discussed a bit, but it is not clear what exactly we can do.

Yes, I understand what happens.

Basically, we have TASK_RUNNING, TASK_STOPPED and TASK_TRACED
states, and after entering TASK_TRACED state we lose information
in which state we were before entering it. We need to remember
old state and restore it in order for this example to work.

Or we can avoid entering TASK_TRACED on ptrace(PTRACE_GETSIGINFO) et al.
Can we remain in TASK_STOPPED?

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 19:42                                             ` Denys Vlasenko
@ 2011-02-14 20:01                                               ` Oleg Nesterov
  2011-02-15 15:24                                                 ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-14 20:01 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/14, Denys Vlasenko wrote:
>
> Basically, we have TASK_RUNNING, TASK_STOPPED and TASK_TRACED
> states, and after entering TASK_TRACED state we lose information
> in which state we were before entering it. We need to remember
> old state and restore it in order for this example to work.

Actually, we do not lose this info. So the kernel can change it
back to TASK_STOPPED after ptrace(PTRACE_CONT), and this is what
PTRACE_CONT-doesnt-resume-until-SIGCONT was supposed to do.

> Or we can avoid entering TASK_TRACED on ptrace(PTRACE_GETSIGINFO) et al.
> Can we remain in TASK_STOPPED?

Oh, unlikely, I think.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 20:01                                               ` Oleg Nesterov
@ 2011-02-15 15:24                                                 ` Tejun Heo
  2011-02-15 15:58                                                   ` Oleg Nesterov
  2011-02-15 17:31                                                   ` Roland McGrath
  0 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-15 15:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello, Oleg, Denys.

On Mon, Feb 14, 2011 at 09:01:30PM +0100, Oleg Nesterov wrote:
> > Or we can avoid entering TASK_TRACED on ptrace(PTRACE_GETSIGINFO) et al.
> > Can we remain in TASK_STOPPED?
> 
> Oh, unlikely, I think.

Actually I was thinking along this line.  We can allow
PTRACE_GETSIGINFO to proceed without forcing the tracee into TRACED
state, the rationale being the operation is required to tell between
group stop and ptrace trap.  Am I missing something?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-15 15:24                                                 ` Tejun Heo
@ 2011-02-15 15:58                                                   ` Oleg Nesterov
  2011-02-15 17:31                                                   ` Roland McGrath
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-15 15:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello Tejun,

On 02/15, Tejun Heo wrote:
>
> Hello, Oleg, Denys.
>
> On Mon, Feb 14, 2011 at 09:01:30PM +0100, Oleg Nesterov wrote:
> > > Or we can avoid entering TASK_TRACED on ptrace(PTRACE_GETSIGINFO) et al.
> > > Can we remain in TASK_STOPPED?
> >
> > Oh, unlikely, I think.
>
> Actually I was thinking along this line.  We can allow
> PTRACE_GETSIGINFO to proceed without forcing the tracee into TRACED
> state, the rationale being the operation is required to tell between
> group stop and ptrace trap.  Am I missing something?

I do not think this is really wrong (except this means another user
visible change and I never know if it is fine).

But I think it doesn't really help. Yes, this is probably enough for
strace (I don't know for sure) , but a more "sophisticated" debugger
may want to do something else with the stopped tracee.


And. Denys suggested this assuming PTRACE_CONT-doesnt-resume-until-SIGCONT,
and in this case this is not really needed. The debugger can safely do
PTRACE_GETSIGINFO even if this changes the state to TASK_TRACED.
Once it does PTRACE_CONT the tracee becomes "visible" to SIGCONT. Or,
if SIGCONT comes in between, the tracee runs.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-15 15:24                                                 ` Tejun Heo
  2011-02-15 15:58                                                   ` Oleg Nesterov
@ 2011-02-15 17:31                                                   ` Roland McGrath
  2011-02-15 20:27                                                     ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-02-15 17:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

> Actually I was thinking along this line.  We can allow
> PTRACE_GETSIGINFO to proceed without forcing the tracee into TRACED
> state, the rationale being the operation is required to tell between
> group stop and ptrace trap.  Am I missing something?

The reason for the transition to TASK_TRACED is to prevent a race with
SIGCONT waking the task.  There is always a race with SIGKILL waking it,
but the circumstances where that can really matter are far fewer.
You need to make sure that the work PTRACE_GETSIGINFO does to access
last_siginfo cannot race with that pointer disappearing or the stack
space it points to becoming invalid.  I think the use of siglock ensures
that, but Oleg should verify it.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-15 17:31                                                   ` Roland McGrath
@ 2011-02-15 20:27                                                     ` Oleg Nesterov
  2011-02-18 17:02                                                       ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-15 20:27 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Tejun Heo, Denys Vlasenko, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/15, Roland McGrath wrote:
>
> > Actually I was thinking along this line.  We can allow
> > PTRACE_GETSIGINFO to proceed without forcing the tracee into TRACED
> > state, the rationale being the operation is required to tell between
> > group stop and ptrace trap.  Am I missing something?
>
> The reason for the transition to TASK_TRACED is to prevent a race with
> SIGCONT waking the task.  There is always a race with SIGKILL waking it,
> but the circumstances where that can really matter are far fewer.
> You need to make sure that the work PTRACE_GETSIGINFO does to access
> last_siginfo cannot race with that pointer disappearing or the stack
> space it points to becoming invalid.  I think the use of siglock ensures
> that, but Oleg should verify it.

Yes, I think this is safe.

I do not really like this idea because it looks a bit strange to treat
PTRACE_GETSIGINFO specially, and this doesn't solve all problems. And,
once again, I still hope we can change ptrace_resume() so that it doesn't
wakeup the stopped (I mean, SIGNAL_STOP_STOPPED) tracee, in this case this
hack is not needed.

And. We are going to add the new requests which doesn't need the stopped
tracee anyway. So we can just add PTRACE_HAS_SIGINFO which returns
child->last_siginfo != NULL. This looks simpler, and this is compatible.
Of course this check is racy, but this doesn't matter. PTRACE_GETSIGINFO
is equally racy if it doesn't change the state to TASK_TRACED.

But I won't argue if you/Denys/Tejun prefer to change PTRACE_GETSIGINFO.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:20                                     ` Denys Vlasenko
  2011-02-14 17:30                                       ` Tejun Heo
  2011-02-14 17:51                                       ` Oleg Nesterov
@ 2011-02-16 21:51                                       ` Jan Kratochvil
  2011-02-17  3:37                                         ` Denys Vlasenko
  2011-02-17 16:49                                         ` Oleg Nesterov
  2 siblings, 2 replies; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-16 21:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Oleg Nesterov, Roland McGrath, linux-kernel, torvalds, akpm

On Mon, 14 Feb 2011 18:20:52 +0100, Denys Vlasenko wrote:
> Jan, please put on your gdb maintainer's hat, we need your opinion here.
> Is it a problem from gdb's POV?

Here is a summary of current and my wished behavior:

Make PTRACE_DETACH (data=SIGSTOP) working - that is to leave the process in
`T (stopped)' without any single PC step.  This works in some kernels and
does not work in other kernels, it is "detach-stopped" test in:
cvs -d :pserver:anoncvs:anoncvs@sourceware.org:/cvs/systemtap co ptrace-tests 

The current upstream GDB trick of
  PTRACE_ATTACH
  if /proc/PID/status->State: == `T (stopped)'
    tgkill(SIGSTOP)
    PTRACE_CONT(0)
  waitpid->SIGSTOP (or preceded by some other signal but 1x SIGSTOP will come)
should remain compatible, as is implemented in:
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gdb/linux-nat.c.diff?r1=1.80&r2=1.81&cvsroot=src

Make the GDB trick above no longer needed, so that in the case it was invented
for a simple PTRACE_ATTACH, wait->SIGSTOP, PTRACE_DETACH(0) also works:
  foreign process: kill(child process, SIGSTOP)
  parent process:  wait() -> SIGSTOP (the notification is now eaten-out)
  child process is now in `T (stopped)'
  debugger: PTRACE_ATTACH(child process)
  debugger: waitpid -> should get SIGSTOP, even despite it was eaten-out above
This works in some kernels and does not work in other kernels.

A new proposal is to preserve the process's `T (stopped)' for
a naive/legacy debugger / ptrace tool doing PTRACE_ATTACH, wait->SIGSTOP,
PTRACE_DETACH(0), incl. GDB doing the "GDB trick" above.
That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
iff the process was `T (stopped)' before PTRACE_ATTACH.
 - PTRACE_DETACH(0)       should preserve `T (stopped)'.
but also:
 - PTRACE_DETACH(SIGSTOP) should force `T (stopped)'.
 - PTRACE_DETACH(SIGCONT) should force freely running process.


The behavior of SIGSTOP and SIGCONT received during active ptrace session
I find as a new feature without having much to keep backward compatibibility.
+
You have concluded a plan how to do a real `T (stopped)' on received SIGSTOP
using PTRACE_GETSIGINFO, OK, go with that.
+
Personally I would keep it completely hidden from the debugger and only
remember the last SIGCONT vs. SIGSTOP for the case the session ends with
PTRACE_DETACH(0).  Debugger/strace would not be able to display any externally
received SIGSTOP/SIGCONT.  PTRACE_CONT(SIGSTOP) and PTRACE_CONT(SIGCONT)
should behave as PTRACE_CONT(0) to clean up compatibility with existing tools.
For a general transparent tracing there is at least systemtap.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-16 21:51                                       ` Jan Kratochvil
@ 2011-02-17  3:37                                         ` Denys Vlasenko
  2011-02-17 19:19                                           ` Oleg Nesterov
  2011-02-17 16:49                                         ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-17  3:37 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Tejun Heo, Oleg Nesterov, Roland McGrath, linux-kernel, torvalds, akpm

Hi Jan,

Thanks for joining!


On Wednesday 16 February 2011 22:51, Jan Kratochvil wrote:
> On Mon, 14 Feb 2011 18:20:52 +0100, Denys Vlasenko wrote:
> > Jan, please put on your gdb maintainer's hat, we need your opinion here.
> > Is it a problem from gdb's POV?
> 
> Here is a summary of current and my wished behavior:
> 
> Make PTRACE_DETACH (data=SIGSTOP) working - that is to leave the process in
> `T (stopped)' without any single PC step.

This coincides of my and Oleg's desire to make SIGSTOP working with
alls ptrace restart commands.


> This works in some kernels and 
> does not work in other kernels, it is "detach-stopped" test in:
> cvs -d :pserver:anoncvs:anoncvs@sourceware.org:/cvs/systemtap co ptrace-tests 
> 
> The current upstream GDB trick of
>   PTRACE_ATTACH
>   if /proc/PID/status->State: == `T (stopped)'
>     tgkill(SIGSTOP)
>     PTRACE_CONT(0)
>   waitpid->SIGSTOP (or preceded by some other signal but 1x SIGSTOP will come)

I don't fully understand the steps of the trick.

* ptrace(PTRACE_ATTACH)
* [do you waitpid? How exactly (once? loop until SIGSTOP or ECHILD?)]
* if /proc/PID/status->State: == `T (stopped)'   [why? Is it test
                                 "was it already stopped when we attached?"
                                 What are the other possible states?]
    tgkill(SIGSTOP)
    PTRACE_CONT(0)
* waitpid->SIGSTOP  [What does this mean? "Loop on waitpid until SIGSTOP"?]

Moreover, I don't see the actual detaching step.

Can you describe the trick in more details?


> should remain compatible, as is implemented in:
> http://sourceware.org/cgi-bin/cvsweb.cgi/src/gdb/linux-nat.c.diff?r1=1.80&r2=1.81&cvsroot=src

If I understand correctly what you are trying to do -
to inject another SIGSTOP so that DETACH won't miss it,
then a fixed ptrace which doesn't lose SIGSTOPs
will also work with the old gdb which does this trick.
The trick will be unnecessary, but it will still work.


> Make the GDB trick above no longer needed, so that in the case it was invented
> for a simple PTRACE_ATTACH, wait->SIGSTOP, PTRACE_DETACH(0) also works:
>   foreign process: kill(child process, SIGSTOP)
>   parent process:  wait() -> SIGSTOP (the notification is now eaten-out)
>   child process is now in `T (stopped)'
>   debugger: PTRACE_ATTACH(child process)
>   debugger: waitpid -> should get SIGSTOP, even despite it was eaten-out above
> This works in some kernels and does not work in other kernels.

I believe there is a proposal to add PTRACE_ATTACH_NOSTOP+PTRACE_INTERRUPT,
which I guess is basically PTRACE_STOP replacement which does not mess things up
with artificial SIGSTOP. Aha, I found it:

http://sourceware.org/ml/archer/2011-q1/msg00026.html

(I do not understand why PTRACE_ATTACH_NOSTOP doesn't simply
make next waitpid() return SIGTRAP stop, but maybe it makes sense
after deeper analysis... anyway, PTRACE_ATTACH_NOSTOP + PTRACE_INTERRUPT
sequence described there is not a problem to use by userspace)

This will work reliably both in above case and also for the case
where real parent did not eat yet the STOP notification.


> A new proposal is to preserve the process's `T (stopped)' for
> a naive/legacy debugger / ptrace tool doing PTRACE_ATTACH, wait->SIGSTOP,
> PTRACE_DETACH(0), incl. GDB doing the "GDB trick" above.

Right, that's the overarching idea: to preserve the stopped state
across ptrace ops.


> That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> iff the process was `T (stopped)' before PTRACE_ATTACH.
>  - PTRACE_DETACH(0)       should preserve `T (stopped)'.

I assume you are thinking about PTRACE_ATTACH + wait():SIGSTOP
+ PTRACE_DETACH(0) sequence.

It looks logical to use 0 in *this* sequence, but consider
the following sequence:

....
ptrace(PTRACE_CONT, 0)
waitpid(): got SIGFOO
ptrace(PTRACE_CONT/SYSCALL/DETACH, 0)

What we have here? A signal delivery notification.
In this case, in next ptrace restarting operation, we specify
whether to inject signal or not. PTRACE_DETACH is a restarting
op. SIGSTOP is a signal. Why rules for them, or their combination,
should be different? Why 0 should still inject the stop?

Basically, assuming kernel got fixed wrt loss of group-stop state,
I don't see why gdb can't simply use PTRACE_DETACH with whatever
signal it saw in last stop (using 0 if it saw non-signal stop)?
If you did see SIGSTOP delivery, pass SIGSTOP with PTRACE_DETACH,
and it will be preserved.

Regarding PTRACE_ATTACH + wait():SIGSTOP + PTRACE_DETACH(???)
sequence. I think this SIGSTOP should be considered by kernel
not to be a signal delivery ptrace stop - because this SIGSTOP
is "artificial". Therefore, next PTRACE_DETACH should ignore
its siganl parameter. IOW, it shouldn't matter whether you
use PTRACE_DETACH(0) or PTRACE_DETACH(SIGSTOP) - the stop state
should be preserved. IOW: neither SIGSTOP nor SIGCONT should
be injected. If we assume this interpretation, then PTRACE_DETACH(0)
will work correctly.


> but also:
>  - PTRACE_DETACH(SIGSTOP) should force `T (stopped)'.

Of course. It should inject SIGSTOP. That stops process
(or rather, it should. You are right that now it is now
always working).


>  - PTRACE_DETACH(SIGCONT) should force freely running process.

Of course. It should inject SIGCONT.

Do you need / want it to work even from a non-signal ptrace stop?
I.e. should this work too?

... ... waitpid():SIGTRAP + ptrace(PTRACE_DETACH, SIGCONT)


> The behavior of SIGSTOP and SIGCONT received during active ptrace session
> I find as a new feature without having much to keep backward compatibibility.
> +
> You have concluded a plan how to do a real `T (stopped)' on received SIGSTOP
> using PTRACE_GETSIGINFO, OK, go with that.
> +
> Personally I would keep it completely hidden from the debugger and only
> remember the last SIGCONT vs. SIGSTOP for the case the session ends with
> PTRACE_DETACH(0).  Debugger/strace would not be able to display any externally
> received SIGSTOP/SIGCONT.  PTRACE_CONT(SIGSTOP) and PTRACE_CONT(SIGCONT)
> should behave as PTRACE_CONT(0) to clean up compatibility with existing tools.
> For a general transparent tracing there is at least systemtap.

Again, the idea is close to what you are saying, just a bit different.

The idea is to make SIGSTOP/SIGCONT to be no different form other signals,
as far as userspace-visible ptrace API is concerned.

Thwo different ways how you end up with stopped tracee:

(1) When waitpid says that tracee got SIGSTOP, the debugger must decide, does
it want to propagate it or not. If it decides to propagate it, it does
ptrace(PTRACE_restarting_op, SIGSTOP). Which injects SIGSTOP to tracee.
If PTRACE_restarting_op == PTRACE_DETACH, the tracee is detached and stopped.
Therefore you must not use ptrace(PTRACE_DETACH, 0), it is logically wrong
if you want SIGSTOP to take effect.

(Note: there is a small twist with PTRACE_restarting_op != PTRACE_DETACH,
because the tracee is still attached and therefore it immediately reports
to debugger that it indeed has stopped in response to SIGSTOP -
which looks very similar to *SIGSTOP delivery*. For example, currently
strace is confused - it thinks that *another SIGSTOP* came in -
and issues *another* ptrace(PTRACE_SYSCALL, SIGSTOP),
which, being issued _not_ after signal delivery ptrace stop, can't inject
the SIGSTOP. Which is harmless, but confusing. Querying GETSIGINFO
can disambiguate this, and make strace output more informative. I have a patch.
But for PTRACE_DETACH it doesn't matter...)

(2) If you PTRACE_ATTACHed (or better, PTRACE_ATTACH_NOSTOP + PTRACE_INTERRUPTed,
because PTRACE_ATTACH is racy versus simultaneous SIGSTOP)
to the process which is already stopped, you don't need to do
ptrace(PTRACE_DETACH, SIGSTOP) in order to detach and leave it stopped.
Actually, if you do waitpid():SIGTRAP + ptrace(PTRACE_DETACH, <anything>),
<anything> will be ignored, because tracee is not in signal delivery ptrace stop,
so you can do ptrace(PTRACE_DETACH, SIGSTOP), no harm done:
if process was already stopped, it will remain stopped.
If it wasn't stopped, it will continue.


So, the only special trick you seem to want is to make ptrace(PTRACE_DETACH, SIGCONT)
to forcibly unpause stopped task, even if done from non-signal ptrace stop. Right?

I guess this can be special-cased, but can't the same be trivially achieved by
kill(SIGCONT) + ptrace(PTRACE_DETACH, SIGCONT)?
This will avoid the need to special case in the kernel...


-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-16 21:51                                       ` Jan Kratochvil
  2011-02-17  3:37                                         ` Denys Vlasenko
@ 2011-02-17 16:49                                         ` Oleg Nesterov
  2011-02-17 18:58                                           ` Roland McGrath
  2011-02-18 21:34                                           ` Jan Kratochvil
  1 sibling, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-17 16:49 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/16, Jan Kratochvil wrote:
>
> On Mon, 14 Feb 2011 18:20:52 +0100, Denys Vlasenko wrote:
> > Jan, please put on your gdb maintainer's hat, we need your opinion here.
> > Is it a problem from gdb's POV?
>
> Here is a summary of current and my wished behavior:
>
> Make PTRACE_DETACH (data=SIGSTOP) working

(OK, but afaics this is a bit off-topic ;).

> - that is to leave the process in
> `T (stopped)' without any single PC step.

This is not exactly clear to me... I mean "without any single PC step".
Why?

> This works in some kernels and
> does not work in other kernels,

Afaics, this only works in utrace-based kernels.

In upstream kernel, we have the extra wake_up_state() in ptrace_detach().
And,

> it is "detach-stopped" test in:

But there is another problem which can't be really tested by detach-stopped
(because it detaches when the tracee was already stopped). The
SIGNAL_STOP_DEQUEUED logic is not correct.

> The current upstream GDB trick of
>   PTRACE_ATTACH
>   if /proc/PID/status->State: == `T (stopped)'
>     tgkill(SIGSTOP)
>     PTRACE_CONT(0)
>   waitpid->SIGSTOP (or preceded by some other signal but 1x SIGSTOP will come)
> should remain compatible,

Oh. OK. It should be at first glance, despite the fact PTRACE_CONT()
doesn't actually resume. But then we need this patch ;)

> Make the GDB trick above no longer needed,

It is still needed. Again, this patch should make this trick unnecessary.

(To clarify, Tejun's patches fix this problem too, but we are trying to
 discuss another behaviour).

> so that in the case it was invented
> for a simple PTRACE_ATTACH, wait->SIGSTOP, PTRACE_DETACH(0) also works:
>   foreign process: kill(child process, SIGSTOP)
>   parent process:  wait() -> SIGSTOP (the notification is now eaten-out)
>   child process is now in `T (stopped)'
>   debugger: PTRACE_ATTACH(child process)
>   debugger: waitpid -> should get SIGSTOP, even despite it was eaten-out above
> This works in some kernels and does not work in other kernels.

Yes, but in fact this is another problem, it was fixed by 90bc8d8b
"do_wait: fix waiting for the group stop with the dead leader".



> A new proposal is to preserve the process's `T (stopped)' for
> a naive/legacy debugger / ptrace tool doing PTRACE_ATTACH, wait->SIGSTOP,
> PTRACE_DETACH(0), incl. GDB doing the "GDB trick" above.
> That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> iff the process was `T (stopped)' before PTRACE_ATTACH.
>  - PTRACE_DETACH(0)       should preserve `T (stopped)'.

Hmm. OK, but I assume you meant "unless the tracee was resumed in between".

> but also:
>  - PTRACE_DETACH(SIGSTOP) should force `T (stopped)'.
>  - PTRACE_DETACH(SIGCONT) should force freely running process.

OK... Yes, perhaps PTRACE_{DETACH,CONT}(SIGCONT) should override
SIGNAL_STOP_STOPPED too. This makes sense, and this connects to
the problem with SIGNAL_STOP_DEQUEUED I mentioned above.

But. Let me remind. PTRACE_DETACH(SIGXXX) does not always work as
gdb thinks, SIGXXX can be ignored. For example, PTRACE_KILL-after-
step-into-handler gdb bug. But this is another story.

> The behavior of SIGSTOP and SIGCONT received during active ptrace session
> I find as a new feature without having much to keep backward compatibibility.
> +
> You have concluded a plan how to do a real `T (stopped)' on received SIGSTOP
> using PTRACE_GETSIGINFO, OK, go with that.

Well, not exactly. Please forget about PTRACE_GETSIGINFO.

Suppose that the tracee is 'T (stopped)'. Because the debugger did
PTRACE_CONT(SIGSTOP), or because debugger attached to the stopped task.

Currently, PTRACE_CONT(WHATEVER) after that always resumes the tracee,
despite the fact it is still stopped in some sense. This leads to
numerous oddities/bugs.

What we propose is to change this so that the tracee does not run
until it actually recieves SIGCONT. Yes, _perhaps_ PTRACE_CONT(SIGCONT)
should be treated specially, but I think this is relatively minor issue.

> Personally I would keep it completely hidden from the debugger and only
> remember the last SIGCONT vs. SIGSTOP for the case the session ends with
> PTRACE_DETACH(0).  Debugger/strace would not be able to display any externally
> received SIGSTOP/SIGCONT.  PTRACE_CONT(SIGSTOP) and PTRACE_CONT(SIGCONT)
> should behave as PTRACE_CONT(0) to clean up compatibility with existing tools.

Can't understand... could you explain?

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-17 16:49                                         ` Oleg Nesterov
@ 2011-02-17 18:58                                           ` Roland McGrath
  2011-02-17 19:33                                             ` Oleg Nesterov
  2011-02-18 21:34                                           ` Jan Kratochvil
  1 sibling, 1 reply; 160+ messages in thread
From: Roland McGrath @ 2011-02-17 18:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Denys Vlasenko, Tejun Heo, linux-kernel, torvalds, akpm

> OK... Yes, perhaps PTRACE_{DETACH,CONT}(SIGCONT) should override
> SIGNAL_STOP_STOPPED too. This makes sense, and this connects to
> the problem with SIGNAL_STOP_DEQUEUED I mentioned above.

It's not at all clear this really does make sense.  I think this may
reflect a (common) misunderstanding of what the SIGCONT semantics are
(aside from ptrace).  Resuming the process is not the action of
delivering SIGCONT.  Rather, *generating* SIGCONT is what resumes the
process--it does so immediately, and regardless of whether SIGCONT is
blocked or ignored.  (It's also at generation-time that its other
magical semantics apply, of clearing all pending stop signals.)  
By the time SIGCONT is actually delivered, it is no different than
SIGWINCH.  (Conversely, with stop signals, generation time is when
the magic semantics of clearing pending SIGCONT occur, but the actual
stopping is indeed the delivery-time action.)

In ptrace, the report of a signal to the tracer is that it is about to
be delivered, not that it was just generated.  e.g., it won't be
reported immediately if it was blocked, only when it's unblocked and
being delivered.  Likewise, a signal given to PTRACE_CONT et al is not
generating a signal, it is continuing the delivery path of a signal.
So IMHO what makes most sense given what all the normal semantics are
is that PTRACE_CONT,SIGCONT does nothing magical, and generating a
fresh SIGCONT (i.e. kill) is the way you resume from job control stop.
If ptrace is involved, that will mean waking up long enough to dequeue
the SIGCONT and get a new ptrace stop for that.  (Things are a bit
more subtle if there are multiple threads.)


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-17  3:37                                         ` Denys Vlasenko
@ 2011-02-17 19:19                                           ` Oleg Nesterov
  2011-02-18 21:11                                             ` Jan Kratochvil
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-17 19:19 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Jan Kratochvil, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/17, Denys Vlasenko wrote:
>
> Hi Jan,
>
> Thanks for joining!

Yep ;)

> On Wednesday 16 February 2011 22:51, Jan Kratochvil wrote:
>
> > The current upstream GDB trick of
> >   PTRACE_ATTACH
> >   if /proc/PID/status->State: == `T (stopped)'
> >     tgkill(SIGSTOP)
> >     PTRACE_CONT(0)
> >   waitpid->SIGSTOP (or preceded by some other signal but 1x SIGSTOP will come)
>
> I don't fully understand the steps of the trick.

Please see my reply to Jan. In short: ->exit_code can be zero after
attach. This is unlikely case after 90bc8d8b, but still possible.

> I believe there is a proposal to add PTRACE_ATTACH_NOSTOP+PTRACE_INTERRUPT,

(This is more or less orthogonal to the discussed problem, I think
 we should discuss this separately).

> > That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> > iff the process was `T (stopped)' before PTRACE_ATTACH.
> >  - PTRACE_DETACH(0)       should preserve `T (stopped)'.
>
> I assume you are thinking about PTRACE_ATTACH + wait():SIGSTOP
> + PTRACE_DETACH(0) sequence.

plus it should be stopped before attach, I assume. Otherwise this
not true with the current code.

> It looks logical to use 0 in *this* sequence, but consider
> the following sequence:
>
> ....
> ptrace(PTRACE_CONT, 0)
> waitpid(): got SIGFOO
> ptrace(PTRACE_CONT/SYSCALL/DETACH, 0)

Agreed, and this matches (I hope) my reply.

> Regarding PTRACE_ATTACH + wait():SIGSTOP + PTRACE_DETACH(???)
> sequence. I think this SIGSTOP should be considered by kernel
> not to be a signal delivery ptrace stop - because this SIGSTOP
> is "artificial".

Well, this is another case when we have what we have. Unfortunately
it is not artificial. We can add the new PTRACE_ATTACH_* requests
(and we are going to do this), but I don't think we can change this.

> Actually, if you do waitpid():SIGTRAP + ptrace(PTRACE_DETACH, <anything>),
> <anything> will be ignored, because tracee is not in signal delivery ptrace stop,

Well, to clarify, this depends on SIGTRAP above. But in general yes,
<anything> can be ignored. In particular, it _is_ ignored when the
tracee is stopped (T (stopped)). I do not know was it supposed or not,
but this is how the code currently works.

Perhaps ptrace interface should give more respect to SIGXXX argument,
I have no idea. But let me repeat, imho this is another issue.

> So, the only special trick you seem to want is to make ptrace(PTRACE_DETACH, SIGCONT)
> to forcibly unpause stopped task, even if done from non-signal ptrace stop. Right?
>
> I guess this can be special-cased, but can't the same be trivially achieved by
> kill(SIGCONT) + ptrace(PTRACE_DETACH, SIGCONT)?

Hmm. Agreed. Contrary to what I said in the previous email.

> This will avoid the need to special case in the kernel...

And note that this currently doesn't work anyway.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-17 18:58                                           ` Roland McGrath
@ 2011-02-17 19:33                                             ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-17 19:33 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Jan Kratochvil, Denys Vlasenko, Tejun Heo, linux-kernel, torvalds, akpm

On 02/17, Roland McGrath wrote:
>
> > OK... Yes, perhaps PTRACE_{DETACH,CONT}(SIGCONT) should override
> > SIGNAL_STOP_STOPPED too. This makes sense, and this connects to
> > the problem with SIGNAL_STOP_DEQUEUED I mentioned above.
>
> It's not at all clear this really does make sense.

Yes, I already changed my mind, see another email from me ;)

> I think this may
> reflect a (common) misunderstanding of what the SIGCONT semantics are
> (aside from ptrace).  Resuming the process is not the action of
> delivering SIGCONT.

Yes. I was confused by "for a naive/legacy debugger", somehow I missed
this doesn't work with the current code anyway, no need to add this
feature.

> So IMHO what makes most sense given what all the normal semantics are
> is that PTRACE_CONT,SIGCONT does nothing magical, and generating a
> fresh SIGCONT (i.e. kill) is the way you resume from job control stop.

Exactly.

All other things we discussed are "small details". This is the most
noticeable change.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-15 20:27                                                     ` Oleg Nesterov
@ 2011-02-18 17:02                                                       ` Tejun Heo
  2011-02-18 19:37                                                         ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-18 17:02 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello, Oleg.

Still trying to follow the new discussion.

On Tue, Feb 15, 2011 at 09:27:47PM +0100, Oleg Nesterov wrote:
> > The reason for the transition to TASK_TRACED is to prevent a race with
> > SIGCONT waking the task.  There is always a race with SIGKILL waking it,
> > but the circumstances where that can really matter are far fewer.
> > You need to make sure that the work PTRACE_GETSIGINFO does to access
> > last_siginfo cannot race with that pointer disappearing or the stack
> > space it points to becoming invalid.  I think the use of siglock ensures
> > that, but Oleg should verify it.
> 
> Yes, I think this is safe.
> 
> I do not really like this idea because it looks a bit strange to treat
> PTRACE_GETSIGINFO specially, and this doesn't solve all problems. And,
> once again, I still hope we can change ptrace_resume() so that it doesn't
> wakeup the stopped (I mean, SIGNAL_STOP_STOPPED) tracee, in this case this
> hack is not needed.
> 
> And. We are going to add the new requests which doesn't need the stopped
> tracee anyway. So we can just add PTRACE_HAS_SIGINFO which returns
> child->last_siginfo != NULL. This looks simpler, and this is compatible.
> Of course this check is racy, but this doesn't matter. PTRACE_GETSIGINFO
> is equally racy if it doesn't change the state to TASK_TRACED.

This is probably where we disagree the most but I think the weird part
isn't making PTRACE_GETSIGINFO exempt from TASK_TRACE transition.  The
weirdness starts when the tracee is put into TASK_STOPPED while being
ptraced.  I think such dual modes of operation inherently lead to
strange problems.

Instead of having simple "a ptracer stops in TASK_TRACED and its
execution is under the control of ptrace", we end up with the tracee
stopping here or there depending on why it stops and the involved
behavioral subtleties like consumption of wait state and the mentioned
GETSIGINFO problem.  The patch which puts the tracee into TASK_TRACED
on ATTACH already fix two problems discussed in this thread without
doing anything wonky.  I think it says a lot.

As it currently stands, SIGSTOP/CONT while ptraced doesn't work and
even if we bend the rules subtly and provide sneaky ways like the
above, userland needs to be modified to make use of it anyway.  I
think it would be far cleaner to simply make ptracee always stop in
TASK_TRACED and give the ptracer a way to notice what's happening to
the tracee w.r.t. group stop, so that it can comply if it wants to.
What do you think?

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-18 17:02                                                       ` Tejun Heo
@ 2011-02-18 19:37                                                         ` Oleg Nesterov
  2011-02-21 16:22                                                           ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-18 19:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello Tejun,

On 02/18, Tejun Heo wrote:
> Hello, Oleg.
>
> Still trying to follow the new discussion.

And how it goes?

As for me, I am not sure I can follow it ;)

> On Tue, Feb 15, 2011 at 09:27:47PM +0100, Oleg Nesterov wrote:
> > > The reason for the transition to TASK_TRACED is to prevent a race with
> > > SIGCONT waking the task.  There is always a race with SIGKILL waking it,
> > > but the circumstances where that can really matter are far fewer.
> > > You need to make sure that the work PTRACE_GETSIGINFO does to access
> > > last_siginfo cannot race with that pointer disappearing or the stack
> > > space it points to becoming invalid.  I think the use of siglock ensures
> > > that, but Oleg should verify it.
> >
> > Yes, I think this is safe.
> >
> > I do not really like this idea because it looks a bit strange to treat
> > PTRACE_GETSIGINFO specially, and this doesn't solve all problems. And,
> > once again, I still hope we can change ptrace_resume() so that it doesn't
> > wakeup the stopped (I mean, SIGNAL_STOP_STOPPED) tracee, in this case this
> > hack is not needed.
> >
> > And. We are going to add the new requests which doesn't need the stopped
> > tracee anyway. So we can just add PTRACE_HAS_SIGINFO which returns
> > child->last_siginfo != NULL. This looks simpler, and this is compatible.
> > Of course this check is racy, but this doesn't matter. PTRACE_GETSIGINFO
> > is equally racy if it doesn't change the state to TASK_TRACED.
>
> This is probably where we disagree the most but I think the weird part
> isn't making PTRACE_GETSIGINFO exempt from TASK_TRACE transition.  The
> weirdness starts when the tracee is put into TASK_STOPPED while being
> ptraced.  I think such dual modes of operation inherently lead to
> strange problems.
>
> Instead of having simple "a ptracer stops in TASK_TRACED and its
> execution is under the control of ptrace",

In fact, I am not sure I really disagree with this part, but see below.

> The patch which puts the tracee into TASK_TRACED
> on ATTACH already fix two problems discussed in this thread without
> doing anything wonky.  I think it says a lot.

Yes. One off-topice note... if we are talking about this patch only,
I do not think it makes sense to add the new member into task_struct
so that STOPPED/TRACED transition can always report the exactly correct
->exit_code. I think we can just use group_exit_code ?: SIGSTOP.
But, again, this is off-topic.


> As it currently stands, SIGSTOP/CONT while ptraced doesn't work

And this is probably where we disagree the most. I think this is bug,
and this should be fixed.

> and
> even if we bend the rules subtly and provide sneaky ways like the
> above, userland needs to be modified to make use of it anyway.

Yes. But with the current code we can't modify, say, strace so
that SIGSTOP/CONT can work "correctly".

> I
> think it would be far cleaner to simply make ptracee always stop in
> TASK_TRACED and give the ptracer a way to notice what's happening to
> the tracee

Well. If we accept the proposed PTRACE_CONT-needs-SIGCONT behaviour,
then I think this probably makes sense. The tracee stops under ptrace,
the possible SIGCONT shouldn't abuse debugger which wants to know, say,
the state of registers.

To be honest, I don't understand whether I changed my mind now, or
I was never against this particular change in behaviour.

Once debugger does PTRACE_CONT, the tracee becomes TASK_STOPPED and
now it is "visible" to SIGCONT (or the tracee resumes if SIGCONT has
come in between).

But I think you will equally blame this TRACED/STOPPED transition
as "behavioral subtleties" and I can understand you even if I disagree.
And yes, this leads to other questions. But note that this greatly
simplifies things. The tracee can never participate in the same
group-stop twice.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-17 19:19                                           ` Oleg Nesterov
@ 2011-02-18 21:11                                             ` Jan Kratochvil
  2011-02-19 20:16                                               ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-18 21:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Thu, 17 Feb 2011 20:19:52 +0100, Oleg Nesterov wrote:
> > > That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> > > iff the process was `T (stopped)' before PTRACE_ATTACH.
> > >  - PTRACE_DETACH(0)       should preserve `T (stopped)'.
> >
> > I assume you are thinking about PTRACE_ATTACH + wait():SIGSTOP
> > + PTRACE_DETACH(0) sequence.
> 
> plus it should be stopped before attach, I assume. Otherwise this
> not true with the current code.

I did not talk about the current code.  I was making a proposal of new
behavior (which should not break existing software).

If PTRACE_ATTACH was done on process with `T (stopped)' then after
PTRACE_DETACH(0) again the process should be `T (stopped)'.


Regards,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-17 16:49                                         ` Oleg Nesterov
  2011-02-17 18:58                                           ` Roland McGrath
@ 2011-02-18 21:34                                           ` Jan Kratochvil
  2011-02-19 20:06                                             ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-18 21:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Thu, 17 Feb 2011 17:49:06 +0100, Oleg Nesterov wrote:
> > - that is to leave the process in
> > `T (stopped)' without any single PC step.
> 
> This is not exactly clear to me... I mean "without any single PC step".
> Why?

Engineers investigating problems of applications SIGSTOP it when it is in the
critical situation.  Then they run gcore, gstack etc.  After they are
satisfied with the analsysis they send SIGCONT.

If the application being investigated changes state between the various tools
it may be confusing as the dumps will not match.  Ale in some cases some
critical state being investigated may get lost.


> > A new proposal is to preserve the process's `T (stopped)' for
> > a naive/legacy debugger / ptrace tool doing PTRACE_ATTACH, wait->SIGSTOP,
> > PTRACE_DETACH(0), incl. GDB doing the "GDB trick" above.
> > That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> > iff the process was `T (stopped)' before PTRACE_ATTACH.
> >  - PTRACE_DETACH(0)       should preserve `T (stopped)'.
> 
> Hmm. OK, but I assume you meant "unless the tracee was resumed in between".

You described the exact behavior of current Fedora/RHEL gdb.  But in general
I do not insist on it, one can for example run an inferior function call
during the investigation-under-SIGSTOP described above, even in such case one
still wants to detach the application still in the `T (stopped)' mode.

Detaching process as '(T) stopped' is not such a problem as the app/user can
send SIGCONT to it.  But accidentally unstopping the process during detach
cannot be fixed/workarounded.


> But. Let me remind. PTRACE_DETACH(SIGXXX) does not always work as
> gdb thinks, SIGXXX can be ignored.

In such case it is a bug.  Due to this bug there is probably the
tgkill(SIGSTOP)+PTRACE_DETACH(0) used by the "detach-stopped-rhel5"
ptrace-testsuite testfile, IIUC.


> > Personally I would keep it completely hidden from the debugger and only
> > remember the last SIGCONT vs. SIGSTOP for the case the session ends with
> > PTRACE_DETACH(0).  Debugger/strace would not be able to display any externally
> > received SIGSTOP/SIGCONT.  PTRACE_CONT(SIGSTOP) and PTRACE_CONT(SIGCONT)
> > should behave as PTRACE_CONT(0) to clean up compatibility with existing tools.
> 
> Can't understand... could you explain?

A process is not in the `T (stopped)' state randomly.  AFAIK it is there due
to an engineer sending it SIGSTOP.  Applications themselves do not use SIGSTOP
themselves to get into `T (stopped)' during their execution.

And if the engineer sent SIGSTOP it was intentional.  The engineer does not
want some tool to accidentally cancel his intentional SIGSTOP.  When the
engineer decides so (s)he can send SIGCONT appropriately.

SIGSTOP I find as a hard stop and thus even the tracers/debuggers of
the `T (stopped)' process should just get no response from it.  I do not think
ptrace is a good tool for some general system monitoring - to see any
SIGCONT/SIGSTOP deliveries - because ptrace is (a) single-master limited
(second PTRACE_ATTACH gets EPERM) and (b) ptrace-control is not transparent
due to the threads/races timing (on `t (tracing stop)').  For global system
tracing incl. the SIGCONT/SIGSTOP deliveries there are more suitable the fully
transparent tools like systemtap.

Therefore if the debugger sends some SIGSTOP/SIGCONT those should be rather
ignored for compatibility reasons as they may be either just bogus or used as
workarounds (such as in the FSF GDB PTRACE_ATTACH-SIGSTOP-trick) of ptrace
bugs which should no longer be needed.



Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-18 21:34                                           ` Jan Kratochvil
@ 2011-02-19 20:06                                             ` Oleg Nesterov
  2011-02-20  9:40                                               ` Jan Kratochvil
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-19 20:06 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/18, Jan Kratochvil wrote:
>
> On Thu, 17 Feb 2011 17:49:06 +0100, Oleg Nesterov wrote:
> > > - that is to leave the process in
> > > `T (stopped)' without any single PC step.
> >
> > This is not exactly clear to me... I mean "without any single PC step".
> > Why?
>
> Engineers investigating problems of applications SIGSTOP it when it is in the
> critical situation.  Then they run gcore, gstack etc.  After they are
> satisfied with the analsysis they send SIGCONT.
>
> If the application being investigated changes state between the various tools
> it may be confusing as the dumps will not match.  Ale in some cases some
> critical state being investigated may get lost.

Which state can be changed?

Of course, the tracee shouldn't return to the user-space before the
stop, it shouldn't change its registers or anything which can be
noticed by gcore/gstack/etc. But why it can't do any single PC step
in kernel?

> > > A new proposal is to preserve the process's `T (stopped)' for
> > > a naive/legacy debugger / ptrace tool doing PTRACE_ATTACH, wait->SIGSTOP,
> > > PTRACE_DETACH(0), incl. GDB doing the "GDB trick" above.
> > > That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> > > iff the process was `T (stopped)' before PTRACE_ATTACH.
> > >  - PTRACE_DETACH(0)       should preserve `T (stopped)'.
> >
> > Hmm. OK, but I assume you meant "unless the tracee was resumed in between".
>
> You described the exact behavior of current Fedora/RHEL gdb.  But in general
> I do not insist on it, one can for example run an inferior function call
> during the investigation-under-SIGSTOP described above, even in such case one
> still wants to detach the application still in the `T (stopped)' mode.
>
> Detaching process as '(T) stopped' is not such a problem as the app/user can
> send SIGCONT to it.  But accidentally unstopping the process during detach
> cannot be fixed/workarounded.

Now I am confused. I do not really understand what do you want, but
I feel that what you are trying to suggest is not very right (or I
misunderstood you).

But, once again, I think this should be discussed in another thread.
Yes, we have multiple "it-doesnt-stop-after-detach" problems, but we
can't discuss all problems in this thread.

> > But. Let me remind. PTRACE_DETACH(SIGXXX) does not always work as
> > gdb thinks, SIGXXX can be ignored.
>
> In such case it is a bug.

Well, may be this is bug, may be not. This is fact ;) Perhaps we should
change this but, again, we need another thread for discussion.

And, btw, we already discussed this. I explained many times that DETACH/CONT
do not always respect SIGXXX argument, you never said this should be fixed.
In particular, I already mentioned in this thread that this argument has no
effect after the jctl stop. I guess this only proves we should discuss this
separately to avoid the confusion ;)

> Due to this bug there is probably the
> tgkill(SIGSTOP)+PTRACE_DETACH(0) used by the "detach-stopped-rhel5"
> ptrace-testsuite testfile, IIUC.

No. I already tried to explain the problems with detach-stopped in my
previous email:

	> This works in some kernels and
	> does not work in other kernels,

	Afaics, this only works in utrace-based kernels.

	In upstream kernel, we have the extra wake_up_state() in ptrace_detach().
	And,

	> it is "detach-stopped" test in:

	But there is another problem which can't be really tested by detach-stopped
	(because it detaches when the tracee was already stopped). The
	SIGNAL_STOP_DEQUEUED logic is not correct.

In short: the wrong wakeup + the broken SIGNAL_STOP_DEQUEUED logic.
detach-stopped-rhel5 does tkill(SIGSTOP), this fixes the latter. But
it can fail anyway afaics, just the probability is low.

> A process is not in the `T (stopped)' state randomly.  AFAIK it is there due
> to an engineer sending it SIGSTOP.  Applications themselves do not use SIGSTOP
> themselves to get into `T (stopped)' during their execution.

Well, they do ;) but I think this doesn't matter.

> And if the engineer sent SIGSTOP it was intentional.  The engineer does not
> want some tool to accidentally cancel his intentional SIGSTOP.  When the
> engineer decides so (s)he can send SIGCONT appropriately.
>
> SIGSTOP I find as a hard stop and thus even the tracers/debuggers of
> the `T (stopped)' process should just get no response from it.

If I understand you correctly, then I agree very much here, and this was
our point.

But I am afraid I could misunderstand, please see below.

> I do not think
> ptrace is a good tool for some general system monitoring - to see any
> SIGCONT/SIGSTOP deliveries - because ptrace is (a) single-master limited
> (second PTRACE_ATTACH gets EPERM)

This is what I certainly can't understand,

> and (b) ptrace-control is not transparent
> due to the threads/races timing (on `t (tracing stop)').

We are going to try to fix this races,

> Therefore if the debugger sends some SIGSTOP/SIGCONT those should be rather
> ignored for compatibility reasons

Well, I don't think so. In particular they shouldn't be ignored for
compatibility reasons.


Jan. Could you please explicitly answer our question? We have the numerious
problems with jctl and ptrace. Everything is just broken. And it is broken
by design, that is why it is not easy to fix the code: we should first
discuss what do we want to get in result. Please forget about attach/detach
for the moment. I'll repeat my question:

	Suppose that the tracee is 'T (stopped)'. Because the debugger did
	PTRACE_CONT(SIGSTOP), or because debugger attached to the stopped task.

	Currently, PTRACE_CONT(WHATEVER) after that always resumes the tracee,
	despite the fact it is still stopped in some sense. This leads to
	numerous oddities/bugs.

	What we propose is to change this so that the tracee does not run
	until it actually recieves SIGCONT.

Is it OK for gdb or not?

IOW. To simplify. Suppose we have a task in 'T (stopped)' state. Then
debugger comes and does

	ptrace(PTRACE_ATTACH);
	PTRACE(PTRACE_CONT, 0);

With the current code the tracee runs after that. We want to change
the kernel so that the tracee won't run, but becomes 'T (stopped)'
again. It only runs when it gets SIGCONT.

Do you agree with such a change?


And yes, yes,

	ptrace(PTRACE_ATTACH);
	ptrace(PTRACE_DETACH, 0)

should leave it stopped too, of course.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-18 21:11                                             ` Jan Kratochvil
@ 2011-02-19 20:16                                               ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-19 20:16 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/18, Jan Kratochvil wrote:
>
> On Thu, 17 Feb 2011 20:19:52 +0100, Oleg Nesterov wrote:
> > > > That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> > > > iff the process was `T (stopped)' before PTRACE_ATTACH.
> > > >  - PTRACE_DETACH(0)       should preserve `T (stopped)'.
> > >
> > > I assume you are thinking about PTRACE_ATTACH + wait():SIGSTOP
> > > + PTRACE_DETACH(0) sequence.
> >
> > plus it should be stopped before attach, I assume. Otherwise this
> > not true with the current code.
>
> I did not talk about the current code.  I was making a proposal of new
> behavior (which should not break existing software).

Confused.

> If PTRACE_ATTACH was done on process with `T (stopped)'

this matters "it should be stopped before attach"

> then after
> PTRACE_DETACH(0) again the process should be `T (stopped)'.

Regardless of what the debugger did in between? This can't be right.
I'd say, it doesn't make sense to take the state of the tracee before
PTRACE_ATTACH into account. What does matter, is its state before
PTRACE_DETACH.

If the debugger did not resume the tracee before PTRACE_DETACH, then
of course I agree, PTRACE_DETACH(0) should preserve T (stopped).

But again, lets discuss this separately.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-19 20:06                                             ` Oleg Nesterov
@ 2011-02-20  9:40                                               ` Jan Kratochvil
  2011-02-20 17:06                                                 ` Denys Vlasenko
  2011-02-20 17:16                                                 ` Oleg Nesterov
  0 siblings, 2 replies; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-20  9:40 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Sat, 19 Feb 2011 21:06:37 +0100, Oleg Nesterov wrote:
> On 02/18, Jan Kratochvil wrote:
> > If the application being investigated changes state between the various tools
> > it may be confusing as the dumps will not match.  Ale in some cases some
> > critical state being investigated may get lost.
> 
> Which state can be changed?
> 
> Of course, the tracee shouldn't return to the user-space before the
> stop, it shouldn't change its registers or anything which can be
> noticed by gcore/gstack/etc.

Yes, I meant this.

> But why it can't do any single PC step in kernel?

I do not know about the kernel internals.


> > I do not think
> > ptrace is a good tool for some general system monitoring - to see any
> > SIGCONT/SIGSTOP deliveries - because ptrace is (a) single-master limited
> > (second PTRACE_ATTACH gets EPERM)
> 
> This is what I certainly can't understand,
> 
> > and (b) ptrace-control is not transparent
> > due to the threads/races timing (on `t (tracing stop)').
> 
> We are going to try to fix this races,

No, even if ptrace will be perfect with the current design of ptrace it will
be affecting timing of its debuggee.

>From practice - GDB (which is single-threaded) is debugging multi-threaded
application and there are some bugs in GDB regarding proper handling of the
multi-threading events.  If you: strace gdb multithreadedapp -ex 'run'
then the bugs are no longer visible as strace-ing gdb will slow it down
compared to multithreadedapp so that the bugs in GDB are no longer
reproducible.

In such cases I used debug printf()s in GDB before without strace, nowadays
I can use systemtap instead of strace and the bug can be still reproducible.

This is why I believe when we design ptrace we should target more the
intrusive debugging tools than non-intrusive tracing tools as nowadays there
are better non-intrusive tracing ways than ptrace.


> Jan. Could you please explicitly answer our question? We have the numerious
> problems with jctl and ptrace. Everything is just broken. And it is broken
> by design, that is why it is not easy to fix the code: we should first
> discuss what do we want to get in result. Please forget about attach/detach
> for the moment. I'll repeat my question:
> 
> 	Suppose that the tracee is 'T (stopped)'. Because the debugger did
> 	PTRACE_CONT(SIGSTOP), or because debugger attached to the stopped task.
> 
> 	Currently, PTRACE_CONT(WHATEVER) after that always resumes the tracee,
> 	despite the fact it is still stopped in some sense. This leads to
> 	numerous oddities/bugs.
> 
> 	What we propose is to change this so that the tracee does not run
> 	until it actually recieves SIGCONT.
> 
> Is it OK for gdb or not?

>From GDB I do not see a problem.  It will change interactive behavior when
SIGSTOP is received, we can put a warning there when GDB does
PTRACE_CONT(SIGSTOP) so that the user knows (s)he should do external SIGCONT
and that the debugging is not broken.

When we talk about FSF GDB there isn't too much SIGSTOP-related code present.
There is the PTRACE_ATTACH-trick referenced before
	http://sourceware.org/cgi-bin/cvsweb.cgi/src/gdb/linux-nat.c.diff?r1=1.80&r2=1.81&cvsroot=src

There is also heavily used tkill(SIGSTOP), wait->SIGSTOP, PTRACE_CONT(0)
instead of some PTRACE_INTERRUPT (to stop each thread without affecting its
later run in any way).  This should not be changed.

With Fedora/RHEL GDB there were always hacks matching its specific Fedora/RHEL
kernel version which is offtopic here.

In GDB you can control as a user the way you want to continue the process:
	(gdb) signal SIGCONT
	Continuing with signal SIGCONT.
	(gdb) help signal
	Continue program giving it signal specified by the argument.
	An argument of "0" means continue program without giving it a signal.

Sure by default GDB does not do anything special, it will respawn (using
PTRACE_CONT(SIGSTOP)) any SIGSTOP it sees due to the default setting of:
	(gdb) handle SIGSTOP
	Signal        Stop  Print Pass to program Description
	SIGSTOP       Yes Yes Yes   Stopped (signal)

Therefore there happens the double SIGSTOP reporting as discussed before:
	(gdb) run
	Starting program: /bin/sleep 1h
	# external kill -STOP <inferior pid>
	Program received signal SIGSTOP, Stopped (signal).
	# State:	t (tracing stop)
	(gdb) continue 
	Continuing.
	Program received signal SIGSTOP, Stopped (signal).
	# State:	t (tracing stop)
	(gdb) continue 
	Continuing.
	# State:	S (sleeping)

Your proposal is I expect:
	(gdb) run
	Starting program: /bin/sleep 1h
	# external kill -STOP <inferior pid>
	Program received signal SIGSTOP, Stopped (signal).
	# State:	t (tracing stop)
	(gdb) continue 
	Continuing.
	# State:	T (stopped)

For non-interactive gstack (backtrace) / gcore (core dumping) GDB does not do
any PTRACE_CONT so it is offtopic here.

Upstream GDB always does PTRACE_DETACH(0).  Unexpected detach behavior would
be unwise but we do not discuss the PTRACE_DETACH here.


> IOW. To simplify. Suppose we have a task in 'T (stopped)' state. Then
> debugger comes and does
> 
> 	ptrace(PTRACE_ATTACH);
> 	PTRACE(PTRACE_CONT, 0);
> 
> With the current code the tracee runs after that. We want to change
> the kernel so that the tracee won't run, but becomes 'T (stopped)'
> again. It only runs when it gets SIGCONT.
> 
> Do you agree with such a change?
> 
> 
> And yes, yes,
> 
> 	ptrace(PTRACE_ATTACH);
> 	ptrace(PTRACE_DETACH, 0)
> 
> should leave it stopped too, of course.

GDB (and I believe nobody else) does PTRACE_ATTACH without wait->SIGSTOP,
otherwise it would make `(T) stopped' regular processes.  So I find your
question irrelevant.

If you ask about:
	ptrace(PTRACE_ATTACH);
	waitpid; eat SIGSTOP (being aware it may not be the first signal)
	PTRACE(PTRACE_CONT, 0);

Then I believe the debugee should run (and not to be stopped) as one can do:
	# kill -STOP applicationpid
	# gdb -p applicationpid
	(gdb) print getpid()
	(gdb) print show_me_your_internal_debug_dump()
	(gdb) quit
	# expecting applicationpid is still stopped, which currently is not.

The inferior calls are implemented as:
	PTRACE_GETREGS - save them somewhere
	PTRACE_SETREGS - change $RIP to show_me_your_internal_debug_dump
	PTRACE_CONT(0)
	waitpid->SIGTRAP - breakpoint show_me_your_internal_debug_dump returned
	PTRACE_SETREGS - restore the initial rgisters

Your other case:
	ptrace(PTRACE_ATTACH);
	waitpid; eat SIGSTOP (being aware it may not be the first signal)
	ptrace(PTRACE_DETACH, 0)

if inferior was `(T) stopped' before currently does not leave inferior
`(T) stopped'.  It would be good if this changes.

Also if GDB (or other debugging/tracing tool) crashes kernel does automatic
PTRACE_DETACH.  In such case if the inferior was `(T) stopped' it should be
still kept `(T) stopped'.


On Sat, 19 Feb 2011 21:16:03 +0100, Oleg Nesterov wrote:
> On 02/18, Jan Kratochvil wrote:
> >
> > On Thu, 17 Feb 2011 20:19:52 +0100, Oleg Nesterov wrote:
> > > > > That is after PTRACE_DETACH(0) the process should remain `T (stopped)'
> > > > > iff the process was `T (stopped)' before PTRACE_ATTACH.
> > > > >  - PTRACE_DETACH(0)       should preserve `T (stopped)'.
> > > >
> > > > I assume you are thinking about PTRACE_ATTACH + wait():SIGSTOP
> > > > + PTRACE_DETACH(0) sequence.
> > >
> > > plus it should be stopped before attach, I assume. Otherwise this
> > > not true with the current code.
> >
> > I did not talk about the current code.  I was making a proposal of new
> > behavior (which should not break existing software).
> 
> Confused.
> 
> > If PTRACE_ATTACH was done on process with `T (stopped)'
> 
> this matters "it should be stopped before attach"
> 
> > then after
> > PTRACE_DETACH(0) again the process should be `T (stopped)'.
> 
> Regardless of what the debugger did in between?

Yes.

> This can't be right.  I'd say, it doesn't make sense to take the state of
> the tracee before PTRACE_ATTACH into account. What does matter, is its state
> before PTRACE_DETACH.

I do not agree with this point.  Real world debugging programs are buggy
various ways and if they break you do not want to accidentally resume the `T
(stopped)' inferior under investigation.


> If the debugger did not resume the tracee before PTRACE_DETACH, then
> of course I agree, PTRACE_DETACH(0) should preserve T (stopped).

There are the common inferior calls in use, mostly because the debugger (GDB)
does not (even more before Python scripting was implemented) provide enough
user-providable per-application debugging facilities so they got implemented
into the inferiors themselves and people use GDB inferior calls to call them.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20  9:40                                               ` Jan Kratochvil
@ 2011-02-20 17:06                                                 ` Denys Vlasenko
  2011-02-20 17:48                                                   ` Oleg Nesterov
  2011-02-20 19:10                                                   ` Jan Kratochvil
  2011-02-20 17:16                                                 ` Oleg Nesterov
  1 sibling, 2 replies; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-20 17:06 UTC (permalink / raw)
  To: Jan Kratochvil, torvalds
  Cc: Oleg Nesterov, Tejun Heo, Roland McGrath, linux-kernel, akpm

On Sunday 20 February 2011 10:40, Jan Kratochvil wrote:
> Sure by default GDB does not do anything special, it will respawn (using
> PTRACE_CONT(SIGSTOP)) any SIGSTOP it sees due to the default setting of:
> 	(gdb) handle SIGSTOP
> 	Signal        Stop  Print Pass to program Description
> 	SIGSTOP       Yes Yes Yes   Stopped (signal)
> 
> Therefore there happens the double SIGSTOP reporting as discussed before:
> 	(gdb) run
> 	Starting program: /bin/sleep 1h
> 	# external kill -STOP <inferior pid>
> 	Program received signal SIGSTOP, Stopped (signal).
> 	# State:	t (tracing stop)
> 	(gdb) continue 
> 	Continuing.
> 	Program received signal SIGSTOP, Stopped (signal).
> 	# State:	t (tracing stop)
> 	(gdb) continue 
> 	Continuing.
> 	# State:	S (sleeping)
> 
> Your proposal is I expect:
> 	(gdb) run
> 	Starting program: /bin/sleep 1h
> 	# external kill -STOP <inferior pid>
> 	Program received signal SIGSTOP, Stopped (signal).
> 	# State:	t (tracing stop)
> 	(gdb) continue 
> 	Continuing.
> 	# State:	T (stopped)

Not exactly. Even after we fix kernel so that it properly preserves
group-stop across ptrace-stops, gdb will still see TWO 
waitpid:SIGSTOP events, not one.

First one says "the tracee has received SIGSTOP", and after PTRACE_CONT(SIGSTOP),
second one says "the tracee has stopped because of SIGSTOP".
Currently, neither strace nor gdb understands that second one
is different from first.

Here is how strace can be improved by querying PTRACE_GETSIGINFO:

+                               entered_stopped_state = 0;
+                               if (WSTOPSIG(status) == SIGSTOP ||
+                                   WSTOPSIG(status) == SIGTSTP) {
+                                       /*
+                                        * PTRACE_GETSIGINFO fails if this was
+                                        * genuine *stop* notification,
+                                        * not *signal* notification
+                                        */
+                                       if (ptrace(PTRACE_GETSIGINFO, pid,
+                                                   0, &si) != 0)
+                                               entered_stopped_state = 1;
+                               }
                                printleader(tcp);
-                               tprintf("--- %s (%s) @ %lx (%lx) ---",
+                               tprintf(entered_stopped_state
+                                       ? "--- stopped by %s ---"
+                                       : "--- %s (%s) @ %lx (%lx) ---",
                                        signame(WSTOPSIG(status)),
                                        strsignal(WSTOPSIG(status)), pc, addr);


Before patch strace shows confusing log:

--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---

After it is more understandable:

--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- stopped by SIGSTOP ---


I think you can use similar trick in gdb, so that second message says
"Program stopped due to signal SIGSTOP, Stopped (signal)",
not "Program received signal SIGSTOP, Stopped (signal)".

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20  9:40                                               ` Jan Kratochvil
  2011-02-20 17:06                                                 ` Denys Vlasenko
@ 2011-02-20 17:16                                                 ` Oleg Nesterov
  2011-02-20 18:52                                                   ` Jan Kratochvil
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-20 17:16 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/20, Jan Kratochvil wrote:
>
> On Sat, 19 Feb 2011 21:06:37 +0100, Oleg Nesterov wrote:
> > On 02/18, Jan Kratochvil wrote:
> > > If the application being investigated changes state between the various tools
> > > it may be confusing as the dumps will not match.  Ale in some cases some
> > > critical state being investigated may get lost.
> >
> > Which state can be changed?
> >
> > Of course, the tracee shouldn't return to the user-space before the
> > stop, it shouldn't change its registers or anything which can be
> > noticed by gcore/gstack/etc.
>
> Yes, I meant this.

OK, I misunderstood. Agreed, it shouldn't change the state.

> > > and (b) ptrace-control is not transparent
> > > due to the threads/races timing (on `t (tracing stop)').
> >
> > We are going to try to fix this races,
>
> No, even if ptrace will be perfect with the current design of ptrace it will
> be affecting timing of its debuggee.

Ah, indeed this is true, I misunderstood you again. In particular, yes, strace
shadows the problems quite often.

> This is why I believe when we design ptrace we should target more the
> intrusive debugging tools than non-intrusive tracing tools as nowadays there
> are better non-intrusive tracing ways than ptrace.

Perhaps. (I am not sure systemtap is always the right answer, and utrace
was designed with "non-intrusive tracing" in mind (I think), but this
off-topic too).

> > Jan. Could you please explicitly answer our question? We have the numerious
> > problems with jctl and ptrace. Everything is just broken. And it is broken
> > by design, that is why it is not easy to fix the code: we should first
> > discuss what do we want to get in result. Please forget about attach/detach
> > for the moment. I'll repeat my question:
> >
> > 	Suppose that the tracee is 'T (stopped)'. Because the debugger did
> > 	PTRACE_CONT(SIGSTOP), or because debugger attached to the stopped task.
> >
> > 	Currently, PTRACE_CONT(WHATEVER) after that always resumes the tracee,
> > 	despite the fact it is still stopped in some sense. This leads to
> > 	numerous oddities/bugs.
> >
> > 	What we propose is to change this so that the tracee does not run
> > 	until it actually recieves SIGCONT.
> >
> > Is it OK for gdb or not?
>
> From GDB I do not see a problem.  It will change interactive behavior when
> SIGSTOP is received, we can put a warning there when GDB does
> PTRACE_CONT(SIGSTOP) so that the user knows (s)he should do external SIGCONT
> and that the debugging is not broken.

OK, thanks.

> There is also heavily used tkill(SIGSTOP), wait->SIGSTOP, PTRACE_CONT(0)
> instead of some PTRACE_INTERRUPT (to stop each thread without affecting its
> later run in any way).  This should not be changed.

This should work, afaics.

> In GDB you can control as a user the way you want to continue the process:
> 	(gdb) signal SIGCONT
> 	Continuing with signal SIGCONT.

This just sets the argument for PTRACE_CONT, afaics

> Sure by default GDB does not do anything special, it will respawn (using
> PTRACE_CONT(SIGSTOP)) any SIGSTOP it sees due to the default setting of:

OK,

> Therefore there happens the double SIGSTOP reporting as discussed before:
> 	(gdb) run
> 	Starting program: /bin/sleep 1h
> 	# external kill -STOP <inferior pid>
> 	Program received signal SIGSTOP, Stopped (signal).
> 	# State:	t (tracing stop)
> 	(gdb) continue
> 	Continuing.
> 	Program received signal SIGSTOP, Stopped (signal).
> 	# State:	t (tracing stop)
> 	(gdb) continue
> 	Continuing.
> 	# State:	S (sleeping)
>
> Your proposal is I expect:
> 	(gdb) run
> 	Starting program: /bin/sleep 1h
> 	# external kill -STOP <inferior pid>
> 	Program received signal SIGSTOP, Stopped (signal).
> 	# State:	t (tracing stop)
> 	(gdb) continue
> 	Continuing.
> 	# State:	T (stopped)

Yes, I expect the same.

> > IOW. To simplify. Suppose we have a task in 'T (stopped)' state. Then
> > debugger comes and does
> >
> > 	ptrace(PTRACE_ATTACH);
> > 	PTRACE(PTRACE_CONT, 0);
> >
> > With the current code the tracee runs after that. We want to change
> > the kernel so that the tracee won't run, but becomes 'T (stopped)'
> > again. It only runs when it gets SIGCONT.
> >
> > Do you agree with such a change?
> >
> >
> > And yes, yes,
> >
> > 	ptrace(PTRACE_ATTACH);
> > 	ptrace(PTRACE_DETACH, 0)
> >
> > should leave it stopped too, of course.
>
> GDB (and I believe nobody else) does PTRACE_ATTACH without wait->SIGSTOP,
> otherwise it would make `(T) stopped' regular processes.  So I find your
> question irrelevant.

Not sure I understand why "without wait->SIGSTOP" matters. The tracee
was stopped before attach. If you mean the bugs like
wait-can-hang-after-attach they should be fixed, but this is another
story.

> If you ask about:
> 	ptrace(PTRACE_ATTACH);
> 	waitpid; eat SIGSTOP (being aware it may not be the first signal)
> 	PTRACE(PTRACE_CONT, 0);
>
> Then I believe the debugee should run (and not to be stopped)

Can't understand... waitpid() does not eat SIGSTOP. It only eats the
notification. PTRACE(PTRACE_CONT, 0) cancels/eats the pending SIGSTOP
which was sent during ATTACH, but only if it was already dequeued
(iow, waitpid() succeeds exactly because of that SIGSTOP sent by
PTRACE_ATTACH).

So. If the tracee was not stopped before PTRACE_ATTACH - it should run.
Otherwise it should wait for SIGCONT to resume.

> as one can do:
> 	# kill -STOP applicationpid
> 	# gdb -p applicationpid
> 	(gdb) print getpid()

This calls the function on behalf of the tracee, yes?

In this case, if we do the proposed change, getpid() won't run until
SIGCONT comes.

> 	(gdb) quit
> 	# expecting applicationpid is still stopped, which currently is not.

I am not sure this expectation is correct, with or without the proposed
change.

The tracee was stopped. gdb makes it running. If gdb wants it stopped,
it should take care during/before the detach. Like it should restore
eip before detach. The kernel can't know what ptracer wants.

> Your other case:
> 	ptrace(PTRACE_ATTACH);
> 	waitpid; eat SIGSTOP (being aware it may not be the first signal)
> 	ptrace(PTRACE_DETACH, 0)
>
> if inferior was `(T) stopped' before currently does not leave inferior
> `(T) stopped'.  It would be good if this changes.

Yes. IOW, in this case attach/detach should not change the state of
the tracee. If it was stopped it should be leaved stopped, otherwise
it should continue to run.

> > > then after
> > > PTRACE_DETACH(0) again the process should be `T (stopped)'.
> >
> > Regardless of what the debugger did in between?
>
> Yes.
>
> > This can't be right.  I'd say, it doesn't make sense to take the state of
> > the tracee before PTRACE_ATTACH into account. What does matter, is its state
> > before PTRACE_DETACH.
>
> I do not agree with this point.  Real world debugging programs are buggy

Of course. But the kernel can't know how it should "fix" the bugs in the
user-space. The kernel itself has a lot of bugs which we are trying to fix ;)

What if the correct apllication attaches to the stopped task and want to
resume it and detach? Why the kernel should remember the state of the tracee
_before_ it was attached and then always restore this state?

But probably (I hope) I misunderstood you.

> > If the debugger did not resume the tracee before PTRACE_DETACH, then
> > of course I agree, PTRACE_DETACH(0) should preserve T (stopped).
>
> There are the common inferior calls in use, mostly because the debugger (GDB)
> does not (even more before Python scripting was implemented) provide enough
> user-providable per-application debugging facilities so they got implemented
> into the inferiors themselves and people use GDB inferior calls to call them.

See above (getpid() example).


Jan, I think this is irrelevant. Once again. We are trying to enforce the
very simple rule: the stopped tracee remains stopped until it gets SIGCONT.
No matter what gdb does (unless of course it sends this signal itself) the
tracee doesn't run (lets ignore SIGKILL to simplify the further discussion).

And it doesn't matter why it is stopped, either because it was already
stopped before attach or it was stopped during the debug session (iow,
gdb acks SIGSTOP recieved while the task was ptraced).

Now, again. The tracee was 'T (stopped)' before attach. Then gdb comes,
plays with the tracee, and does PTRACE_DETACH. In this case the tracee
should be stopped unless it gets SIGCONT before detach. And if it gets
SIGCONT it is no longer stopped, the kernel shouldn't ignore SIGCONT.

If gdb wants to resume the tracee temporary (say, call getpid()) it should
send SIGCONT itself.

And in this case, if gdb wants the stopped tracee after detach - it should
take care. It can send another SIGSTOP or do ptrace(PTRACE_DETACH, SIGSTOP).

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 17:06                                                 ` Denys Vlasenko
@ 2011-02-20 17:48                                                   ` Oleg Nesterov
  2011-02-20 19:10                                                   ` Jan Kratochvil
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-20 17:48 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Jan Kratochvil, torvalds, Tejun Heo, Roland McGrath, linux-kernel, akpm

On 02/20, Denys Vlasenko wrote:
>
> On Sunday 20 February 2011 10:40, Jan Kratochvil wrote:
> > Sure by default GDB does not do anything special, it will respawn (using
> > PTRACE_CONT(SIGSTOP)) any SIGSTOP it sees due to the default setting of:
> > 	(gdb) handle SIGSTOP
> > 	Signal        Stop  Print Pass to program Description
> > 	SIGSTOP       Yes Yes Yes   Stopped (signal)
> >
> > Therefore there happens the double SIGSTOP reporting as discussed before:
> > 	(gdb) run
> > 	Starting program: /bin/sleep 1h
> > 	# external kill -STOP <inferior pid>
> > 	Program received signal SIGSTOP, Stopped (signal).
> > 	# State:	t (tracing stop)
> > 	(gdb) continue
> > 	Continuing.
> > 	Program received signal SIGSTOP, Stopped (signal).
> > 	# State:	t (tracing stop)
> > 	(gdb) continue
> > 	Continuing.
> > 	# State:	S (sleeping)
> >
> > Your proposal is I expect:
> > 	(gdb) run
> > 	Starting program: /bin/sleep 1h
> > 	# external kill -STOP <inferior pid>
> > 	Program received signal SIGSTOP, Stopped (signal).
> > 	# State:	t (tracing stop)
> > 	(gdb) continue
> > 	Continuing.
> > 	# State:	T (stopped)
>
> Not exactly. Even after we fix kernel so that it properly preserves
> group-stop across ptrace-stops, gdb will still see TWO
> waitpid:SIGSTOP events, not one.

Yes, I didn't notice the second report doesn't show SIGSTOP twice.
The only important change is

	(gdb) continue
	Continuing.
-	# State:        S (sleeping)
+	# State:        T (stopped)


> I think you can use similar trick in gdb, so that second message says
> "Program stopped due to signal SIGSTOP, Stopped (signal)",
> not "Program received signal SIGSTOP, Stopped (signal)".

Agreed.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 17:16                                                 ` Oleg Nesterov
@ 2011-02-20 18:52                                                   ` Jan Kratochvil
  2011-02-20 20:38                                                     ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-20 18:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Sun, 20 Feb 2011 18:16:58 +0100, Oleg Nesterov wrote:
> On 02/20, Jan Kratochvil wrote:
> > In GDB you can control as a user the way you want to continue the process:
> > 	(gdb) signal SIGCONT
> > 	Continuing with signal SIGCONT.
> 
> This just sets the argument for PTRACE_CONT, afaics

Yes.


> > > IOW. To simplify. Suppose we have a task in 'T (stopped)' state. Then
> > > debugger comes and does
> > >
> > > 	ptrace(PTRACE_ATTACH);
> > > 	PTRACE(PTRACE_CONT, 0);
> > >
> > > With the current code the tracee runs after that. We want to change
> > > the kernel so that the tracee won't run, but becomes 'T (stopped)'
> > > again. It only runs when it gets SIGCONT.
> > >
> > > Do you agree with such a change?
> > >
> > >
> > > And yes, yes,
> > >
> > > 	ptrace(PTRACE_ATTACH);
> > > 	ptrace(PTRACE_DETACH, 0)
> > >
> > > should leave it stopped too, of course.
> >
> > GDB (and I believe nobody else) does PTRACE_ATTACH without wait->SIGSTOP,
> > otherwise it would make `(T) stopped' regular processes.  So I find your
> > question irrelevant.
> 
> Not sure I understand why "without wait->SIGSTOP" matters. The tracee
> was stopped before attach. If you mean the bugs like
> wait-can-hang-after-attach they should be fixed, but this is another
> story.

If I do (on kernel-debug-2.6.35.11-83.fc14.x86_64)
	ptrace (PTRACE_ATTACH);
	sleep (1);
	ptrace (PTRACE_DETACH, 0);

even without the wait() it really has no effect.  I thought it will do like:
	ptrace (PTRACE_ATTACH);
	exit (0);

which will make the debuggee `T (stopped)'.


> > as one can do:
> > 	# kill -STOP applicationpid
> > 	# gdb -p applicationpid
> > 	(gdb) print getpid()
> 
> This calls the function on behalf of the tracee, yes?

Yes.


> In this case, if we do the proposed change, getpid() won't run until
> SIGCONT comes.

And this is such an incompatibility issue you were asking about.

OTOH FSF GDB can attach to `T (stopped)' processes only since 2008 (released
2009) and I have never seen non-commercial users to have any issues with
`T (stopped)' processes so the whole SIGSTOP backward compatibilities for FSF
GDB may not make much sense.


> > 	(gdb) quit
> > 	# expecting applicationpid is still stopped, which currently is not.
> 
> I am not sure this expectation is correct, with or without the proposed
> change.
> 
> The tracee was stopped. gdb makes it running. If gdb wants it stopped,
> it should take care during/before the detach. Like it should restore
> eip before detach. The kernel can't know what ptracer wants.

The ptracer does PTRACE_DETACH(0), therefore to "restore the original state".
If GDB uses uprobes one day it would also expect removal of breakpoints from
the inferior.  The same situation applies for intentional or unintentional
(crash) _exit() of the debugger.


> > > This can't be right.  I'd say, it doesn't make sense to take the state of
> > > the tracee before PTRACE_ATTACH into account. What does matter, is its state
> > > before PTRACE_DETACH.
> >
> > I do not agree with this point.  Real world debugging programs are buggy
> 
> Of course. But the kernel can't know how it should "fix" the bugs in the
> user-space. The kernel itself has a lot of bugs which we are trying to fix ;)

The same way it automatically frees memory, closes file descriptors etc. it
can also restore the debugee's state.  Unless the debugger enforces
`T (stopped)' by PTRACE_DETACH (SIGSTOP) or cancels any saved `T (stopped)' by
PTRACE_DETACH (SIGCONT).


> What if the correct apllication attaches to the stopped task and want to
> resume it and detach?

PTRACE_DETACH (SIGCONT)


> Why the kernel should remember the state of the tracee
> _before_ it was attached and then always restore this state?

As it does for other resources (memory/fds).  One day it should do it even
with hardware watchpoints (hw_breakpoint) and software breakpoints (uprobes).


> Jan, I think this is irrelevant. Once again. We are trying to enforce the
> very simple rule: the stopped tracee remains stopped until it gets SIGCONT.

I do not mind but you asked about compatibility with GDB and this breaks a use
case of post-2008 GDB.

I think such incompatibility is acceptable as there was the eaten-out SIGSTOP
notification so `T (stopped)' processes could not be debugged for many years
by GDB before 2008 anyway.


> Now, again. The tracee was 'T (stopped)' before attach. Then gdb comes,
> plays with the tracee, and does PTRACE_DETACH. In this case the tracee
> should be stopped unless it gets SIGCONT before detach.

Here is a mix of two issues:

I have shown you why it should be `(T) stopped' (on PTRACE_DETACH(0)) even if
GDB did whatever while the inferior is under debug.  This is a new feature and
not a compatibility issue.

Another issue is debuggee's inferior function calls will not work for
`(T) stopped'-on-attach processes.  This is backward incompatible change but
IMO acceptable as such case did not work till 2008 (2009 release) FSF GDB.


> If gdb wants to resume the tracee temporary (say, call getpid()) it should
> send SIGCONT itself.

This is a new feature.  GDB can do whatever as a new feature.  The problem is
existing GDBs do not do it.  And for some reason (probably as each vendor has
GDB patched a lot) there remain very old GDBs in use out there.


> And in this case, if gdb wants the stopped tracee after detach - it should
> take care. It can send another SIGSTOP or do ptrace(PTRACE_DETACH, SIGSTOP).

GDB can crash (yes, it happens), then it will accidentally resume the
inferior.  There also exist other tools (proprietary TotalView debugger etc.)
which I cannot speak for, if they can already attach to `(T) stopped'
processes (and would get broken by the new behavior in such case).



Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 17:06                                                 ` Denys Vlasenko
  2011-02-20 17:48                                                   ` Oleg Nesterov
@ 2011-02-20 19:10                                                   ` Jan Kratochvil
  2011-02-20 19:16                                                     ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-20 19:10 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: torvalds, Oleg Nesterov, Tejun Heo, Roland McGrath, linux-kernel, akpm

On Sun, 20 Feb 2011 18:06:30 +0100, Denys Vlasenko wrote:
> On Sunday 20 February 2011 10:40, Jan Kratochvil wrote:
> > Sure by default GDB does not do anything special, it will respawn (using
> > PTRACE_CONT(SIGSTOP)) any SIGSTOP it sees due to the default setting of:
> > 	(gdb) handle SIGSTOP
> > 	Signal        Stop  Print Pass to program Description
> > 	SIGSTOP       Yes Yes Yes   Stopped (signal)
> > 
> > Therefore there happens the double SIGSTOP reporting as discussed before:
> > 	(gdb) run
> > 	Starting program: /bin/sleep 1h
> > 	# external kill -STOP <inferior pid>
> > 	Program received signal SIGSTOP, Stopped (signal).
> > 	# State:	t (tracing stop)
> > 	(gdb) continue 
> > 	Continuing.
> > 	Program received signal SIGSTOP, Stopped (signal).
> > 	# State:	t (tracing stop)
> > 	(gdb) continue 
> > 	Continuing.
> > 	# State:	S (sleeping)
> > 
> > Your proposal is I expect:
> > 	(gdb) run
> > 	Starting program: /bin/sleep 1h
> > 	# external kill -STOP <inferior pid>
> > 	Program received signal SIGSTOP, Stopped (signal).
> > 	# State:	t (tracing stop)
> > 	(gdb) continue 
> > 	Continuing.
> > 	# State:	T (stopped)
> 
> Not exactly. Even after we fix kernel so that it properly preserves
> group-stop across ptrace-stops, gdb will still see TWO 
> waitpid:SIGSTOP events, not one.
> 
> First one says "the tracee has received SIGSTOP", and after PTRACE_CONT(SIGSTOP),
> second one says "the tracee has stopped because of SIGSTOP".
> Currently, neither strace nor gdb understands that second one
> is different from first.

I thought the kernel change being discussed should be the second SIGSTOP will
receive the process's original parent, so that CTRL-Z on a debugged process
from shell works the normal way, as without a debugger.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 19:10                                                   ` Jan Kratochvil
@ 2011-02-20 19:16                                                     ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-20 19:16 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, torvalds, Tejun Heo, Roland McGrath, linux-kernel, akpm

On 02/20, Jan Kratochvil wrote:
>
> On Sun, 20 Feb 2011 18:06:30 +0100, Denys Vlasenko wrote:
>
> > Not exactly. Even after we fix kernel so that it properly preserves
> > group-stop across ptrace-stops, gdb will still see TWO
> > waitpid:SIGSTOP events, not one.
> >
> > First one says "the tracee has received SIGSTOP", and after PTRACE_CONT(SIGSTOP),
> > second one says "the tracee has stopped because of SIGSTOP".
> > Currently, neither strace nor gdb understands that second one
> > is different from first.
>
> I thought the kernel change being discussed should be the second SIGSTOP will
> receive the process's original parent,

No, no, the debugger should see this notification whatever we do.

> so that CTRL-Z on a debugged process
> from shell works the normal way, as without a debugger.

Yes, this is yet another problem (din't I say we have a lot of them? ;),
CTRL-Z should work and thus we have notify original parent as well.
And even this is not enough, we should change do_wait(WSTOPPED).
But this is another story, gdb shouldn't be affected by these changes.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 18:52                                                   ` Jan Kratochvil
@ 2011-02-20 20:38                                                     ` Oleg Nesterov
  2011-02-20 21:06                                                       ` `(T) stopped' preservation after _exit() [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH] Jan Kratochvil
  2011-02-20 21:20                                                       ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Jan Kratochvil
  0 siblings, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-20 20:38 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/20, Jan Kratochvil wrote:
>
> On Sun, 20 Feb 2011 18:16:58 +0100, Oleg Nesterov wrote:
> > > > IOW. To simplify. Suppose we have a task in 'T (stopped)' state. Then
> > > > debugger comes and does
> > > >
> > > > 	ptrace(PTRACE_ATTACH);
> > > > 	PTRACE(PTRACE_CONT, 0);
> > > >
> > > > With the current code the tracee runs after that. We want to change
> > > > the kernel so that the tracee won't run, but becomes 'T (stopped)'
> > > > again. It only runs when it gets SIGCONT.
> > > >
> > > > Do you agree with such a change?
> > > >
> > > >
> > > > And yes, yes,
> > > >
> > > > 	ptrace(PTRACE_ATTACH);
> > > > 	ptrace(PTRACE_DETACH, 0)
> > > >
> > > > should leave it stopped too, of course.
> > >
> > > GDB (and I believe nobody else) does PTRACE_ATTACH without wait->SIGSTOP,
> > > otherwise it would make `(T) stopped' regular processes.  So I find your
> > > question irrelevant.
> >
> > Not sure I understand why "without wait->SIGSTOP" matters. The tracee
> > was stopped before attach. If you mean the bugs like
> > wait-can-hang-after-attach they should be fixed, but this is another
> > story.
>
> If I do (on kernel-debug-2.6.35.11-83.fc14.x86_64)
> 	ptrace (PTRACE_ATTACH);
> 	sleep (1);
> 	ptrace (PTRACE_DETACH, 0);
>
> even without the wait() it really has no effect.

Well. what does this "has no effect" mean? ;) I am totally confused.
We were talking about the case when the tracee was stopped before
attach, right?

(In this case I don't understand sleep() above, but this doesn't matter)

In this case it should be stopped after detach, do you agree?

And it will be stopped with or without the proposed change. And it will
be stopped with the current kernel.

So, I still can't understand...

If you meant the failing detach-stopped test-case, then please note
that it differs, mostly because it is multithreaded. And it fails
because (again! ;) we have more problems here. I already mentioned
them: the wrong wakeup and the broken SIGNAL_STO_DEQUEUED logic.

> > In this case, if we do the proposed change, getpid() won't run until
> > SIGCONT comes.
>
> And this is such an incompatibility issue you were asking about.

Sure, I understand. Like in your previous example,

	> >     # State:        t (tracing stop)
	> >     (gdb) continue
	> >     Continuing.
	> >     # State:        T (stopped)

this is the same user-visible change.

That is why we are asking you, we need your opinion: whether this
change is acceptable or not for gdb.

> > The tracee was stopped. gdb makes it running. If gdb wants it stopped,
> > it should take care during/before the detach. Like it should restore
> > eip before detach. The kernel can't know what ptracer wants.
>
> The ptracer does PTRACE_DETACH(0), therefore to "restore the original state".
> If GDB uses uprobes one day it would also expect removal of breakpoints from
> the inferior.  The same situation applies for intentional or unintentional
> (crash) _exit() of the debugger.

This doesn't explain why the kernel should restore TASK_STOPPED.
Suppose the tracee creates the file when it was run under debugger,
should the kernel remove this file after detach? STOPPED->RUNNING
transition is the same, I think. And, if SIGCONT comes in between,
the tracee is no longer stopped. It needs another SIGSTOP, gdb can
do this via ptrace(DETACH, SIGSTOP).

> > > > This can't be right.  I'd say, it doesn't make sense to take the state of
> > > > the tracee before PTRACE_ATTACH into account. What does matter, is its state
> > > > before PTRACE_DETACH.
> > >
> > > I do not agree with this point.  Real world debugging programs are buggy
> >
> > Of course. But the kernel can't know how it should "fix" the bugs in the
> > user-space. The kernel itself has a lot of bugs which we are trying to fix ;)
>
> The same way it automatically frees memory, closes file descriptors etc. it
> can also restore the debugee's state.

Then why gdb has to restore $RIP after '(gdb) print getpid()' + detach?
following your logic, this should be handled by the kernel as well.

In any case. This doesn't work currently, and it was never supposed to
work. If you think we need this new (and imho very wrong ;) feature -
lets discuss this separately.

> > What if the correct apllication attaches to the stopped task and want to
> > resume it and detach?
>
> PTRACE_DETACH (SIGCONT)

This was already discussed and you even managed to confuse me ;) Initially
I was going to agree with this special case, but then I agreed with Denys
and after that Roland nacked this approach.

> > Jan, I think this is irrelevant. Once again. We are trying to enforce the
> > very simple rule: the stopped tracee remains stopped until it gets SIGCONT.
>
> I do not mind but you asked about compatibility with GDB and this breaks a use
> case of post-2008 GDB.

Of course! Of course this is user-visible change, that is why we are asking
you.

The problem is, _any_ fix in this area is user-visible. By definition ;)
Even if we fix the obvious bug (like this damn wrong wakeup) we can
break something.

Let me remind. ptrace_detach()->wake_up_process() is just wrong. I tried
to remove it, but then later this patch was reverted because gdb was not
happy. But. If we keep this wakeup, then PTRACE_DETACH can _never_ leave
the tracee in the stopped state reliably, whatever we do. So, what we can
do? We are going to kill it again, even if this can break something else.
All this code is just wrong. We can not magically change it so that
everything will be happy.

> I think such incompatibility is acceptable as there was the eaten-out SIGSTOP
> notification so `T (stopped)' processes could not be debugged for many years
> by GDB before 2008 anyway.

OK, thanks.

So. So far I assume you are not against this change ;)

> > Now, again. The tracee was 'T (stopped)' before attach. Then gdb comes,
> > plays with the tracee, and does PTRACE_DETACH. In this case the tracee
> > should be stopped unless it gets SIGCONT before detach.
>
> Here is a mix of two issues:
>
> I have shown you why it should be `(T) stopped' (on PTRACE_DETACH(0)) even if
> GDB did whatever while the inferior is under debug.

I disagree ;) but,

> This is a new feature and
> not a compatibility issue.

Yes. So lets discuss this separately as the new-feature-request.

> Another issue is debuggee's inferior function calls will not work for
> `(T) stopped'-on-attach processes.  This is backward incompatible change but
> IMO acceptable as such case did not work till 2008 (2009 release) FSF GDB.

OK,

> > If gdb wants to resume the tracee temporary (say, call getpid()) it should
> > send SIGCONT itself.
>
> This is a new feature.  GDB can do whatever as a new feature.  The problem is
> existing GDBs do not do it.

Yes, yes, sure, I understand. Again, and again, this is the user-visible
change and that is why we need your opinion.

> > And in this case, if gdb wants the stopped tracee after detach - it should
> > take care. It can send another SIGSTOP or do ptrace(PTRACE_DETACH, SIGSTOP).
>
> GDB can crash (yes, it happens), then it will accidentally resume the
> inferior.

See above.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* `(T) stopped' preservation after _exit()  [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH]
  2011-02-20 20:38                                                     ` Oleg Nesterov
@ 2011-02-20 21:06                                                       ` Jan Kratochvil
  2011-02-20 21:19                                                         ` Oleg Nesterov
  2011-02-20 21:20                                                       ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Jan Kratochvil
  1 sibling, 1 reply; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-20 21:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Sun, 20 Feb 2011 21:38:19 +0100, Oleg Nesterov wrote:
> If you think we need this new (and imho very wrong ;) feature -
> lets discuss this separately.


> On 02/20, Jan Kratochvil wrote:
> > On Sun, 20 Feb 2011 18:16:58 +0100, Oleg Nesterov wrote:
> > > The tracee was stopped. gdb makes it running. If gdb wants it stopped,
> > > it should take care during/before the detach. Like it should restore
> > > eip before detach. The kernel can't know what ptracer wants.
> >
> > The ptracer does PTRACE_DETACH(0), therefore to "restore the original state".
> > If GDB uses uprobes one day it would also expect removal of breakpoints from
> > the inferior.  The same situation applies for intentional or unintentional
> > (crash) _exit() of the debugger.
> 
> This doesn't explain why the kernel should restore TASK_STOPPED.
> Suppose the tracee creates the file when it was run under debugger,
> should the kernel remove this file after detach?

No, opening such file is like writing data into a file.


> And, if SIGCONT comes in between, the tracee is no longer stopped. It needs
> another SIGSTOP, gdb can do this via ptrace(DETACH, SIGSTOP).

It cannot if it crashes in between.  It would be OK if it can do so right
after the inferior call.  Which I realize now it can, after the inferior call
returns (wait->SIGTRAP) GDB can do PTRACE_CONT(SIGSTOP), wait->SIGSTOP and now
it can do PTRACE_GETREGS etc. while after _exit() it will be like after
PTRACE_DETACH(0) and the debuggee still remains `(T) stopped', doesn't it?


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: `(T) stopped' preservation after _exit()  [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH]
  2011-02-20 21:06                                                       ` `(T) stopped' preservation after _exit() [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH] Jan Kratochvil
@ 2011-02-20 21:19                                                         ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-20 21:19 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/20, Jan Kratochvil wrote:
>
> On Sun, 20 Feb 2011 21:38:19 +0100, Oleg Nesterov wrote:
> >
> > This doesn't explain why the kernel should restore TASK_STOPPED.
> > Suppose the tracee creates the file when it was run under debugger,
> > should the kernel remove this file after detach?
>
> No, opening such file is like writing data into a file.

OK, let it be writing data into a file. But in this case the kernel
shouldn't truncate the file after detach or if the tracer crashes ;)

> > And, if SIGCONT comes in between, the tracee is no longer stopped. It needs
> > another SIGSTOP, gdb can do this via ptrace(DETACH, SIGSTOP).
>
> It cannot if it crashes in between.

Yes,

> It would be OK if it can do so right
> after the inferior call.  Which I realize now it can, after the inferior call
> returns (wait->SIGTRAP) GDB can do PTRACE_CONT(SIGSTOP), wait->SIGSTOP and now
> it can do PTRACE_GETREGS etc. while after _exit() it will be like after
> PTRACE_DETACH(0) and the debuggee still remains `(T) stopped', doesn't it?

Yes. Or gdb can just send SIGSTOP to the tracee.

Modulo other bugs we have, but these bugs should be fixed anyway.

For example, once again, PTRACE_DETACH/PTRACE_CONT can ignore SIGXXX.
I never knew if it was designed this way or this should be fixed, but
there is one particular case which looks like the oversight to me: If
the tracee reports SIGTRAP after it steps into the signal handler, then
SIGXXX is ignored after PTRACE_CONT/DETACH (and this btw affects gdb).

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 20:38                                                     ` Oleg Nesterov
  2011-02-20 21:06                                                       ` `(T) stopped' preservation after _exit() [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH] Jan Kratochvil
@ 2011-02-20 21:20                                                       ` Jan Kratochvil
  2011-02-21 14:23                                                         ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-20 21:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Sun, 20 Feb 2011 21:38:19 +0100, Oleg Nesterov wrote:
> On 02/20, Jan Kratochvil wrote:
> > If I do (on kernel-debug-2.6.35.11-83.fc14.x86_64)
> > 	ptrace (PTRACE_ATTACH);
> > 	sleep (1);
> > 	ptrace (PTRACE_DETACH, 0);
> >
> > even without the wait() it really has no effect.
> 
> Well. what does this "has no effect" mean? ;) I am totally confused.
> We were talking about the case when the tracee was stopped before
> attach, right?

No, the case it is not `(T) stopped'.  I was surprised by this ptrace behavior
but it is offtopic and not useful so let's drop it.


> So. So far I assume you are not against this change ;)

No, although you should provide the patch in advance, it would be nice to also
post it first to <gdb@sourceware.org> for comments.

Now if new GDB should allow inferior functions calls on previously
`(T) stopped' process doing PTRACE_CONT(SIGCONT) for executing the call should
be harmless but how to make it `(T) stopped' afterwards?  PTRACE_CONT(SIGSTOP)
right after the inferior call will make the old kernels run the inferior - we
do not want that.  GDB can only wait till the end of debugging session and do
PTRACE_DETACH(SIGSTOP).  But we are back at the point if GDB crashes in
between the inferior will accidentally resume.

(This is the ``(T) stopped' preservation after _exit()' thread along claimed
to be unrelated.)


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-20 21:20                                                       ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Jan Kratochvil
@ 2011-02-21 14:23                                                         ` Oleg Nesterov
  2011-02-23 16:44                                                           ` Jan Kratochvil
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-21 14:23 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On 02/20, Jan Kratochvil wrote:
>
> On Sun, 20 Feb 2011 21:38:19 +0100, Oleg Nesterov wrote:
>
> > So. So far I assume you are not against this change ;)
>
> No, although you should provide the patch in advance, it would be nice to also
> post it first to <gdb@sourceware.org> for comments.

OK.

> Now if new GDB should allow inferior functions calls on previously
> `(T) stopped' process doing PTRACE_CONT(SIGCONT)

No, no, this won't work. You need to send SIGCONT via kill/tkill. Once
again, we can add the special case for PTRACE_CONT(SIGCONT), but please
look at Roland's comment: http://marc.info/?l=linux-kernel&m=129796917823181

And given that currently gdb does PTRACE_CONT(0) this special case can't
help anyway unless you change gdb.

> but how to make it `(T) stopped' afterwards?  PTRACE_CONT(SIGSTOP)
> right after the inferior call will make the old kernels run the inferior - we
> do not want that.

Hmm... probably I am totally confused... but PTRACE_CONT(SIGSTOP)
should work in this case, the tracee reports SIGTRAP after the single-step
(if I understand correctly how gdb implements this).

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14 17:54                                         ` Denys Vlasenko
@ 2011-02-21 15:16                                           ` Tejun Heo
  2011-02-21 15:28                                             ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-21 15:16 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Mon, Feb 14, 2011 at 06:54:37PM +0100, Denys Vlasenko wrote:
> > Okay, maybe I'm missing something but so once SIGSTOP is determined to
> > be delivered, then the tracee enters group stop and that's the second
> > SIGSTOP notification you get.  At that point, strace should wait for
> > the tracee to be continued by SIGCONT.  That should work, right?
> 
> Do you mean "Will it work on current kernels" or "that's what strace
> has to do and then it is supposed to work correctly, modulo bugs"?

Yes and no, I think it will mostly work on current kernels if we
concentrate only on the actual stopping and continuing part; however,
there still are two obstacles.

1. The distinction between the first SIGSTOP trapping and the second
   can only be reliably done by GETSIGINFO which in turn will put the
   tracee into TASK_TRACED making the tracee ignore the future SIGCONT
   and the tracer has no way to detect reception of it either.  The
   tracer can make the distinction by looking at the sequence of
   events but it wouldn't work for multithreaded cases and right after
   attach.

2. Due to reparenting, wait(2) notifications (including the SIGCLDs)
   don't get to the real parent at all.

#2 just needs fixing.  I don't think there will be a lot of different
opinions on that one; however, #1 is trickier and one of the biggest
reasons why we have this long thread.

> In this particular scenario, first SIGSTOP is ptrace-stop.
> Obviously, we must issue ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
> to continue.
> 
> Second SIGSTOP is notification of tracee's group-stop to debugger.

So, at this point, the debugger shouldn't be continuing the tracee by
calling PTRACE_SYSCALL but do something else.  What that should be is
still being discussed.

> The question is, logically, by sending this notification, does tracee,
> or does it not enter into ptrace-stop too? (IOW: is ptrace-stop a separate
> bit in task state, independent of group-stop?)
> If yes, then we need to release tracee from ptrace-stop (but it will remain in
> group-stop) by issuing ptrace(PTRACE_SYSCALL, $PID, 0x1, 0).
> If not, then we must not do so, because the task is not ptrace-stopped,
> and ptrace(PTRACE_SYSCALL, $PID, 0x1, 0) is undefined (I think it should
> error out to indicate that).

That preciesly is what is being discussed.  IIUC, Oleg and Roland are
saying that the tracee should enter group stop but not ptrace trap at
that point and then transition into ptrace trap on the first PTRACE
call.  I was agreeing with that at first but changed my mind after
reading these discussions and now I think we should just put it in
ptrace trap and give the debugger a way to notice the end of group
stop.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 15:16                                           ` Tejun Heo
@ 2011-02-21 15:28                                             ` Oleg Nesterov
  2011-02-21 16:11                                               ` [pseudo patch] ptrace should respect the group stop Oleg Nesterov
  2011-02-22 16:24                                               ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Tejun Heo
  0 siblings, 2 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-21 15:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/21, Tejun Heo wrote:
>
> 1. The distinction between the first SIGSTOP trapping and the second
>    can only be reliably done by GETSIGINFO which in turn will put the
>    tracee into TASK_TRACED making the tracee ignore the future SIGCONT

Yes, but please see below.

> 2. Due to reparenting, wait(2) notifications (including the SIGCLDs)
>    don't get to the real parent at all.
>
> #2 just needs fixing.

Yes.

> That preciesly is what is being discussed.  IIUC, Oleg and Roland are
> saying that the tracee should enter group stop but not ptrace trap at
> that point and then transition into ptrace trap on the first PTRACE
> call.

Actually I am not saying this (at least now, probably I did).

Once again. We have the bug with arch_ptrace_stop_needed(), but lets
ignore it to simplify the discussion.

Suppose that the tracee calls do_signal_stop() and participates in the
group stop. To me, it doesn't really matter (in the context of this
discussion) if it stops in TASK_STOPPED or TASK_TRACED (and where it
stops).

However, I am starting to agree that TASK_TRACED looks more clean.

What is important, I think ptrace should respect SIGNAL_STOP_STOPPED.
IOW, when the tracee is group-stopped (TASK_STOPPED or TASK_TRACED,
doesn't matter), ptrace_resume() should not wake it up, but merely
do set_task_state(TASK_STATE) and make it resumeable by SIGCONT.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* [pseudo patch] ptrace should respect the group stop
  2011-02-21 15:28                                             ` Oleg Nesterov
@ 2011-02-21 16:11                                               ` Oleg Nesterov
  2011-02-22 16:24                                               ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Tejun Heo
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-21 16:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/21, Oleg Nesterov wrote:
>
> On 02/21, Tejun Heo wrote:
> >
> > That preciesly is what is being discussed.  IIUC, Oleg and Roland are
> > saying that the tracee should enter group stop but not ptrace trap at
> > that point and then transition into ptrace trap on the first PTRACE
> > call.
>
> Actually I am not saying this (at least now, probably I did).
>
> Once again. We have the bug with arch_ptrace_stop_needed(), but lets
> ignore it to simplify the discussion.
>
> Suppose that the tracee calls do_signal_stop() and participates in the
> group stop. To me, it doesn't really matter (in the context of this
> discussion) if it stops in TASK_STOPPED or TASK_TRACED (and where it
> stops).
>
> However, I am starting to agree that TASK_TRACED looks more clean.
>
> What is important, I think ptrace should respect SIGNAL_STOP_STOPPED.
> IOW, when the tracee is group-stopped (TASK_STOPPED or TASK_TRACED,
> doesn't matter), ptrace_resume() should not wake it up, but merely
> do set_task_state(TASK_STATE) and make it resumeable by SIGCONT.

IOW. I mean something like the (uncompiled, incomplete) patch below as
as the initial approximation.

Oleg.

--- x/kernel/ptrace.c
+++ x/kernel/ptrace.c
@@ -38,15 +38,15 @@ void __ptrace_link(struct task_struct *c
 }
 
 /*
- * Turn a tracing stop into a normal stop now, since with no tracer there
- * would be no way to wake it up with SIGCONT or SIGKILL.  If there was a
- * signal sent that would resume the child, but didn't because it was in
- * TASK_TRACED, resume it now.
- * Requires that irqs be disabled.
+ * NEW COMMENT
  */
-static void ptrace_untrace(struct task_struct *child)
+static void ptrace_wake_up(struct task_struct *child)
 {
-	spin_lock(&child->sighand->siglock);
+	unsigned long flags;
+
+	if (!lock_task_sighand(child, &flags))
+		return;
+
 	if (task_is_traced(child)) {
 		/*
 		 * If the group stop is completed or in progress,
@@ -56,9 +56,9 @@ static void ptrace_untrace(struct task_s
 		    child->signal->group_stop_count)
 			__set_task_state(child, TASK_STOPPED);
 		else
-			signal_wake_up(child, 1);
+			wake_up_state(child, TASK_TRACED);
 	}
-	spin_unlock(&child->sighand->siglock);
+	unlock_task_sighand(tsk, &flags);
 }
 
 /*
@@ -76,7 +76,7 @@ void __ptrace_unlink(struct task_struct 
 	list_del_init(&child->ptrace_entry);
 
 	if (task_is_traced(child))
-		ptrace_untrace(child);
+		ptrace_wake_up(child);
 }
 
 /*
@@ -312,8 +312,6 @@ int ptrace_detach(struct task_struct *ch
 	if (child->ptrace) {
 		child->exit_code = data;
 		dead = __ptrace_detach(current, child);
-		if (!child->exit_state)
-			wake_up_state(child, TASK_TRACED | TASK_STOPPED);
 	}
 	write_unlock_irq(&tasklist_lock);
 
@@ -514,7 +512,7 @@ static int ptrace_resume(struct task_str
 	}
 
 	child->exit_code = data;
-	wake_up_process(child);
+	ptrace_wake_up(child);
 
 	return 0;
 }
--- x/kernel/signal.c
+++ x/kernel/signal.c
@@ -1644,8 +1644,11 @@ static void ptrace_stop(int exit_code, i
 	 * If there is a group stop in progress,
 	 * we must participate in the bookkeeping.
 	 */
-	if (current->signal->group_stop_count > 0)
-		--current->signal->group_stop_count;
+	if (current->signal->group_stop_count) {
+		// XXX: this is not enough, we can race with detach
+		if (!--current->signal->group_stop_count)
+			current->signal->flags = SIGNAL_STOP_STOPPED;
+	}
 
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
@@ -1825,6 +1828,9 @@ static int ptrace_signal(int signr, sigi
 	if (sigismember(&current->blocked, signr)) {
 		specific_send_sig_info(signr, info, current);
 		signr = 0;
+	} else if (sig_kernel_stop(signr)) {
+		// XXX: not exactly right but anyway better
+		current->signal->flags |= SIGNAL_STOP_DEQUEUED;
 	}
 
 	return signr;


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-18 19:37                                                         ` Oleg Nesterov
@ 2011-02-21 16:22                                                           ` Tejun Heo
  2011-02-21 16:49                                                             ` Oleg Nesterov
  2011-02-24 20:29                                                             ` Oleg Nesterov
  0 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-21 16:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hey, :-)

On Fri, Feb 18, 2011 at 08:37:09PM +0100, Oleg Nesterov wrote:
> > Still trying to follow the new discussion.
> 
> And how it goes?
> 
> As for me, I am not sure I can follow it ;)

The issues Denys brought up are okay but I still haven't gotten my
head wrapped around what Jan and you are talking about.  Urgh... :-)

> > Instead of having simple "a ptracer stops in TASK_TRACED and its
> > execution is under the control of ptrace",
> 
> In fact, I am not sure I really disagree with this part, but see below.
> 
> > The patch which puts the tracee into TASK_TRACED
> > on ATTACH already fix two problems discussed in this thread without
> > doing anything wonky.  I think it says a lot.
> 
> Yes. One off-topice note... if we are talking about this patch only,
> I do not think it makes sense to add the new member into task_struct
> so that STOPPED/TRACED transition can always report the exactly correct
> ->exit_code. I think we can just use group_exit_code ?: SIGSTOP.
> But, again, this is off-topic.

It shares the task->group_stop which is needed for other things
anyway, but yeah, if we're sure it's either that or SIGSTOP that would
definitely be better.  Hmmm, but it can be other things.  There are
many signals which can trigger group stop.  Maybe this is not
important but then again preserving this doesn't cost us much either.

BTW, I plan on separating out all ptrace related stuff into a separate
struct as it's not used by most tasks anyway, so I don't think we need
to be too concerned about several more fields.

> > As it currently stands, SIGSTOP/CONT while ptraced doesn't work
> 
> And this is probably where we disagree the most. I think this is bug,
> and this should be fixed.

I don't think we disagree that it is a bug.  I want to fix it too but
we definitely seem to disagree on how.  I want to give more control to
the ptracer so that the tracer has enough information and control to
follow the group stop semantics if it wants to and you want to give
more control to group stop so that it overrides the tracer and always
does the right thing regarding group stop.

> > and even if we bend the rules subtly and provide sneaky ways like
> > the above, userland needs to be modified to make use of it anyway.
> 
> Yes. But with the current code we can't modify, say, strace so
> that SIGSTOP/CONT can work "correctly".

Agreed, not possible.  The kernel needs to be improved one way or the
other.

> > I think it would be far cleaner to simply make ptracee always stop
> > in TASK_TRACED and give the ptracer a way to notice what's
> > happening to the tracee
> 
> Well. If we accept the proposed PTRACE_CONT-needs-SIGCONT behaviour,
> then I think this probably makes sense. The tracee stops under ptrace,
> the possible SIGCONT shouldn't abuse debugger which wants to know, say,
> the state of registers.

The objections I have against PTRACE_CONT-needs-SIGCONT are,

* It will be very different from the current behavior.

* ptrace, sans the odd SIGSTOP on attach which we should remove, is
  per-task.  Sending out SIGCONT on PTRACE_CONT would break that.  I
  really don't think that's a good idea.

* PTRACE_CONT would be behaving completely differently depending on
  whether it's resuming from group stop or other traps.

> To be honest, I don't understand whether I changed my mind now, or
> I was never against this particular change in behaviour.
> 
> Once debugger does PTRACE_CONT, the tracee becomes TASK_STOPPED and
> now it is "visible" to SIGCONT (or the tracee resumes if SIGCONT has
> come in between).
> 
> But I think you will equally blame this TRACED/STOPPED transition
> as "behavioral subtleties" and I can understand you even if I disagree.
> And yes, this leads to other questions. But note that this greatly
> simplifies things. The tracee can never participate in the same
> group-stop twice.

But that's not really because the problem is solved.  The problem is
put out of scope by forcing the tracer to always override group stop.
That's a rather big departure from the current behavior and capability
and I frankly think is not a good direction to head to.  It's like
giving up useful features for conceptual purity.  We can make it work
without regressing on capabilities.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 16:22                                                           ` Tejun Heo
@ 2011-02-21 16:49                                                             ` Oleg Nesterov
  2011-02-21 16:59                                                               ` Tejun Heo
  2011-02-24 20:29                                                             ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-21 16:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hi,

I can't write the full reply right now, just one note...

On 02/21, Tejun Heo wrote:
>
> BTW, I plan on separating out all ptrace related stuff into a separate
> struct as it's not used by most tasks anyway, so I don't think we need
> to be too concerned about several more fields.

This is funny.

I already did this change (it even had some review on lkml). And it was
discussed again (on archer@sourceware.org) a couple of days ago. Yes,
probably we should finally do this. Unfortunately, this complicates the
attach-to-all-threads request (new feature), but anyway.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 16:49                                                             ` Oleg Nesterov
@ 2011-02-21 16:59                                                               ` Tejun Heo
  2011-02-23 19:31                                                                 ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-21 16:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hey,

On Mon, Feb 21, 2011 at 05:49:08PM +0100, Oleg Nesterov wrote:
> > BTW, I plan on separating out all ptrace related stuff into a separate
> > struct as it's not used by most tasks anyway, so I don't think we need
> > to be too concerned about several more fields.
> 
> This is funny.
> 
> I already did this change (it even had some review on lkml). And it was
> discussed again (on archer@sourceware.org) a couple of days ago. Yes,
> probably we should finally do this. Unfortunately, this complicates the
> attach-to-all-threads request (new feature), but anyway.

Heh, well, I also have a mostly working patch but hey if you already
did it, all the better.  :-)

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 15:28                                             ` Oleg Nesterov
  2011-02-21 16:11                                               ` [pseudo patch] ptrace should respect the group stop Oleg Nesterov
@ 2011-02-22 16:24                                               ` Tejun Heo
  2011-02-24 21:08                                                 ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-22 16:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Okay, I think I finally caught up with the discussion (hopefully).

On Mon, Feb 21, 2011 at 04:28:55PM +0100, Oleg Nesterov wrote:
> However, I am starting to agree that TASK_TRACED looks more clean.
> 
> What is important, I think ptrace should respect SIGNAL_STOP_STOPPED.
> IOW, when the tracee is group-stopped (TASK_STOPPED or TASK_TRACED,
> doesn't matter), ptrace_resume() should not wake it up, but merely
> do set_task_state(TASK_STATE) and make it resumeable by SIGCONT.

I don't think that's gonna fly.  It first is a very user-visible
change to how ptrace_resume() works and it removes a lot of debugging
capability.  As Jan's examples showed, there are things which the
debugger does behind group stop's back and some of them are quite
legitimate and useful things to do like running some code in the
tracee context for the tracer and adjusting where the task is stopped.

If you mix ptrace trap and group stop and then fix group stop
notification, not only multithreaded debugging becomes quite
cumbersome (suddenly ptracing becomes per-process thing instead of
per-thread), it becomes almost impossible to debug jctl behaviors.
Jctl becomes completely intertwined with ptracing and the real parent
would get numerous notifications during the course of debugging.

I think they belong to different layers and they should stack instead
of mix.  I'll try to write up a summary for how I think it can be done
later but in short I think we just need two more PTRACE calls (one for
combined SIGSTOPless attach + INTERRUPT and the other for jctl
monitoring) and there doesn't need to be any fundamental revolt in how
ptrace and jctl interact with each other.  The current ptrace behavior
is quirky and rough on the edges but I think the fundamentals are
correct in that it's something which is fundamentally bound to a task
(not task group) and operates below jctl.  We just need to iron out
the interactions so that the outcome makes sense.

That way, most current users won't notice (e.g. entering TASK_TRACED
directly on SIGSTOP doesn't make any different to strace or gdb, they
already issue PTRACE calls immediately afterwards) the difference
while newer ones can take of new features to show better jctl
behavior.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 14:23                                                         ` Oleg Nesterov
@ 2011-02-23 16:44                                                           ` Jan Kratochvil
  0 siblings, 0 replies; 160+ messages in thread
From: Jan Kratochvil @ 2011-02-23 16:44 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Tejun Heo, Roland McGrath, linux-kernel, torvalds, akpm

On Mon, 21 Feb 2011 15:23:25 +0100, Oleg Nesterov wrote:
> On 02/20, Jan Kratochvil wrote:
> > Now if new GDB should allow inferior functions calls on previously
> > `(T) stopped' process doing PTRACE_CONT(SIGCONT)
> 
> No, no, this won't work. You need to send SIGCONT via kill/tkill. Once
> again, we can add the special case for PTRACE_CONT(SIGCONT), but please
> look at Roland's comment: http://marc.info/?l=linux-kernel&m=129796917823181
> 
> And given that currently gdb does PTRACE_CONT(0) this special case can't
> help anyway unless you change gdb.

I would better play with a patched kernel.


> > but how to make it `(T) stopped' afterwards?  PTRACE_CONT(SIGSTOP)
> > right after the inferior call will make the old kernels run the inferior - we
> > do not want that.
> 
> Hmm... probably I am totally confused... but PTRACE_CONT(SIGSTOP)
> should work in this case, the tracee reports SIGTRAP after the single-step
> (if I understand correctly how gdb implements this).

The inferior call returns to a breakpoint (0xcc), this is the reason of the
SIGTRAP at the end.  I expect PTRACE_CONT(SIGSTOP) could work even in such
case.


Thanks,
Jan

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 16:59                                                               ` Tejun Heo
@ 2011-02-23 19:31                                                                 ` Oleg Nesterov
  2011-02-25 15:10                                                                   ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-23 19:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/21, Tejun Heo wrote:
>
> Hey,
>
> On Mon, Feb 21, 2011 at 05:49:08PM +0100, Oleg Nesterov wrote:
> > > BTW, I plan on separating out all ptrace related stuff into a separate
> > > struct as it's not used by most tasks anyway, so I don't think we need
> > > to be too concerned about several more fields.
> >
> > This is funny.
> >
> > I already did this change (it even had some review on lkml). And it was
> > discussed again (on archer@sourceware.org) a couple of days ago. Yes,
> > probably we should finally do this. Unfortunately, this complicates the
> > attach-to-all-threads request (new feature), but anyway.
>
> Heh, well, I also have a mostly working patch but hey if you already
> did it, all the better.  :-)

Argh, sorry. If you already have the patch - please send it.

Otherwise I'll try to refresh the old series, but not before the next week.


And. As usual, sorry for the delay, I'll try to reply to other emails
tomorrow.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-21 16:22                                                           ` Tejun Heo
  2011-02-21 16:49                                                             ` Oleg Nesterov
@ 2011-02-24 20:29                                                             ` Oleg Nesterov
  2011-02-25 15:51                                                               ` Tejun Heo
  1 sibling, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-24 20:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hi Tejun,

On 02/21, Tejun Heo wrote:
>

Damn. Today is 02/24 ;) sorry.

> On Fri, Feb 18, 2011 at 08:37:09PM +0100, Oleg Nesterov wrote:
>
> > > As it currently stands, SIGSTOP/CONT while ptraced doesn't work
> >
> > And this is probably where we disagree the most. I think this is bug,
> > and this should be fixed.
>
> I don't think we disagree that it is a bug.  I want to fix it too but
> we definitely seem to disagree on how.

Yes, but I also think that the running tracee in the SIGNAL_STOP_STOPPED
process is bug by itself. IIUC, you think this is fine.

> I want to give more control to
> the ptracer so that the tracer has enough information and control to
> follow the group stop semantics if it wants to and you want to give
> more control to group stop so that it overrides the tracer and always
> does the right thing regarding group stop.

Yes, but debugger still has the control. It can nack SIGSTOP, or if
the tracee was already stopped it can send SIGCONT.

> > > I think it would be far cleaner to simply make ptracee always stop
> > > in TASK_TRACED and give the ptracer a way to notice what's
> > > happening to the tracee
> >
> > Well. If we accept the proposed PTRACE_CONT-needs-SIGCONT behaviour,
> > then I think this probably makes sense. The tracee stops under ptrace,
> > the possible SIGCONT shouldn't abuse debugger which wants to know, say,
> > the state of registers.
>
> The objections I have against PTRACE_CONT-needs-SIGCONT are,
>
> * It will be very different from the current behavior.

Unfortunately, you are right. Again, I think the current behaviour
is very wrong, but of course you are right that this behaviour is
very old, and thus perhaps we can't change it whatever I think.

> * ptrace, sans the odd SIGSTOP on attach which we should remove, is
>   per-task.  Sending out SIGCONT on PTRACE_CONT would break that.  I
>   really don't think that's a good idea.

Hmm. But why do you think we should always send SIGCONT after attach?

> * PTRACE_CONT would be behaving completely differently depending on
>   whether it's resuming from group stop or other traps.

Afaics, no. It does not matter from where the tracee resumes. See
the [pseudo patch] I sent. Once again, it doesn't really work, it
only tries to explain what I mean.

> > Once debugger does PTRACE_CONT, the tracee becomes TASK_STOPPED and
> > now it is "visible" to SIGCONT (or the tracee resumes if SIGCONT has
> > come in between).
> >
> > But I think you will equally blame this TRACED/STOPPED transition
> > as "behavioral subtleties" and I can understand you even if I disagree.
> > And yes, this leads to other questions. But note that this greatly
> > simplifies things. The tracee can never participate in the same
> > group-stop twice.
>
> But that's not really because the problem is solved.  The problem is
> put out of scope by forcing the tracer to always override group stop.

Hmm, can't understand... But probably I should just reply to the next
email from you.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-22 16:24                                               ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Tejun Heo
@ 2011-02-24 21:08                                                 ` Oleg Nesterov
  2011-02-25 15:45                                                   ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-24 21:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/22, Tejun Heo wrote:
>
> Okay, I think I finally caught up with the discussion (hopefully).
>
> On Mon, Feb 21, 2011 at 04:28:55PM +0100, Oleg Nesterov wrote:
> > However, I am starting to agree that TASK_TRACED looks more clean.
> >
> > What is important, I think ptrace should respect SIGNAL_STOP_STOPPED.
> > IOW, when the tracee is group-stopped (TASK_STOPPED or TASK_TRACED,
> > doesn't matter), ptrace_resume() should not wake it up, but merely
> > do set_task_state(TASK_STATE) and make it resumeable by SIGCONT.
>
> I don't think that's gonna fly.  It first is a very user-visible
> change to how ptrace_resume() works

Yes. But can't resist, this is a bit unfair ;) It was you who convinced
me we should cleanup this horror somehow, even if we break some corner
cases.

However, again, I can't argue. Perhaps this change is too radical.

In particular, if Jan thinks this is not acceptable - I'll shut up
immediately.

> and it removes a lot of debugging
> capability.

Well. I don't think it limits the current ptrace interface somehow,
but:

> As Jan's examples showed, there are things which the
> debugger does behind group stop's back and some of them are quite
> legitimate and useful things to do like running some code

Yes. This can surprise a user which runs the unmodified debugger.

> If you mix ptrace trap and group stop and then fix group stop
> notification, not only multithreaded debugging becomes quite
> cumbersome (suddenly ptracing becomes per-process thing instead of
> per-thread),

It should be, imho. Like SIGKILL, SIGSTOP/SIGCONT are not per-thread.
This is per-process thing.

> it becomes almost impossible to debug jctl behaviors.
> Jctl becomes completely intertwined with ptracing and the real parent
> would get numerous notifications during the course of debugging.

Again, I think this is a win. The real parent should know that, say,
its child becomes running after it was stopped. It does not matter
why it was CLD_CONTINUED, it was resumed and that is all.

> I think they belong to different layers and they should stack instead
> of mix.  I'll try to write up a summary for how I think it can be done
> later

OK. You know, we already spent sooooooooooooooooooooooooooooooooooooooo
much time discussing this, I have the strong desire to agree in advance
with anything new ;)

> but in short I think we just need two more PTRACE calls (one for
> combined SIGSTOPless attach + INTERRUPT

Yes, we are discussing these requests on archer,

> and the other for jctl
> monitoring)

Of course, we can add the new requests to help gdb/strace/whatever
to handle jctl. In fact I think we should in any case.

But this is "easy". In the context of this discussion, my only concern
is the current behaviour.

> and there doesn't need to be any fundamental revolt in how
> ptrace and jctl interact with each other. The current ptrace behavior
> is quirky and rough on the edges but I think the fundamentals are
> correct in that it's something which is fundamentally bound to a task
> (not task group)

Aha. see above. I feel this is not true. But, as usual, can't prove.

> That way, most current users won't notice (e.g. entering TASK_TRACED
> directly on SIGSTOP doesn't make any different to strace or gdb,

Just in case, let me repeat...  Yes, I think you are right, TASK_TRACE
looks more clean if the tracee does do_signal_stop().

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-23 19:31                                                                 ` Oleg Nesterov
@ 2011-02-25 15:10                                                                   ` Tejun Heo
  0 siblings, 0 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-25 15:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Wed, Feb 23, 2011 at 08:31:17PM +0100, Oleg Nesterov wrote:
> > Heh, well, I also have a mostly working patch but hey if you already
> > did it, all the better.  :-)
> 
> Argh, sorry. If you already have the patch - please send it.
> 
> Otherwise I'll try to refresh the old series, but not before the next week.

Please go ahead and send yours.  Mine still needs to be separated from
other stuff and isn't tested much.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-24 21:08                                                 ` Oleg Nesterov
@ 2011-02-25 15:45                                                   ` Tejun Heo
  2011-02-25 17:42                                                     ` Roland McGrath
  2011-02-28 15:23                                                     ` Oleg Nesterov
  0 siblings, 2 replies; 160+ messages in thread
From: Tejun Heo @ 2011-02-25 15:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Thu, Feb 24, 2011 at 10:08:19PM +0100, Oleg Nesterov wrote:
> On 02/22, Tejun Heo wrote:
> > > What is important, I think ptrace should respect SIGNAL_STOP_STOPPED.
> > > IOW, when the tracee is group-stopped (TASK_STOPPED or TASK_TRACED,
> > > doesn't matter), ptrace_resume() should not wake it up, but merely
> > > do set_task_state(TASK_STATE) and make it resumeable by SIGCONT.
> >
> > I don't think that's gonna fly.  It first is a very user-visible
> > change to how ptrace_resume() works
> 
> Yes. But can't resist, this is a bit unfair ;) It was you who convinced
> me we should cleanup this horror somehow, even if we break some corner
> cases.

Okay, but I don't think I have changed my position.  Things like the
weird race windows visible from some other thread fall in weird corner
cases but the basic stop/resume behavior is much more fundamental than
that and I don't think it would be wise to change that when there are
other ways to solve the problems.  I was referring more to the subtle
implementation details which prevent the problems from being solved
than changing the model itself.

> However, again, I can't argue. Perhaps this change is too radical.

Or maybe I'm just throwing out different arguments as I see fit.  :-P

Anyways, I'm opposed to changing the principles of the current
behavior for two reasons.

1. Changing those would be too visible and can be avoided by taking a
   different approach.

2. I think, in principle, the current per-task behavior is better than
   the proposed behavior of making jctl and ptrace intertwined.

> > As Jan's examples showed, there are things which the
> > debugger does behind group stop's back and some of them are quite
> > legitimate and useful things to do like running some code
> 
> Yes. This can surprise a user which runs the unmodified debugger.

Yeap, it would.

> > If you mix ptrace trap and group stop and then fix group stop
> > notification, not only multithreaded debugging becomes quite
> > cumbersome (suddenly ptracing becomes per-process thing instead of
> > per-thread),
> 
> It should be, imho. Like SIGKILL, SIGSTOP/SIGCONT are not per-thread.
> This is per-process thing.

jctl should be and will stay to be per-process, but that doesn't mean
ptrace needs to interact with them at process level.  ptrace can still
be per-task and operate beneath jctl, which is what I'm proposing to
do.

Requiring ptrace to follow jctl's rules might be conceptually
appealing (not to me) but it changes the current behavior a lot and
affects multithread debugging capability significantly.  I really
can't see much upside of such change.

> > it becomes almost impossible to debug jctl behaviors.
> > Jctl becomes completely intertwined with ptracing and the real parent
> > would get numerous notifications during the course of debugging.
> 
> Again, I think this is a win. The real parent should know that, say,
> its child becomes running after it was stopped. It does not matter
> why it was CLD_CONTINUED, it was resumed and that is all.

I see and we do disagree.  :-)

> > I think they belong to different layers and they should stack instead
> > of mix.  I'll try to write up a summary for how I think it can be done
> > later
> 
> OK. You know, we already spent sooooooooooooooooooooooooooooooooooooooo
> much time discussing this, I have the strong desire to agree in advance
> with anything new ;)

Yeah!!  I'll try to write it up tomorrow.

> > but in short I think we just need two more PTRACE calls (one for
> > combined SIGSTOPless attach + INTERRUPT
> 
> Yes, we are discussing these requests on archer,

Can we please do that on LKML?  It's a kernel change after all.

> > and the other for jctl monitoring)
> 
> Of course, we can add the new requests to help gdb/strace/whatever
> to handle jctl. In fact I think we should in any case.
> 
> But this is "easy". In the context of this discussion, my only concern
> is the current behaviour.

IMHO, as long as we don't break the current users in any significant
way, it should be okay, but I don't think we can or should make
fundamental changes to the existing behaviors.  My position is, I
guess, to change the things which prevent us from implementing the new
things.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-24 20:29                                                             ` Oleg Nesterov
@ 2011-02-25 15:51                                                               ` Tejun Heo
  2011-02-26  2:48                                                                 ` Denys Vlasenko
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-25 15:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Thu, Feb 24, 2011 at 09:29:41PM +0100, Oleg Nesterov wrote:
> Damn. Today is 02/24 ;) sorry.

No need.  I've been pretty lazy with this thread too.  :-)

> > On Fri, Feb 18, 2011 at 08:37:09PM +0100, Oleg Nesterov wrote:
> >
> > > > As it currently stands, SIGSTOP/CONT while ptraced doesn't work
> > >
> > > And this is probably where we disagree the most. I think this is bug,
> > > and this should be fixed.
> >
> > I don't think we disagree that it is a bug.  I want to fix it too but
> > we definitely seem to disagree on how.
> 
> Yes, but I also think that the running tracee in the SIGNAL_STOP_STOPPED
> process is bug by itself. IIUC, you think this is fine.

Yeap, I actually think that's the better way.

> > * ptrace, sans the odd SIGSTOP on attach which we should remove, is
> >   per-task.  Sending out SIGCONT on PTRACE_CONT would break that.  I
> >   really don't think that's a good idea.
> 
> Hmm. But why do you think we should always send SIGCONT after attach?

Hmmm... my sentences were confusing.  I was trying to say,

* ptrace, as it currently stands, is largely per-task.  One exception
  is the implicit SIGSTOP which is sent on PTRACE_ATTACH but this
  should be replaced with a more transparent attach request which
  doesn't affect jctl states.

* Sending out SIGCONT on PTRACE_CONT on jctl stopped tracee adds
  another exception to per-task behavior, which I don't think is a
  good idea.

> > * PTRACE_CONT would be behaving completely differently depending on
> >   whether it's resuming from group stop or other traps.
> 
> Afaics, no. It does not matter from where the tracee resumes. See
> the [pseudo patch] I sent. Once again, it doesn't really work, it
> only tries to explain what I mean.

I see.  I'll read the patch again.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-25 15:45                                                   ` Tejun Heo
@ 2011-02-25 17:42                                                     ` Roland McGrath
  2011-02-28 15:23                                                     ` Oleg Nesterov
  1 sibling, 0 replies; 160+ messages in thread
From: Roland McGrath @ 2011-02-25 17:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm

> > Yes, we are discussing these requests on archer,
> 
> Can we please do that on LKML?  It's a kernel change after all.

What's being discussed on the debugger list is batting around ideas of
what kinds of potential new features could be useful to the debugger.
No decisions about kernel issues are being made, and of course when it
comes to proposing specific kernel features, that would be discussed
here.  But the debugger community is who needs to discuss what they
would actually want and would actually make use of.  There is no point
in discussing the details of new ptrace features for the benefit of
the debugger on a kernel list before the debugger community has come
to some consensus about what they would really make use of.  It would
be counterproductive to start proposing and implementing random new
half-baked ideas in the kernel without first being sure that they are
things the debugger actually needs and the debugger developers will
actually do the work to exploit.  We've had enough of that already,
leading to the current morass of ill-specified features that don't
help the debugger people very much.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-25 15:51                                                               ` Tejun Heo
@ 2011-02-26  2:48                                                                 ` Denys Vlasenko
  2011-02-28 12:56                                                                   ` Tejun Heo
  2011-02-28 14:36                                                                   ` Oleg Nesterov
  0 siblings, 2 replies; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-26  2:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Friday 25 February 2011 16:51, Tejun Heo wrote:
> > > * ptrace, sans the odd SIGSTOP on attach which we should remove, is
> > >   per-task.  Sending out SIGCONT on PTRACE_CONT would break that.  I
> > >   really don't think that's a good idea.
> > 
> > Hmm. But why do you think we should always send SIGCONT after attach?
> 
> Hmmm... my sentences were confusing.  I was trying to say,
> 
> * ptrace, as it currently stands, is largely per-task.  One exception
>   is the implicit SIGSTOP which is sent on PTRACE_ATTACH but this
>   should be replaced with a more transparent attach request which
>   doesn't affect jctl states.
> 
> * Sending out SIGCONT on PTRACE_CONT on jctl stopped tracee adds
>   another exception to per-task behavior, which I don't think is a
>   good idea.

Guys, it looks like we finally identified some points on which
everyone on this thread agrees (at least I don't see any strong
objections).

To enumerate:

* PTRACE_ATTACH's insertion of SIGSTOP is a design bug, but it is
  so ingrained by now that we don't want to change PTRACE_ATTACH
  semantic. We fix this situation by introducing a new ptrace call,
  PTRACE_ATTACH_NOSTOP, which has saner API.

* group-stop state is currently not preserved across ptrace-stop.
  This makes, in particular, ^Z and SIGSTOP inoperative for straced
  programs. Everyone agrees this needs to be fixed.
  (There is a small bug of not notifying real parent about the group-stop,
  I don't want to go there since it is also non-contentious - everybody
  is in agreement this also should be fixed in "obvious" way).

* HOWEVER, this behavior _is_ indeed used by gdb to run small fragments
  of tracee even if it's stopped. Jan's example:
    # gdb -p applicationpid
    (gdb) print getpid()
    (gdb) print show_me_your_internal_debug_dump()
    (gdb) continue
  gdb people want to preserve this feature.
  How gdb implements this? I ssume it does this by modifying IP,
  setting a breakpoint on return address, and issues PTRACE_CONT(0).
  Currently it works because of "group-stop is ignored under ptrace" bug.


How we can accomodate this gdb need while fixing this bug?


Oleg's POV is that gdb should SIGCONT the tracee (at least if it is
currently in group-stop). This has the advantage of using standard Unix
tool. The disadvantage is that SIGCONT will wake up *all* threads,
and that it will cause user-visible effects (SIGCONT handler will be run,
parent can (or "should be able to", we may have a bug there too)
see child to be WCONTINUED.

Frankly, it seems that this is hardly acceptable for gdb. gdb people
do want here a "secret" backdoor-ish way to make a *thread*
(not the whole process) running even when the process is in group-stop.
Yes, this is a "violation" of the convention that normally
stopped process has all threads stopped, and it makes Oleg feel
it is "wrong", but it is also useful, and used in real life.
We can't ignore that.


Jan's idea is to make kernel remember group-stop state upon attach,
preserve current behavior of ignoring group-stop while attached,
and restore group-stop upon detach.
Sorry Jan, this won't work in many cases. It won't fix the
"stracing makes process ignore SIGSTOP" bug - the result will be
that buggy behavior will be still observed. Neither it will work for
    # gdb -p applicationpid
    (gdb) print getpid()
    (gdb) print show_me_your_internal_debug_dump()
    (gdb) continue
- the "continue" will make application run even if we attached to it while
it was stopped. It will ONLY work for
    # gdb -p applicationpid
    (gdb) print getpid()
    (gdb) print show_me_your_internal_debug_dump()
    (gdb) quit
sequence. Which is good, but not good enough.


Tejun, you are disagreeng with Oleg's proposal. Do you have a proposal
which looks better to you? Or do you propose to just leave it as-is,
that is, to continue to ignore group-stop under ptrace?


>From my side, i really want to see "group-stop is ignored under ptrace"
bug fixed, yet I feel gdb's needs are legitimate. Perhaps I can help
by presenting a few ideas how to open a backdoor in ptrace API for gdb:

(a) Special-case ptrace(PTRACE_CONT/SYSCALL, pid, 0, SIGCONT) to do
"special restart for gdb" thing. Problem with this idea is that we can
be in ptrace-stop caused by genuine signal delivery, and using
ptrace(PTRACE_CONT/SYSCALL, SIGCONT) from it means "inject SIGCONT".
IOW: this creates ambiquity.

    or

(b) Abuse "addr" parameter in ptrace(PTRACE_CONT/SYSCALL, pid, addr, sig).
Currently, it is unused. Can we define a value for it which means
"do gdb hacky restart under group-stop, if tracee is indeed under group-stop"?
(the value should be different from 0 and 0x1 - values currently used by strace)

    or

(c) Add ptrace(PTRACE_CONT2/SYSCALL2/SINGLESTEP2) with the semantic of
"do gdb hacky restart under group-stop, if tracee is indeed under group-stop".
I like it less because we have at least three restarting PTRACE_foo,
maybe even four if we want to have DETACH2 too.
Duplicating every one of them feels ugly.

    or

(d) Add a ptrace option PTRACE_O_IGNORE_JOB_STOP which can be set/cleared
by PTRACE_SETOPTIONS and which modifies ptrace-restart behavior.
gdb will set the option before it wants to do
"restart-which-ignores-group-stop", and clears it again when it
no longer wants it. In the example above:
    # gdb -p applicationpid
    (gdb) print getpid() # sets IGNORE_JOB_STOP before PTRACE_CONT(0)
    (gdb) print show_me_your_internal_debug_dump() # sets IGNORE_JOB_STOP
    (gdb) continue       # clears IGNORE_JOB_STOP before PTRACE_CONT(0)


Unfortunately, none of them look particularly elegant, and all of them
will require gdb to be changed.

Jan, which one of these proposed changes to API looks "least bad" to you
from gdb POV?

Of course, feel free to provide a better proposal.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-14  9:03                                     ` Jan Kratochvil
  2011-02-14 11:39                                       ` Denys Vlasenko
  2011-02-14 16:01                                       ` Oleg Nesterov
@ 2011-02-26  3:59                                       ` Pavel Machek
  2 siblings, 0 replies; 160+ messages in thread
From: Pavel Machek @ 2011-02-26  3:59 UTC (permalink / raw)
  To: Jan Kratochvil
  Cc: Denys Vlasenko, Oleg Nesterov, Tejun Heo, Roland McGrath,
	linux-kernel, torvalds, akpm

On Mon 2011-02-14 10:03:56, Jan Kratochvil wrote:
> On Mon, 14 Feb 2011 00:01:47 +0100, Denys Vlasenko wrote:
> > * sleep runs in nanosleep
> > * SIGSTOP arrives, strace sees it
> > * strace logs it and allows it via ptrace(PTRACE_SYSCALL, ..., SIGSTOP)
> > * sleep process enters group-stop
> 
> The last point breaks the documented behavior of ptrace:
> 	If data is nonzero and not SIGSTOP, it is interpreted as a signal to
> 	be delivered to the child; otherwise, no signal is delivered.
> 
> I do not see it would affect gdb.  strace will change its behavior when
> SIGSTOP is sent to its tracee although the new behavior may be OK.
> 
> It is more a subject of apps compatibility testing with such a kernel change.

apps compatibility testing?

No, we don't change kernel APIs like that -- those are called
regressions.

Just make ptrace(PTRACE_SYSCALL_2, ...) with fixed semantics.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-26  2:48                                                                 ` Denys Vlasenko
@ 2011-02-28 12:56                                                                   ` Tejun Heo
  2011-02-28 13:16                                                                     ` Denys Vlasenko
  2011-02-28 14:36                                                                   ` Oleg Nesterov
  1 sibling, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-28 12:56 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello, Denys.

On Sat, Feb 26, 2011 at 03:48:03AM +0100, Denys Vlasenko wrote:
> * PTRACE_ATTACH's insertion of SIGSTOP is a design bug, but it is
>   so ingrained by now that we don't want to change PTRACE_ATTACH
>   semantic. We fix this situation by introducing a new ptrace call,
>   PTRACE_ATTACH_NOSTOP, which has saner API.

I'm thinking about a slightly different one.  Instead of having
PTRACE_ATTACH_NOSTOP + INTERRUPT, I think one which attaches and
cleanly seizes the tracee would be better.  Let's say it's
PTRACE_SEIZE.  This should be able to serve both ATTACH and INTERRUPT,
but this is a detail.

> * group-stop state is currently not preserved across ptrace-stop.
>   This makes, in particular, ^Z and SIGSTOP inoperative for straced
>   programs. Everyone agrees this needs to be fixed.
>   (There is a small bug of not notifying real parent about the group-stop,
>   I don't want to go there since it is also non-contentious - everybody
>   is in agreement this also should be fixed in "obvious" way).

Yeap, we do agree on this one, unfortunately not on how yet.

> * HOWEVER, this behavior _is_ indeed used by gdb to run small fragments
>   of tracee even if it's stopped. Jan's example:
>     # gdb -p applicationpid
>     (gdb) print getpid()
>     (gdb) print show_me_your_internal_debug_dump()
>     (gdb) continue
>   gdb people want to preserve this feature.
>   How gdb implements this? I ssume it does this by modifying IP,
>   setting a breakpoint on return address, and issues PTRACE_CONT(0).
>   Currently it works because of "group-stop is ignored under ptrace" bug.

I don't think it works because of "group-stop is ignored under ptrace"
bug.  IMO, it's because ptrace is inherently per-task not
per-task-group, which I think is the right way to do it.

> How we can accomodate this gdb need while fixing this bug?
>
> Oleg's POV is that gdb should SIGCONT the tracee (at least if it is
> currently in group-stop). This has the advantage of using standard Unix
> tool. The disadvantage is that SIGCONT will wake up *all* threads,
> and that it will cause user-visible effects (SIGCONT handler will be run,
> parent can (or "should be able to", we may have a bug there too)
> see child to be WCONTINUED.
> 
> Frankly, it seems that this is hardly acceptable for gdb. gdb people
> do want here a "secret" backdoor-ish way to make a *thread*
> (not the whole process) running even when the process is in group-stop.
> Yes, this is a "violation" of the convention that normally
> stopped process has all threads stopped, and it makes Oleg feel
> it is "wrong", but it is also useful, and used in real life.
> We can't ignore that.

Yeah, agreed and as I said multiple times I think this is by design
and actually the better and more useful behavior, albeit slightly less
intuitive.

> Jan's idea is to make kernel remember group-stop state upon attach,
> preserve current behavior of ignoring group-stop while attached,
> and restore group-stop upon detach.
> Sorry Jan, this won't work in many cases. It won't fix the
> "stracing makes process ignore SIGSTOP" bug - the result will be
> that buggy behavior will be still observed. Neither it will work for
>     # gdb -p applicationpid
>     (gdb) print getpid()
>     (gdb) print show_me_your_internal_debug_dump()
>     (gdb) continue
> - the "continue" will make application run even if we attached to it while
> it was stopped. It will ONLY work for
>     # gdb -p applicationpid
>     (gdb) print getpid()
>     (gdb) print show_me_your_internal_debug_dump()
>     (gdb) quit
> sequence. Which is good, but not good enough.
> 
> Tejun, you are disagreeng with Oleg's proposal.

Yeap.

> Do you have a proposal which looks better to you? Or do you propose
> to just leave it as-is, that is, to continue to ignore group-stop
> under ptrace?

I'm writing my proposal now.  Will post soon.  Was too lazy to do
anything during the weekend.

> From my side, i really want to see "group-stop is ignored under ptrace"
> bug fixed, yet I feel gdb's needs are legitimate. Perhaps I can help
> by presenting a few ideas how to open a backdoor in ptrace API for gdb:
> 
> (a) Special-case ptrace(PTRACE_CONT/SYSCALL, pid, 0, SIGCONT) to do
> "special restart for gdb" thing. Problem with this idea is that we can
> be in ptrace-stop caused by genuine signal delivery, and using
> ptrace(PTRACE_CONT/SYSCALL, SIGCONT) from it means "inject SIGCONT".
> IOW: this creates ambiquity.
> 
>     or
> 
> (b) Abuse "addr" parameter in ptrace(PTRACE_CONT/SYSCALL, pid, addr, sig).
> Currently, it is unused. Can we define a value for it which means
> "do gdb hacky restart under group-stop, if tracee is indeed under group-stop"?
> (the value should be different from 0 and 0x1 - values currently used by strace)
> 
>     or
> 
> (c) Add ptrace(PTRACE_CONT2/SYSCALL2/SINGLESTEP2) with the semantic of
> "do gdb hacky restart under group-stop, if tracee is indeed under group-stop".
> I like it less because we have at least three restarting PTRACE_foo,
> maybe even four if we want to have DETACH2 too.
> Duplicating every one of them feels ugly.
> 
>     or
> 
> (d) Add a ptrace option PTRACE_O_IGNORE_JOB_STOP which can be set/cleared
> by PTRACE_SETOPTIONS and which modifies ptrace-restart behavior.
> gdb will set the option before it wants to do
> "restart-which-ignores-group-stop", and clears it again when it
> no longer wants it. In the example above:
>     # gdb -p applicationpid
>     (gdb) print getpid() # sets IGNORE_JOB_STOP before PTRACE_CONT(0)
>     (gdb) print show_me_your_internal_debug_dump() # sets IGNORE_JOB_STOP
>     (gdb) continue       # clears IGNORE_JOB_STOP before PTRACE_CONT(0)

I don't think any such hack is necessary.  We just need to let the
ptracer know what's going on.  There's no need to discern between trap
resume and group stop resume.  Anyways, will come back soon with a
proposal.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 12:56                                                                   ` Tejun Heo
@ 2011-02-28 13:16                                                                     ` Denys Vlasenko
  2011-02-28 13:29                                                                       ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-28 13:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Mon, Feb 28, 2011 at 1:56 PM, Tejun Heo <tj@kernel.org> wrote:
>> * group-stop state is currently not preserved across ptrace-stop.
>>   This makes, in particular, ^Z and SIGSTOP inoperative for straced
>>   programs. Everyone agrees this needs to be fixed.
>>   (There is a small bug of not notifying real parent about the group-stop,
>>   I don't want to go there since it is also non-contentious - everybody
>>   is in agreement this also should be fixed in "obvious" way).
>
> Yeap, we do agree on this one, unfortunately not on how yet.
>
>> * HOWEVER, this behavior _is_ indeed used by gdb to run small fragments
>>   of tracee even if it's stopped. Jan's example:
>>     # gdb -p applicationpid
>>     (gdb) print getpid()
>>     (gdb) print show_me_your_internal_debug_dump()
>>     (gdb) continue
>>   gdb people want to preserve this feature.
>>   How gdb implements this? I ssume it does this by modifying IP,
>>   setting a breakpoint on return address, and issues PTRACE_CONT(0).
>>   Currently it works because of "group-stop is ignored under ptrace" bug.
>
> I don't think it works because of "group-stop is ignored under ptrace"
> bug.

How so?
Imagine the following: tracee was stopped (two cases: it was stopped
before we attached to it, or it was stopped by SIGSTOP during debug session),
and we do run on a hypothetical kernel which preserves group-stop.
At this point, in gdb user does this:

(gdb) print getpid()

gdb modifies IP, sets breakpoint on return address, and issues PTRACE_CONT(0).
Kernel has to put the tracee into group-stop, right?
Becuase if it doesn't, if it makes tracee run, then the kernel is
still broken. For example,
stracing a program and sending SIGSTOP on it won't work: the sequence
of events will be
got SIGSTOP because SIGSTOP was delivered
PTRACE_SYSCALL(SIGSTOP) - "inject it"
got SIGSTOP because tracee is in group-stop now
PTRACE_SYSCALL(SIGSTOP) - equivalent to PTRACE_SYSCALL(0)
  because we aren't in signal delivery ptrace-stop
and tracee continues.

That's why I think gdb's "print getpid()" today depends on the bug.
If we simply fix the bug (by making PTRACE_CONT/SYSCALL(0)
re-enter group-stop), then "print getpid()" will stop working
for stopped tracees.

> IMO, it's because ptrace is inherently per-task not
> per-task-group, which I think is the right way to do it.

Yes, it is, and I don't propose to change that.
However, I don't see how that is relevant to examples
I just described.

> Yeah, agreed and as I said multiple times I think this is by design
> and actually the better and more useful behavior, albeit slightly less
> intuitive.

As I described, current behavior breaks stracing of programs
which get SIGSTOPed or SIGTSTP'ed (^Z).
Which is pretty lame - ^Z is not exactly rare use case.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 13:16                                                                     ` Denys Vlasenko
@ 2011-02-28 13:29                                                                       ` Tejun Heo
  2011-02-28 13:41                                                                         ` Denys Vlasenko
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-28 13:29 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Mon, Feb 28, 2011 at 02:16:48PM +0100, Denys Vlasenko wrote:
> (gdb) print getpid()
> 
> gdb modifies IP, sets breakpoint on return address, and issues PTRACE_CONT(0).
> Kernel has to put the tracee into group-stop, right?
> Becuase if it doesn't, if it makes tracee run, then the kernel is
> still broken. For example,
> stracing a program and sending SIGSTOP on it won't work: the sequence
> of events will be
> got SIGSTOP because SIGSTOP was delivered
> PTRACE_SYSCALL(SIGSTOP) - "inject it"
> got SIGSTOP because tracee is in group-stop now
> PTRACE_SYSCALL(SIGSTOP) - equivalent to PTRACE_SYSCALL(0)
>   because we aren't in signal delivery ptrace-stop
> and tracee continues.
> 
> That's why I think gdb's "print getpid()" today depends on the bug.
> If we simply fix the bug (by making PTRACE_CONT/SYSCALL(0)
> re-enter group-stop), then "print getpid()" will stop working
> for stopped tracees.

There's no reason to make the tracee re-enter group stop after pulling
it out to execute 'print getpid()'.  The only thing necessary is a way
for the debugger to find out that group stop has been lifted.  The
debugger then can resume the tracee if it wishes so.  ie. group stop
becomes a trap point + a state which the debugger can monitor.  If the
debugger wants the tracee to follow the jctl behavior, it can do so by
resuming the tracee as it sees fit.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 13:29                                                                       ` Tejun Heo
@ 2011-02-28 13:41                                                                         ` Denys Vlasenko
  2011-02-28 13:53                                                                           ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-28 13:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Mon, Feb 28, 2011 at 2:29 PM, Tejun Heo <tj@kernel.org> wrote:
> On Mon, Feb 28, 2011 at 02:16:48PM +0100, Denys Vlasenko wrote:
>> (gdb) print getpid()
>>
>> gdb modifies IP, sets breakpoint on return address, and issues PTRACE_CONT(0).
>> Kernel has to put the tracee into group-stop, right?
>> Becuase if it doesn't, if it makes tracee run, then the kernel is
>> still broken. For example,
>> stracing a program and sending SIGSTOP on it won't work: the sequence
>> of events will be
>> got SIGSTOP because SIGSTOP was delivered
>> PTRACE_SYSCALL(SIGSTOP) - "inject it"
>> got SIGSTOP because tracee is in group-stop now
>> PTRACE_SYSCALL(SIGSTOP) - equivalent to PTRACE_SYSCALL(0)
>>   because we aren't in signal delivery ptrace-stop
>> and tracee continues.
>>
>> That's why I think gdb's "print getpid()" today depends on the bug.
>> If we simply fix the bug (by making PTRACE_CONT/SYSCALL(0)
>> re-enter group-stop), then "print getpid()" will stop working
>> for stopped tracees.
>
> There's no reason to make the tracee re-enter group stop after pulling
> it out to execute 'print getpid()'.

If we want to execute 'print getpid()', you are right, we don't want
to enter group stop. That's use case #1. But there is use case #2:
"strace sleep" + ^Z. if we want to continue stracing sleep
without continuing, our PTRACE_SYSCALL(0) *must* make sleep
enter group stop. Otherwise, sleep won't be stopped. It will
continue sleeping and will exit, which is not what we want. Right?

>  The only thing necessary is a way
> for the debugger to find out that group stop has been lifted.

What do you mean by "has been *lifted*"?

> The debugger then can resume the tracee if it wishes so.  ie. group stop
> becomes a trap point + a state which the debugger can monitor.  If the
> debugger wants the tracee to follow the jctl behavior, it can do so by
> resuming the tracee as it sees fit.

Can you describe this in more details? Do you propose that
debugger can detect that we are in group stop (it is already sort-of
possible with PTRACE_GETSIGINFO) and if it doesn't want to
restart tracee, it simply doesn't do any PTRACE_SYSCALL/CONT?
I tried that. This makes tracee sit in *ptrace* stop, not group stop.
Meaning: debugger is never be able to see waking SIGCONT:
waitpid doesn't report it to the debugger.

-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 13:41                                                                         ` Denys Vlasenko
@ 2011-02-28 13:53                                                                           ` Tejun Heo
  2011-02-28 14:25                                                                             ` Denys Vlasenko
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-28 13:53 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hello,

On Mon, Feb 28, 2011 at 02:41:29PM +0100, Denys Vlasenko wrote:
> > There's no reason to make the tracee re-enter group stop after pulling
> > it out to execute 'print getpid()'.
> 
> If we want to execute 'print getpid()', you are right, we don't want
> to enter group stop. That's use case #1. But there is use case #2:
> "strace sleep" + ^Z. if we want to continue stracing sleep
> without continuing, our PTRACE_SYSCALL(0) *must* make sleep
> enter group stop. Otherwise, sleep won't be stopped. It will
> continue sleeping and will exit, which is not what we want. Right?

I don't really follow the distinction you're trying to make.  It
doesn't matter what you're trying to do.  All that's necessary is for
the debugger to find out whether group stop is in effect or not and a
way to control the tracee's execution.  Nothing else is necessary.
The debugger already knows when the tracee enters group stop.  We just
need a way for the debugger to find out when group stop stops.

> >  The only thing necessary is a way
> > for the debugger to find out that group stop has been lifted.
> 
> What do you mean by "has been *lifted*"?

Somebody sent SIGCONT.

> > The debugger then can resume the tracee if it wishes so.  ie. group stop
> > becomes a trap point + a state which the debugger can monitor.  If the
> > debugger wants the tracee to follow the jctl behavior, it can do so by
> > resuming the tracee as it sees fit.
> 
> Can you describe this in more details? Do you propose that
> debugger can detect that we are in group stop (it is already sort-of
> possible with PTRACE_GETSIGINFO) and if it doesn't want to
> restart tracee, it simply doesn't do any PTRACE_SYSCALL/CONT?
> I tried that. This makes tracee sit in *ptrace* stop, not group stop.
> Meaning: debugger is never be able to see waking SIGCONT:
> waitpid doesn't report it to the debugger.

A tracee should _always_ enter ptrace trap whenever stopping while
ptraced.  The stop here or stop there depending on the type of stop
behavior is hardly useful and very fragile (I think it's inherently
fragile that way).  Again, the only missing thing is a way for the
debugger to find out when task stop stops.  BTW, I'm not talking about
the current behavior.  There's no way to make jctl work properly as it
is.  We need to improve the kernel one way or another.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 13:53                                                                           ` Tejun Heo
@ 2011-02-28 14:25                                                                             ` Denys Vlasenko
  2011-02-28 14:39                                                                               ` Tejun Heo
  0 siblings, 1 reply; 160+ messages in thread
From: Denys Vlasenko @ 2011-02-28 14:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On Mon, Feb 28, 2011 at 2:53 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Mon, Feb 28, 2011 at 02:41:29PM +0100, Denys Vlasenko wrote:
>> > There's no reason to make the tracee re-enter group stop after pulling
>> > it out to execute 'print getpid()'.
>>
>> If we want to execute 'print getpid()', you are right, we don't want
>> to enter group stop. That's use case #1. But there is use case #2:
>> "strace sleep" + ^Z. if we want to continue stracing sleep
>> without continuing, our PTRACE_SYSCALL(0) *must* make sleep
>> enter group stop. Otherwise, sleep won't be stopped. It will
>> continue sleeping and will exit, which is not what we want. Right?
>
> I don't really follow the distinction you're trying to make.  It
> doesn't matter what you're trying to do.  All that's necessary is for
> the debugger to find out whether group stop is in effect or not and a
> way to control the tracee's execution.  Nothing else is necessary.
> The debugger already knows when the tracee enters group stop.  We just
> need a way for the debugger to find out when group stop stops.

If I understand you right, you are proposing to handle strace+^Z
scenario this way (simplified pseudo-C):

for(;;) {
  waitpid(-1, ...);
  tracee_is_stopped = 0;
  if (waitpid reported *stopping* signal) {
    PTRACE_GETSIGINFO();
    if (PTRACE_GETSIGINFO failed)
      tracee_is_stopped = 1;
  }
  if (!tracee_is_stopped)
    PTRACE_SYSCALL(signal);
}

This requires API change to make waitpid in debugger
see waking SIGCONTs even if we did not restart the tracee
(at least in the case when tracee was in group stop).

This sounds like viable plan from userspace POV.

>> Can you describe this in more details? Do you propose that
>> debugger can detect that we are in group stop (it is already sort-of
>> possible with PTRACE_GETSIGINFO) and if it doesn't want to
>> restart tracee, it simply doesn't do any PTRACE_SYSCALL/CONT?
>> I tried that. This makes tracee sit in *ptrace* stop, not group stop.
>> Meaning: debugger is never be able to see waking SIGCONT:
>> waitpid doesn't report it to the debugger.
>
> A tracee should _always_ enter ptrace trap whenever stopping while
> ptraced.  The stop here or stop there depending on the type of stop
> behavior is hardly useful and very fragile (I think it's inherently
> fragile that way).  Again, the only missing thing is a way for the
> debugger to find out when task stop stops.

PTRACE_GETSIGINFO tells you that. It's a bit of a hack
(PTRACE_GETSIGINFO was meant to be used for a different purpose)
but it seems to be working well.

The problematic case is when we attach to *already stopped* task.
gdb today goes through insane gyrations to detect that condition
(I believe it looks into /proc/TID/state).

If we are going to add PTRACE_ATTACH_NOSTOP or something like it,
perhaps we need to take care to make tracee state detectable
at attach time more easily. Maybe by delivering "I have stopped with SIGfoo"
notification to the debugger if tracee was in group stop?

> BTW, I'm not talking about
> the current behavior.  There's no way to make jctl work properly as it
> is.  We need to improve the kernel one way or another.

I'm happy to hear that :)
Thanks!
-- 
vda

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-26  2:48                                                                 ` Denys Vlasenko
  2011-02-28 12:56                                                                   ` Tejun Heo
@ 2011-02-28 14:36                                                                   ` Oleg Nesterov
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-28 14:36 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Roland McGrath, jan.kratochvil, linux-kernel, torvalds, akpm

On 02/26, Denys Vlasenko wrote:
>
> * HOWEVER, this behavior _is_ indeed used by gdb to run small fragments
>   of tracee even if it's stopped. Jan's example:
>     # gdb -p applicationpid
>     (gdb) print getpid()
>     (gdb) print show_me_your_internal_debug_dump()
>     (gdb) continue
>   gdb people want to preserve this feature.

Yes. Jan is looking at this, and probably he will nack this change.

> How we can accomodate this gdb need while fixing this bug?
>
>
> Oleg's POV is that gdb should SIGCONT the tracee (at least if it is
> currently in group-stop). This has the advantage of using standard Unix
> tool. The disadvantage is that SIGCONT will wake up *all* threads,

Not necessarily. That is why, btw, I started to like Tejun's suggestion,
the traced task should always stop in TASK_TRACED state. This means
SIGCONT can only wakeup the tracee after PTRACE_CONT from debugger.

Even without enforcing TASK_TRACED from the kernel side, gdb should
do at least one ptrace() call after attach, this makes it TASK_TRACED
anyway.

> gdb people
> do want here a "secret" backdoor-ish way to make a *thread*
> (not the whole process) running even when the process is in group-stop.

And this is what I disagree with. This was my main motivation to start
this hopeless^W lengthy discussion ;) I simply can't accept the current
behaviour: the task runs while the kernel and parent think the whole
process is stopped.

That is why I also considered another (and imho worse) option. OK, let's
resume the tracee even if it is stopped. But in this case, let's clear
SIGNAL_STOP_STOPPED and notify its parent.

> how to open a backdoor in ptrace API for gdb:

Probably I am wrong, but in the context of this discussion I do not
care much about the new possible requests/improvements in gdb/kernel.

Of course we can do something to make gdb happy, but the problem is
the current/old code. The main objection (and I have to respect it)
is: this change is not compatible.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 14:25                                                                             ` Denys Vlasenko
@ 2011-02-28 14:39                                                                               ` Tejun Heo
  2011-02-28 16:48                                                                                 ` Oleg Nesterov
  0 siblings, 1 reply; 160+ messages in thread
From: Tejun Heo @ 2011-02-28 14:39 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

Hey,

On Mon, Feb 28, 2011 at 03:25:53PM +0100, Denys Vlasenko wrote:
> On Mon, Feb 28, 2011 at 2:53 PM, Tejun Heo <tj@kernel.org> wrote:
> > I don't really follow the distinction you're trying to make.  It
> > doesn't matter what you're trying to do.  All that's necessary is for
> > the debugger to find out whether group stop is in effect or not and a
> > way to control the tracee's execution.  Nothing else is necessary.
> > The debugger already knows when the tracee enters group stop.  We just
> > need a way for the debugger to find out when group stop stops.
> 
> If I understand you right, you are proposing to handle strace+^Z
> scenario this way (simplified pseudo-C):
> 
> for(;;) {
>   waitpid(-1, ...);
>   tracee_is_stopped = 0;
>   if (waitpid reported *stopping* signal) {
>     PTRACE_GETSIGINFO();
>     if (PTRACE_GETSIGINFO failed)
>       tracee_is_stopped = 1;
>   }
>   if (!tracee_is_stopped)
>     PTRACE_SYSCALL(signal);
> }
> 
> This requires API change to make waitpid in debugger
> see waking SIGCONTs even if we did not restart the tracee
> (at least in the case when tracee was in group stop).

Yes, something like that.  I'm still not sure how to notify end of
group stop to the debugger tho.  Using wait(2) would be the path of
the least resistance but as you pointed out it does change the
behavior.  I think what we can do is to switch on the behavior when
the new attach call is used.  We'll probably have to pay some
attention to make the notification race-free and reliable but I think
it shouldn't too difficult.

> This sounds like viable plan from userspace POV.

Cool.

> > A tracee should _always_ enter ptrace trap whenever stopping while
> > ptraced.  The stop here or stop there depending on the type of stop
> > behavior is hardly useful and very fragile (I think it's inherently
> > fragile that way).  Again, the only missing thing is a way for the
> > debugger to find out when task stop stops.
> 
> PTRACE_GETSIGINFO tells you that. It's a bit of a hack
> (PTRACE_GETSIGINFO was meant to be used for a different purpose)
> but it seems to be working well.

Yeap, what works works.  We probably want to explain its use in the
man page but I don't think there's any reason to add a new mechanism
for this.

> The problematic case is when we attach to *already stopped* task.
> gdb today goes through insane gyrations to detect that condition
> (I believe it looks into /proc/TID/state).
> 
> If we are going to add PTRACE_ATTACH_NOSTOP or something like it,
> perhaps we need to take care to make tracee state detectable
> at attach time more easily. Maybe by delivering "I have stopped with SIGfoo"
> notification to the debugger if tracee was in group stop?

I believe this is already solved by making the tracee always enter
TASK_TRACED via ptrace_stop() on attach, which always reports the
group stop signal to the ptracer whether the real parent has consumed
it or not.  The patch is at the top of this gigantic thread.  Oleg,
this is solved one, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-25 15:45                                                   ` Tejun Heo
  2011-02-25 17:42                                                     ` Roland McGrath
@ 2011-02-28 15:23                                                     ` Oleg Nesterov
  1 sibling, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-28 15:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/25, Tejun Heo wrote:
>
> Hello,
>
> On Thu, Feb 24, 2011 at 10:08:19PM +0100, Oleg Nesterov wrote:
>
> > > If you mix ptrace trap and group stop and then fix group stop
> > > notification, not only multithreaded debugging becomes quite
> > > cumbersome (suddenly ptracing becomes per-process thing instead of
> > > per-thread),
> >
> > It should be, imho. Like SIGKILL, SIGSTOP/SIGCONT are not per-thread.
> > This is per-process thing.
>
> jctl should be and will stay to be per-process, but that doesn't mean
> ptrace needs to interact with them at process level.  ptrace can still
> be per-task and operate beneath jctl, which is what I'm proposing to
> do.

This is what I don't fully understand... Yes, ptrace can still be
per-thread. But yes, if gdb sends SIGCONT to one thread, this affects
the whole group even if it doesn't wake up them all, this is true.

OK, I think this doesn't matter, at least we understand how/where
we do not agree with each other.

> > > but in short I think we just need two more PTRACE calls (one for
> > > combined SIGSTOPless attach + INTERRUPT
> >
> > Yes, we are discussing these requests on archer,
>
> Can we please do that on LKML?  It's a kernel change after all.

Right now this has almost nothing to do with the kernel. Currently we
are trying to understand what gdb needs. But, of course, after that
we should discuss the possible kernel improvements/changes on lkml.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

* Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
  2011-02-28 14:39                                                                               ` Tejun Heo
@ 2011-02-28 16:48                                                                                 ` Oleg Nesterov
  0 siblings, 0 replies; 160+ messages in thread
From: Oleg Nesterov @ 2011-02-28 16:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Roland McGrath, jan.kratochvil, linux-kernel,
	torvalds, akpm

On 02/28, Tejun Heo wrote:
>
> I believe this is already solved by making the tracee always enter
> TASK_TRACED via ptrace_stop() on attach, which always reports the
> group stop signal to the ptracer whether the real parent has consumed
> it or not.  The patch is at the top of this gigantic thread.  Oleg,
> this is solved one, right?

Yes, sure. gdb will never see the stopped (I mean task->state) tracee
after attach, so it does not need to do something special. This problem
with exit_code == 0 is fixed too.

Oleg.


^ permalink raw reply	[flat|nested] 160+ messages in thread

end of thread, other threads:[~2011-02-28 16:57 UTC | newest]

Thread overview: 160+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
2011-01-28 15:08 ` [PATCH 01/10] signal: fix SIGCONT notification code Tejun Heo
2011-01-28 15:08 ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Tejun Heo
2011-01-28 18:46   ` Roland McGrath
2011-01-31 10:38     ` Tejun Heo
2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
2011-02-01 13:40         ` Oleg Nesterov
2011-02-01 15:07           ` Tejun Heo
2011-02-01 19:17             ` Oleg Nesterov
2011-02-02  5:31             ` Roland McGrath
2011-02-02 10:35               ` Tejun Heo
2011-02-02  0:27         ` Andrew Morton
2011-02-02  5:33           ` Roland McGrath
2011-02-02  5:38             ` Andrew Morton
2011-02-02 10:34               ` Tejun Heo
2011-02-02 19:33                 ` Andrew Morton
2011-02-02 20:01                   ` Tejun Heo
2011-02-02 21:40             ` Oleg Nesterov
2011-02-02  5:29         ` Roland McGrath
2011-02-02  5:28       ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Roland McGrath
2011-01-28 15:08 ` [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
2011-01-28 18:46   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 04/10] ptrace: kill tracehook_notify_jctl() Tejun Heo
2011-01-28 21:09   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 05/10] ptrace: add @why to ptrace_stop() Tejun Heo
2011-01-28 18:48   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace Tejun Heo
2011-01-28 21:22   ` Roland McGrath
2011-01-31 11:00     ` Tejun Heo
2011-02-02  5:44       ` Roland McGrath
2011-02-02 10:56         ` Tejun Heo
2011-01-28 15:08 ` [PATCH 07/10] signal: use GROUP_STOP_PENDING to stop once for a single group stop Tejun Heo
2011-01-28 15:08 ` [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
2011-01-28 21:30   ` Roland McGrath
2011-01-31 11:26     ` Tejun Heo
2011-02-02  5:57       ` Roland McGrath
2011-02-02 10:53         ` Tejun Heo
2011-02-03 10:02           ` Tejun Heo
2011-02-01 19:36     ` Oleg Nesterov
2011-01-28 15:08 ` [PATCH 09/10] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced Tejun Heo
2011-01-28 15:08 ` [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED Tejun Heo
2011-02-03 20:41   ` [PATCH 0/1] (Was: ptrace: clean transitions between TASK_STOPPED and TRACED) Oleg Nesterov
2011-02-03 20:41     ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Oleg Nesterov
2011-02-03 21:36       ` Roland McGrath
2011-02-03 21:44         ` Oleg Nesterov
2011-02-04 10:53           ` Tejun Heo
2011-02-04 13:04             ` Oleg Nesterov
2011-02-04 14:48               ` Tejun Heo
2011-02-04 17:06                 ` Oleg Nesterov
2011-02-05 13:39                   ` Tejun Heo
2011-02-07 13:42                     ` Oleg Nesterov
2011-02-07 14:11                       ` Tejun Heo
2011-02-07 15:37                         ` Oleg Nesterov
2011-02-07 16:31                           ` Tejun Heo
2011-02-07 17:48                             ` Oleg Nesterov
2011-02-09 14:18                               ` Tejun Heo
2011-02-09 14:21                                 ` Tejun Heo
2011-02-09 21:25                                 ` Oleg Nesterov
2011-02-13 23:01                                   ` Denys Vlasenko
2011-02-14  9:03                                     ` Jan Kratochvil
2011-02-14 11:39                                       ` Denys Vlasenko
2011-02-14 17:32                                         ` Oleg Nesterov
2011-02-14 16:01                                       ` Oleg Nesterov
2011-02-26  3:59                                       ` Pavel Machek
2011-02-14 15:51                                     ` Oleg Nesterov
2011-02-14 14:50                                   ` Tejun Heo
2011-02-14 18:53                                     ` Oleg Nesterov
2011-02-13 22:25                                 ` Denys Vlasenko
2011-02-14 15:13                                   ` Tejun Heo
2011-02-14 16:15                                     ` Oleg Nesterov
2011-02-14 16:33                                       ` Tejun Heo
2011-02-14 17:23                                         ` Oleg Nesterov
2011-02-14 17:20                                     ` Denys Vlasenko
2011-02-14 17:30                                       ` Tejun Heo
2011-02-14 17:45                                         ` Oleg Nesterov
2011-02-14 17:54                                         ` Denys Vlasenko
2011-02-21 15:16                                           ` Tejun Heo
2011-02-21 15:28                                             ` Oleg Nesterov
2011-02-21 16:11                                               ` [pseudo patch] ptrace should respect the group stop Oleg Nesterov
2011-02-22 16:24                                               ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Tejun Heo
2011-02-24 21:08                                                 ` Oleg Nesterov
2011-02-25 15:45                                                   ` Tejun Heo
2011-02-25 17:42                                                     ` Roland McGrath
2011-02-28 15:23                                                     ` Oleg Nesterov
2011-02-14 17:51                                       ` Oleg Nesterov
2011-02-14 18:55                                         ` Denys Vlasenko
2011-02-14 19:01                                           ` Oleg Nesterov
2011-02-14 19:42                                             ` Denys Vlasenko
2011-02-14 20:01                                               ` Oleg Nesterov
2011-02-15 15:24                                                 ` Tejun Heo
2011-02-15 15:58                                                   ` Oleg Nesterov
2011-02-15 17:31                                                   ` Roland McGrath
2011-02-15 20:27                                                     ` Oleg Nesterov
2011-02-18 17:02                                                       ` Tejun Heo
2011-02-18 19:37                                                         ` Oleg Nesterov
2011-02-21 16:22                                                           ` Tejun Heo
2011-02-21 16:49                                                             ` Oleg Nesterov
2011-02-21 16:59                                                               ` Tejun Heo
2011-02-23 19:31                                                                 ` Oleg Nesterov
2011-02-25 15:10                                                                   ` Tejun Heo
2011-02-24 20:29                                                             ` Oleg Nesterov
2011-02-25 15:51                                                               ` Tejun Heo
2011-02-26  2:48                                                                 ` Denys Vlasenko
2011-02-28 12:56                                                                   ` Tejun Heo
2011-02-28 13:16                                                                     ` Denys Vlasenko
2011-02-28 13:29                                                                       ` Tejun Heo
2011-02-28 13:41                                                                         ` Denys Vlasenko
2011-02-28 13:53                                                                           ` Tejun Heo
2011-02-28 14:25                                                                             ` Denys Vlasenko
2011-02-28 14:39                                                                               ` Tejun Heo
2011-02-28 16:48                                                                                 ` Oleg Nesterov
2011-02-28 14:36                                                                   ` Oleg Nesterov
2011-02-16 21:51                                       ` Jan Kratochvil
2011-02-17  3:37                                         ` Denys Vlasenko
2011-02-17 19:19                                           ` Oleg Nesterov
2011-02-18 21:11                                             ` Jan Kratochvil
2011-02-19 20:16                                               ` Oleg Nesterov
2011-02-17 16:49                                         ` Oleg Nesterov
2011-02-17 18:58                                           ` Roland McGrath
2011-02-17 19:33                                             ` Oleg Nesterov
2011-02-18 21:34                                           ` Jan Kratochvil
2011-02-19 20:06                                             ` Oleg Nesterov
2011-02-20  9:40                                               ` Jan Kratochvil
2011-02-20 17:06                                                 ` Denys Vlasenko
2011-02-20 17:48                                                   ` Oleg Nesterov
2011-02-20 19:10                                                   ` Jan Kratochvil
2011-02-20 19:16                                                     ` Oleg Nesterov
2011-02-20 17:16                                                 ` Oleg Nesterov
2011-02-20 18:52                                                   ` Jan Kratochvil
2011-02-20 20:38                                                     ` Oleg Nesterov
2011-02-20 21:06                                                       ` `(T) stopped' preservation after _exit() [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH] Jan Kratochvil
2011-02-20 21:19                                                         ` Oleg Nesterov
2011-02-20 21:20                                                       ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Jan Kratochvil
2011-02-21 14:23                                                         ` Oleg Nesterov
2011-02-23 16:44                                                           ` Jan Kratochvil
2011-02-14 15:31                                   ` Oleg Nesterov
2011-02-14 17:24                                     ` Denys Vlasenko
2011-02-14 17:39                                       ` Oleg Nesterov
2011-02-14 17:57                                         ` Denys Vlasenko
2011-02-14 18:00                                           ` Oleg Nesterov
2011-02-14 18:06                                             ` Oleg Nesterov
2011-02-14 18:59                                         ` Denys Vlasenko
2011-02-13 21:24                 ` Denys Vlasenko
2011-02-14 15:06                   ` Oleg Nesterov
2011-02-14 15:19                     ` Tejun Heo
2011-02-14 16:20                       ` Oleg Nesterov
2011-02-14 17:05                     ` Denys Vlasenko
2011-02-14 17:18                       ` Oleg Nesterov
2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
2011-01-28 17:41   ` Thomas Gleixner
2011-01-28 18:04     ` Anca Emanuel
2011-01-28 18:36       ` Mathieu Desnoyers
2011-01-28 17:55   ` Oleg Nesterov
2011-01-28 18:29     ` Bash not reacting to Ctrl-C Ingo Molnar
2011-02-05 20:34       ` Oleg Nesterov
2011-02-07 13:08         ` Oleg Nesterov
2011-02-09  6:17           ` Michael Witten
2011-02-09 14:53             ` Ingo Molnar
2011-02-09 19:37               ` Michael Witten
2011-02-11 14:41           ` Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.