[RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced
@ 2011-03-08 19:56 Tejun Heo
  2011-03-08 19:56 ` [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop Tejun Heo
                   ` (12 more replies)
  0 siblings, 13 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

Hello,

This patchset implements "P2. Fix notifications to the real parent" of
the ptrace job control improvements proposal[1].

As the whole job control / ptrace logic is quite delicate, I tried to
be very granual with changes and add plenty of explanations for
subtleties.

This patchset contains the following eight patches.

 0001-job-control-Don-t-set-group_stop-exit_code-if-re-ent.patch
 0002-job-control-Small-reorganization-of-wait_consider_ta.patch
 0003-job-control-Fix-ptracer-wait-2-hang-and-explain-nota.patch
 0004-job-control-Allow-access-to-job-control-events-throu.patch
 0005-job-control-Add-for_ptrace-to-do_notify_parent_cldst.patch
 0006-job-control-Job-control-stop-notifications-should-al.patch
 0007-job-control-Notify-the-real-parent-of-job-control-ev.patch
 0008-job-control-Don-t-send-duplicate-job-control-stop-no.patch

0001-0003 fix subtle issues and prepare for wait(2) related changes.
I don't think any of this needs to be backported.  0004 make the job
control stopped/continued states visible to the real parent regardless
of ptrace.

0005-0006 prepare for notification related changes.  0007-0008 enable
proper job control notifications to the real parent regardless of
ptrace.

This patchset introduces behavior changes; however, all the changes
are visible only to the real parent.  As basically everything job
control related was broken while ptraced, I don't think these changes
need to be conditionalized (ie. enabled only when explicitly told so).
Doing so is likely to only increase confusion with few benefits.

After this patchset, I can ^Z a strace(1) attached command and get a
shell prompt.  The interaction of course is not perfect yet because
strace(1) currently unconditionally overrides job control stop, but
it's a progress.  For strace(1) itself, nothing really changes.

This patchset is based on 2.6.38-rc7 (212e3499b2) + "group stop /
ptrace updates, take#2" patchset[2] which implements P1.  The git
branch for the P1 patchset has been regenerated with Oleg's Acked-by's
added and available in ptrace-review-P1 branch.

This patchset is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git ptrace-review-P2

HEAD is fbbe61c532ba23d2d00bff7c2b354e0522ae4afc.  git.korg takes some
time to sync so if it shows older commit, please try again after a
while.  diffstat follows.

 kernel/exit.c   |   83 +++++++++++++++++++++++++++++++---------
 kernel/signal.c |  116 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 170 insertions(+), 29 deletions(-)

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1107045
[2] http://thread.gmane.org/gmane.linux.kernel/1109224

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-21 13:20   ` Oleg Nesterov
  2011-03-08 19:56 ` [PATCH 2/8] job control: Small reorganization of wait_consider_task() Tejun Heo
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

While ptraced, a task may be resumed while the containing process is
still job control stopped.  If the task receives another stop signal
in this state, it will still initiate group stop, which generates
group_exit_code, which the real parent would be able to see once the
ptracer detaches.

In this scenario, the real parent may see two consecutive CLD_STOPPED
events from two stop signals without intervening SIGCONT, which
normally is impossible.

Test case follows.

#include <stdio.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/wait.h>

int main(void)
{
	pid_t tracee;
	siginfo_t si;

	tracee = fork();
	if (!tracee)
		while (1)
			pause();

	kill(tracee, SIGSTOP);
	waitid(P_PID, tracee, &si, WSTOPPED);

	if (!fork()) {
		ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		waitid(P_PID, tracee, &si, WSTOPPED);
		ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
		waitid(P_PID, tracee, &si, WSTOPPED);
		ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
		waitid(P_PID, tracee, &si, WSTOPPED);
		ptrace(PTRACE_DETACH, tracee, NULL, NULL);
		return 0;
	}

	while (1) {
		si.si_pid = 0;
		waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG);
		if (si.si_pid)
			printf("st=%02d c=%02d\n", si.si_status, si.si_code);
	}
	return 0;
}

Before the patch, the latter waitid() in polling mode reports the
second stopped event generated by the implied SIGSTOP of
PTRACE_ATTACH.

  st=19 c=05
  ^C

After the patch, the second event is not reported.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index c146150..6130cb8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1827,10 +1827,27 @@ static int do_signal_stop(int signr)
 		    unlikely(signal_group_exit(sig)))
 			return 0;
 		/*
-		 * There is no group stop already in progress.
-		 * We must initiate one now.
+		 * There is no group stop already in progress.  We must
+		 * initiate one now.
+		 *
+		 * While ptraced, a task may be resumed while group stop is
+		 * still in effect and then receive a stop signal and
+		 * initiate another group stop.  This deviates from the
+		 * usual behavior as two consecutive stop signals can't
+		 * cause two group stops when !ptraced.
+		 *
+		 * The condition can be distinguished by testing whether
+		 * SIGNAL_STOP_STOPPED is already set.  Don't generate
+		 * group_exit_code in such case.
+		 *
+		 * This is not necessary for SIGNAL_STOP_CONTINUED because
+		 * an intervening stop signal is required to cause two
+		 * continued events regardless of ptrace.
 		 */
-		sig->group_exit_code = signr;
+		if (!(sig->flags & SIGNAL_STOP_STOPPED))
+			sig->group_exit_code = signr;
+		else
+			WARN_ON_ONCE(!task_ptrace(current));
 
 		current->group_stop &= ~GROUP_STOP_SIGMASK;
 		current->group_stop |= signr | gstop;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 2/8] job control: Small reorganization of wait_consider_task()
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
  2011-03-08 19:56 ` [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-08 19:56 ` [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing Tejun Heo
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

Move EXIT_DEAD test in wait_consider_task() above ptrace check.  As
ptraced tasks can't be EXIT_DEAD, this change doesn't cause any
behavior change.  This is to prepare for further changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/exit.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index f9a45eb..b4a935c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1537,6 +1537,10 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 		return 0;
 	}
 
+	/* dead body doesn't have much to contribute */
+	if (p->exit_state == EXIT_DEAD)
+		return 0;
+
 	if (likely(!ptrace) && unlikely(task_ptrace(p))) {
 		/*
 		 * This child is hidden by ptrace.
@@ -1546,9 +1550,6 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 		return 0;
 	}
 
-	if (p->exit_state == EXIT_DEAD)
-		return 0;
-
 	/*
 	 * We don't reap group leaders with subthreads.
 	 */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
  2011-03-08 19:56 ` [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop Tejun Heo
  2011-03-08 19:56 ` [PATCH 2/8] job control: Small reorganization of wait_consider_task() Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-21 15:19   ` Oleg Nesterov
  2011-03-22 10:51   ` [PATCH UPDATED " Tejun Heo
  2011-03-08 19:56 ` [PATCH 4/8] job control: Allow access to job control events through ptracees Tejun Heo
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

wait(2) and friends allow access to stopped/continued states through
zombies, which is required as the states are process-wide and should
be accessible whether the leader task is alive or undead.
wait_consider_task() implements this by always clearing notask_error
and going through wait_task_stopped/continued() for unreaped zombies.

However, while ptraced, the stopped state is per-task and as such if
the ptracee became a zombie, there's no further stopped event to
listen to and wait(2) and friends should return -ECHILD on the tracee.

Fix it by clearing notask_error only if WCONTINUED is set for ptraced
zombies.  While at it, document why clearing notask_error is safe for
each case.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <pthread.h>
  #include <time.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static void *nooper(void *arg)
  {
	  pause();
	  return NULL;
  }

  int main(void)
  {
	  const struct timespec ts1s = { .tv_sec = 1 };
	  pid_t tracee, tracer;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  pthread_t thr;

		  pthread_create(&thr, NULL, nooper, NULL);
		  nanosleep(&ts1s, NULL);
		  printf("tracee exiting\n");
		  pthread_exit(NULL);	/* let subthread run */
	  }

	  tracer = fork();
	  if (tracer == 0) {
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		  while (1) {
			  if (waitid(P_PID, tracee, &si, WSTOPPED) < 0) {
				  perror("waitid");
				  break;
			  }
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(long)si.si_status);
		  }
		  return 0;
	  }

	  waitid(P_PID, tracer, &si, WEXITED);
	  kill(tracee, SIGKILL);
	  return 0;
  }

Before the patch, after the tracee becomes a zombie, the tracer's
waitid(WSTOPPED) never returns and the program doesn't terminate.

  tracee exiting
  ^C

After the patch, tracee exiting triggers waitid() to fail.

  tracee exiting
  waitid: No child processes

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/exit.c |   44 ++++++++++++++++++++++++++++++++++----------
 1 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index b4a935c..7171821 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1550,17 +1550,41 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 		return 0;
 	}
 
-	/*
-	 * We don't reap group leaders with subthreads.
-	 */
-	if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
-		return wait_task_zombie(wo, p);
+	/* slay zombie? */
+	if (p->exit_state == EXIT_ZOMBIE) {
+		/* we don't reap group leaders with subthreads */
+		if (!delay_group_leader(p))
+			return wait_task_zombie(wo, p);
 
-	/*
-	 * It's stopped or running now, so it might
-	 * later continue, exit, or stop again.
-	 */
-	wo->notask_error = 0;
+		/*
+		 * Allow access to stopped/continued state via zombie by
+		 * falling through.  Clearing of notask_error is complex.
+		 *
+		 * When !@ptrace:
+		 *
+		 * If WEXITED is set, notask_error should naturally be
+		 * cleared.  If not, subset of WSTOPPED|WCONTINUED is set,
+		 * so, if there are live subthreads, there are events to
+		 * wait for.  If all subthreads are dead, it's still safe
+		 * to clear - this function will be called again in finite
+		 * amount time once all the subthreads are released and
+		 * will then return without clearing.
+		 *
+		 * When @ptrace:
+		 *
+		 * Stopped state is per-task and thus can't change once the
+		 * target task dies, so notask_error should be cleared only
+		 * if WCONTINUED is set.
+		 */
+		if (likely(!ptrace) || (wo->wo_flags & WCONTINUED))
+			wo->notask_error = 0;
+	} else {
+		/*
+		 * @p is alive and it's gonna stop, continue or exit, so
+		 * there always is something to wait for.
+		 */
+		wo->notask_error = 0;
+	}
 
 	if (task_stopped_code(p, ptrace))
 		return wait_task_stopped(wo, ptrace, p);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 4/8] job control: Allow access to job control events through ptracees
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (2 preceding siblings ...)
  2011-03-08 19:56 ` [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-21 16:39   ` Oleg Nesterov
  2011-03-22 11:10   ` [PATCH UPDATED " Tejun Heo
  2011-03-08 19:56 ` [PATCH 5/8] job control: Add @for_ptrace to do_notify_parent_cldstop() Tejun Heo
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

Currently a real parent can't access job control stopped/continued
events through a ptraced child.  This utterly breaks job control when
the children are ptraced.

For example, if a program is run from an interactive and then
strace(1) attaches to it, pressing ^Z would send SIGTSTP and strace(1)
would notice it but the shell has no way to tell whether the child
entered job control stop and thus can't tell when to take over the
terminal - leading to awkward lone ^Z on the terminal.

Because the job control and ptrace stopped states are independent,
there is no reason to prevent real parents from accessing the stopped
state regardless of ptrace.  The continued state isn't separate but
ptracers don't have any use for them as ptracees can never resume
without explicit command from their ptracers, so as long as ptracers
don't consume it, it should be fine.

Although this is a behavior change, because the previous behavior is
utterly broken when viewed from real parents and the change is only
visible to real parents, I don't think it's necessary to make this
behavior optional.

One situation to be careful about is when the real parent is ptracing.
The parent is the recipient of both ptrace and job control stop events
and one stop can be reported as both job control and ptrace stops.  As
this can break the current ptrace users, suppress job control stopped
events for these cases.

If a real parent ptracer wants to know about both job control and
ptrace stops, it can create a separate task to serve the role as real
parent.

Note that this only updates wait(2) side of things.  The real parent
can access the states via wait(2) but still is not properly notified
(woken up and delivered signal).  Test case polls wait(2) with WNOHANG
to work around.  Notification will be updated by future patches.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <time.h>
  #include <errno.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  int main(void)
  {
	  const struct timespec ts100ms = { .tv_nsec = 100000000 };
	  pid_t tracee, tracer;
	  siginfo_t si;
	  int i;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  printf("tracee: SIGSTOP\n");
			  raise(SIGSTOP);
			  nanosleep(&ts100ms, NULL);
			  printf("tracee: SIGCONT\n");
			  raise(SIGCONT);
			  nanosleep(&ts100ms, NULL);
		  }
	  }

	  waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG | WNOWAIT);

	  tracer = fork();
	  if (tracer == 0) {
		  nanosleep(&ts100ms, NULL);
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);

		  for (i = 0; i < 11; i++) {
			  si.si_pid = 0;
			  waitid(P_PID, tracee, &si, WSTOPPED);
			  if (si.si_pid && si.si_code == CLD_TRAPPED)
				  ptrace(PTRACE_CONT, tracee, NULL,
					 (void *)(long)si.si_status);
		  }
		  printf("tracer: EXITING\n");
		  return 0;
	  }

	  while (1) {
		  si.si_pid = 0;
		  waitid(P_PID, tracee, &si,
			 WSTOPPED | WCONTINUED | WEXITED | WNOHANG);
		  if (si.si_pid)
			  printf("mommy : WAIT status=%02d code=%02d\n",
				 si.si_status, si.si_code);
		  nanosleep(&ts100ms, NULL);
	  }
	  return 0;
  }

Before the patch, while ptraced, the parent can't see any job control
events.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

After the patch,

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/exit.c |   40 ++++++++++++++++++++++++++++++++--------
 1 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 7171821..922e41d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1541,17 +1541,19 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 	if (p->exit_state == EXIT_DEAD)
 		return 0;
 
-	if (likely(!ptrace) && unlikely(task_ptrace(p))) {
+	/* slay zombie? */
+	if (p->exit_state == EXIT_ZOMBIE) {
 		/*
-		 * This child is hidden by ptrace.
-		 * We aren't allowed to see it now, but eventually we will.
+		 * A zombie ptracee is only visible to its ptracer.
+		 * Notification and reaping will be cascaded to the real
+		 * parent when the ptracer detaches.
 		 */
-		wo->notask_error = 0;
-		return 0;
-	}
+		if (likely(!ptrace) && unlikely(task_ptrace(p))) {
+			/* it will become visible, clear notask_error */
+			wo->notask_error = 0;
+			return 0;
+		}
 
-	/* slay zombie? */
-	if (p->exit_state == EXIT_ZOMBIE) {
 		/* we don't reap group leaders with subthreads */
 		if (!delay_group_leader(p))
 			return wait_task_zombie(wo, p);
@@ -1580,15 +1582,37 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 			wo->notask_error = 0;
 	} else {
 		/*
+		 * If %current is ptracing @p, hide group stop/continued
+		 * state when looking at @p as the real parent; otherwise,
+		 * a single stop can be reported twice as group and ptrace
+		 * stops.
+		 *
+		 * If a ptracer wants to distinguish the two events for its
+		 * own children, it should create a separate task which
+		 * takes the role of real parent.
+		 */
+		if (likely(!ptrace) && task_ptrace(p) && p->parent == current)
+			return 0;
+
+		/*
 		 * @p is alive and it's gonna stop, continue or exit, so
 		 * there always is something to wait for.
 		 */
 		wo->notask_error = 0;
 	}
 
+	/*
+	 * Wait for stopped.  Depending on @ptrace, different stopped state
+	 * is used and the two don't interact with each other.
+	 */
 	if (task_stopped_code(p, ptrace))
 		return wait_task_stopped(wo, ptrace, p);
 
+	/*
+	 * Wait for continued.  There's only one continued state and the
+	 * ptracer can consume it which can confuse the real parent.  Don't
+	 * use WCONTINUED from ptracer.  You don't need or want it.
+	 */
 	return wait_task_continued(wo, p);
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 5/8] job control: Add @for_ptrace to do_notify_parent_cldstop()
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (3 preceding siblings ...)
  2011-03-08 19:56 ` [PATCH 4/8] job control: Allow access to job control events through ptracees Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-08 19:56 ` [PATCH 6/8] job control: Job control stop notifications should always go to the real parent Tejun Heo
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

Currently, do_notify_parent_cldstop() determines whether the
notification is for the real parent or ptracer.  Move the decision to
the caller by adding @for_ptrace parameter to
do_notify_parent_cldstop().  All the callers are updated to pass
task_ptrace(target_task), so this patch doesn't cause any behavior
difference.

While at it, add function comment to do_notify_parent_cldstop().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   31 ++++++++++++++++++++++++-------
 1 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 6130cb8..4acca00 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1595,16 +1595,30 @@ int do_notify_parent(struct task_struct *tsk, int sig)
 	return ret;
 }
 
-static void do_notify_parent_cldstop(struct task_struct *tsk, int why)
+/**
+ * do_notify_parent_cldstop - notify parent of stopped/continued state change
+ * @tsk: task reporting the state change
+ * @for_ptracer: the notification is for ptracer
+ * @why: CLD_{CONTINUED|STOPPED|TRAPPED} to report
+ *
+ * Notify @tsk's parent that the stopped/continued state has changed.  If
+ * @for_ptracer is %false, @tsk's group leader notifies to its real parent.
+ * If %true, @tsk reports to @tsk->parent which should be the ptracer.
+ *
+ * CONTEXT:
+ * Must be called with tasklist_lock at least read locked.
+ */
+static void do_notify_parent_cldstop(struct task_struct *tsk,
+				     bool for_ptracer, int why)
 {
 	struct siginfo info;
 	unsigned long flags;
 	struct task_struct *parent;
 	struct sighand_struct *sighand;
 
-	if (task_ptrace(tsk))
+	if (for_ptracer) {
 		parent = tsk->parent;
-	else {
+	} else {
 		tsk = tsk->group_leader;
 		parent = tsk->real_parent;
 	}
@@ -1743,7 +1757,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
 	if (may_ptrace_stop()) {
-		do_notify_parent_cldstop(current, why);
+		do_notify_parent_cldstop(current, task_ptrace(current), why);
 		/*
 		 * Don't want to allow preemption here, because
 		 * sys_ptrace() needs this task to be inactive.
@@ -1886,7 +1900,8 @@ retry:
 
 		if (notify) {
 			read_lock(&tasklist_lock);
-			do_notify_parent_cldstop(current, notify);
+			do_notify_parent_cldstop(current, task_ptrace(current),
+						 notify);
 			read_unlock(&tasklist_lock);
 		}
 
@@ -1982,6 +1997,7 @@ relock:
 	 * the CLD_ si_code into SIGNAL_CLD_MASK bits.
 	 */
 	if (unlikely(signal->flags & SIGNAL_CLD_MASK)) {
+		struct task_struct *leader;
 		int why;
 
 		if (signal->flags & SIGNAL_CLD_CONTINUED)
@@ -1994,7 +2010,8 @@ relock:
 		spin_unlock_irq(&sighand->siglock);
 
 		read_lock(&tasklist_lock);
-		do_notify_parent_cldstop(current->group_leader, why);
+		leader = current->group_leader;
+		do_notify_parent_cldstop(leader, task_ptrace(leader), why);
 		read_unlock(&tasklist_lock);
 		goto relock;
 	}
@@ -2167,7 +2184,7 @@ out:
 
 	if (unlikely(group_stop)) {
 		read_lock(&tasklist_lock);
-		do_notify_parent_cldstop(tsk, group_stop);
+		do_notify_parent_cldstop(tsk, task_ptrace(tsk), group_stop);
 		read_unlock(&tasklist_lock);
 	}
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 6/8] job control: Job control stop notifications should always go to the real parent
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (4 preceding siblings ...)
  2011-03-08 19:56 ` [PATCH 5/8] job control: Add @for_ptrace to do_notify_parent_cldstop() Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-21 17:12   ` Oleg Nesterov
  2011-03-08 19:56 ` [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace Tejun Heo
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

The stopped notifications in do_signal_stop() and exit_signals() are
always for the completion of job control.  The one in do_signal_stop()
may be delivered to the ptracer if PTRACE_ATTACH races with
notification and the one in exit_signals() if task exits while
ptraced.

In both cases, the notifications are meaningless and confusing to the
ptracer as it never accesses the group stop state while the real
parent would miss notifications for the events it is watching.

Make sure these notifications always go to the real parent by calling
do_notify_parent_cld_stop() with %false @for_ptrace.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 4acca00..52120d6 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1898,10 +1898,18 @@ retry:
 		__set_current_state(TASK_STOPPED);
 		spin_unlock_irq(&current->sighand->siglock);
 
+		/*
+		 * Notify the parent of the group stop completion.  Because
+		 * we're not holding either the siglock or tasklist_lock
+		 * here, ptracer may attach inbetween; however, this is for
+		 * group stop and should always be delivered to the real
+		 * parent of the group leader.  The new ptracer will get
+		 * its notification when this task transitions into
+		 * TASK_TRACED.
+		 */
 		if (notify) {
 			read_lock(&tasklist_lock);
-			do_notify_parent_cldstop(current, task_ptrace(current),
-						 notify);
+			do_notify_parent_cldstop(current, false, notify);
 			read_unlock(&tasklist_lock);
 		}
 
@@ -2182,9 +2190,13 @@ void exit_signals(struct task_struct *tsk)
 out:
 	spin_unlock_irq(&tsk->sighand->siglock);
 
+	/*
+	 * If group stop has completed, deliver the notification.  This
+	 * should always go to the real parent of the group leader.
+	 */
 	if (unlikely(group_stop)) {
 		read_lock(&tasklist_lock);
-		do_notify_parent_cldstop(tsk, task_ptrace(tsk), group_stop);
+		do_notify_parent_cldstop(tsk, false, group_stop);
 		read_unlock(&tasklist_lock);
 	}
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (5 preceding siblings ...)
  2011-03-08 19:56 ` [PATCH 6/8] job control: Job control stop notifications should always go to the real parent Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-21 17:43   ` Oleg Nesterov
  2011-03-22 11:30   ` [PATCH UPDATED " Tejun Heo
  2011-03-08 19:56 ` [PATCH 8/8] job control: Don't send duplicate job control stop notification while ptraced Tejun Heo
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

With recent changes, job control and ptrace stopped states are
properly separated and accessible to the real parent and the ptracer
respectively; however, notifications of job control stopped/continued
events to the real parent while ptraced are still missing.

A ptracee participates in group stop in ptrace_stop() but the
completion isn't notified.  If participation results in completion of
group stop, notify the real parent of the event.  The ptrace and group
stops are separate and can be handled as such.

However, when the real parent is the ptracer, only the ptrace stop
event is visible through wait(2) and the duplicate notifications are
different from the current behaivor and are confusing.  Suppress group
stop notification in such cases.

The continued state is shared between the real parent and the ptracer
but is only meaningful to the real parent.  Always notify the real
parent and notify the ptracer too for backward compatibility.  Similar
to stop notification, if the real parent is the ptracer, suppress a
duplicate notification.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <time.h>
  #include <errno.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  int main(void)
  {
	  const struct timespec ts100ms = { .tv_nsec = 100000000 };
	  pid_t tracee, tracer;
	  siginfo_t si;
	  int i;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  printf("tracee: SIGSTOP\n");
			  raise(SIGSTOP);
			  nanosleep(&ts100ms, NULL);
			  printf("tracee: SIGCONT\n");
			  raise(SIGCONT);
			  nanosleep(&ts100ms, NULL);
		  }
	  }

	  waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG | WNOWAIT);

	  tracer = fork();
	  if (tracer == 0) {
		  nanosleep(&ts100ms, NULL);
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);

		  for (i = 0; i < 11; i++) {
			  si.si_pid = 0;
			  waitid(P_PID, tracee, &si, WSTOPPED);
			  if (si.si_pid && si.si_code == CLD_TRAPPED)
				  ptrace(PTRACE_CONT, tracee, NULL,
					 (void *)(long)si.si_status);
		  }
		  printf("tracer: EXITING\n");
		  return 0;
	  }

	  while (1) {
		  si.si_pid = 0;
		  waitid(P_PID, tracee, &si, WSTOPPED | WCONTINUED | WEXITED);
		  if (si.si_pid)
			  printf("mommy : WAIT status=%02d code=%02d\n",
				 si.si_status, si.si_code);
	  }
	  return 0;
  }

Before this patch, while ptraced, the real parent doesn't get
notifications for job control events, so although it can access those
events, the later waitid(2) call never wakes up.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

After this patch, it works as expected.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   44 +++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 52120d6..74f097c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1694,6 +1694,15 @@ static int sigkill_pending(struct task_struct *tsk)
 }
 
 /*
+ * Test whether the target task of the usual cldstop notification - the
+ * real_parent of the group_leader of @child - is the ptracer.
+ */
+static bool real_parent_is_ptracer(struct task_struct *child)
+{
+	return child->parent == child->group_leader->real_parent;
+}
+
+/*
  * This must be called with current->sighand->siglock held.
  *
  * This should be the path for all ptrace stops.
@@ -1708,6 +1717,8 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
+	bool gstop_done = false;
+
 	if (arch_ptrace_stop_needed(exit_code, info)) {
 		/*
 		 * The arch code has something special to do before a
@@ -1735,7 +1746,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	 * is entered - ignore it.
 	 */
 	if (why == CLD_STOPPED && (current->group_stop & GROUP_STOP_PENDING))
-		task_participate_group_stop(current);
+		gstop_done = task_participate_group_stop(current);
 
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
@@ -1757,7 +1768,20 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
 	if (may_ptrace_stop()) {
-		do_notify_parent_cldstop(current, task_ptrace(current), why);
+		/*
+		 * Notify parents of the stop.
+		 *
+		 * While ptraced, there are two parents - the ptracer and
+		 * the real_parent of the group_leader.  The ptracer should
+		 * know about every stop while the real parent is only
+		 * interested in the completion of group stop.  The states
+		 * for the two don't interact with each other.  Notify
+		 * separately unless they're gonna be duplicates.
+		 */
+		do_notify_parent_cldstop(current, true, why);
+		if (gstop_done && !real_parent_is_ptracer(current))
+			do_notify_parent_cldstop(current, false, why);
+
 		/*
 		 * Don't want to allow preemption here, because
 		 * sys_ptrace() needs this task to be inactive.
@@ -2017,10 +2041,24 @@ relock:
 
 		spin_unlock_irq(&sighand->siglock);
 
+		/*
+		 * Notify the parent that we're continuing.  This event is
+		 * always per-process and doesn't make whole lot of sense
+		 * for ptracers, who shouldn't consume the state via
+		 * wait(2) either, but, for backward compatibility, notify
+		 * the ptracer of the group leader too unless it's gonna be
+		 * a duplicate.
+		 */
 		read_lock(&tasklist_lock);
+
+		do_notify_parent_cldstop(current, false, why);
+
 		leader = current->group_leader;
-		do_notify_parent_cldstop(leader, task_ptrace(leader), why);
+		if (task_ptrace(leader) && !real_parent_is_ptracer(leader))
+			do_notify_parent_cldstop(leader, true, why);
+
 		read_unlock(&tasklist_lock);
+
 		goto relock;
 	}
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 8/8] job control: Don't send duplicate job control stop notification while ptraced
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (6 preceding siblings ...)
  2011-03-08 19:56 ` [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace Tejun Heo
@ 2011-03-08 19:56 ` Tejun Heo
  2011-03-21 17:48   ` Oleg Nesterov
  2011-03-08 20:01 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent " Linus Torvalds
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-08 19:56 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, Tejun Heo

Just as group_exit_code shouldn't be generated when a PTRACE_CONT'd
task re-enters job control stop, notifiction for the event should be
suppressed too.  The logic is the same as the group_exit_code
generation suppression in do_signal_stop(), if SIGNAL_STOP_STOPPED is
already set, the task is re-entering job control stop without
intervening SIGCONT and the notifications should be suppressed.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <signal.h>
  #include <time.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };
  static pid_t tracee, tracer;

  static const char *pid_who(pid_t pid)
  {
	  return pid == tracee ? "tracee" : (pid == tracer ? "tracer" : "mommy ");
  }

  static void sigchld_sigaction(int signo, siginfo_t *si, void *ucxt)
  {
	  printf("%s: SIG status=%02d code=%02d (%s)\n",
		 pid_who(getpid()), si->si_status, si->si_code,
		 pid_who(si->si_pid));
  }

  int main(void)
  {
	  const struct sigaction chld_sa = { .sa_sigaction = sigchld_sigaction,
					     .sa_flags = SA_SIGINFO|SA_RESTART };
	  siginfo_t si;

	  sigaction(SIGCHLD, &chld_sa, NULL);

	  tracee = fork();
	  if (!tracee) {
		  tracee = getpid();
		  while (1)
			  pause();
	  }

	  kill(tracee, SIGSTOP);
	  waitid(P_PID, tracee, &si, WSTOPPED);

	  tracer = fork();
	  if (!tracer) {
		  tracer = getpid();
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		  waitid(P_PID, tracee, &si, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
		  waitid(P_PID, tracee, &si, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
		  waitid(P_PID, tracee, &si, WSTOPPED);
		  printf("tracer: detaching\n");
		  ptrace(PTRACE_DETACH, tracee, NULL, NULL);
		  return 0;
	  }

	  while (1)
		  pause();
	  return 0;
  }

Before the patch, the parent gets the second notification for the
tracee after the tracer detaches.  si_status is zero because
group_exit_code is not set by the group stop completion which
triggered this notification.

  mommy : SIG status=19 code=05 (tracee)
  tracer: SIG status=00 code=05 (tracee)
  tracer: SIG status=19 code=04 (tracee)
  tracer: SIG status=00 code=05 (tracee)
  tracer: detaching
  mommy : SIG status=00 code=05 (tracee)
  mommy : SIG status=00 code=01 (tracer)
  ^C

After the patch, the duplicate notification is gone.

  mommy : SIG status=19 code=05 (tracee)
  tracer: SIG status=00 code=05 (tracee)
  tracer: SIG status=19 code=04 (tracee)
  tracer: SIG status=00 code=05 (tracee)
  tracer: detaching
  mommy : SIG status=00 code=01 (tracer)
  ^C

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 74f097c..8304563 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -268,6 +268,10 @@ void task_clear_group_stop_pending(struct task_struct *task)
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
+ *
+ * RETURNS:
+ * %true if group stop completion should be notified to the parent, %false
+ * otherwise.
  */
 static bool task_participate_group_stop(struct task_struct *task)
 {
@@ -284,7 +288,11 @@ static bool task_participate_group_stop(struct task_struct *task)
 	if (!WARN_ON_ONCE(sig->group_stop_count == 0))
 		sig->group_stop_count--;
 
-	if (!sig->group_stop_count) {
+	/*
+	 * Tell the caller to notify completion iff we are entering into a
+	 * fresh group stop.  Read comment in do_signal_stop() for details.
+	 */
+	if (!sig->group_stop_count && !(sig->flags & SIGNAL_STOP_STOPPED)) {
 		sig->flags = SIGNAL_STOP_STOPPED;
 		return true;
 	}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (7 preceding siblings ...)
  2011-03-08 19:56 ` [PATCH 8/8] job control: Don't send duplicate job control stop notification while ptraced Tejun Heo
@ 2011-03-08 20:01 ` Linus Torvalds
  2011-03-09 16:50 ` Oleg Nesterov
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Linus Torvalds @ 2011-03-08 20:01 UTC (permalink / raw)
  To: Tejun Heo
  Cc: oleg, roland, jan.kratochvil, vda.linux, linux-kernel, akpm, indan

On Tue, Mar 8, 2011 at 11:56 AM, Tejun Heo <tj@kernel.org> wrote:
>
> This patchset implements "P2. Fix notifications to the real parent" of
> the ptrace job control improvements proposal[1].

Ok, from skimming the patches, I like the whole series. I especially
liked how all the steps were small and well-commented, both in the
code and in the changelog. I could basically follow the patch series
from just reading the changelogs, and the patches themselves looked
fine too.

                        Linus

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (8 preceding siblings ...)
  2011-03-08 20:01 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent " Linus Torvalds
@ 2011-03-09 16:50 ` Oleg Nesterov
  2011-03-22 10:20 ` [PATCH 0.1/8] ptrace: Collapse ptrace_untrace() into __ptrace_unlink() Tejun Heo
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-09 16:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>
> This patchset implements "P2. Fix notifications to the real parent" of
> the ptrace job control improvements proposal[1].

And at first glance this does what we discussed before. I mean, even if
(perhaps) this series has some problems/bugs it looks right "in general"
to me.

> As the whole job control / ptrace logic is quite delicate, I tried to
> be very granual with changes and add plenty of explanations for
> subtleties.

Yeah... And right now I have some concerns about the details, but I am
not ready to share them. Will try tomorrow, it turns out I need to re-read
the previous series.

Say, even 1/8 doesn't look exactly right with the current ptrace code,
WARN_ON_ONCE() can be easily triggered afaics.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-08 19:56 ` [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop Tejun Heo
@ 2011-03-21 13:20   ` Oleg Nesterov
  2011-03-21 15:52     ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 13:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>

Hi Tejun,

I hope you still remember you sent these patches, perhaps you can
even recall what they should do ;)

> @@ -1827,10 +1827,27 @@ static int do_signal_stop(int signr)
>  		    unlikely(signal_group_exit(sig)))
>  			return 0;
>  		/*
> -		 * There is no group stop already in progress.
> -		 * We must initiate one now.
> +		 * There is no group stop already in progress.  We must
> +		 * initiate one now.
> +		 *
> +		 * While ptraced, a task may be resumed while group stop is
> +		 * still in effect and then receive a stop signal and
> +		 * initiate another group stop.  This deviates from the
> +		 * usual behavior as two consecutive stop signals can't
> +		 * cause two group stops when !ptraced.
> +		 *
> +		 * The condition can be distinguished by testing whether
> +		 * SIGNAL_STOP_STOPPED is already set.  Don't generate
> +		 * group_exit_code in such case.
> +		 *
> +		 * This is not necessary for SIGNAL_STOP_CONTINUED because
> +		 * an intervening stop signal is required to cause two
> +		 * continued events regardless of ptrace.
>  		 */
> -		sig->group_exit_code = signr;
> +		if (!(sig->flags & SIGNAL_STOP_STOPPED))
> +			sig->group_exit_code = signr;
> +		else
> +			WARN_ON_ONCE(!task_ptrace(current));

Yes. But WARN_ON_ONCE() is wrong. Suppose that the tracee was stopped,
then PTRACE_CONT'ed, then it gets another SIGSTOP and reports it. Now
suppose that debugger does PTRACE_CONT(SIGSTOP) and exits before the
tracee processes this signal.

OTOH, this WARN_ON_ONCE() makes sense, but we should fix __ptrace_unlink().
This path should take siglock and check SIGNAL_STOP_STOPPED unconditionally.
This should also fix other problems with detach && SIGNAL_STOP_STOPPED.

Also. We should take ->group_stop_count != 0 into account, we should not
set (change) ->group_exit_code in this case too. This is is only "real"
problem in this patch I can see. Other comments are mostly the random
thoughts.

But lets look at the code below,

		for (t = next_thread(current); t != current;
		     t = next_thread(t)) {
			t->group_stop &= ~GROUP_STOP_SIGMASK;
			/*
			 * Setting state to TASK_STOPPED for a group
			 * stop is always done with the siglock held,
			 * so this check has no races.
			 */
			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
				t->group_stop |= signr | gstop;
				sig->group_stop_count++;
				signal_wake_up(t, 0);
			} else {
				task_clear_group_stop_pending(t);
			}
		}

Somehow I no longer understand "else task_clear_group_stop_pending()".
I mean, is it really needed?

If task_is_stopped() == T or it is PF_EXITING, this task has already
done task_participate_group_stop(), no?

Also. I do not think it is correct to change the "signr" part of
->group_stop (unless it is zero) when ->group_stop_count != 0
for other threads. This is minor, but still doesn't look exactly
correct. Probably we can ignore this.

Hmm. it turns out "group_stop & GROUP_STOP_SIGMASK" is only needed
to handle this special case: if debugger PTRACE_CONT's or more
stopped tracees and then som thread initiates the stop again, other
threads need to know that "signr". Otherwise this part of ->group_stop
is only valid "inside" the retry loop in do_signal_stop(), it can
be a local variable. I wonder if we can simply report SIGSTOP in
this case and kill the GROUP_STOP_SIGMASK logic. Nevermind.

And. I think this code does something we do not really want. Why do
we _always_ ask the task_is_traced() threads to participate?

2 threads T1 and T2, both stopped. they are TASK_TRACED, I mean
SIGNAL_STOP_STOPPED is stopped and both have already participated.

Debuggere PTRACE_CONTs T1, then it calls do_signal_stop() again
and sets T2->group_stop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME.
This T2->group_stop doesn't look right, we can report the wrong
extra CLD_STOPPED after detach while ->group_exit_code can be 0.
I think that !task_ptrace(current) case in do_signal_stop() should
take SIGNAL_STOP_STOPPED into account, but perhaps we need another
GROUP_STOP_REPORTED bit, I am not sure.

Or, if debugger PTRACE_CONT's T2, it will report another
ptrace_stop(CLD_STOPPED) immediately, this differs from the current
behaviour although probably we do not care.

Oleg.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing
  2011-03-08 19:56 ` [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing Tejun Heo
@ 2011-03-21 15:19   ` Oleg Nesterov
  2011-03-21 16:09     ` Oleg Nesterov
  2011-03-21 16:12     ` Tejun Heo
  2011-03-22 10:51   ` [PATCH UPDATED " Tejun Heo
  1 sibling, 2 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 15:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>
>   static void *nooper(void *arg)
>   {
> 	  pause();
> 	  return NULL;
>   }
>
>   int main(void)
>   {
> 	  const struct timespec ts1s = { .tv_sec = 1 };
> 	  pid_t tracee, tracer;
> 	  siginfo_t si;
>
> 	  tracee = fork();
> 	  if (tracee == 0) {
> 		  pthread_t thr;
>
> 		  pthread_create(&thr, NULL, nooper, NULL);
> 		  nanosleep(&ts1s, NULL);
> 		  printf("tracee exiting\n");
> 		  pthread_exit(NULL);	/* let subthread run */
> 	  }
>
> 	  tracer = fork();
> 	  if (tracer == 0) {
> 		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
> 		  while (1) {
> 			  if (waitid(P_PID, tracee, &si, WSTOPPED) < 0) {
> 				  perror("waitid");
> 				  break;
> 			  }
> 			  ptrace(PTRACE_CONT, tracee, NULL,
> 				 (void *)(long)si.si_status);
> 		  }
> 		  return 0;
> 	  }
>
> 	  waitid(P_PID, tracer, &si, WEXITED);
> 	  kill(tracee, SIGKILL);
> 	  return 0;
>   }
>
> Before the patch, after the tracee becomes a zombie, the tracer's
> waitid(WSTOPPED) never returns and the program doesn't terminate.

Yes. The program should terminate after nooper() exits. Agreed, this
looks strange.

> After the patch, tracee exiting triggers waitid() to fail.
>
>   tracee exiting
>   waitid: No child processes

OK, but

> @@ -1550,17 +1550,41 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
>  		return 0;
>  	}
>
> -	/*
> -	 * We don't reap group leaders with subthreads.
> -	 */
> -	if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
> -		return wait_task_zombie(wo, p);
> +	/* slay zombie? */
> +	if (p->exit_state == EXIT_ZOMBIE) {
> +		/* we don't reap group leaders with subthreads */
> +		if (!delay_group_leader(p))
> +			return wait_task_zombie(wo, p);
>
> -	/*
> -	 * It's stopped or running now, so it might
> -	 * later continue, exit, or stop again.
> -	 */
> -	wo->notask_error = 0;
> +		/*
> +		 * Allow access to stopped/continued state via zombie by
> +		 * falling through.  Clearing of notask_error is complex.
> +		 *
> +		 * When !@ptrace:
> +		 *
> +		 * If WEXITED is set, notask_error should naturally be
> +		 * cleared.  If not, subset of WSTOPPED|WCONTINUED is set,
> +		 * so, if there are live subthreads, there are events to
> +		 * wait for.  If all subthreads are dead, it's still safe
> +		 * to clear - this function will be called again in finite
> +		 * amount time once all the subthreads are released and
> +		 * will then return without clearing.
> +		 *
> +		 * When @ptrace:
> +		 *
> +		 * Stopped state is per-task and thus can't change once the
> +		 * target task dies, so notask_error should be cleared only
> +		 * if WCONTINUED is set.
> +		 */
> +		if (likely(!ptrace) || (wo->wo_flags & WCONTINUED))
> +			wo->notask_error = 0;

I don't understand this part. Suppose that this task is not traced and
its real parent does do_wait(WEXITED). We shouldn't return -ECHLD in
this case (EXIT_ZOMBIE && delay_group_leader()).

If the task is traced and debugger does do_wait(WEXITED) we should not
return -ECHLD too.

So I think this patch is wrong in its current state. Hmm, I just noticed
the comment says "If WEXITED is set, notask_error should ...", but this
is not what the code does?

But the main problem is, I do not think do_wait() should block in this
case, and thus I am starting to think this patch is not "complete".

I mean, we are going to add the user-visible change, and I agree the
current behaviour looks strange. But if we change this code, perhaps
the tracer should ignore delay_group_leader() at all (unless we are
going to do wait_task_zombie) ?

Your test-case could use waitid(WEXITED) instead WSTOPPED with the same
result, it should hang. Why it hangs? The tracee is dead, we can't do
ptrace(PTRACE_DETACH), and we can do nothing until other threads exit.
This looks equally strange.

IOW. Assuming that ptrace == T and WEXITED is set, perhaps we should
do something like this pseudo-code

	if (p->exit_state == EXIT_ZOMBIE) {
		if (!delay_group_leader(p))
			return wait_task_zombie(wo, p);

		ptrace_unlink();
		wait_task_zombie(WNOWAIT);
	}

However. This is another user-visible change, we need another discussion
even if I am right. In particular, it is not clear what should we do
if parent == real_parent. And probably this can confuse gdb, but iirc
gdb already have the problems with the dead leader anyway.

Oleg.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-21 13:20   ` Oleg Nesterov
@ 2011-03-21 15:52     ` Tejun Heo
  2011-03-22 18:44       ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-21 15:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hey, Oleg.

On Mon, Mar 21, 2011 at 02:20:24PM +0100, Oleg Nesterov wrote:
> I hope you still remember you sent these patches, perhaps you can
> even recall what they should do ;)

Yeah, was about to ping you actually.  BTW, I biked for a couple of
hours and had a beer and am feeling pretty stupid at the moment, so my
chance of writing something stupid is higher than usual.

> > -		sig->group_exit_code = signr;
> > +		if (!(sig->flags & SIGNAL_STOP_STOPPED))
> > +			sig->group_exit_code = signr;
> > +		else
> > +			WARN_ON_ONCE(!task_ptrace(current));
> 
> Yes. But WARN_ON_ONCE() is wrong. Suppose that the tracee was stopped,
> then PTRACE_CONT'ed, then it gets another SIGSTOP and reports it. Now
> suppose that debugger does PTRACE_CONT(SIGSTOP) and exits before the
> tracee processes this signal.

I see.  It can't happen for PTRACE_DETACH due to ptrace_check_attach()
but can if the debugger is exiting.

> OTOH, this WARN_ON_ONCE() makes sense, but we should fix __ptrace_unlink().
> This path should take siglock and check SIGNAL_STOP_STOPPED unconditionally.
> This should also fix other problems with detach && SIGNAL_STOP_STOPPED.

Right, the task_is_traced() tests.  I agree that the culprit is
__ptrace_unlink().  It should always put the tracee in the appropriate
state.  I'll prep a patch for this.

> Also. We should take ->group_stop_count != 0 into account, we should not
> set (change) ->group_exit_code in this case too.

Hmmm... I'm not sure whether this matters.  If the group stop hasn't
completed yet, I don't think we need to guarantee which one is
reported.  With single thread, the condition can't happen and, with
multithread, the userland has no way to tell which one of the two
signals actually started group stop so we can report any.  The
ordering is undefined.

It might be cleaner that way but I think the current code could be
easier to explain.  Mights and coulds.

> But lets look at the code below,
> 
> 		for (t = next_thread(current); t != current;
> 		     t = next_thread(t)) {
> 			t->group_stop &= ~GROUP_STOP_SIGMASK;
> 			/*
> 			 * Setting state to TASK_STOPPED for a group
> 			 * stop is always done with the siglock held,
> 			 * so this check has no races.
> 			 */
> 			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
> 				t->group_stop |= signr | gstop;
> 				sig->group_stop_count++;
> 				signal_wake_up(t, 0);
> 			} else {
> 				task_clear_group_stop_pending(t);
> 			}
> 		}
> 
> Somehow I no longer understand "else task_clear_group_stop_pending()".
> I mean, is it really needed?
> 
> If task_is_stopped() == T or it is PF_EXITING, this task has already
> done task_participate_group_stop(), no?

Hmm... I think this is from the older implementation where
task_clear_group_stop_pending() did more than just clearing the two
flags.  We can remove it.

> Also. I do not think it is correct to change the "signr" part of
> ->group_stop (unless it is zero) when ->group_stop_count != 0
> for other threads. This is minor, but still doesn't look exactly
> correct. Probably we can ignore this.

GROUP_STOP_SIGMASK only affects ptracers.  The extra stop signal is
visible to one ptracer and possibly other ptracers too, so I think
it's more appropriate to report the newest signal.  Please consider
the following scenario.

* Two threads in a process.  SIGNAL_STOP_STOPPED.

* PTRACE_ATTACH on both threads and both are PTRACE_CONT'd.

* A stopsig is delivered to one of them which initiates group stop.

* Both ptracers notice the tracees participating in the group stop.

> Hmm. it turns out "group_stop & GROUP_STOP_SIGMASK" is only needed
> to handle this special case: if debugger PTRACE_CONT's or more
> stopped tracees and then som thread initiates the stop again, other
> threads need to know that "signr". Otherwise this part of ->group_stop
> is only valid "inside" the retry loop in do_signal_stop(), it can
> be a local variable. I wonder if we can simply report SIGSTOP in
> this case and kill the GROUP_STOP_SIGMASK logic. Nevermind.

I think there was a reason why it couldn't be a local variable.  Let's
see if I can remember it.  Ah, okay, please consider the following
scenario.  This one needs to be documented.

* A thread does group stop and the parent consumed exit code.

* ptracer attaches and sees the group stop signal.

* PTRACE_CONT and the thread leaves do_signal_stop().

* PTRACE_DETACH.  The thread returns to do_signal_stop() and re-enters
  TASK_STOPPED.

* Another ptracer does PTRACE_ATTACH.

The second ptracer wants to know the signo too but if it were stored
in a local variable, it wouldn't be available anywhere.

> And. I think this code does something we do not really want. Why do
> we _always_ ask the task_is_traced() threads to participate?
> 
> 2 threads T1 and T2, both stopped. they are TASK_TRACED, I mean
> SIGNAL_STOP_STOPPED is stopped and both have already participated.
> 
> Debuggere PTRACE_CONTs T1, then it calls do_signal_stop() again
> and sets T2->group_stop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME.
> This T2->group_stop doesn't look right, we can report the wrong
> extra CLD_STOPPED after detach while ->group_exit_code can be 0.
> I think that !task_ptrace(current) case in do_signal_stop() should
> take SIGNAL_STOP_STOPPED into account, but perhaps we need another
> GROUP_STOP_REPORTED bit, I am not sure.

Isn't this addressed by the last patch in this thread?

> Or, if debugger PTRACE_CONT's T2, it will report another
> ptrace_stop(CLD_STOPPED) immediately, this differs from the current
> behaviour although probably we do not care.

This was changed by "signal: Use GROUP_STOP_PENDING to stop once for a
single group stop".  The previous behavior was inconsistent.  If the
tracee was running it would get notified; otherwise, not.  Also, the
tracer could be notified multiple times for the same group stop.  I
think the new behavior is sane and falls within the boundaries set by
the inconsistencies of the previous behavior.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing
  2011-03-21 15:19   ` Oleg Nesterov
@ 2011-03-21 16:09     ` Oleg Nesterov
  2011-03-21 16:12     ` Tejun Heo
  1 sibling, 0 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 16:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/21, Oleg Nesterov wrote:
>
> On 03/08, Tejun Heo wrote:
> >
> > +		if (likely(!ptrace) || (wo->wo_flags & WCONTINUED))
> > +			wo->notask_error = 0;
>
> I don't understand this part. Suppose that this task is not traced and
> its real parent does do_wait(WEXITED). We shouldn't return -ECHLD in
> this case (EXIT_ZOMBIE && delay_group_leader()).

Argh. please ingnore this part. I misread the code above.

> If the task is traced and debugger does do_wait(WEXITED) we should not
> return -ECHLD too.

Still true, or I missed something again...

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing
  2011-03-21 15:19   ` Oleg Nesterov
  2011-03-21 16:09     ` Oleg Nesterov
@ 2011-03-21 16:12     ` Tejun Heo
  2011-03-22 19:08       ` Oleg Nesterov
  1 sibling, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-21 16:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On Mon, Mar 21, 2011 at 04:19:41PM +0100, Oleg Nesterov wrote:
> > +		/*
> > +		 * Allow access to stopped/continued state via zombie by
> > +		 * falling through.  Clearing of notask_error is complex.
> > +		 *
> > +		 * When !@ptrace:
> > +		 *
> > +		 * If WEXITED is set, notask_error should naturally be
> > +		 * cleared.  If not, subset of WSTOPPED|WCONTINUED is set,
> > +		 * so, if there are live subthreads, there are events to
> > +		 * wait for.  If all subthreads are dead, it's still safe
> > +		 * to clear - this function will be called again in finite
> > +		 * amount time once all the subthreads are released and
> > +		 * will then return without clearing.
> > +		 *
> > +		 * When @ptrace:
> > +		 *
> > +		 * Stopped state is per-task and thus can't change once the
> > +		 * target task dies, so notask_error should be cleared only
> > +		 * if WCONTINUED is set.
> > +		 */
> > +		if (likely(!ptrace) || (wo->wo_flags & WCONTINUED))
> > +			wo->notask_error = 0;
> 
> I don't understand this part. Suppose that this task is not traced and
> its real parent does do_wait(WEXITED). We shouldn't return -ECHLD in
> this case (EXIT_ZOMBIE && delay_group_leader()).
> 
> If the task is traced and debugger does do_wait(WEXITED) we should not
> return -ECHLD too.
> 
> So I think this patch is wrong in its current state. Hmm, I just noticed
> the comment says "If WEXITED is set, notask_error should ...", but this
> is not what the code does?

That comment is for !@ptrace, so the code and comment are in sync, but
yes, if WEXITED is set, notask_error better be cleared regardless of
@ptrace.

> But the main problem is, I do not think do_wait() should block in this
> case, and thus I am starting to think this patch is not "complete".
> 
> I mean, we are going to add the user-visible change, and I agree the
> current behaviour looks strange. But if we change this code, perhaps
> the tracer should ignore delay_group_leader() at all (unless we are
> going to do wait_task_zombie) ?
> 
> Your test-case could use waitid(WEXITED) instead WSTOPPED with the same
> result, it should hang. Why it hangs? The tracee is dead, we can't do
> ptrace(PTRACE_DETACH), and we can do nothing until other threads exit.
> This looks equally strange.
> 
> IOW. Assuming that ptrace == T and WEXITED is set, perhaps we should
> do something like this pseudo-code
> 
> 	if (p->exit_state == EXIT_ZOMBIE) {
> 		if (!delay_group_leader(p))
> 			return wait_task_zombie(wo, p);
> 
> 		ptrace_unlink();
> 		wait_task_zombie(WNOWAIT);
> 	}
> 
> However. This is another user-visible change, we need another discussion
> even if I am right. In particular, it is not clear what should we do
> if parent == real_parent. And probably this can confuse gdb, but iirc
> gdb already have the problems with the dead leader anyway.

Interesting point.  Yeah, I agree.  wait(WEXITED) from the ptracer
should only wait for the tracee itself, not the group.  When they are
one and the same, I don't think we need to do anything differently
from now.

If we change the behavior that way, it would also fit better with the
rest of the new behavior where the real parent and ptracer have
separate roles when wait(2)ing for stopped states.

The question is how the change would affect the existing users.  When
the debugee is a direct child, nothing will change.  When attaching to
a separate group, I don't think it even matters.  Does gdb handle
group leader any differently from the rest when attached to an
unrelated group?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 4/8] job control: Allow access to job control events through ptracees
  2011-03-08 19:56 ` [PATCH 4/8] job control: Allow access to job control events through ptracees Tejun Heo
@ 2011-03-21 16:39   ` Oleg Nesterov
  2011-03-21 17:20     ` Tejun Heo
  2011-03-22 11:10   ` [PATCH UPDATED " Tejun Heo
  1 sibling, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 16:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>
> Currently a real parent can't access job control stopped/continued
> events through a ptraced child.  This utterly breaks job control when
> the children are ptraced.

Yes. This is what we need.

Oh. I'll try to recheck this all once again later, there are too many
combinations we should worry about. But looks correct.

Minor problem:

> One situation to be careful about is when the real parent is ptracing.
> The parent is the recipient of both ptrace and job control stop events
> and one stop can be reported as both job control and ptrace stops.  As
> this can break the current ptrace users, suppress job control stopped
> events for these cases.

Yes, but

> @@ -1580,15 +1582,37 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
>  			wo->notask_error = 0;
>  	} else {
>  		/*
> +		 * If %current is ptracing @p, hide group stop/continued
> +		 * state when looking at @p as the real parent; otherwise,
> +		 * a single stop can be reported twice as group and ptrace
> +		 * stops.
> +		 *
> +		 * If a ptracer wants to distinguish the two events for its
> +		 * own children, it should create a separate task which
> +		 * takes the role of real parent.
> +		 */
> +		if (likely(!ptrace) && task_ptrace(p) && p->parent == current)
> +			return 0;

This doesn't look exactly right. Ignoring __WNOTHREAD, do_wait() should work
the same way for every thread in parent/debugger's thread group. IOW, we
should probably check same_thread_group(p->parent, p->real_parent) instead
of "== current".



OTOH, this is minor, perhaps we do not care. And  we have more oddities
like this. In fact I think we should just change ptrace_reparented() to
use same_thread_group(), this makes wait_task_zombie() more consistent.
But, again, this needs another discussion and off-topic right now.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/8] job control: Job control stop notifications should always go to the real parent
  2011-03-08 19:56 ` [PATCH 6/8] job control: Job control stop notifications should always go to the real parent Tejun Heo
@ 2011-03-21 17:12   ` Oleg Nesterov
  0 siblings, 0 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 17:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>
> The stopped notifications in do_signal_stop() and exit_signals() are
> always for the completion of job control.  The one in do_signal_stop()
> may be delivered to the ptracer if PTRACE_ATTACH races with
> notification and the one in exit_signals() if task exits while
> ptraced.
>
> In both cases, the notifications are meaningless and confusing to the
> ptracer as it never accesses the group stop state while the real
> parent would miss notifications for the events it is watching.
>
> Make sure these notifications always go to the real parent by calling
> do_notify_parent_cld_stop() with %false @for_ptrace.

Yes, looks correct.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 4/8] job control: Allow access to job control events through ptracees
  2011-03-21 16:39   ` Oleg Nesterov
@ 2011-03-21 17:20     ` Tejun Heo
  0 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-21 17:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

On Mon, Mar 21, 2011 at 05:39:50PM +0100, Oleg Nesterov wrote:
> > @@ -1580,15 +1582,37 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
> >  			wo->notask_error = 0;
> >  	} else {
> >  		/*
> > +		 * If %current is ptracing @p, hide group stop/continued
> > +		 * state when looking at @p as the real parent; otherwise,
> > +		 * a single stop can be reported twice as group and ptrace
> > +		 * stops.
> > +		 *
> > +		 * If a ptracer wants to distinguish the two events for its
> > +		 * own children, it should create a separate task which
> > +		 * takes the role of real parent.
> > +		 */
> > +		if (likely(!ptrace) && task_ptrace(p) && p->parent == current)
> > +			return 0;
> 
> This doesn't look exactly right. Ignoring __WNOTHREAD, do_wait() should work
> the same way for every thread in parent/debugger's thread group. IOW, we
> should probably check same_thread_group(p->parent, p->real_parent) instead
> of "== current".

Alright, I'll update the test here and in the other patch which has
similar logic.

BTW, what are you planning about patch routing?  Are you gonna setup a
tree?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-08 19:56 ` [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace Tejun Heo
@ 2011-03-21 17:43   ` Oleg Nesterov
  2011-03-22  8:04     ` Tejun Heo
  2011-03-22 11:30   ` [PATCH UPDATED " Tejun Heo
  1 sibling, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 17:43 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>
> With recent changes, job control and ptrace stopped states are
> properly separated and accessible to the real parent and the ptracer
> respectively; however, notifications of job control stopped/continued
> events to the real parent while ptraced are still missing.

Yes, great.

>  /*
> + * Test whether the target task of the usual cldstop notification - the
> + * real_parent of the group_leader of @child - is the ptracer.
> + */
> +static bool real_parent_is_ptracer(struct task_struct *child)
> +{
> +	return child->parent == child->group_leader->real_parent;
> +}

Again, I am not sure we do not need same_thread_group(), but this
is minor.

Hmm... in fact I can't convince myself we really need to look at
child->group_leader, will recheck... Anyway, this is minor too.

> @@ -1757,7 +1768,20 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
>  	spin_unlock_irq(&current->sighand->siglock);
>  	read_lock(&tasklist_lock);
>  	if (may_ptrace_stop()) {
> -		do_notify_parent_cldstop(current, task_ptrace(current), why);
> +		/*
> +		 * Notify parents of the stop.
> +		 *
> +		 * While ptraced, there are two parents - the ptracer and
> +		 * the real_parent of the group_leader.  The ptracer should
> +		 * know about every stop while the real parent is only
> +		 * interested in the completion of group stop.  The states
> +		 * for the two don't interact with each other.  Notify
> +		 * separately unless they're gonna be duplicates.
> +		 */
> +		do_notify_parent_cldstop(current, true, why);
> +		if (gstop_done && !real_parent_is_ptracer(current))
> +			do_notify_parent_cldstop(current, false, why);

OK.

But what about "else" branch? If gstop_done == T but debugger has gone
between spin_unlock(siglock) and read_lock(tasklist), we should do
something.

ptrace_untrace() restores GROUP_STOP_PENDING in this case, so this task
will stop again. But notification is lost.

Just in case, it is not that I blame this patch. Just I think we need
a bit more changes here. Unless I missed something.

> @@ -2017,10 +2041,24 @@ relock:
>
> +		/*
> +		 * Notify the parent that we're continuing.  This event is
> +		 * always per-process and doesn't make whole lot of sense
> +		 * for ptracers, who shouldn't consume the state via
> +		 * wait(2) either, but, for backward compatibility, notify
> +		 * the ptracer of the group leader too unless it's gonna be
> +		 * a duplicate.
> +		 */
>  		read_lock(&tasklist_lock);
> +
> +		do_notify_parent_cldstop(current, false, why);

Nice,

>  		leader = current->group_leader;
> +		if (task_ptrace(leader) && !real_parent_is_ptracer(leader))
> +			do_notify_parent_cldstop(leader, true, why);

Well, yes... This is ugly but compatible and documented, so I agree.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 8/8] job control: Don't send duplicate job control stop notification while ptraced
  2011-03-08 19:56 ` [PATCH 8/8] job control: Don't send duplicate job control stop notification while ptraced Tejun Heo
@ 2011-03-21 17:48   ` Oleg Nesterov
  0 siblings, 0 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-21 17:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/08, Tejun Heo wrote:
>
>  static bool task_participate_group_stop(struct task_struct *task)
>  {
> @@ -284,7 +288,11 @@ static bool task_participate_group_stop(struct task_struct *task)
>  	if (!WARN_ON_ONCE(sig->group_stop_count == 0))
>  		sig->group_stop_count--;
>  
> -	if (!sig->group_stop_count) {
> +	/*
> +	 * Tell the caller to notify completion iff we are entering into a
> +	 * fresh group stop.  Read comment in do_signal_stop() for details.
> +	 */
> +	if (!sig->group_stop_count && !(sig->flags & SIGNAL_STOP_STOPPED)) {
>  		sig->flags = SIGNAL_STOP_STOPPED;
>  		return true;

Ah, indeed. And this fixes the problems with the extra notification I
mentioned when I was looking at 1/8.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-21 17:43   ` Oleg Nesterov
@ 2011-03-22  8:04     ` Tejun Heo
  2011-03-22 19:44       ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-22  8:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On Mon, Mar 21, 2011 at 06:43:06PM +0100, Oleg Nesterov wrote:
> > + * Test whether the target task of the usual cldstop notification - the
> > + * real_parent of the group_leader of @child - is the ptracer.
> > + */
> > +static bool real_parent_is_ptracer(struct task_struct *child)
> > +{
> > +	return child->parent == child->group_leader->real_parent;
> > +}
> 
> Again, I am not sure we do not need same_thread_group(), but this
> is minor.

Yeah, same_thread_group() would be better.

> Hmm... in fact I can't convince myself we really need to look at
> child->group_leader, will recheck... Anyway, this is minor too.

Care to elaborate?

> > @@ -1757,7 +1768,20 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
> >  	spin_unlock_irq(&current->sighand->siglock);
> >  	read_lock(&tasklist_lock);
> >  	if (may_ptrace_stop()) {
> > -		do_notify_parent_cldstop(current, task_ptrace(current), why);
> > +		/*
> > +		 * Notify parents of the stop.
> > +		 *
> > +		 * While ptraced, there are two parents - the ptracer and
> > +		 * the real_parent of the group_leader.  The ptracer should
> > +		 * know about every stop while the real parent is only
> > +		 * interested in the completion of group stop.  The states
> > +		 * for the two don't interact with each other.  Notify
> > +		 * separately unless they're gonna be duplicates.
> > +		 */
> > +		do_notify_parent_cldstop(current, true, why);
> > +		if (gstop_done && !real_parent_is_ptracer(current))
> > +			do_notify_parent_cldstop(current, false, why);
> 
> OK.
> 
> But what about "else" branch? If gstop_done == T but debugger has gone
> between spin_unlock(siglock) and read_lock(tasklist), we should do
> something.
> 
> ptrace_untrace() restores GROUP_STOP_PENDING in this case, so this task
> will stop again. But notification is lost.
> 
> Just in case, it is not that I blame this patch. Just I think we need
> a bit more changes here. Unless I missed something.

You mean when may_ptrace_stop() fails, right?  Yeah, we need
notification in the else clause.  Will add it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 0.1/8] ptrace: Collapse ptrace_untrace() into __ptrace_unlink()
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (9 preceding siblings ...)
  2011-03-09 16:50 ` Oleg Nesterov
@ 2011-03-22 10:20 ` Tejun Heo
  2011-03-22 10:20 ` [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state Tejun Heo
  2011-03-22 13:11 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
  12 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-22 10:20 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

Remove the extra task_is_traced() check in __ptrace_unlink() and
collapse ptrace_untrace() into __ptrace_unlink().  This is to prepare
for further changes.

While at it, drop the comment on top of ptrace_untrace() and convert
__ptrace_unlink() comment to docbook format.  Detailed comment will be
added by the next patch.

This patch doesn't cause any visible behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/ptrace.c |   40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

Index: work/kernel/ptrace.c
===================================================================
--- work.orig/kernel/ptrace.c
+++ work/kernel/ptrace.c
@@ -37,15 +37,23 @@ void __ptrace_link(struct task_struct *c
 	child->parent = new_parent;
 }
 
-/*
- * Turn a tracing stop into a normal stop now, since with no tracer there
- * would be no way to wake it up with SIGCONT or SIGKILL.  If there was a
- * signal sent that would resume the child, but didn't because it was in
- * TASK_TRACED, resume it now.
- * Requires that irqs be disabled.
+/**
+ * __ptrace_unlink - unlink ptracee and restore its execution state
+ * @child: ptracee to be unlinked
+ *
+ * Remove @child from the ptrace list, move it back to the original parent.
+ *
+ * CONTEXT:
+ * write_lock_irq(tasklist_lock)
  */
-static void ptrace_untrace(struct task_struct *child)
+void __ptrace_unlink(struct task_struct *child)
 {
+	BUG_ON(!child->ptrace);
+
+	child->ptrace = 0;
+	child->parent = child->real_parent;
+	list_del_init(&child->ptrace_entry);
+
 	spin_lock(&child->sighand->siglock);
 	if (task_is_traced(child)) {
 		/*
@@ -70,24 +78,6 @@ static void ptrace_untrace(struct task_s
 }
 
 /*
- * unptrace a task: move it back to its original parent and
- * remove it from the ptrace list.
- *
- * Must be called with the tasklist lock write-held.
- */
-void __ptrace_unlink(struct task_struct *child)
-{
-	BUG_ON(!child->ptrace);
-
-	child->ptrace = 0;
-	child->parent = child->real_parent;
-	list_del_init(&child->ptrace_entry);
-
-	if (task_is_traced(child))
-		ptrace_untrace(child);
-}
-
-/*
  * Check that we have indeed attached to the thing..
  */
 int ptrace_check_attach(struct task_struct *child, int kill)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (10 preceding siblings ...)
  2011-03-22 10:20 ` [PATCH 0.1/8] ptrace: Collapse ptrace_untrace() into __ptrace_unlink() Tejun Heo
@ 2011-03-22 10:20 ` Tejun Heo
  2011-03-22 20:33   ` Oleg Nesterov
  2011-03-22 13:11 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
  12 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-22 10:20 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

Currently, __ptrace_unlink() wakes up the tracee iff it's in
TASK_TRACED.  For unlinking from PTRACE_DETACH, this is correct as the
tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking
also happens when the ptracer exits and in this case the ptracee can
be in any state and ptrace might be left running even if the group it
belongs to is stopped.

This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
reinstated regardless of the ptracee's current state as long as it's
alive and makes sure that signal_wake_up() is called if execution
state transition is necessary.

Test case follows.

  #include <unistd.h>
  #include <time.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(void)
  {
	  pid_t tracee;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  nanosleep(&ts1s, NULL);
			  write(1, ".", 1);
		  }
	  }

	  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
	  waitid(P_PID, tracee, &si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  waitid(P_PID, tracee, &si, WSTOPPED);
	  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
	  write(1, "exiting", 7);
	  return 0;
  }

Before the patch, after the parent process exits, the child is left
running and prints out "." every second.

  exiting..... (continues)

After the patch, the group stop initiated by the implied SIGSTOP from
PTRACE_ATTACH is re-established when the parent exits.

  exiting

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/ptrace.c |   59 +++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 39 insertions(+), 20 deletions(-)

Index: work/kernel/ptrace.c
===================================================================
--- work.orig/kernel/ptrace.c
+++ work/kernel/ptrace.c
@@ -41,7 +41,26 @@ void __ptrace_link(struct task_struct *c
  * __ptrace_unlink - unlink ptracee and restore its execution state
  * @child: ptracee to be unlinked
  *
- * Remove @child from the ptrace list, move it back to the original parent.
+ * Remove @child from the ptrace list, move it back to the original parent,
+ * and restore the execution state so that it conforms to the group stop
+ * state.
+ *
+ * Unlinking can happen via two paths - explicit PTRACE_DETACH or ptracer
+ * exiting.  For PTRACE_DETACH, unless the ptracee has been killed between
+ * ptrace_check_attach() and here, it's guaranteed to be in TASK_TRACED.
+ * If the ptracer is exiting, the ptracee can be in any state.
+ *
+ * After detach, the ptracee should be in a state which conforms to the
+ * group stop.  If the group is stopped or in the process of stopping, the
+ * ptracee should be put into TASK_STOPPED; otherwise, it should be woken
+ * up from TASK_TRACED.
+ *
+ * If the ptracee is in TASK_TRACED and needs to be moved to TASK_STOPPED,
+ * it goes through TRACED -> RUNNING -> STOPPED transition which is similar
+ * to but in the opposite direction of what happens while attaching to a
+ * stopped task.  However, in this direction, the intermediate RUNNING
+ * state is not hidden even from the current ptracer and if it immediately
+ * re-attaches and performs a WNOHANG wait(2), it may fail.
  *
  * CONTEXT:
  * write_lock_irq(tasklist_lock)
@@ -55,25 +74,25 @@ void __ptrace_unlink(struct task_struct
 	list_del_init(&child->ptrace_entry);
 
 	spin_lock(&child->sighand->siglock);
-	if (task_is_traced(child)) {
-		/*
-		 * If group stop is completed or in progress, it should
-		 * participate in the group stop.  Set GROUP_STOP_PENDING
-		 * before kicking it.
-		 *
-		 * This involves TRACED -> RUNNING -> STOPPED transition
-		 * which is similar to but in the opposite direction of
-		 * what happens while attaching to a stopped task.
-		 * However, in this direction, the intermediate RUNNING
-		 * state is not hidden even from the current ptracer and if
-		 * it immediately re-attaches and performs a WNOHANG
-		 * wait(2), it may fail.
-		 */
-		if (child->signal->flags & SIGNAL_STOP_STOPPED ||
-		    child->signal->group_stop_count)
-			child->group_stop |= GROUP_STOP_PENDING;
-		signal_wake_up(child, 1);
-	}
+
+	/*
+	 * Reinstate GROUP_STOP_PENDING if group stop is in effect and
+	 * @child isn't dead.
+	 */
+	if (!(child->flags & PF_EXITING) &&
+	    (child->signal->flags & SIGNAL_STOP_STOPPED ||
+	     child->signal->group_stop_count))
+		child->group_stop |= GROUP_STOP_PENDING;
+
+	/*
+	 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
+	 * @child in the butt.  Note that @resume should be used iff @child
+	 * is in TASK_TRACED; otherwise, we might unduly disrupt
+	 * TASK_KILLABLE sleeps.
+	 */
+	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
+		signal_wake_up(child, task_is_traced(child));
+
 	spin_unlock(&child->sighand->siglock);
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH UPDATED 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing
  2011-03-08 19:56 ` [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing Tejun Heo
  2011-03-21 15:19   ` Oleg Nesterov
@ 2011-03-22 10:51   ` Tejun Heo
  1 sibling, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-22 10:51 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

wait(2) and friends allow access to stopped/continued states through
zombies, which is required as the states are process-wide and should
be accessible whether the leader task is alive or undead.
wait_consider_task() implements this by always clearing notask_error
and going through wait_task_stopped/continued() for unreaped zombies.

However, while ptraced, the stopped state is per-task and as such if
the ptracee became a zombie, there's no further stopped event to
listen to and wait(2) and friends should return -ECHILD on the tracee.

Fix it by clearing notask_error only if WCONTINUED | WEXITED is set
for ptraced zombies.  While at it, document why clearing notask_error
is safe for each case.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <pthread.h>
  #include <time.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static void *nooper(void *arg)
  {
	  pause();
	  return NULL;
  }

  int main(void)
  {
	  const struct timespec ts1s = { .tv_sec = 1 };
	  pid_t tracee, tracer;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  pthread_t thr;

		  pthread_create(&thr, NULL, nooper, NULL);
		  nanosleep(&ts1s, NULL);
		  printf("tracee exiting\n");
		  pthread_exit(NULL);	/* let subthread run */
	  }

	  tracer = fork();
	  if (tracer == 0) {
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		  while (1) {
			  if (waitid(P_PID, tracee, &si, WSTOPPED) < 0) {
				  perror("waitid");
				  break;
			  }
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(long)si.si_status);
		  }
		  return 0;
	  }

	  waitid(P_PID, tracer, &si, WEXITED);
	  kill(tracee, SIGKILL);
	  return 0;
  }

Before the patch, after the tracee becomes a zombie, the tracer's
waitid(WSTOPPED) never returns and the program doesn't terminate.

  tracee exiting
  ^C

After the patch, tracee exiting triggers waitid() to fail.

  tracee exiting
  waitid: No child processes

-v2: Oleg pointed out that exited in addition to continued can happen
     for ptraced dead group leader.  Clear notask_error for ptraced
     child on WEXITED too.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
WEXITED bug fixed.  Let's tackle the ptraced per-task wait(2) thing
later.  Thanks.

 kernel/exit.c |   44 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 10 deletions(-)

Index: work/kernel/exit.c
===================================================================
--- work.orig/kernel/exit.c
+++ work/kernel/exit.c
@@ -1550,17 +1550,41 @@ static int wait_consider_task(struct wai
 		return 0;
 	}
 
-	/*
-	 * We don't reap group leaders with subthreads.
-	 */
-	if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
-		return wait_task_zombie(wo, p);
+	/* slay zombie? */
+	if (p->exit_state == EXIT_ZOMBIE) {
+		/* we don't reap group leaders with subthreads */
+		if (!delay_group_leader(p))
+			return wait_task_zombie(wo, p);
 
-	/*
-	 * It's stopped or running now, so it might
-	 * later continue, exit, or stop again.
-	 */
-	wo->notask_error = 0;
+		/*
+		 * Allow access to stopped/continued state via zombie by
+		 * falling through.  Clearing of notask_error is complex.
+		 *
+		 * When !@ptrace:
+		 *
+		 * If WEXITED is set, notask_error should naturally be
+		 * cleared.  If not, subset of WSTOPPED|WCONTINUED is set,
+		 * so, if there are live subthreads, there are events to
+		 * wait for.  If all subthreads are dead, it's still safe
+		 * to clear - this function will be called again in finite
+		 * amount time once all the subthreads are released and
+		 * will then return without clearing.
+		 *
+		 * When @ptrace:
+		 *
+		 * Stopped state is per-task and thus can't change once the
+		 * target task dies.  Only continued and exited can happen.
+		 * Clear notask_error if WCONTINUED | WEXITED.
+		 */
+		if (likely(!ptrace) || (wo->wo_flags & (WCONTINUED | WEXITED)))
+			wo->notask_error = 0;
+	} else {
+		/*
+		 * @p is alive and it's gonna stop, continue or exit, so
+		 * there always is something to wait for.
+		 */
+		wo->notask_error = 0;
+	}
 
 	if (task_stopped_code(p, ptrace))
 		return wait_task_stopped(wo, ptrace, p);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH UPDATED 4/8] job control: Allow access to job control events through ptracees
  2011-03-08 19:56 ` [PATCH 4/8] job control: Allow access to job control events through ptracees Tejun Heo
  2011-03-21 16:39   ` Oleg Nesterov
@ 2011-03-22 11:10   ` Tejun Heo
  1 sibling, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-22 11:10 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

Currently a real parent can't access job control stopped/continued
events through a ptraced child.  This utterly breaks job control when
the children are ptraced.

For example, if a program is run from an interactive shell and then
strace(1) attaches to it, pressing ^Z would send SIGTSTP and strace(1)
would notice it but the shell has no way to tell whether the child
entered job control stop and thus can't tell when to take over the
terminal - leading to awkward lone ^Z on the terminal.

Because the job control and ptrace stopped states are independent,
there is no reason to prevent real parents from accessing the stopped
state regardless of ptrace.  The continued state isn't separate but
ptracers don't have any use for them as ptracees can never resume
without explicit command from their ptracers, so as long as ptracers
don't consume it, it should be fine.

Although this is a behavior change, because the previous behavior is
utterly broken when viewed from real parents and the change is only
visible to real parents, I don't think it's necessary to make this
behavior optional.

One situation to be careful about is when a task from the real
parent's group is ptracing.  The parent group is the recipient of both
ptrace and job control stop events and one stop can be reported as
both job control and ptrace stops.  As this can break the current
ptrace users, suppress job control stopped events for these cases.

If a real parent ptracer wants to know about both job control and
ptrace stops, it can create a separate process to serve the role of
real parent.

Note that this only updates wait(2) side of things.  The real parent
can access the states via wait(2) but still is not properly notified
(woken up and delivered signal).  Test case polls wait(2) with WNOHANG
to work around.  Notification will be updated by future patches.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <time.h>
  #include <errno.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  int main(void)
  {
	  const struct timespec ts100ms = { .tv_nsec = 100000000 };
	  pid_t tracee, tracer;
	  siginfo_t si;
	  int i;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  printf("tracee: SIGSTOP\n");
			  raise(SIGSTOP);
			  nanosleep(&ts100ms, NULL);
			  printf("tracee: SIGCONT\n");
			  raise(SIGCONT);
			  nanosleep(&ts100ms, NULL);
		  }
	  }

	  waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG | WNOWAIT);

	  tracer = fork();
	  if (tracer == 0) {
		  nanosleep(&ts100ms, NULL);
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);

		  for (i = 0; i < 11; i++) {
			  si.si_pid = 0;
			  waitid(P_PID, tracee, &si, WSTOPPED);
			  if (si.si_pid && si.si_code == CLD_TRAPPED)
				  ptrace(PTRACE_CONT, tracee, NULL,
					 (void *)(long)si.si_status);
		  }
		  printf("tracer: EXITING\n");
		  return 0;
	  }

	  while (1) {
		  si.si_pid = 0;
		  waitid(P_PID, tracee, &si,
			 WSTOPPED | WCONTINUED | WEXITED | WNOHANG);
		  if (si.si_pid)
			  printf("mommy : WAIT status=%02d code=%02d\n",
				 si.si_status, si.si_code);
		  nanosleep(&ts100ms, NULL);
	  }
	  return 0;
  }

Before the patch, while ptraced, the parent can't see any job control
events.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

After the patch,

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

-v2: Oleg pointed out that wait(2) should be suppressed for the real
     parent's group instead of only the real parent task itself.
     Updated accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 kernel/exit.c |   41 +++++++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 8 deletions(-)

Index: work/kernel/exit.c
===================================================================
--- work.orig/kernel/exit.c
+++ work/kernel/exit.c
@@ -1541,17 +1541,19 @@ static int wait_consider_task(struct wai
 	if (p->exit_state == EXIT_DEAD)
 		return 0;
 
-	if (likely(!ptrace) && unlikely(task_ptrace(p))) {
+	/* slay zombie? */
+	if (p->exit_state == EXIT_ZOMBIE) {
 		/*
-		 * This child is hidden by ptrace.
-		 * We aren't allowed to see it now, but eventually we will.
+		 * A zombie ptracee is only visible to its ptracer.
+		 * Notification and reaping will be cascaded to the real
+		 * parent when the ptracer detaches.
 		 */
-		wo->notask_error = 0;
-		return 0;
-	}
+		if (likely(!ptrace) && unlikely(task_ptrace(p))) {
+			/* it will become visible, clear notask_error */
+			wo->notask_error = 0;
+			return 0;
+		}
 
-	/* slay zombie? */
-	if (p->exit_state == EXIT_ZOMBIE) {
 		/* we don't reap group leaders with subthreads */
 		if (!delay_group_leader(p))
 			return wait_task_zombie(wo, p);
@@ -1580,15 +1582,38 @@ static int wait_consider_task(struct wai
 			wo->notask_error = 0;
 	} else {
 		/*
+		 * If @p is ptraced by a task in its real parent's group,
+		 * hide group stop/continued state when looking at @p as
+		 * the real parent; otherwise, a single stop can be
+		 * reported twice as group and ptrace stops.
+		 *
+		 * If a ptracer wants to distinguish the two events for its
+		 * own children, it should create a separate process which
+		 * takes the role of real parent.
+		 */
+		if (likely(!ptrace) && task_ptrace(p) &&
+		    same_thread_group(p->parent, p->real_parent))
+			return 0;
+
+		/*
 		 * @p is alive and it's gonna stop, continue or exit, so
 		 * there always is something to wait for.
 		 */
 		wo->notask_error = 0;
 	}
 
+	/*
+	 * Wait for stopped.  Depending on @ptrace, different stopped state
+	 * is used and the two don't interact with each other.
+	 */
 	if (task_stopped_code(p, ptrace))
 		return wait_task_stopped(wo, ptrace, p);
 
+	/*
+	 * Wait for continued.  There's only one continued state and the
+	 * ptracer can consume it which can confuse the real parent.  Don't
+	 * use WCONTINUED from ptracer.  You don't need or want it.
+	 */
 	return wait_task_continued(wo, p);
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH UPDATED 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-08 19:56 ` [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace Tejun Heo
  2011-03-21 17:43   ` Oleg Nesterov
@ 2011-03-22 11:30   ` Tejun Heo
  1 sibling, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-22 11:30 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

With recent changes, job control and ptrace stopped states are
properly separated and accessible to the real parent and the ptracer
respectively; however, notifications of job control stopped/continued
events to the real parent while ptraced are still missing.

A ptracee participates in group stop in ptrace_stop() but the
completion isn't notified.  If participation results in completion of
group stop, notify the real parent of the event.  The ptrace and group
stops are separate and can be handled as such.

However, when the real parent and the ptracer are in the same thread
group, only the ptrace stop event is visible through wait(2) and the
duplicate notifications are different from the current behavior and
are confusing.  Suppress group stop notification in such cases.

The continued state is shared between the real parent and the ptracer
but is only meaningful to the real parent.  Always notify the real
parent and notify the ptracer too for backward compatibility.  Similar
to stop notification, if the real parent is the ptracer, suppress a
duplicate notification.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <time.h>
  #include <errno.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  int main(void)
  {
	  const struct timespec ts100ms = { .tv_nsec = 100000000 };
	  pid_t tracee, tracer;
	  siginfo_t si;
	  int i;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  printf("tracee: SIGSTOP\n");
			  raise(SIGSTOP);
			  nanosleep(&ts100ms, NULL);
			  printf("tracee: SIGCONT\n");
			  raise(SIGCONT);
			  nanosleep(&ts100ms, NULL);
		  }
	  }

	  waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG | WNOWAIT);

	  tracer = fork();
	  if (tracer == 0) {
		  nanosleep(&ts100ms, NULL);
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);

		  for (i = 0; i < 11; i++) {
			  si.si_pid = 0;
			  waitid(P_PID, tracee, &si, WSTOPPED);
			  if (si.si_pid && si.si_code == CLD_TRAPPED)
				  ptrace(PTRACE_CONT, tracee, NULL,
					 (void *)(long)si.si_status);
		  }
		  printf("tracer: EXITING\n");
		  return 0;
	  }

	  while (1) {
		  si.si_pid = 0;
		  waitid(P_PID, tracee, &si, WSTOPPED | WCONTINUED | WEXITED);
		  if (si.si_pid)
			  printf("mommy : WAIT status=%02d code=%02d\n",
				 si.si_status, si.si_code);
	  }
	  return 0;
  }

Before this patch, while ptraced, the real parent doesn't get
notifications for job control events, so although it can access those
events, the later waitid(2) call never wakes up.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

After this patch, it works as expected.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

-v2: Oleg pointed out that

     * Group stop notification to the real parent should also happen
       when ptracer detach races with ptrace_stop().

     * real_parent_is_ptracer() should be testing thread group
       equality not the task itself as wait(2) and stop/cont
       notifications are normally thread-group wide.

     Both issues are fixed accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 kernel/signal.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 52 insertions(+), 3 deletions(-)

Index: work/kernel/signal.c
===================================================================
--- work.orig/kernel/signal.c
+++ work/kernel/signal.c
@@ -1694,6 +1694,17 @@ static int sigkill_pending(struct task_s
 }
 
 /*
+ * Test whether the target task of the usual cldstop notification - the
+ * real_parent of the group_leader of @child - is in the same group as the
+ * ptracer.
+ */
+static bool real_parent_is_ptracer(struct task_struct *child)
+{
+	return same_thread_group(child->parent,
+				 child->group_leader->real_parent);
+}
+
+/*
  * This must be called with current->sighand->siglock held.
  *
  * This should be the path for all ptrace stops.
@@ -1708,6 +1719,8 @@ static void ptrace_stop(int exit_code, i
 	__releases(&current->sighand->siglock)
 	__acquires(&current->sighand->siglock)
 {
+	bool gstop_done = false;
+
 	if (arch_ptrace_stop_needed(exit_code, info)) {
 		/*
 		 * The arch code has something special to do before a
@@ -1735,7 +1748,7 @@ static void ptrace_stop(int exit_code, i
 	 * is entered - ignore it.
 	 */
 	if (why == CLD_STOPPED && (current->group_stop & GROUP_STOP_PENDING))
-		task_participate_group_stop(current);
+		gstop_done = task_participate_group_stop(current);
 
 	current->last_siginfo = info;
 	current->exit_code = exit_code;
@@ -1757,7 +1770,20 @@ static void ptrace_stop(int exit_code, i
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
 	if (may_ptrace_stop()) {
-		do_notify_parent_cldstop(current, task_ptrace(current), why);
+		/*
+		 * Notify parents of the stop.
+		 *
+		 * While ptraced, there are two parents - the ptracer and
+		 * the real_parent of the group_leader.  The ptracer should
+		 * know about every stop while the real parent is only
+		 * interested in the completion of group stop.  The states
+		 * for the two don't interact with each other.  Notify
+		 * separately unless they're gonna be duplicates.
+		 */
+		do_notify_parent_cldstop(current, true, why);
+		if (gstop_done && !real_parent_is_ptracer(current))
+			do_notify_parent_cldstop(current, false, why);
+
 		/*
 		 * Don't want to allow preemption here, because
 		 * sys_ptrace() needs this task to be inactive.
@@ -1772,7 +1798,16 @@ static void ptrace_stop(int exit_code, i
 		/*
 		 * By the time we got the lock, our tracer went away.
 		 * Don't drop the lock yet, another tracer may come.
+		 *
+		 * If @gstop_done, the ptracer went away between group stop
+		 * completion and here.  During detach, it would have set
+		 * GROUP_STOP_PENDING on us and we'll re-enter TASK_STOPPED
+		 * in do_signal_stop() on return, so notifying the real
+		 * parent of the group stop completion is enough.
 		 */
+		if (gstop_done)
+			do_notify_parent_cldstop(current, false, why);
+
 		__set_current_state(TASK_RUNNING);
 		if (clear_code)
 			current->exit_code = 0;
@@ -2017,10 +2052,24 @@ relock:
 
 		spin_unlock_irq(&sighand->siglock);
 
+		/*
+		 * Notify the parent that we're continuing.  This event is
+		 * always per-process and doesn't make whole lot of sense
+		 * for ptracers, who shouldn't consume the state via
+		 * wait(2) either, but, for backward compatibility, notify
+		 * the ptracer of the group leader too unless it's gonna be
+		 * a duplicate.
+		 */
 		read_lock(&tasklist_lock);
+
+		do_notify_parent_cldstop(current, false, why);
+
 		leader = current->group_leader;
-		do_notify_parent_cldstop(leader, task_ptrace(leader), why);
+		if (task_ptrace(leader) && !real_parent_is_ptracer(leader))
+			do_notify_parent_cldstop(leader, true, why);
+
 		read_unlock(&tasklist_lock);
+
 		goto relock;
 	}
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced
  2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
                   ` (11 preceding siblings ...)
  2011-03-22 10:20 ` [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state Tejun Heo
@ 2011-03-22 13:11 ` Tejun Heo
  2011-03-22 20:59   ` Oleg Nesterov
  12 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-22 13:11 UTC (permalink / raw)
  To: oleg, roland, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan

Hello, Oleg.

The updated patches should address the issues pointed out in your
reviews, at least the ones which are appropriate to address now.
Please let me know when you want the whole series resent.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-21 15:52     ` Tejun Heo
@ 2011-03-22 18:44       ` Oleg Nesterov
  2011-03-23  8:44         ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-22 18:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/21, Tejun Heo wrote:
>
> On Mon, Mar 21, 2011 at 02:20:24PM +0100, Oleg Nesterov wrote:
> > Also. We should take ->group_stop_count != 0 into account, we should not
> > set (change) ->group_exit_code in this case too.
>
> Hmmm... I'm not sure whether this matters.  If the group stop hasn't
> completed yet, I don't think we need to guarantee which one is
> reported.

I agree, this is minor, but

> with
> multithread, the userland has no way to tell which one of the two
> signals actually started group stop so we can report any.  The
> ordering is undefined.

Yes, if two threads dequeue the sig_kernel_stop() signals, the ordering
is undefined. But ptrace should not change sig->group_exit_code once
it was already set by the thread which starts the group stop.

Suppose that debugger PTRACE_CONTs the stopped thread and then it
gets SIGSTOP and calls do_signal_stop() again. In theory this all is
possible before SIGNAL_STOP_STOPPED. This can confuse real_parent.
Say, real_parent itself sends SIGTTIN to the child and naturally
expects WIFSTOPPED() == SIGTTIN.

And btw, I think we should not change/set ->group_stop_count if

But again, even if I am right this is minor and we can change this
later. And btw, I think we should not change/set ->group_stop_count
if SIGNAL_STOP_STOPPED || group_stop_count. We will see.

IOW. In short - you can skip the whole email. I do not see any
real problems which could delay this patch.

> > 			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
> > 				t->group_stop |= signr | gstop;
> > 				sig->group_stop_count++;
> > 				signal_wake_up(t, 0);
> > 			} else {
> > 				task_clear_group_stop_pending(t);
> > 			}
> > 		}
> >
> > Somehow I no longer understand "else task_clear_group_stop_pending()".
> > I mean, is it really needed?
> >
> > If task_is_stopped() == T or it is PF_EXITING, this task has already
> > done task_participate_group_stop(), no?
>
> Hmm... I think this is from the older implementation where
> task_clear_group_stop_pending() did more than just clearing the two
> flags.  We can remove it.

OK, good.

> > Also. I do not think it is correct to change the "signr" part of
> > ->group_stop (unless it is zero) when ->group_stop_count != 0
> > for other threads. This is minor, but still doesn't look exactly
> > correct. Probably we can ignore this.
>
> GROUP_STOP_SIGMASK only affects ptracers.  The extra stop signal is
> visible to one ptracer and possibly other ptracers too, so I think
> it's more appropriate to report the newest signal.  Please consider
> the following scenario.
>
> * Two threads in a process.  SIGNAL_STOP_STOPPED.
>
> * PTRACE_ATTACH on both threads and both are PTRACE_CONT'd.
>
> * A stopsig is delivered to one of them which initiates group stop.
>
> * Both ptracers notice the tracees participating in the group stop.

Sure, this is correct. But I meant another case. In the scenario
above, suppose that only one thread was PTRACE_CONT'ed.

Again, again, again. This is very minor. I do not mean it makes sense
to change this, at least right now. Just I have a vague feeling we
can cleanup this a little bit later.

For example. Ignoring task_participate_group_stop(), why
task_clear_group_stop_pending() preserves the GROUP_STOP_SIGMASK ?
This doesn't hurt, sure. But looks a bit inconsistent.

Anyway, please forget. Right now this is off-topic.

> > Hmm. it turns out "group_stop & GROUP_STOP_SIGMASK" is only needed
> > to handle this special case: if debugger PTRACE_CONT's or more
> > stopped tracees and then som thread initiates the stop again, other
> > threads need to know that "signr". Otherwise this part of ->group_stop
> > is only valid "inside" the retry loop in do_signal_stop(), it can
> > be a local variable. I wonder if we can simply report SIGSTOP in
> > this case and kill the GROUP_STOP_SIGMASK logic. Nevermind.
>
> I think there was a reason why it couldn't be a local variable.  Let's
> see if I can remember it.  Ah, okay, please consider the following
> scenario.  This one needs to be documented.
>
> * A thread does group stop and the parent consumed exit code.
>
> * ptracer attaches and sees the group stop signal.
>
> * PTRACE_CONT and the thread leaves do_signal_stop().
>
> * PTRACE_DETACH.  The thread returns to do_signal_stop() and re-enters
>   TASK_STOPPED.
>
> * Another ptracer does PTRACE_ATTACH.
>
> The second ptracer wants to know the signo too but if it were stored
> in a local variable, it wouldn't be available anywhere.

Yes, sure. But this is basically the same case. My point was, this
signr must be correct inside the retry loop, otherwise we could just
report SIGSTOP because we can't guarantee the "correct" signr anyway.

Let's look at your example again. Suppose that the process was stopped
by SIGSTOP.

When the first ptracer attaches, each thread correctly reports SIGSTOP.
But if it plays with PTRACE_CONT and then detaches, the next ptracer can
see, say, SIGTTIN. And different threads can report different signals.

Again, again, again. I do not want to delay this changes. But _perhaps_
we can cleanup this a bit later.

OK, lets keep GROUP_STOP_SIGMASK. Then, perhaps, we should never change
"group_stop & GROUP_STOP_SIGMASK" if it is non-zero. PTRACE_CONT should
clear it.

The only special case is stop->PTRACE_CONT->do_signal_stop(). In this
case the tracee should keep its "group_stop & GROUP_STOP_SIGMASK" but
use "int signr" passed to do_signal_stop() as an argument for
ptrace_stop().

Not that I seriously think this needs the attention. In any case,
not now.

> > And. I think this code does something we do not really want. Why do
> > we _always_ ask the task_is_traced() threads to participate?
> >
> > 2 threads T1 and T2, both stopped. they are TASK_TRACED, I mean
> > SIGNAL_STOP_STOPPED is stopped and both have already participated.
> >
> > Debuggere PTRACE_CONTs T1, then it calls do_signal_stop() again
> > and sets T2->group_stop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME.
> > This T2->group_stop doesn't look right, we can report the wrong
> > extra CLD_STOPPED after detach while ->group_exit_code can be 0.
> > I think that !task_ptrace(current) case in do_signal_stop() should
> > take SIGNAL_STOP_STOPPED into account, but perhaps we need another
> > GROUP_STOP_REPORTED bit, I am not sure.
>
> Isn't this addressed by the last patch in this thread?

Yes, it was.

> > Or, if debugger PTRACE_CONT's T2, it will report another
> > ptrace_stop(CLD_STOPPED) immediately, this differs from the current
> > behaviour although probably we do not care.
>
> This was changed by "signal: Use GROUP_STOP_PENDING to stop once for a
> single group stop".

Not sure I understand... We are setting GROUP_STOP_PENDING | CONSUME
again. T2 has already reported ptrace_stop(CLD_STOPPED) to the tracer.
It is stopped. Now it will report another CLD_STOPPED after PTRACE_CONT.

Oleg.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing
  2011-03-21 16:12     ` Tejun Heo
@ 2011-03-22 19:08       ` Oleg Nesterov
  0 siblings, 0 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-22 19:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/21, Tejun Heo wrote:
>
> On Mon, Mar 21, 2011 at 04:19:41PM +0100, Oleg Nesterov wrote:
> > But the main problem is, I do not think do_wait() should block in this
> > case, and thus I am starting to think this patch is not "complete".

Just in case... But of course I didn't mean this patch should be
updated to handle the EXIT_ZOMBIE case.

> > Your test-case could use waitid(WEXITED) instead WSTOPPED with the same
> > result, it should hang. Why it hangs? The tracee is dead, we can't do
> > ptrace(PTRACE_DETACH), and we can do nothing until other threads exit.
> > This looks equally strange.
> >
> > IOW. Assuming that ptrace == T and WEXITED is set, perhaps we should
> > do something like this pseudo-code
> >
> > 	if (p->exit_state == EXIT_ZOMBIE) {
> > 		if (!delay_group_leader(p))
> > 			return wait_task_zombie(wo, p);
> >
> > 		ptrace_unlink();
> > 		wait_task_zombie(WNOWAIT);
> > 	}
> >
> > However. This is another user-visible change, we need another discussion
> > even if I am right. In particular, it is not clear what should we do
> > if parent == real_parent. And probably this can confuse gdb, but iirc
> > gdb already have the problems with the dead leader anyway.
>
> Interesting point.  Yeah, I agree.  wait(WEXITED) from the ptracer
> should only wait for the tracee itself, not the group.  When they are
> one and the same, I don't think we need to do anything differently
> from now.
>
> If we change the behavior that way, it would also fit better with the
> rest of the new behavior where the real parent and ptracer have
> separate roles when wait(2)ing for stopped states.
>
> The question is how the change would affect the existing users.

Yes, of course. Perhaps we can never do this.

> When
> the debugee is a direct child, nothing will change.

Actually, I think this is the most problematic case... Perhaps
it would be safer to add WEXITED_THREAD for ptrace. I dunno.

> When attaching to
> a separate group, I don't think it even matters.  Does gdb handle
> group leader any differently from the rest when attached to an
> unrelated group?

gdb certainly has some problems with the dead leaders. But I can't
recall what exactly. Will try to check later...

In any case, I only tried to discuss what else we can do with the
current strange semantics. When it comes to ptrace, group_leader
should not represent the whole process.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-22  8:04     ` Tejun Heo
@ 2011-03-22 19:44       ` Oleg Nesterov
  2011-03-23  9:17         ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-22 19:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/22, Tejun Heo wrote:
>
> On Mon, Mar 21, 2011 at 06:43:06PM +0100, Oleg Nesterov wrote:
> > Hmm... in fact I can't convince myself we really need to look at
> > child->group_leader, will recheck... Anyway, this is minor too.
>
> Care to elaborate?

child->real_parent must be equal to child->group_leader->real_parent.

What we need is ptrace_reparented(). But as I said, it should be
fixed first (I think) and this needs a separate patch/discussion.

So yes, we need a new helper.

Well, and look at get_signal_to_deliver(),

		leader = current->group_leader;
		if (task_ptrace(leader) && !real_parent_is_ptracer(leader))
			do_notify_parent_cldstop(leader, true, why);

Indeed. But in fact task_ptrace(leader) is not needed. If
real_parent_is_ptracer(tsk) returns false, then tsk must be ptraced.

So we could write

		if (!real_parent_is_ptracer(leader))
			do_notify_parent_cldstop(leader, true, why);

but now this looks confusing because of the naming. Compare with

		if (ptrace_reparented(leader))
			do_notify_parent_cldstop(leader);

Oleg.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state
  2011-03-22 10:20 ` [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state Tejun Heo
@ 2011-03-22 20:33   ` Oleg Nesterov
  2011-03-23  8:00     ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-22 20:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/22, Tejun Heo wrote:
>
> This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
> reinstated regardless of the ptracee's current state as long as it's
> alive and makes sure that signal_wake_up() is called if execution
> state transition is necessary.

Looks correct (and the previous one too).

But I don't understand the PF_EXITING check,

> +	/*
> +	 * Reinstate GROUP_STOP_PENDING if group stop is in effect and
> +	 * @child isn't dead.
> +	 */
> +	if (!(child->flags & PF_EXITING) &&
> +	    (child->signal->flags & SIGNAL_STOP_STOPPED ||
> +	     child->signal->group_stop_count))
> +		child->group_stop |= GROUP_STOP_PENDING;

Why do we need to filter out PF_EXITING tasks? This doesn't look
strictly necessary. And note that exit_signals() doesn't always
take ->siglock, we can race anyway.

> +	 * Note that @resume should be used iff @child
> +	 * is in TASK_TRACED; otherwise, we might unduly disrupt
> +	 * TASK_KILLABLE sleeps.

Yes. but, just in case,

> +	 */
> +	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
> +		signal_wake_up(child, task_is_traced(child));

signal_wake_up() is not needed if task_is_traced(). Even if we added
GROUP_STOP_PENDING, ptrace_stop() does recalc_sigpending_tsk() anyway
before return.

So we could do

	if (SIGNAL_STOP_STOPPED || group_stop_count) {
		child->group_stop |= GROUP_STOP_PENDING;
		signal_wake_up(child, 0);
	}

	if (task_is_traced(child))
		wake_up_state(TASK_TRACED);

But probably a single wakeup looks more simple/clean, so I agree.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced
  2011-03-22 13:11 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
@ 2011-03-22 20:59   ` Oleg Nesterov
  2011-03-23  8:48     ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-22 20:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/22, Tejun Heo wrote:
>
> Hello, Oleg.
>
> The updated patches should address the issues pointed out in your
> reviews, at least the ones which are appropriate to address now.

Yes, thanks. "UPDATED" patches look fine. Please feel free to add my
reviewed-by to the whole series.

I'd only ask you to remove ->group_leader from real_parent_is_ptracer()
(btw, note that wait_consider_task() can use this helper too) although
this is minor.

> Please let me know when you want the whole series resent.

Cough. Resent? If you ask me, I think you should just send them to
Linus while the merge window is opened.

I am trying to get the account on kernel.org, but until then I hope
you can flush the pending ptrace-review branches.

In case I misunderstood you... I think that the whole series is good
and the patches should not be updated. Just for example, I'd like to
rename real_parent_is_ptracer (and remove ->group_leader), but we can
make another patch for that.

And yes, I bet we missed something and we will need the subsequent
fixes anyway ;)

Oleg.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state
  2011-03-22 20:33   ` Oleg Nesterov
@ 2011-03-23  8:00     ` Tejun Heo
  0 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-23  8:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

On Tue, Mar 22, 2011 at 09:33:22PM +0100, Oleg Nesterov wrote:
> > +	/*
> > +	 * Reinstate GROUP_STOP_PENDING if group stop is in effect and
> > +	 * @child isn't dead.
> > +	 */
> > +	if (!(child->flags & PF_EXITING) &&
> > +	    (child->signal->flags & SIGNAL_STOP_STOPPED ||
> > +	     child->signal->group_stop_count))
> > +		child->group_stop |= GROUP_STOP_PENDING;
> 
> Why do we need to filter out PF_EXITING tasks? This doesn't look
> strictly necessary. And note that exit_signals() doesn't always
> take ->siglock, we can race anyway.

I don't think it's strictly necessary either but in the usual group
stop initiation path in do_signal_stop(), PF_EXITING is checked before
setting PENDING, so I think it's better to remain consistent with
that.

> > +	 * Note that @resume should be used iff @child
> > +	 * is in TASK_TRACED; otherwise, we might unduly disrupt
> > +	 * TASK_KILLABLE sleeps.
> 
> Yes. but, just in case,
> 
> > +	 */
> > +	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
> > +		signal_wake_up(child, task_is_traced(child));
> 
> signal_wake_up() is not needed if task_is_traced(). Even if we added
> GROUP_STOP_PENDING, ptrace_stop() does recalc_sigpending_tsk() anyway
> before return.
> 
> So we could do
> 
> 	if (SIGNAL_STOP_STOPPED || group_stop_count) {
> 		child->group_stop |= GROUP_STOP_PENDING;
> 		signal_wake_up(child, 0);
> 	}
> 
> 	if (task_is_traced(child))
> 		wake_up_state(TASK_TRACED);
> 
> But probably a single wakeup looks more simple/clean, so I agree.

Yeah, I'd like to avoid mixing calls to wake_up_state() and
signal_wake_up().  It gets confusing and there's no reason for
micro-optimization here.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-22 18:44       ` Oleg Nesterov
@ 2011-03-23  8:44         ` Tejun Heo
  2011-03-23 16:40           ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-23  8:44 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

On Tue, Mar 22, 2011 at 07:44:15PM +0100, Oleg Nesterov wrote:
> Suppose that debugger PTRACE_CONTs the stopped thread and then it
> gets SIGSTOP and calls do_signal_stop() again. In theory this all is
> possible before SIGNAL_STOP_STOPPED. This can confuse real_parent.
> Say, real_parent itself sends SIGTTIN to the child and naturally
> expects WIFSTOPPED() == SIGTTIN.

Hmmm... There are two competing signals in that case - SIGTTIN sent by
the parent and SIGSTOP sent by someone else.  The parent doesn't have
any control over which signal wins the race nor is it visible in any
way before the group stop is complete.  The two signals are
equivalent.

I can't think of a scenario where the parent would be able to
differentiate the two signals (in the sense that it can say the latter
is the wrong signal).  If you can, please share.

> But again, even if I am right this is minor and we can change this
> later. And btw, I think we should not change/set ->group_stop_count
> if SIGNAL_STOP_STOPPED || group_stop_count. We will see.

Yeah, maybe.  We would be suppressing group stop completion anyway so
the count doesn't make whole lot of difference.  The only difference
would be that group stop completion would be delayed if stop signals
are continuously delievered by ptracee's in the thread group.  I don't
think it matters one way or the other.

> For example. Ignoring task_participate_group_stop(), why
> task_clear_group_stop_pending() preserves the GROUP_STOP_SIGMASK ?
> This doesn't hurt, sure. But looks a bit inconsistent.

Yeah, it probably is better to clear the signo too.

> > * A thread does group stop and the parent consumed exit code.
> >
> > * ptracer attaches and sees the group stop signal.
> >
> > * PTRACE_CONT and the thread leaves do_signal_stop().
> >
> > * PTRACE_DETACH.  The thread returns to do_signal_stop() and re-enters
> >   TASK_STOPPED.
> >
> > * Another ptracer does PTRACE_ATTACH.
> >
> > The second ptracer wants to know the signo too but if it were stored
> > in a local variable, it wouldn't be available anywhere.
> 
> Yes, sure. But this is basically the same case. My point was, this
> signr must be correct inside the retry loop, otherwise we could just
> report SIGSTOP because we can't guarantee the "correct" signr anyway.
> 
> Let's look at your example again. Suppose that the process was stopped
> by SIGSTOP.
> 
> When the first ptracer attaches, each thread correctly reports SIGSTOP.
> But if it plays with PTRACE_CONT and then detaches, the next ptracer can
> see, say, SIGTTIN. And different threads can report different signals.

I think it depends on how you look at it.  Group stop behavior is
different under ptrace anyway.  We allow PTRACE_CONT'd tracees to
re-initiate and re-enter the existing group stop.  Which signo is the
right one to report is debatable, but,

* For the tracee which delivers the signal which re-initiates the
  existing group stop, it would be just weird to report different
  signo from the one delivered.

* It would also be weird to report different signo for other tracees
  in the group depending on whether they were running at the time of
  re-initiation or not.

So, I think just defining the stop signo to be reported while ptraced
as "the latest group stop signal delivered" isn't too bad.  It's
simple and more consistent with the existing behavior.

> > > Or, if debugger PTRACE_CONT's T2, it will report another
> > > ptrace_stop(CLD_STOPPED) immediately, this differs from the current
> > > behaviour although probably we do not care.
> >
> > This was changed by "signal: Use GROUP_STOP_PENDING to stop once for a
> > single group stop".
> 
> Not sure I understand... We are setting GROUP_STOP_PENDING | CONSUME
> again. T2 has already reported ptrace_stop(CLD_STOPPED) to the tracer.
> It is stopped. Now it will report another CLD_STOPPED after PTRACE_CONT.

Okay, I see.  Maybe we should discern between traced for group stop
from other traps but then again given the group stop re-entering while
ptraced it can be considered a relatively consistent behavior.  Yeah,
but probably better to remove the double reporting.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced
  2011-03-22 20:59   ` Oleg Nesterov
@ 2011-03-23  8:48     ` Tejun Heo
  0 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-23  8:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

> I'd only ask you to remove ->group_leader from real_parent_is_ptracer()
> (btw, note that wait_consider_task() can use this helper too) although
> this is minor.

Yeah, will also rename the function as suggested.

> > Please let me know when you want the whole series resent.
> 
> Cough. Resent? If you ask me, I think you should just send them to
> Linus while the merge window is opened.

Too late for that.  Let's target the next merge window.

> I am trying to get the account on kernel.org, but until then I hope
> you can flush the pending ptrace-review branches.

I'll set up a branch until you can get your tree up and request
inclusion to linux-next.

> And yes, I bet we missed something and we will need the subsequent
> fixes anyway ;)

Oh yeah, I have no doubt about that.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-22 19:44       ` Oleg Nesterov
@ 2011-03-23  9:17         ` Tejun Heo
  2011-03-23  9:24           ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-23  9:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hey, Oleg.

On Tue, Mar 22, 2011 at 08:44:16PM +0100, Oleg Nesterov wrote:
> On 03/22, Tejun Heo wrote:
> >
> > On Mon, Mar 21, 2011 at 06:43:06PM +0100, Oleg Nesterov wrote:
> > > Hmm... in fact I can't convince myself we really need to look at
> > > child->group_leader, will recheck... Anyway, this is minor too.
> >
> > Care to elaborate?
> 
> child->real_parent must be equal to child->group_leader->real_parent.

Okay.

> What we need is ptrace_reparented(). But as I said, it should be
> fixed first (I think) and this needs a separate patch/discussion.
> 
> So yes, we need a new helper.
> 
> Well, and look at get_signal_to_deliver(),
> 
> 		leader = current->group_leader;
> 		if (task_ptrace(leader) && !real_parent_is_ptracer(leader))
> 			do_notify_parent_cldstop(leader, true, why);
> 
> Indeed. But in fact task_ptrace(leader) is not needed. If
> real_parent_is_ptracer(tsk) returns false, then tsk must be ptraced.
> 
> So we could write
> 
> 		if (!real_parent_is_ptracer(leader))
> 			do_notify_parent_cldstop(leader, true, why);
> 
> but now this looks confusing because of the naming. Compare with
> 
> 		if (ptrace_reparented(leader))
> 			do_notify_parent_cldstop(leader);

I see but I don't think ptrace_reparented() is a good name for
!same_thread_group(child->parent, child->real_parent).  It gets too
confusing with child->parent != child->real_parent test.

Anyways, I'll leave the renaming for later patch and just update the
->group_leader part.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-23  9:17         ` Tejun Heo
@ 2011-03-23  9:24           ` Tejun Heo
  2011-03-23 16:46             ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-23  9:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On Wed, Mar 23, 2011 at 10:17:36AM +0100, Tejun Heo wrote:
> Hey, Oleg.
> 
> On Tue, Mar 22, 2011 at 08:44:16PM +0100, Oleg Nesterov wrote:
> > On 03/22, Tejun Heo wrote:
> > >
> > > On Mon, Mar 21, 2011 at 06:43:06PM +0100, Oleg Nesterov wrote:
> > > > Hmm... in fact I can't convince myself we really need to look at
> > > > child->group_leader, will recheck... Anyway, this is minor too.
> > >
> > > Care to elaborate?
> > 
> > child->real_parent must be equal to child->group_leader->real_parent.
> 
> Okay.

Ooh, on a related note, we probably want to change
do_notify_parent_cldstop() too.  tsk->group_leader->real_parent is
used as the delivery target when !@for_ptracer.  This is the same with
tsk->real_parent and the code has been like this for a long time but
is a bit confusing.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-23  8:44         ` Tejun Heo
@ 2011-03-23 16:40           ` Oleg Nesterov
  2011-03-23 17:02             ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-23 16:40 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/23, Tejun Heo wrote:
>
> Hello,
>
> On Tue, Mar 22, 2011 at 07:44:15PM +0100, Oleg Nesterov wrote:
> > Suppose that debugger PTRACE_CONTs the stopped thread and then it
> > gets SIGSTOP and calls do_signal_stop() again. In theory this all is
> > possible before SIGNAL_STOP_STOPPED. This can confuse real_parent.
> > Say, real_parent itself sends SIGTTIN to the child and naturally
> > expects WIFSTOPPED() == SIGTTIN.
>
> Hmmm... There are two competing signals in that case - SIGTTIN sent by
> the parent and SIGSTOP sent by someone else.

"someone else" can be PTRACE_CONT(SIGSTOP) from the debugger.

> I can't think of a scenario where the parent would be able to
> differentiate the two signals (in the sense that it can say the latter
> is the wrong signal).  If you can, please share.

I didn't mean this is really wrong or can lead to some problems.
Just this looks inconsistent a bit.

> > Not sure I understand... We are setting GROUP_STOP_PENDING | CONSUME
> > again. T2 has already reported ptrace_stop(CLD_STOPPED) to the tracer.
> > It is stopped. Now it will report another CLD_STOPPED after PTRACE_CONT.
>
> Okay, I see.  Maybe we should discern between traced for group stop
> from other traps but then again given the group stop re-entering while
> ptraced it can be considered a relatively consistent behavior.  Yeah,
> but probably better to remove the double reporting.

Yes. I have a vague feeling a new GROUP_STOP_YES_I_AM_STOPPED can
be useful anyway. It should be set by task_participate_group_stop()
if the task participates. We will see.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-23  9:24           ` Tejun Heo
@ 2011-03-23 16:46             ` Oleg Nesterov
  2011-03-23 16:59               ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-23 16:46 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/23, Tejun Heo wrote:
>
> On Wed, Mar 23, 2011 at 10:17:36AM +0100, Tejun Heo wrote:
> > >
> > > child->real_parent must be equal to child->group_leader->real_parent.
> >
> > Okay.
>
> Ooh, on a related note, we probably want to change
> do_notify_parent_cldstop() too.  tsk->group_leader->real_parent is
> used as the delivery target when !@for_ptracer.  This is the same with
> tsk->real_parent and the code has been like this for a long time but
> is a bit confusing.

Yes, although in this case we do

		tsk = tsk->group_leader;
		parent = tsk->real_parent;

We need to change tsk to report the correct si_pid. But we could do

		parent = tsk->real_parent;
		tsk = tsk->group_leader;

not sure this looks less confusing.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-23 16:46             ` Oleg Nesterov
@ 2011-03-23 16:59               ` Tejun Heo
  2011-03-23 17:07                 ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-23 16:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

On Wed, Mar 23, 2011 at 05:46:57PM +0100, Oleg Nesterov wrote:
> > Ooh, on a related note, we probably want to change
> > do_notify_parent_cldstop() too.  tsk->group_leader->real_parent is
> > used as the delivery target when !@for_ptracer.  This is the same with
> > tsk->real_parent and the code has been like this for a long time but
> > is a bit confusing.
> 
> Yes, although in this case we do
> 
> 		tsk = tsk->group_leader;
> 		parent = tsk->real_parent;
> 
> We need to change tsk to report the correct si_pid. But we could do
> 
> 		parent = tsk->real_parent;
> 		tsk = tsk->group_leader;
> 
> not sure this looks less confusing.

They themselves are about the same but the inconsistency with the
is_real_parent() test is a bit confusing, I think.  If we're testing
for duplicates by testing ->real_parent against ->parent then the
actual delivery should be also be using ->real_parent and ->parent,
so...

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-23 16:40           ` Oleg Nesterov
@ 2011-03-23 17:02             ` Tejun Heo
  2011-03-23 17:09               ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-23 17:02 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hey,

On Wed, Mar 23, 2011 at 05:40:14PM +0100, Oleg Nesterov wrote:
> > Hmmm... There are two competing signals in that case - SIGTTIN sent by
> > the parent and SIGSTOP sent by someone else.
> 
> "someone else" can be PTRACE_CONT(SIGSTOP) from the debugger.

The debugger should be doing PTRACE_CONT(notified_signo).  Am I
missing something?  Anyways, this doesn't really matter.  I kinda like
just testing STOP_STOPPED both in stop initiation and notification
paths but well it's no biggie either way.

> > Okay, I see.  Maybe we should discern between traced for group stop
> > from other traps but then again given the group stop re-entering while
> > ptraced it can be considered a relatively consistent behavior.  Yeah,
> > but probably better to remove the double reporting.
> 
> Yes. I have a vague feeling a new GROUP_STOP_YES_I_AM_STOPPED can
> be useful anyway. It should be set by task_participate_group_stop()
> if the task participates. We will see.

Yeah, agreed.  Let's mark it and ensure that a task doesn't stop for
group stop back-to-back.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-23 16:59               ` Tejun Heo
@ 2011-03-23 17:07                 ` Oleg Nesterov
  2011-03-23 17:20                   ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-23 17:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/23, Tejun Heo wrote:
>
> Hello,
>
> On Wed, Mar 23, 2011 at 05:46:57PM +0100, Oleg Nesterov wrote:
> > > Ooh, on a related note, we probably want to change
> > > do_notify_parent_cldstop() too.  tsk->group_leader->real_parent is
> > > used as the delivery target when !@for_ptracer.  This is the same with
> > > tsk->real_parent and the code has been like this for a long time but
> > > is a bit confusing.
> >
> > Yes, although in this case we do
> >
> > 		tsk = tsk->group_leader;
> > 		parent = tsk->real_parent;
> >
> > We need to change tsk to report the correct si_pid. But we could do
> >
> > 		parent = tsk->real_parent;
> > 		tsk = tsk->group_leader;
> >
> > not sure this looks less confusing.
>
> They themselves are about the same but the inconsistency with the
> is_real_parent() test is a bit confusing, I think.  If we're testing
> for duplicates by testing ->real_parent against ->parent then the
> actual delivery should be also be using ->real_parent and ->parent,
> so...

Hmm. So I misunderstood you. And still can't understand...

Could you explain how should we change do_notify_parent_cldstop()?
I am just curious.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-23 17:02             ` Tejun Heo
@ 2011-03-23 17:09               ` Oleg Nesterov
  2011-03-23 17:22                 ` Tejun Heo
  0 siblings, 1 reply; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-23 17:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/23, Tejun Heo wrote:
>
> Hey,
>
> On Wed, Mar 23, 2011 at 05:40:14PM +0100, Oleg Nesterov wrote:
> > > Hmmm... There are two competing signals in that case - SIGTTIN sent by
> > > the parent and SIGSTOP sent by someone else.
> >
> > "someone else" can be PTRACE_CONT(SIGSTOP) from the debugger.
>
> The debugger should be doing PTRACE_CONT(notified_signo).

Why? it can change the signal.

> Am I
> missing something?  Anyways, this doesn't really matter. I kinda like
> just testing STOP_STOPPED both in stop initiation and notification
> paths but well it's no biggie either way.

OK, agreed, please forget.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-23 17:20                   ` Tejun Heo
@ 2011-03-23 17:17                     ` Oleg Nesterov
  0 siblings, 0 replies; 47+ messages in thread
From: Oleg Nesterov @ 2011-03-23 17:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

On 03/23, Tejun Heo wrote:
>
> > Hmm. So I misunderstood you. And still can't understand...
> >
> > Could you explain how should we change do_notify_parent_cldstop()?
> > I am just curious.
>
> As you suggested above. :-) The inconsistency I'm concernced about is
> the following.

Ah! got it. I misread you email as if I missed your initial point, sorry.

Oleg.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace
  2011-03-23 17:07                 ` Oleg Nesterov
@ 2011-03-23 17:20                   ` Tejun Heo
  2011-03-23 17:17                     ` Oleg Nesterov
  0 siblings, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2011-03-23 17:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

On Wed, Mar 23, 2011 at 06:07:08PM +0100, Oleg Nesterov wrote:
> On 03/23, Tejun Heo wrote:
> > On Wed, Mar 23, 2011 at 05:46:57PM +0100, Oleg Nesterov wrote:
> > > > Ooh, on a related note, we probably want to change
> > > > do_notify_parent_cldstop() too.  tsk->group_leader->real_parent is
> > > > used as the delivery target when !@for_ptracer.  This is the same with
> > > > tsk->real_parent and the code has been like this for a long time but
> > > > is a bit confusing.
> > >
> > > Yes, although in this case we do
> > >
> > > 		tsk = tsk->group_leader;
> > > 		parent = tsk->real_parent;
> > >
> > > We need to change tsk to report the correct si_pid. But we could do
> > >
> > > 		parent = tsk->real_parent;
> > > 		tsk = tsk->group_leader;
> > >
> > > not sure this looks less confusing.
> >
> > They themselves are about the same but the inconsistency with the
> > is_real_parent() test is a bit confusing, I think.  If we're testing
> > for duplicates by testing ->real_parent against ->parent then the
> > actual delivery should be also be using ->real_parent and ->parent,
> > so...
> 
> Hmm. So I misunderstood you. And still can't understand...
> 
> Could you explain how should we change do_notify_parent_cldstop()?
> I am just curious.

As you suggested above. :-) The inconsistency I'm concernced about is
the following.

 /*
  * Test whether the target task of the usual cldstop notification - the
  * real_parent of @child - is in the same group as the ptracer.
  */
 static bool real_parent_is_ptracer(struct task_struct *child)
 {
	 return same_thread_group(child->parent, child->real_parent);
 }

So, here, we're saying that if child->parent and child->real_parent
are in the same thread group, they're gonna be duplicates, but in
do_notify_parent_cldstop(), we target tsk->group_leader->real_parent,
which should be the same as tsk->real_parent but confusing
nonetheless.  That's actually the reason why I used
child->group_leader->real_parent in the first place.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop
  2011-03-23 17:09               ` Oleg Nesterov
@ 2011-03-23 17:22                 ` Tejun Heo
  0 siblings, 0 replies; 47+ messages in thread
From: Tejun Heo @ 2011-03-23 17:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: roland, jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan

Hello,

On Wed, Mar 23, 2011 at 06:09:20PM +0100, Oleg Nesterov wrote:
> On 03/23, Tejun Heo wrote:
> > > "someone else" can be PTRACE_CONT(SIGSTOP) from the debugger.
> >
> > The debugger should be doing PTRACE_CONT(notified_signo).
> 
> Why? it can change the signal.

Yeah, sure, it can but then it gets to live with the consequences,
which is the changed signal being reported.  In fact, in such case, I
think reporting the changed signal is the right thing to do(tm).

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2011-03-23 17:26 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-08 19:56 [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
2011-03-08 19:56 ` [PATCH 1/8] job control: Don't set group_stop exit_code if re-entering job control stop Tejun Heo
2011-03-21 13:20   ` Oleg Nesterov
2011-03-21 15:52     ` Tejun Heo
2011-03-22 18:44       ` Oleg Nesterov
2011-03-23  8:44         ` Tejun Heo
2011-03-23 16:40           ` Oleg Nesterov
2011-03-23 17:02             ` Tejun Heo
2011-03-23 17:09               ` Oleg Nesterov
2011-03-23 17:22                 ` Tejun Heo
2011-03-08 19:56 ` [PATCH 2/8] job control: Small reorganization of wait_consider_task() Tejun Heo
2011-03-08 19:56 ` [PATCH 3/8] job control: Fix ptracer wait(2) hang and explain notask_error clearing Tejun Heo
2011-03-21 15:19   ` Oleg Nesterov
2011-03-21 16:09     ` Oleg Nesterov
2011-03-21 16:12     ` Tejun Heo
2011-03-22 19:08       ` Oleg Nesterov
2011-03-22 10:51   ` [PATCH UPDATED " Tejun Heo
2011-03-08 19:56 ` [PATCH 4/8] job control: Allow access to job control events through ptracees Tejun Heo
2011-03-21 16:39   ` Oleg Nesterov
2011-03-21 17:20     ` Tejun Heo
2011-03-22 11:10   ` [PATCH UPDATED " Tejun Heo
2011-03-08 19:56 ` [PATCH 5/8] job control: Add @for_ptrace to do_notify_parent_cldstop() Tejun Heo
2011-03-08 19:56 ` [PATCH 6/8] job control: Job control stop notifications should always go to the real parent Tejun Heo
2011-03-21 17:12   ` Oleg Nesterov
2011-03-08 19:56 ` [PATCH 7/8] job control: Notify the real parent of job control events regardless of ptrace Tejun Heo
2011-03-21 17:43   ` Oleg Nesterov
2011-03-22  8:04     ` Tejun Heo
2011-03-22 19:44       ` Oleg Nesterov
2011-03-23  9:17         ` Tejun Heo
2011-03-23  9:24           ` Tejun Heo
2011-03-23 16:46             ` Oleg Nesterov
2011-03-23 16:59               ` Tejun Heo
2011-03-23 17:07                 ` Oleg Nesterov
2011-03-23 17:20                   ` Tejun Heo
2011-03-23 17:17                     ` Oleg Nesterov
2011-03-22 11:30   ` [PATCH UPDATED " Tejun Heo
2011-03-08 19:56 ` [PATCH 8/8] job control: Don't send duplicate job control stop notification while ptraced Tejun Heo
2011-03-21 17:48   ` Oleg Nesterov
2011-03-08 20:01 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent " Linus Torvalds
2011-03-09 16:50 ` Oleg Nesterov
2011-03-22 10:20 ` [PATCH 0.1/8] ptrace: Collapse ptrace_untrace() into __ptrace_unlink() Tejun Heo
2011-03-22 10:20 ` [PATCH 0.2/8] ptrace: Always put ptracee into appropriate execution state Tejun Heo
2011-03-22 20:33   ` Oleg Nesterov
2011-03-23  8:00     ` Tejun Heo
2011-03-22 13:11 ` [RFC PATCHSET] ptrace,signal: Fix notifications to the real parent while ptraced Tejun Heo
2011-03-22 20:59   ` Oleg Nesterov
2011-03-23  8:48     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).