linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume
@ 2011-03-29 14:46 Tejun Heo
  2011-03-29 14:46 ` [PATCH 2/3] signal, ptrace: Add SIGTRAP signal_wake_up() Tejun Heo
  2011-03-29 18:27 ` [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume Oleg Nesterov
  0 siblings, 2 replies; 16+ messages in thread
From: Tejun Heo @ 2011-03-29 14:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, roland

signal_wake_up() currently takes boolean @resume which indicates that
the target task should be woken up with %TASK_WAKEKILL.  Replace the
argument with @sig_type and use %SIGKILL to indicate %TASK_WAKEKILL
wakeups.

This is to prepare for adding more signal wake up types.  All users
are converted to use %SIGKILL instead of 1 and this patch doesn't
cause any behavior difference.

While at it, convert to docbook function comment.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
Hello, guys.

These three patches implement fix for the SIGCONT notification corner
case Oleg pointed out in the following thread.

 http://thread.gmane.org/gmane.linux.kernel/1116692/focus=1117000

In the process, signal_wake_up() is beefed up to handle most of child
wake up paths in signal and ptrace and I think the resulting code is
easier to comprehend and slightly less error-prone.

It may call in child into signal delivery path where it doesn't need
to but such cases are by no means common and I don't think there's any
performance implication.  Please read the patch description on the
third patch for details.

Oleg, what do you think?

Thanks.

 fs/exec.c             |    2 +-
 include/linux/sched.h |    2 +-
 kernel/ptrace.c       |    8 +++---
 kernel/signal.c       |   66 ++++++++++++++++++++++++++++++++-----------------
 4 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 8328beb..63e726d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1662,7 +1662,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 		task_clear_group_stop_pending(t);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
-			signal_wake_up(t, 1);
+			signal_wake_up(t, SIGKILL);
 			nr++;
 		}
 	} while_each_thread(start, t);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 456d80e..4c30c00 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2488,7 +2488,7 @@ static inline void thread_group_cputime_init(struct signal_struct *sig)
 extern void recalc_sigpending_and_wake(struct task_struct *t);
 extern void recalc_sigpending(void);
 
-extern void signal_wake_up(struct task_struct *t, int resume_stopped);
+extern void signal_wake_up(struct task_struct *t, int sig_type);
 
 /*
  * Wrappers for p->thread_info->cpu access. No-op on UP.
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 4348586..ec8cce6 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -86,12 +86,12 @@ void __ptrace_unlink(struct task_struct *child)
 
 	/*
 	 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
-	 * @child in the butt.  Note that @resume should be used iff @child
-	 * is in TASK_TRACED; otherwise, we might unduly disrupt
+	 * @child in the butt.  Note that %SIGKILL wake up should be used
+	 * iff @child is in TASK_TRACED; otherwise, we might unduly disrupt
 	 * TASK_KILLABLE sleeps.
 	 */
 	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
-		signal_wake_up(child, task_is_traced(child));
+		signal_wake_up(child, task_is_traced(child) ? SIGKILL : 0);
 
 	spin_unlock(&child->sighand->siglock);
 }
@@ -243,7 +243,7 @@ static int ptrace_attach(struct task_struct *task)
 	 */
 	if (task_is_stopped(task)) {
 		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
-		signal_wake_up(task, 1);
+		signal_wake_up(task, SIGKILL);
 		wait_trap = true;
 	}
 
diff --git a/kernel/signal.c b/kernel/signal.c
index f799a05..837070c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -618,33 +618,53 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
 	return signr;
 }
 
-/*
- * Tell a process that it has a new active signal..
+/**
+ * signal_wake_up - tell a task that it has a new active signal
+ * @t: the target task
+ * @sig_type: the type of the new signal (0 or SIGKILL)
+ *
+ * This function makes sure that @t is woken up and/or brought into signal
+ * delivery path as necessary accodring to @sig_type.  @sig_type can be one
+ * of the followings.
+ *
+ * 0		Normal signal delivery.  If @t is executing in userland it
+ *		should be brought in to deliver the signal.  When @t is in
+ *		kernel, wake it up iff it's in interruptible sleep.
  *
- * NOTE! we rely on the previous spin_lock to
- * lock interrupts for us! We can only be called with
- * "siglock" held, and the local interrupt must
- * have been disabled when that got acquired!
+ * %SIGKILL	@t is being killed.  In addition to the usual kicking,
+ *		interrupt KILLABLE, STOPPED and TRACED sleeps using
+ *		%TASK_WAKEKILL.
  *
- * No need to set need_resched since signal event passing
- * goes through ->blocked
+ * CONTEXT:
+ * Must be called with @t->sighand->siglock held.
  */
-void signal_wake_up(struct task_struct *t, int resume)
+void signal_wake_up(struct task_struct *t, int sig_type)
 {
-	unsigned int mask;
+	unsigned int uninitialized_var(mask);
 
 	set_tsk_thread_flag(t, TIF_SIGPENDING);
 
-	/*
-	 * For SIGKILL, we want to wake it up in the stopped/traced/killable
-	 * case. We don't check t->state here because there is a race with it
-	 * executing another processor and just now entering stopped state.
-	 * By using wake_up_state, we ensure the process will wake up and
-	 * handle its death signal.
-	 */
-	mask = TASK_INTERRUPTIBLE;
-	if (resume)
-		mask |= TASK_WAKEKILL;
+	switch (sig_type) {
+	case 0:
+		mask = TASK_INTERRUPTIBLE;
+		break;
+
+	case SIGKILL:
+		/*
+		 * For SIGKILL, we want to wake it up in the stopped /
+		 * traced / killable case.  We don't check t->state here
+		 * because there is a race with it executing another
+		 * processor and just now entering stopped state.  By using
+		 * wake_up_state, we ensure the process will wake up and
+		 * handle its death signal.
+		 */
+		mask |= TASK_INTERRUPTIBLE | TASK_WAKEKILL;
+		break;
+
+	default:
+		BUG();
+	}
+
 	if (!wake_up_state(t, mask))
 		kick_process(t);
 }
@@ -941,7 +961,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			do {
 				task_clear_group_stop_pending(t);
 				sigaddset(&t->pending.signal, SIGKILL);
-				signal_wake_up(t, 1);
+				signal_wake_up(t, SIGKILL);
 			} while_each_thread(p, t);
 			return;
 		}
@@ -951,7 +971,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 	 * The signal is already in the shared-pending queue.
 	 * Tell the chosen thread to wake up and dequeue it.
 	 */
-	signal_wake_up(t, sig == SIGKILL);
+	signal_wake_up(t, sig == SIGKILL ? SIGKILL : 0);
 	return;
 }
 
@@ -1180,7 +1200,7 @@ int zap_other_threads(struct task_struct *p)
 		if (t->exit_state)
 			continue;
 		sigaddset(&t->pending.signal, SIGKILL);
-		signal_wake_up(t, 1);
+		signal_wake_up(t, SIGKILL);
 	}
 
 	return count;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] signal, ptrace: Add SIGTRAP signal_wake_up()
  2011-03-29 14:46 [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume Tejun Heo
@ 2011-03-29 14:46 ` Tejun Heo
  2011-03-29 14:47   ` [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced Tejun Heo
  2011-03-29 18:27 ` [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume Oleg Nesterov
  1 sibling, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2011-03-29 14:46 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, roland

ptrace needs the ability to kick tasks in STOPPED or TRACED states and
was selectively using %SIGKILL when calling signal_wake_up() to
achieve it.

Add %SIGTRAP wake up mode to signal_wake_up() which wakes up tasks in
interruptible sleep, stopped or traced state and use it in
__ptrace_unlink() and ptrace_attach().

Although this changes the used wakeup mask, it doesn't cause any
behavior change as the target task is already known to be in specific
states.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/ptrace.c |    8 +++-----
 kernel/signal.c |    7 +++++++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index ec8cce6..3989a7e 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -86,12 +86,10 @@ void __ptrace_unlink(struct task_struct *child)
 
 	/*
 	 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
-	 * @child in the butt.  Note that %SIGKILL wake up should be used
-	 * iff @child is in TASK_TRACED; otherwise, we might unduly disrupt
-	 * TASK_KILLABLE sleeps.
+	 * @child in the butt.
 	 */
 	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
-		signal_wake_up(child, task_is_traced(child) ? SIGKILL : 0);
+		signal_wake_up(child, SIGTRAP);
 
 	spin_unlock(&child->sighand->siglock);
 }
@@ -243,7 +241,7 @@ static int ptrace_attach(struct task_struct *task)
 	 */
 	if (task_is_stopped(task)) {
 		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
-		signal_wake_up(task, SIGKILL);
+		signal_wake_up(task, SIGTRAP);
 		wait_trap = true;
 	}
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 837070c..ff63459 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -631,6 +631,9 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
  *		should be brought in to deliver the signal.  When @t is in
  *		kernel, wake it up iff it's in interruptible sleep.
  *
+ * %SIGTRAP	Used by ptrace.  In addition to the usual kicking,
+ *		interrupt STOPPED and TRACED sleeps.
+ *
  * %SIGKILL	@t is being killed.  In addition to the usual kicking,
  *		interrupt KILLABLE, STOPPED and TRACED sleeps using
  *		%TASK_WAKEKILL.
@@ -649,6 +652,10 @@ void signal_wake_up(struct task_struct *t, int sig_type)
 		mask = TASK_INTERRUPTIBLE;
 		break;
 
+	case SIGTRAP:
+		mask = TASK_INTERRUPTIBLE | __TASK_STOPPED | __TASK_TRACED;
+		break;
+
 	case SIGKILL:
 		/*
 		 * For SIGKILL, we want to wake it up in the stopped /
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-29 14:46 ` [PATCH 2/3] signal, ptrace: Add SIGTRAP signal_wake_up() Tejun Heo
@ 2011-03-29 14:47   ` Tejun Heo
  2011-03-30 19:29     ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2011-03-29 14:47 UTC (permalink / raw)
  To: oleg, jan.kratochvil, vda.linux
  Cc: linux-kernel, torvalds, akpm, indan, roland

When SIGCONT is resuming a process from a in-progress or completed
group stop, it is guaranteed that at least one task of the target
group will travel signal delivery path and thus will notice
SIGNAL_CLD_MASK set by prepare_signal() and deliver it.  This is why
prepare_signal() could use wake_up_state() instead of
signal_wake_up().

However, because a ptraced task in group stopped process is allowed to
escape the group stop and execute beneath it, the task may not notice
SIGNAL_CLD_MASK until the next signal delivery.

Fix it by making job control continuation path use signal_wake_up()
instead of wake_up_state() and recalc_sigpending_tsk() consider
SIGNAL_CLD_MASK.  This makes pending CONTINUED notification treated
similarly to pending signals and ensures that the target task is
brought into signal delivery path regardless of its current state.

After this patch, TIF_SIGPENDING is set for cases where it isn't
strictly necessary; however, the only case which it makes any
difference is when the SIGCONT is received while group stop is still
in progress, which hardly is a case to optimize for.  Just use
signal_wake_up() unconditionally for simplicity's sake.

This problem has been identified and the solution suggested by Oleg
Nesterov.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <time.h>
  #include <signal.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };

  static void sigchld_sigaction(int signo, siginfo_t *si, void *ucxt)
  {
	  printf("SIG  status=%02d code=%02d\n", si->si_status, si->si_code);
  }

  int main(void)
  {
	  const struct sigaction chld_sa = { .sa_sigaction = sigchld_sigaction,
					     .sa_flags = SA_SIGINFO|SA_RESTART };
	  pid_t tracee;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  sigset_t sigset;
		  sigemptyset(&sigset);
		  sigaddset(&sigset, SIGCONT);
		  while (1)
			  sigprocmask(SIG_BLOCK, &sigset, NULL);
	  }

	  if (!fork()) {
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		  waitid(P_PID, tracee, &si, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
		  waitid(P_PID, tracee, &si, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
		  while (1)
			  pause();
	  }

	  waitid(P_ALL, 0, &si, WSTOPPED);
	  nanosleep(&ts100ms, NULL);
	  sigaction(SIGCHLD, &chld_sa, NULL);
	  kill(tracee, SIGCONT);
	  while (1)
		  pause();
	  return 0;
  }

The sigprocmask() loop in tracee is there to make it continuously
re-evaluate TIF_SIGPENDING such that it doesn't stay set from previous
events.  Before the patch, as the SIGCONT doesn't call in the tracee
into signal delivery path, SIGCHLD is never generated and there's no
output.

After the patch, the tracee is forced into signal delivery path and
SIGCHLD is properly generated.

  SIG  status=18 code=06
  ^C

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/signal.c |   44 +++++++++++++++++++++-----------------------
 1 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index ff63459..fcab849 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -125,6 +125,7 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
 	if ((t->group_stop & GROUP_STOP_PENDING) ||
+	    (t->signal->flags & SIGNAL_CLD_MASK) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
@@ -631,6 +632,10 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
  *		should be brought in to deliver the signal.  When @t is in
  *		kernel, wake it up iff it's in interruptible sleep.
  *
+ * %SIGCONT	@t is being resumed from job control stop.  Wake up if
+ *		%__TASK_STOPPED.  If there's a custom signal handler to
+ *		%run, interrupt %TASK_INTERRUPTIBLE sleeps too.
+ *
  * %SIGTRAP	Used by ptrace.  In addition to the usual kicking,
  *		interrupt STOPPED and TRACED sleeps.
  *
@@ -652,6 +657,21 @@ void signal_wake_up(struct task_struct *t, int sig_type)
 		mask = TASK_INTERRUPTIBLE;
 		break;
 
+	case SIGCONT:
+		/*
+		 * If there is a handler for SIGCONT, we must make sure
+		 * that no thread returns to user mode before we post the
+		 * signal, in case it was the only thread eligible to run
+		 * the signal handler--then it must not do anything between
+		 * resuming and running the handler.  This is ensured by
+		 * setting TIF_SIGPENDING before waking up.
+		 */
+		mask = __TASK_STOPPED;
+		if (sig_user_defined(t, SIGCONT) &&
+		    !sigismember(&t->blocked, SIGCONT))
+			mask |= TASK_INTERRUPTIBLE;
+		break;
+
 	case SIGTRAP:
 		mask = TASK_INTERRUPTIBLE | __TASK_STOPPED | __TASK_TRACED;
 		break;
@@ -821,31 +841,9 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			unsigned int state;
-
 			task_clear_group_stop_pending(t);
-
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
-			/*
-			 * If there is a handler for SIGCONT, we must make
-			 * sure that no thread returns to user mode before
-			 * we post the signal, in case it was the only
-			 * thread eligible to run the signal handler--then
-			 * it must not do anything between resuming and
-			 * running the handler.  With the TIF_SIGPENDING
-			 * flag set, the thread will pause and acquire the
-			 * siglock that we hold now and until we've queued
-			 * the pending signal.
-			 *
-			 * Wake up the stopped thread _after_ setting
-			 * TIF_SIGPENDING
-			 */
-			state = __TASK_STOPPED;
-			if (sig_user_defined(t, SIGCONT) && !sigismember(&t->blocked, SIGCONT)) {
-				set_tsk_thread_flag(t, TIF_SIGPENDING);
-				state |= TASK_INTERRUPTIBLE;
-			}
-			wake_up_state(t, state);
+			signal_wake_up(t, SIGCONT);
 		} while_each_thread(p, t);
 
 		/*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume
  2011-03-29 14:46 [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume Tejun Heo
  2011-03-29 14:46 ` [PATCH 2/3] signal, ptrace: Add SIGTRAP signal_wake_up() Tejun Heo
@ 2011-03-29 18:27 ` Oleg Nesterov
  1 sibling, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2011-03-29 18:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

On 03/29, Tejun Heo wrote:
>
> -void signal_wake_up(struct task_struct *t, int resume)
> +void signal_wake_up(struct task_struct *t, int sig_type)
>  {
> -	unsigned int mask;
> +	unsigned int uninitialized_var(mask);
>
>  	set_tsk_thread_flag(t, TIF_SIGPENDING);
> 
> -	/*
> -	 * For SIGKILL, we want to wake it up in the stopped/traced/killable
> -	 * case. We don't check t->state here because there is a race with it
> -	 * executing another processor and just now entering stopped state.
> -	 * By using wake_up_state, we ensure the process will wake up and
> -	 * handle its death signal.
> -	 */
> -	mask = TASK_INTERRUPTIBLE;
> -	if (resume)
> -		mask |= TASK_WAKEKILL;
> +	switch (sig_type) {
> +	case 0:
> +		mask = TASK_INTERRUPTIBLE;
> +		break;
> +
> +	case SIGKILL:
> +		/*
> +		 * For SIGKILL, we want to wake it up in the stopped /
> +		 * traced / killable case.  We don't check t->state here
> +		 * because there is a race with it executing another
> +		 * processor and just now entering stopped state.  By using
> +		 * wake_up_state, we ensure the process will wake up and
> +		 * handle its death signal.
> +		 */
> +		mask |= TASK_INTERRUPTIBLE | TASK_WAKEKILL;
> +		break;

Interesting... Yes, I was thinking about changing signal_wake_up()
too, my intent was to pass TASK_* mask directly.

But your approach looks more clean. So, to me 1-2 look as the nice
cleanups in any case.

But let me think more about 3/3. I still think we do not want this.
But I need the fresh head to undestand what I actually have in mind.
Perhaps nothing, just the wrong feeling.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-29 14:47   ` [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced Tejun Heo
@ 2011-03-30 19:29     ` Oleg Nesterov
  2011-03-31  7:29       ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2011-03-30 19:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

On 03/29, Tejun Heo wrote:
>
> Fix it by making job control continuation path use signal_wake_up()
> instead of wake_up_state() and recalc_sigpending_tsk() consider
> SIGNAL_CLD_MASK.

I still can't agree with this patch...

For the moment, please forget about ptrace. In this case this patch
is not needed, but it adds the small pessimization. OK, it is really
small, recalc_sigpending_tsk() is not called very often.

Also,

> After this patch, TIF_SIGPENDING is set for cases where it isn't
> strictly necessary; however, the only case which it makes any
> difference is when the SIGCONT is received while group stop is still
> in progress, which hardly is a case to optimize for.

Agreed. But, the current code is simply wrong even if the problem
is minor, and afaics your patch makes the things worse.

> @@ -652,6 +657,21 @@ void signal_wake_up(struct task_struct *t, int sig_type)
>  		mask = TASK_INTERRUPTIBLE;
>  		break;
>
> +	case SIGCONT:
> +		/*
> +		 * If there is a handler for SIGCONT, we must make sure
> +		 * that no thread returns to user mode before we post the
> +		 * signal, in case it was the only thread eligible to run
> +		 * the signal handler--then it must not do anything between
> +		 * resuming and running the handler.  This is ensured by
> +		 * setting TIF_SIGPENDING before waking up.
> +		 */
> +		mask = __TASK_STOPPED;
> +		if (sig_user_defined(t, SIGCONT) &&
> +		    !sigismember(&t->blocked, SIGCONT))
> +			mask |= TASK_INTERRUPTIBLE;
> +		break;
> +
>  	case SIGTRAP:
>  		mask = TASK_INTERRUPTIBLE | __TASK_STOPPED | __TASK_TRACED;
>  		break;
> @@ -821,31 +841,9 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
>  		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
>  		t = p;
>  		do {
> [...snip...]
> -			wake_up_state(t, state);
> +			signal_wake_up(t, SIGCONT);
>  		} while_each_thread(p, t);

I don't think this logic should be moved into signal_wake_up(). Simply
because it is wrong.

Forget about SIGSTOP. Suppose that the thread group is running, and we
are doing tkill(SIGCONT). Now, why every thread gets TIF_SIGPENDING +
wakeup? This is not correct. Even the kill(SIGCONT) case is not exactly
right. This should be handled by complete_signal(). I think we had to
set TIF_SIGPENDING before, now the locking is different. I'll try to
make the patch.


Now, lets recall about ptrace. But please forget about the lost
notification. We are going to implement the new trap anyway. Ignoring
all details, how we can implement this? I think the most natural and
approach is something like this:

	- add the new GROUP_STOP_CONTTRAP (the name is random), it
	  lives in task->group_stop

	- this bit should be considered by recalc_sigpending_tsk()
	  along with GROUP_STOP_PENDING

	- it should be set by prepare_signal(SIGCONT) if the task
	  is traced

	- this bit should be checked by get_signal_to_deliver() to
	  trigger ptrace_stop()

Does this make any sense?

If yes, then everything works "automagically" after that. The tracee
will call get_signal_to_deliver() if it is running, and check
SIGNAL_CLD_MASK.

What do you think?

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-30 19:29     ` Oleg Nesterov
@ 2011-03-31  7:29       ` Tejun Heo
  2011-03-31 15:15         ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2011-03-31  7:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

Hello, Oleg.

On Wed, Mar 30, 2011 at 09:29:18PM +0200, Oleg Nesterov wrote:
> > @@ -652,6 +657,21 @@ void signal_wake_up(struct task_struct *t, int sig_type)
> >  		mask = TASK_INTERRUPTIBLE;
> >  		break;
> >
> > +	case SIGCONT:
> > +		/*
> > +		 * If there is a handler for SIGCONT, we must make sure
> > +		 * that no thread returns to user mode before we post the
> > +		 * signal, in case it was the only thread eligible to run
> > +		 * the signal handler--then it must not do anything between
> > +		 * resuming and running the handler.  This is ensured by
> > +		 * setting TIF_SIGPENDING before waking up.
> > +		 */
> > +		mask = __TASK_STOPPED;
> > +		if (sig_user_defined(t, SIGCONT) &&
> > +		    !sigismember(&t->blocked, SIGCONT))
> > +			mask |= TASK_INTERRUPTIBLE;
> > +		break;
> > +
> >  	case SIGTRAP:
> >  		mask = TASK_INTERRUPTIBLE | __TASK_STOPPED | __TASK_TRACED;
> >  		break;
> > @@ -821,31 +841,9 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
> >  		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
> >  		t = p;
> >  		do {
> > [...snip...]
> > -			wake_up_state(t, state);
> > +			signal_wake_up(t, SIGCONT);
> >  		} while_each_thread(p, t);
> 
> I don't think this logic should be moved into signal_wake_up(). Simply
> because it is wrong.
>
> Forget about SIGSTOP. Suppose that the thread group is running, and we
> are doing tkill(SIGCONT). Now, why every thread gets TIF_SIGPENDING +
> wakeup? This is not correct.

Right, while running, it generates spurious TIF_SIGPENDINGs.  It can
be trivially worked around by testing if (group_stop_count ||
STOP_STOPPED) before calling signal_wake_up().

> Even the kill(SIGCONT) case is not exactly right. This should be
> handled by complete_signal(). I think we had to set TIF_SIGPENDING
> before, now the locking is different. I'll try to make the patch.

The wake up is for continuation from group stop, which should be
handled simiarly to signal delivery but isn't exactly the same thing.
Maybe moving it to complete_signal() is cleaner but I don't think it's
a problem of being right or wrong.

> Now, lets recall about ptrace. But please forget about the lost
> notification. We are going to implement the new trap anyway. Ignoring
> all details, how we can implement this? I think the most natural and
> approach is something like this:
> 
> 	- add the new GROUP_STOP_CONTTRAP (the name is random), it
> 	  lives in task->group_stop
> 
> 	- this bit should be considered by recalc_sigpending_tsk()
> 	  along with GROUP_STOP_PENDING
> 
> 	- it should be set by prepare_signal(SIGCONT) if the task
> 	  is traced
> 
> 	- this bit should be checked by get_signal_to_deliver() to
> 	  trigger ptrace_stop()
> 
> Does this make any sense?

Yeah, that sounds reasonable.  I was thinking more along the line of
just adding another ptrace_stop() call after the one for group stop
but the flag based approach solves the notification to running tracees
automatically, so it sounds good.

> If yes, then everything works "automagically" after that. The tracee
> will call get_signal_to_deliver() if it is running, and check
> SIGNAL_CLD_MASK.
> 
> What do you think?

I think it's good but I don't think it necessarily changes a lot about
this patch.  We'll be calling signal_wake_up() for that trap from
prepare_signal() after all, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-31  7:29       ` Tejun Heo
@ 2011-03-31 15:15         ` Oleg Nesterov
  2011-03-31 16:34           ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2011-03-31 15:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

On 03/31, Tejun Heo wrote:
>
> On Wed, Mar 30, 2011 at 09:29:18PM +0200, Oleg Nesterov wrote:
> > I don't think this logic should be moved into signal_wake_up(). Simply
> > because it is wrong.
> >
> > Forget about SIGSTOP. Suppose that the thread group is running, and we
> > are doing tkill(SIGCONT). Now, why every thread gets TIF_SIGPENDING +
> > wakeup? This is not correct.
>
> Right, while running, it generates spurious TIF_SIGPENDINGs.  It can
> be trivially worked around by testing if (group_stop_count ||
> STOP_STOPPED) before calling signal_wake_up().

Please look at the patch below. I can't believe it is that simple, I'll
recheck and rediff against your tree. But, it seems, this code is simply
bogus nowadays and can be removed. Or I missed something...

> > Even the kill(SIGCONT) case is not exactly right. This should be
> > handled by complete_signal(). I think we had to set TIF_SIGPENDING
> > before, now the locking is different. I'll try to make the patch.
>
> The wake up is for continuation from group stop, which should be
> handled simiarly to signal delivery but isn't exactly the same thing.
> Maybe moving it to complete_signal() is cleaner but I don't think it's
> a problem of being right or wrong.

No, sorry, I was unclear. I meant, complete_signal() should take care and
set TIF_SIGPENDING properly. But, unless I missed something, it already
does all we need.

> > What do you think?
>
> I think it's good but I don't think it necessarily changes a lot about
> this patch.  We'll be calling signal_wake_up() for that trap from
> prepare_signal() after all, right?

No, I think prepare_signal(SIGCONT) should do this, but only if the
thread is ptraced.

My main objection was, signal_wake_up() shouldn't play with
sig_user_defined/blocked at all. And, we shouldn't blindly set
TIF_SIGPENDING as the current code does.

So. IMHO we need the patch below first, then we should think about the
further changes.

What do you think?

Oleg.

---------------------------------------------------------------------

prepare_signal(SIGCONT) should never set TIF_SIGPENDING or wake up
the TASK_INTERRUPTIBLE threads. We are going to call complete_signal()
which should pick the right thread correctly. All we need is to wake
up the TASK_STOPPED threads.

If the task was stopped, it can't return to usermode without taking
->siglock. Otherwise we don't care, and the spurious TIF_SIGPENDING
can't be useful.

The comment says:

	* If there is a handler for SIGCONT, we must make
	* sure that no thread returns to user mode before
	* we post the signal

It is not clear what this means. But in any case, even if this SIGCONT
is not private, only one thread can dequeue SIGCONT, other threads can
happily return to user mode before before that thread handles this signal.

Note also that wake_up_state(t, __TASK_STOPPED) can't race with the
task which changes its state, TASK_STOPPED state is protected by
->siglock as well.

--- x/kernel/signal.c
+++ x/kernel/signal.c
@@ -726,34 +726,13 @@ static int prepare_signal(int sig, struc
 	} else if (sig == SIGCONT) {
 		unsigned int why;
 		/*
-		 * Remove all stop signals from all queues,
-		 * and wake all threads.
+		 * Remove all stop signals from all queues, wake all threads.
 		 */
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			unsigned int state;
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
-			/*
-			 * If there is a handler for SIGCONT, we must make
-			 * sure that no thread returns to user mode before
-			 * we post the signal, in case it was the only
-			 * thread eligible to run the signal handler--then
-			 * it must not do anything between resuming and
-			 * running the handler.  With the TIF_SIGPENDING
-			 * flag set, the thread will pause and acquire the
-			 * siglock that we hold now and until we've queued
-			 * the pending signal.
-			 *
-			 * Wake up the stopped thread _after_ setting
-			 * TIF_SIGPENDING
-			 */
-			state = __TASK_STOPPED;
-			if (sig_user_defined(t, SIGCONT) && !sigismember(&t->blocked, SIGCONT)) {
-				set_tsk_thread_flag(t, TIF_SIGPENDING);
-				state |= TASK_INTERRUPTIBLE;
-			}
-			wake_up_state(t, state);
+			wake_up_state(t, __TASK_STOPPED);
 		} while_each_thread(p, t);
 
 		/*


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-31 15:15         ` Oleg Nesterov
@ 2011-03-31 16:34           ` Tejun Heo
  2011-03-31 16:35             ` Tejun Heo
  2011-03-31 17:29             ` Oleg Nesterov
  0 siblings, 2 replies; 16+ messages in thread
From: Tejun Heo @ 2011-03-31 16:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

Hello,

On Thu, Mar 31, 2011 at 05:15:49PM +0200, Oleg Nesterov wrote:
> No, I think prepare_signal(SIGCONT) should do this, but only if the
> thread is ptraced.
> 
> My main objection was, signal_wake_up() shouldn't play with
> sig_user_defined/blocked at all. And, we shouldn't blindly set
> TIF_SIGPENDING as the current code does.

Hmmm... okay.

> prepare_signal(SIGCONT) should never set TIF_SIGPENDING or wake up
> the TASK_INTERRUPTIBLE threads. We are going to call complete_signal()
> which should pick the right thread correctly. All we need is to wake
> up the TASK_STOPPED threads.
> 
> If the task was stopped, it can't return to usermode without taking
> ->siglock. Otherwise we don't care, and the spurious TIF_SIGPENDING
> can't be useful.
> 
> The comment says:
> 
> 	* If there is a handler for SIGCONT, we must make
> 	* sure that no thread returns to user mode before
> 	* we post the signal

I interpreted it as "when there's only single thread, it should not
return to userland before executing the signal handler".

> It is not clear what this means. But in any case, even if this SIGCONT
> is not private, only one thread can dequeue SIGCONT, other threads can
> happily return to user mode before before that thread handles this signal.
> 
> Note also that wake_up_state(t, __TASK_STOPPED) can't race with the
> task which changes its state, TASK_STOPPED state is protected by
> ->siglock as well.
> 
> --- x/kernel/signal.c
> +++ x/kernel/signal.c
> @@ -726,34 +726,13 @@ static int prepare_signal(int sig, struc
>  	} else if (sig == SIGCONT) {
>  		unsigned int why;
>  		/*
> -		 * Remove all stop signals from all queues,
> -		 * and wake all threads.
> +		 * Remove all stop signals from all queues, wake all threads.
>  		 */
>  		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
>  		t = p;
>  		do {
> -			unsigned int state;
>  			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
> -			/*
> -			 * If there is a handler for SIGCONT, we must make
> -			 * sure that no thread returns to user mode before
> -			 * we post the signal, in case it was the only
> -			 * thread eligible to run the signal handler--then
> -			 * it must not do anything between resuming and
> -			 * running the handler.  With the TIF_SIGPENDING
> -			 * flag set, the thread will pause and acquire the
> -			 * siglock that we hold now and until we've queued
> -			 * the pending signal.
> -			 *
> -			 * Wake up the stopped thread _after_ setting
> -			 * TIF_SIGPENDING
> -			 */
> -			state = __TASK_STOPPED;
> -			if (sig_user_defined(t, SIGCONT) && !sigismember(&t->blocked, SIGCONT)) {
> -				set_tsk_thread_flag(t, TIF_SIGPENDING);
> -				state |= TASK_INTERRUPTIBLE;
> -			}
> -			wake_up_state(t, state);
> +			wake_up_state(t, __TASK_STOPPED);
>  		} while_each_thread(p, t);

This is awesome and, at the first glance, yeah, this seems to be the
right thing to do.  That part is pure signal delivery after all.

* As wants_signal() doesn't take uninterruptible sleeps into
  consideration, the signal might get delivered later with the change
  but I don't think it's problematic in any way.

* Interruptible sleeps won't be disturbed on SIGCONT generation, which
  is a visible behavior change, but, I agree, this is more of a bug
  fix.

I'll mull over it a bit more but please go ahead and create a proper
patch.  I'll apply it to the ptrace branch with the previous two
patches.  (Can I add your Acked-by's there?)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-31 16:34           ` Tejun Heo
@ 2011-03-31 16:35             ` Tejun Heo
  2011-03-31 17:29             ` Oleg Nesterov
  1 sibling, 0 replies; 16+ messages in thread
From: Tejun Heo @ 2011-03-31 16:35 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

Small correction.

On Thu, Mar 31, 2011 at 06:34:41PM +0200, Tejun Heo wrote:
> * Interruptible sleeps won't be disturbed on SIGCONT generation, which
                                 ^
				 spuriously
>   is a visible behavior change, but, I agree, this is more of a bug
>   fix.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-31 16:34           ` Tejun Heo
  2011-03-31 16:35             ` Tejun Heo
@ 2011-03-31 17:29             ` Oleg Nesterov
  2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
  1 sibling, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2011-03-31 17:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

On 03/31, Tejun Heo wrote:
>
> Hello,
>
> On Thu, Mar 31, 2011 at 05:15:49PM +0200, Oleg Nesterov wrote:
> >
> > The comment says:
> >
> > 	* If there is a handler for SIGCONT, we must make
> > 	* sure that no thread returns to user mode before
> > 	* we post the signal
>
> I interpreted it as "when there's only single thread, it should not
> return to userland before executing the signal handler".

Yes... probably.

> >  		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
> >  		t = p;
> >  		do {
> > -			unsigned int state;
> >  			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
> > -			/*
> > -			 * If there is a handler for SIGCONT, we must make
> > -			 * sure that no thread returns to user mode before
> > -			 * we post the signal, in case it was the only
> > -			 * thread eligible to run the signal handler--then
> > -			 * it must not do anything between resuming and
> > -			 * running the handler.  With the TIF_SIGPENDING
> > -			 * flag set, the thread will pause and acquire the
> > -			 * siglock that we hold now and until we've queued
> > -			 * the pending signal.
> > -			 *
> > -			 * Wake up the stopped thread _after_ setting
> > -			 * TIF_SIGPENDING
> > -			 */
> > -			state = __TASK_STOPPED;
> > -			if (sig_user_defined(t, SIGCONT) && !sigismember(&t->blocked, SIGCONT)) {
> > -				set_tsk_thread_flag(t, TIF_SIGPENDING);
> > -				state |= TASK_INTERRUPTIBLE;
> > -			}
> > -			wake_up_state(t, state);
> > +			wake_up_state(t, __TASK_STOPPED);
> >  		} while_each_thread(p, t);
>
> This is awesome and, at the first glance, yeah, this seems to be the
> right thing to do.  That part is pure signal delivery after all.

Great.

> * As wants_signal() doesn't take uninterruptible sleeps into
>   consideration,

Yes! And I already thought about this before (regardless of this change).
This is not really good imho, we can improve the ->curr_target logic a
bit, this looks simple.

> the signal might get delivered later with the change
>   but I don't think it's problematic in any way.

Agreed,

> * Interruptible sleeps won't be disturbed on SIGCONT generation, which
>   is a visible behavior change, but, I agree, this is more of a bug
>   fix.

Yes, agreed. I'll try to make the test-case which shows the difference.

> I'll mull over it a bit more but please go ahead and create a proper
> patch.

Yes, will do tomorrow (and it needs the trivial re-diff against your tree).

I spent too many time today trying to understand what was the original
reason for this code. Looks like, it could die a loooooong ago. Perhaps
the only reason was: handle_stop_signal() could drop ->siglock to notify
the parent. I am not sure, that is why I am a bit nervous and want to
recheck once again.

> I'll apply it to the ptrace branch with the previous two
> patches.  (Can I add your Acked-by's there?)

Yes, thanks Tejun.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 0/4] Was: signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-03-31 17:29             ` Oleg Nesterov
@ 2011-04-01 18:11               ` Oleg Nesterov
  2011-04-01 18:11                 ` [PATCH 1/4] signal: prepare_signal(SIGCONT) shouldn't play with TIF_SIGPENDING Oleg Nesterov
                                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Oleg Nesterov @ 2011-04-01 18:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

On 03/31, Oleg Nesterov wrote:
>
> > I'll mull over it a bit more but please go ahead and create a proper
> > patch.
>
> Yes, will do tomorrow (and it needs the trivial re-diff against your tree).

OK, please consider this series for your tree.

4/3 is a bit off-topic, but imho makes sense right now.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] signal: prepare_signal(SIGCONT) shouldn't play with TIF_SIGPENDING
  2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
@ 2011-04-01 18:11                 ` Oleg Nesterov
  2011-04-01 18:12                 ` [PATCH 2/4] signal: do_signal_stop: remove the unneeded task_clear_group_stop_pending() Oleg Nesterov
                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2011-04-01 18:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

prepare_signal(SIGCONT) should never set TIF_SIGPENDING or wake up
the TASK_INTERRUPTIBLE threads. We are going to call complete_signal()
which should pick the right thread correctly. All we need is to wake
up the TASK_STOPPED threads.

If the task was stopped, it can't return to usermode without taking
->siglock. Otherwise we don't care, and the spurious TIF_SIGPENDING
can't be useful.

The comment says:

	* If there is a handler for SIGCONT, we must make
	* sure that no thread returns to user mode before
	* we post the signal

It is not clear what this means. Probably, "when there's only a single
thread" and this continues to be true. Otherwise, even if this SIGCONT
is not private, with or without this change only one thread can dequeue
SIGCONT, other threads can happily return to user mode before before
that thread handles this signal.

Note also that wake_up_state(t, __TASK_STOPPED) can't race with the task
which changes its state, TASK_STOPPED state is protected by ->siglock as
well.

In short: when it comes to signal delivery, SIGCONT is the normal signal
and does not need any special support.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/signal.c |   27 ++-------------------------
 1 file changed, 2 insertions(+), 25 deletions(-)

--- ptrace/kernel/signal.c~1_cont_remove_sigwake	2011-04-01 16:36:29.000000000 +0200
+++ ptrace/kernel/signal.c	2011-04-01 17:12:07.000000000 +0200
@@ -788,37 +788,14 @@ static int prepare_signal(int sig, struc
 	} else if (sig == SIGCONT) {
 		unsigned int why;
 		/*
-		 * Remove all stop signals from all queues,
-		 * and wake all threads.
+		 * Remove all stop signals from all queues, wake all threads.
 		 */
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			unsigned int state;
-
 			task_clear_group_stop_pending(t);
-
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
-			/*
-			 * If there is a handler for SIGCONT, we must make
-			 * sure that no thread returns to user mode before
-			 * we post the signal, in case it was the only
-			 * thread eligible to run the signal handler--then
-			 * it must not do anything between resuming and
-			 * running the handler.  With the TIF_SIGPENDING
-			 * flag set, the thread will pause and acquire the
-			 * siglock that we hold now and until we've queued
-			 * the pending signal.
-			 *
-			 * Wake up the stopped thread _after_ setting
-			 * TIF_SIGPENDING
-			 */
-			state = __TASK_STOPPED;
-			if (sig_user_defined(t, SIGCONT) && !sigismember(&t->blocked, SIGCONT)) {
-				set_tsk_thread_flag(t, TIF_SIGPENDING);
-				state |= TASK_INTERRUPTIBLE;
-			}
-			wake_up_state(t, state);
+			wake_up_state(t, __TASK_STOPPED);
 		} while_each_thread(p, t);
 
 		/*


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/4] signal: do_signal_stop: remove the unneeded task_clear_group_stop_pending()
  2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
  2011-04-01 18:11                 ` [PATCH 1/4] signal: prepare_signal(SIGCONT) shouldn't play with TIF_SIGPENDING Oleg Nesterov
@ 2011-04-01 18:12                 ` Oleg Nesterov
  2011-04-01 18:12                 ` [PATCH 3/4] signal: turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED Oleg Nesterov
                                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2011-04-01 18:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

PF_EXITING or TASK_STOPPED has already called task_participate_group_stop()
and cleared its ->group_stop. No need to do task_clear_group_stop_pending()
when we start the new group stop.

Add a small comment to explain the !task_is_stopped() check. Note that this
check is not exactly right and it can lead to unnecessary stop later if the
thread is TASK_PTRACED. What we need is task_participated_in_group_stop(),
this will be solved later.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/signal.c |    2 --
 1 file changed, 2 deletions(-)

--- ptrace/kernel/signal.c~2_dss_kill_tcgsp	2011-04-01 17:12:07.000000000 +0200
+++ ptrace/kernel/signal.c	2011-04-01 18:10:18.000000000 +0200
@@ -1866,7 +1866,8 @@ static int do_signal_stop(int signr)
 		 * still in effect and then receive a stop signal and
 		 * initiate another group stop.  This deviates from the
 		 * usual behavior as two consecutive stop signals can't
-		 * cause two group stops when !ptraced.
+		 * cause two group stops when !ptraced. That is why we
+		 * also check !task_is_stopped(t) below.
 		 *
 		 * The condition can be distinguished by testing whether
 		 * SIGNAL_STOP_STOPPED is already set.  Don't generate
@@ -1896,8 +1897,6 @@ static int do_signal_stop(int signr)
 				t->group_stop |= signr | gstop;
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
-			} else {
-				task_clear_group_stop_pending(t);
 			}
 		}
 	}


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/4] signal: turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED
  2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
  2011-04-01 18:11                 ` [PATCH 1/4] signal: prepare_signal(SIGCONT) shouldn't play with TIF_SIGPENDING Oleg Nesterov
  2011-04-01 18:12                 ` [PATCH 2/4] signal: do_signal_stop: remove the unneeded task_clear_group_stop_pending() Oleg Nesterov
@ 2011-04-01 18:12                 ` Oleg Nesterov
  2011-04-01 18:13                 ` [PATCH 4/4] ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/ Oleg Nesterov
  2011-04-04  0:11                 ` [PATCH 0/4] Was: signal, ptrace: Fix delayed CONTINUED notification when ptraced Tejun Heo
  4 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2011-04-01 18:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

This patch moves SIGNAL_STOP_DEQUEUED from signal_struct->flags to
task_struct->group_stop, and thus makes it per-thread.

Like SIGNAL_STOP_DEQUEUED, GROUP_STOP_DEQUEUED can be false-positive
after return from get_signal_to_deliver, this is fine. The only purpose
of this bit is: we can drop ->siglock after __dequeue_signal() returns
the sig_kernel_stop() signal and before we call do_signal_stop(), in
this case we must not miss SIGCONT if it comes in between.

But, unlike SIGNAL_STOP_DEQUEUED, GROUP_STOP_DEQUEUED can not be
false-positive in do_signal_stop() if multiple threads dequeue the
sig_kernel_stop() signal at the same time.

Consider two threads T1 and T2, SIGTTIN has a hanlder.

	- T1 dequeues SIGTSTP and sets SIGNAL_STOP_DEQUEUED, then
	  it drops ->siglock

	- SIGCONT comes and clears SIGNAL_STOP_DEQUEUED, SIGTSTP
	  should be cancelled.

	- T2 dequeues SIGTTIN and sets SIGNAL_STOP_DEQUEUED again.
	  Since we have a handler we should not stop, T2 returns
	  to usermode to run the handler.

	- T1 continues, calls do_signal_stop() and wrongly starts
	  the group stop because SIGNAL_STOP_DEQUEUED was restored
	  in between.

With or without this change:

	- we need to do something with ptrace_signal() which can
	  return SIGSTOP, but this needs another discussion

	- SIGSTOP can be lost if it races with the mt exec, will
	  be fixed later.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 include/linux/sched.h |    6 +++---
 kernel/signal.c       |   14 ++++----------
 2 files changed, 7 insertions(+), 13 deletions(-)

--- ptrace/include/linux/sched.h~3_make_stop_dequeued_per_thread	2011-04-01 16:36:29.000000000 +0200
+++ ptrace/include/linux/sched.h	2011-04-01 18:50:38.000000000 +0200
@@ -652,9 +652,8 @@ struct signal_struct {
  * Bits in flags field of signal_struct.
  */
 #define SIGNAL_STOP_STOPPED	0x00000001 /* job control stop in effect */
-#define SIGNAL_STOP_DEQUEUED	0x00000002 /* stop signal dequeued */
-#define SIGNAL_STOP_CONTINUED	0x00000004 /* SIGCONT since WCONTINUED reap */
-#define SIGNAL_GROUP_EXIT	0x00000008 /* group exit in progress */
+#define SIGNAL_STOP_CONTINUED	0x00000002 /* SIGCONT since WCONTINUED reap */
+#define SIGNAL_GROUP_EXIT	0x00000004 /* group exit in progress */
 /*
  * Pending notifications to parent.
  */
@@ -1779,6 +1778,7 @@ extern void thread_group_times(struct ta
 #define GROUP_STOP_PENDING	(1 << 16) /* task should stop for group stop */
 #define GROUP_STOP_CONSUME	(1 << 17) /* consume group stop count */
 #define GROUP_STOP_TRAPPING	(1 << 18) /* switching from STOPPED to TRACED */
+#define GROUP_STOP_DEQUEUED	(1 << 19) /* stop signal dequeued */
 
 extern void task_clear_group_stop_pending(struct task_struct *task);
 
--- ptrace/kernel/signal.c~3_make_stop_dequeued_per_thread	2011-04-01 18:10:18.000000000 +0200
+++ ptrace/kernel/signal.c	2011-04-01 18:58:41.000000000 +0200
@@ -254,7 +254,8 @@ static void task_clear_group_stop_trappi
  */
 void task_clear_group_stop_pending(struct task_struct *task)
 {
-	task->group_stop &= ~(GROUP_STOP_PENDING | GROUP_STOP_CONSUME);
+	task->group_stop &= ~(GROUP_STOP_PENDING | GROUP_STOP_CONSUME |
+				GROUP_STOP_DEQUEUED);
 }
 
 /**
@@ -602,7 +603,7 @@ int dequeue_signal(struct task_struct *t
 		 * is to alert stop-signal processing code when another
 		 * processor has come along and cleared the flag.
 		 */
-		tsk->signal->flags |= SIGNAL_STOP_DEQUEUED;
+		current->group_stop |= GROUP_STOP_DEQUEUED;
 	}
 	if ((info->si_code & __SI_MASK) == __SI_TIMER && info->si_sys_private) {
 		/*
@@ -821,13 +822,6 @@ static int prepare_signal(int sig, struc
 			signal->flags = why | SIGNAL_STOP_CONTINUED;
 			signal->group_stop_count = 0;
 			signal->group_exit_code = 0;
-		} else {
-			/*
-			 * We are not stopped, but there could be a stop
-			 * signal in the middle of being processed after
-			 * being removed from the queue.  Clear that too.
-			 */
-			signal->flags &= ~SIGNAL_STOP_DEQUEUED;
 		}
 	}
 
@@ -1855,7 +1849,7 @@ static int do_signal_stop(int signr)
 		/* signr will be recorded in task->group_stop for retries */
 		WARN_ON_ONCE(signr & ~GROUP_STOP_SIGMASK);
 
-		if (!likely(sig->flags & SIGNAL_STOP_DEQUEUED) ||
+		if (!likely(current->group_stop & GROUP_STOP_DEQUEUED) ||
 		    unlikely(signal_group_exit(sig)))
 			return 0;
 		/*


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/4] ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/
  2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
                                   ` (2 preceding siblings ...)
  2011-04-01 18:12                 ` [PATCH 3/4] signal: turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED Oleg Nesterov
@ 2011-04-01 18:13                 ` Oleg Nesterov
  2011-04-04  0:11                 ` [PATCH 0/4] Was: signal, ptrace: Fix delayed CONTINUED notification when ptraced Tejun Heo
  4 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2011-04-01 18:13 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

After "ptrace: Clean transitions between TASK_STOPPED and TRACED"
d79fdd6d96f46fabb779d86332e3677c6f5c2a4f, ptrace_check_attach()
should never see a TASK_STOPPED tracee and s/STOPPED/TRACED/ is
no longer legal. Add the warning.

Note: ptrace_check_attach() can be greatly simplified, in particular
it doesn't need tasklist. But I'd prefer another patch for that.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/ptrace.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

--- ptrace/kernel/ptrace.c~4_ck_attach_cant_be_stopped	2011-04-01 16:36:29.000000000 +0200
+++ ptrace/kernel/ptrace.c	2011-04-01 19:54:05.000000000 +0200
@@ -112,16 +112,14 @@ int ptrace_check_attach(struct task_stru
 	 */
 	read_lock(&tasklist_lock);
 	if ((child->ptrace & PT_PTRACED) && child->parent == current) {
-		ret = 0;
 		/*
 		 * child->sighand can't be NULL, release_task()
 		 * does ptrace_unlink() before __exit_signal().
 		 */
 		spin_lock_irq(&child->sighand->siglock);
-		if (task_is_stopped(child))
-			child->state = TASK_TRACED;
-		else if (!task_is_traced(child) && !kill)
-			ret = -ESRCH;
+		WARN_ON_ONCE(task_is_stopped(child));
+		if (task_is_traced(child) || kill)
+			ret = 0;
 		spin_unlock_irq(&child->sighand->siglock);
 	}
 	read_unlock(&tasklist_lock);


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] Was: signal, ptrace: Fix delayed CONTINUED notification when ptraced
  2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
                                   ` (3 preceding siblings ...)
  2011-04-01 18:13                 ` [PATCH 4/4] ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/ Oleg Nesterov
@ 2011-04-04  0:11                 ` Tejun Heo
  4 siblings, 0 replies; 16+ messages in thread
From: Tejun Heo @ 2011-04-04  0:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: jan.kratochvil, vda.linux, linux-kernel, torvalds, akpm, indan, roland

On Fri, Apr 01, 2011 at 08:11:23PM +0200, Oleg Nesterov wrote:
> On 03/31, Oleg Nesterov wrote:
> >
> > > I'll mull over it a bit more but please go ahead and create a proper
> > > patch.
> >
> > Yes, will do tomorrow (and it needs the trivial re-diff against your tree).
> 
> OK, please consider this series for your tree.
> 
> 4/3 is a bit off-topic, but imho makes sense right now.

Applied 1-4 to misc:ptrace.  Thanks Oleg.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-04-04  0:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-29 14:46 [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume Tejun Heo
2011-03-29 14:46 ` [PATCH 2/3] signal, ptrace: Add SIGTRAP signal_wake_up() Tejun Heo
2011-03-29 14:47   ` [PATCH 3/3] signal, ptrace: Fix delayed CONTINUED notification when ptraced Tejun Heo
2011-03-30 19:29     ` Oleg Nesterov
2011-03-31  7:29       ` Tejun Heo
2011-03-31 15:15         ` Oleg Nesterov
2011-03-31 16:34           ` Tejun Heo
2011-03-31 16:35             ` Tejun Heo
2011-03-31 17:29             ` Oleg Nesterov
2011-04-01 18:11               ` [PATCH 0/4] Was: " Oleg Nesterov
2011-04-01 18:11                 ` [PATCH 1/4] signal: prepare_signal(SIGCONT) shouldn't play with TIF_SIGPENDING Oleg Nesterov
2011-04-01 18:12                 ` [PATCH 2/4] signal: do_signal_stop: remove the unneeded task_clear_group_stop_pending() Oleg Nesterov
2011-04-01 18:12                 ` [PATCH 3/4] signal: turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED Oleg Nesterov
2011-04-01 18:13                 ` [PATCH 4/4] ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/ Oleg Nesterov
2011-04-04  0:11                 ` [PATCH 0/4] Was: signal, ptrace: Fix delayed CONTINUED notification when ptraced Tejun Heo
2011-03-29 18:27 ` [PATCH 1/3] signal: Make signal_wake_up() take @sig_type instead of @resume Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).