All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
@ 2011-05-29 23:12 Tejun Heo
  2011-05-29 23:12 ` [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach() Tejun Heo
                   ` (17 more replies)
  0 siblings, 18 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hello,

This is the fourth take of PTRACE_SEIZE/INTERRUPT and group stop
notification patchset.  This patchset contains both the prep and the
actual implementation patches.

I'm traveling from tomorrow and wanted to send out the revised version
before leaving, so I didn't spend enough time testing the notification
part.  I haven't tested things like INTERRUPT cancelling listening or
immediate re-trap triggering on LISTEN.  But the basics are tested and
it should be enough to decide on direction.  BTW, Oleg, nice
suggestions.  I much prefer this over the TRAPPING one. :)

Notable changes from the third take[1] are,

* PTRACE_TRAP_NOTIFY is now cleared on the next STOP trap instead of
  PTRACE_GETSIGINFO as suggested by Oleg.

* PTRACE_LISTEN added.  This puts tracee into quasi-active state where
  wait(2) and sync ptrace requests fail and tracee is allowed to
  re-trap into STOP to notify an async event.  This makes re-trapping
  protection with TRAPPING unnecessary and makes the implementation
  noticeably simpler.  Also suggested by Oleg.

Per patch changes:

* The following patches are dropped.

  0008-ptrace-move-JOBCTL_TRAPPING-wait-to-wait-2-and-ptrac.patch
  0009-ptrace-make-TRAPPING-wait-interruptible.patch
  0014-ptrace-restructure-ptrace_getsiginfo.patch
  0018-ptrace-add-JOBCTL_BLOCK_NOTIFY.patch
  0019-ptrace-implement-group-stop-notification-for-ptracer.patch

* The following patches are added
  
  0001-ptrace-remove-silly-wait_trap-variable-from-ptrace_a.patch
  0016-ptrace-implement-TRAP_NOTIFY-and-use-it-for-group-st.patch
  0017-ptrace-implement-PTRACE_LISTEN.patch

* 0002-job-control-rename-signal-group_stop-and-flags-to-jo.patch

  JOBCTL_*_BIT macros defined to replace ilog2() usage as suggested by
  Linus.

* 0010-job-control-introduce-JOBCTL_TRAP_STOP-and-use-it-fo.patch

  PTRACE_TRAP_MASK handling in get_signal_delivery() is relocated
  after group stop participation inside for(;;) and do_signal_stop()
  now returns %false after scheduling STOP trap without releasing
  siglock.  This removes the extra re-locking and also avoids taking
  consecutive traps by consuming group stop first.

* 0014-ptrace-make-group-stop-state-visible-via-PTRACE_GETS.patch

  si_pt_flags and si_signo are set while taking trap instead of on
  PTRACE_GETSIGINFO.  Accordingly, NOTIFY is also cleared on STOP
  trap.

This patchset contains the following 17 patches.

  0001-ptrace-remove-silly-wait_trap-variable-from-ptrace_a.patch
  0002-job-control-rename-signal-group_stop-and-flags-to-jo.patch
  0003-ptrace-ptrace_check_attach-rename-kill-to-ignore_sta.patch
  0004-ptrace-relocate-set_current_state-TASK_TRACED-in-ptr.patch
  0005-job-control-introduce-JOBCTL_PENDING_MASK-and-task_c.patch
  0006-job-control-make-task_clear_jobctl_pending-clear-TRA.patch
  0007-job-control-introduce-task_set_jobctl_pending.patch
  0008-ptrace-use-bit_waitqueue-for-TRAPPING-instead-of-wai.patch
  0009-signal-remove-three-noop-tracehooks.patch
  0010-job-control-introduce-JOBCTL_TRAP_STOP-and-use-it-fo.patch
  0011-ptrace-implement-PTRACE_SEIZE.patch
  0012-ptrace-implement-PTRACE_INTERRUPT.patch
  0013-ptrace-add-siginfo.si_pt_flags.patch
  0014-ptrace-make-group-stop-state-visible-via-PTRACE_GETS.patch
  0015-ptrace-don-t-let-PTRACE_SETSIGINFO-override-__SI_TRA.patch
  0016-ptrace-implement-TRAP_NOTIFY-and-use-it-for-group-st.patch
  0017-ptrace-implement-PTRACE_LISTEN.patch

and available in the following git branch.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize

The HEAD is a48ca3deab (ptrace: implement PTRACE_LISTEN).  If you see
older branch, please retry after a while (korg is still syncing).

The patchset is on top of today's (20110530) mainline - 139f37f5e1
(Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin).

diffstat follows.

 arch/ia64/include/asm/siginfo.h       |    7 
 arch/ia64/kernel/signal.c             |    5 
 arch/mips/include/asm/compat-signal.h |    7 
 arch/mips/include/asm/siginfo.h       |    7 
 arch/mips/kernel/signal32.c           |    5 
 arch/parisc/kernel/signal32.c         |    5 
 arch/parisc/kernel/signal32.h         |    7 
 arch/powerpc/kernel/ppc32.h           |    7 
 arch/powerpc/kernel/signal_32.c       |    5 
 arch/s390/kernel/compat_linux.h       |    7 
 arch/s390/kernel/compat_signal.c      |    5 
 arch/sparc/kernel/signal32.c          |   12 +
 arch/tile/kernel/compat_signal.c      |   11 +
 arch/x86/ia32/ia32_signal.c           |    4 
 arch/x86/include/asm/ia32.h           |    7 
 fs/exec.c                             |    2 
 include/asm-generic/siginfo.h         |   10 
 include/linux/ptrace.h                |   16 +
 include/linux/sched.h                 |   37 ++-
 include/linux/tracehook.h             |   52 -----
 kernel/exit.c                         |    2 
 kernel/ptrace.c                       |  190 +++++++++++++++---
 kernel/signal.c                       |  351 ++++++++++++++++++++++------------
 23 files changed, 543 insertions(+), 218 deletions(-)

Thanks.

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1144997

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach()
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-06-01 18:47   ` Oleg Nesterov
  2011-06-02 11:39   ` [PATCH UPDATED " Tejun Heo
  2011-05-29 23:12 ` [PATCH 02/17] job control: rename signal->group_stop and flags to jobctl and update them Tejun Heo
                   ` (16 subsequent siblings)
  17 siblings, 2 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

Local variable wait_trap is used to determine whether to call
wait_event() on TRAPPING or not, which doesn't change or optimize
anything.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/ptrace.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2df1157..389497b 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -184,7 +184,6 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 
 static int ptrace_attach(struct task_struct *task)
 {
-	bool wait_trap = false;
 	int retval;
 
 	audit_ptrace(task);
@@ -246,7 +245,6 @@ static int ptrace_attach(struct task_struct *task)
 	if (task_is_stopped(task)) {
 		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
 		signal_wake_up(task, 1);
-		wait_trap = true;
 	}
 
 	spin_unlock(&task->sighand->siglock);
@@ -257,9 +255,8 @@ unlock_tasklist:
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
-	if (wait_trap)
-		wait_event(current->signal->wait_chldexit,
-			   !(task->group_stop & GROUP_STOP_TRAPPING));
+	wait_event(current->signal->wait_chldexit,
+		   !(task->group_stop & GROUP_STOP_TRAPPING));
 	return retval;
 }
 
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 02/17] job control: rename signal->group_stop and flags to jobctl and update them
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
  2011-05-29 23:12 ` [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach() Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 03/17] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments Tejun Heo
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

signal->group_stop currently hosts mostly group stop related flags;
however, it's gonna be used for wider purposes and the GROUP_STOP_
flag prefix becomes confusing.  Rename signal->group_stop to
signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_*.

Bit position macros JOBCTL_*_BIT are defined and JOBCTL_* flags are
defined in terms of them to allow using bitops later.

While at it, reassign JOBCTL_TRAPPING to bit 22 to better accomodate
future additions.

This doesn't cause any functional change.

-v2: JOBCTL_*_BIT macros added as suggested by Linus.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 fs/exec.c             |    2 +-
 include/linux/sched.h |   22 +++++++----
 kernel/ptrace.c       |   12 +++---
 kernel/signal.c       |   91 +++++++++++++++++++++++++------------------------
 4 files changed, 67 insertions(+), 60 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index ea5f748..8986bb0 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1772,7 +1772,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
-		task_clear_group_stop_pending(t);
+		task_clear_jobctl_stop_pending(t);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index bcddd01..f4537d5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1282,7 +1282,7 @@ struct task_struct {
 	int exit_state;
 	int exit_code, exit_signal;
 	int pdeath_signal;  /*  The signal sent when the parent dies  */
-	unsigned int group_stop;	/* GROUP_STOP_*, siglock protected */
+	unsigned int jobctl;	/* JOBCTL_*, siglock protected */
 	/* ??? */
 	unsigned int personality;
 	unsigned did_exec:1;
@@ -1803,15 +1803,21 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define used_math() tsk_used_math(current)
 
 /*
- * task->group_stop flags
+ * task->jobctl flags
  */
-#define GROUP_STOP_SIGMASK	0xffff    /* signr of the last group stop */
-#define GROUP_STOP_PENDING	(1 << 16) /* task should stop for group stop */
-#define GROUP_STOP_CONSUME	(1 << 17) /* consume group stop count */
-#define GROUP_STOP_TRAPPING	(1 << 18) /* switching from STOPPED to TRACED */
-#define GROUP_STOP_DEQUEUED	(1 << 19) /* stop signal dequeued */
+#define JOBCTL_STOP_SIGMASK	0xffff	/* signr of the last group stop */
 
-extern void task_clear_group_stop_pending(struct task_struct *task);
+#define JOBCTL_STOP_DEQUEUED_BIT 16	/* stop signal dequeued */
+#define JOBCTL_STOP_PENDING_BIT	17	/* task should stop for group stop */
+#define JOBCTL_STOP_CONSUME_BIT	18	/* consume group stop count */
+#define JOBCTL_TRAPPING_BIT	21	/* switching to TRACED */
+
+#define JOBCTL_STOP_DEQUEUED	(1 << JOBCTL_STOP_DEQUEUED_BIT)
+#define JOBCTL_STOP_PENDING	(1 << JOBCTL_STOP_PENDING_BIT)
+#define JOBCTL_STOP_CONSUME	(1 << JOBCTL_STOP_CONSUME_BIT)
+#define JOBCTL_TRAPPING		(1 << JOBCTL_TRAPPING_BIT)
+
+extern void task_clear_jobctl_stop_pending(struct task_struct *task);
 
 #ifdef CONFIG_PREEMPT_RCU
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 389497b..da14bd2 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -77,13 +77,13 @@ void __ptrace_unlink(struct task_struct *child)
 	spin_lock(&child->sighand->siglock);
 
 	/*
-	 * Reinstate GROUP_STOP_PENDING if group stop is in effect and
+	 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
 	 * @child isn't dead.
 	 */
 	if (!(child->flags & PF_EXITING) &&
 	    (child->signal->flags & SIGNAL_STOP_STOPPED ||
 	     child->signal->group_stop_count))
-		child->group_stop |= GROUP_STOP_PENDING;
+		child->jobctl |= JOBCTL_STOP_PENDING;
 
 	/*
 	 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
@@ -91,7 +91,7 @@ void __ptrace_unlink(struct task_struct *child)
 	 * is in TASK_TRACED; otherwise, we might unduly disrupt
 	 * TASK_KILLABLE sleeps.
 	 */
-	if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child))
+	if (child->jobctl & JOBCTL_STOP_PENDING || task_is_traced(child))
 		signal_wake_up(child, task_is_traced(child));
 
 	spin_unlock(&child->sighand->siglock);
@@ -226,7 +226,7 @@ static int ptrace_attach(struct task_struct *task)
 	spin_lock(&task->sighand->siglock);
 
 	/*
-	 * If the task is already STOPPED, set GROUP_STOP_PENDING and
+	 * If the task is already STOPPED, set JOBCTL_STOP_PENDING and
 	 * TRAPPING, and kick it so that it transits to TRACED.  TRAPPING
 	 * will be cleared if the child completes the transition or any
 	 * event which clears the group stop states happens.  We'll wait
@@ -243,7 +243,7 @@ static int ptrace_attach(struct task_struct *task)
 	 * in and out of STOPPED are protected by siglock.
 	 */
 	if (task_is_stopped(task)) {
-		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
+		task->jobctl |= JOBCTL_STOP_PENDING | JOBCTL_TRAPPING;
 		signal_wake_up(task, 1);
 	}
 
@@ -256,7 +256,7 @@ unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
 	wait_event(current->signal->wait_chldexit,
-		   !(task->group_stop & GROUP_STOP_TRAPPING));
+			   !(task->jobctl & JOBCTL_TRAPPING));
 	return retval;
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 86c32b8..ab6851c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -124,7 +124,7 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
-	if ((t->group_stop & GROUP_STOP_PENDING) ||
+	if ((t->jobctl & JOBCTL_STOP_PENDING) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
@@ -224,27 +224,28 @@ static inline void print_dropped_signal(int sig)
 }
 
 /**
- * task_clear_group_stop_trapping - clear group stop trapping bit
+ * task_clear_jobctl_trapping - clear jobctl trapping bit
  * @task: target task
  *
- * If GROUP_STOP_TRAPPING is set, a ptracer is waiting for us.  Clear it
- * and wake up the ptracer.  Note that we don't need any further locking.
- * @task->siglock guarantees that @task->parent points to the ptracer.
+ * If JOBCTL_TRAPPING is set, a ptracer is waiting for us to enter TRACED.
+ * Clear it and wake up the ptracer.  Note that we don't need any further
+ * locking.  @task->siglock guarantees that @task->parent points to the
+ * ptracer.
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-static void task_clear_group_stop_trapping(struct task_struct *task)
+static void task_clear_jobctl_trapping(struct task_struct *task)
 {
-	if (unlikely(task->group_stop & GROUP_STOP_TRAPPING)) {
-		task->group_stop &= ~GROUP_STOP_TRAPPING;
+	if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
+		task->jobctl &= ~JOBCTL_TRAPPING;
 		__wake_up_sync_key(&task->parent->signal->wait_chldexit,
 				   TASK_UNINTERRUPTIBLE, 1, task);
 	}
 }
 
 /**
- * task_clear_group_stop_pending - clear pending group stop
+ * task_clear_jobctl_stop_pending - clear pending group stop
  * @task: target task
  *
  * Clear group stop states for @task.
@@ -252,19 +253,19 @@ static void task_clear_group_stop_trapping(struct task_struct *task)
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-void task_clear_group_stop_pending(struct task_struct *task)
+void task_clear_jobctl_stop_pending(struct task_struct *task)
 {
-	task->group_stop &= ~(GROUP_STOP_PENDING | GROUP_STOP_CONSUME |
-			      GROUP_STOP_DEQUEUED);
+	task->jobctl &= ~(JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME |
+			  JOBCTL_STOP_DEQUEUED);
 }
 
 /**
  * task_participate_group_stop - participate in a group stop
  * @task: task participating in a group stop
  *
- * @task has GROUP_STOP_PENDING set and is participating in a group stop.
+ * @task has %JOBCTL_STOP_PENDING set and is participating in a group stop.
  * Group stop states are cleared and the group stop count is consumed if
- * %GROUP_STOP_CONSUME was set.  If the consumption completes the group
+ * %JOBCTL_STOP_CONSUME was set.  If the consumption completes the group
  * stop, the appropriate %SIGNAL_* flags are set.
  *
  * CONTEXT:
@@ -277,11 +278,11 @@ void task_clear_group_stop_pending(struct task_struct *task)
 static bool task_participate_group_stop(struct task_struct *task)
 {
 	struct signal_struct *sig = task->signal;
-	bool consume = task->group_stop & GROUP_STOP_CONSUME;
+	bool consume = task->jobctl & JOBCTL_STOP_CONSUME;
 
-	WARN_ON_ONCE(!(task->group_stop & GROUP_STOP_PENDING));
+	WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING));
 
-	task_clear_group_stop_pending(task);
+	task_clear_jobctl_stop_pending(task);
 
 	if (!consume)
 		return false;
@@ -604,7 +605,7 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
 		 * is to alert stop-signal processing code when another
 		 * processor has come along and cleared the flag.
 		 */
-		current->group_stop |= GROUP_STOP_DEQUEUED;
+		current->jobctl |= JOBCTL_STOP_DEQUEUED;
 	}
 	if ((info->si_code & __SI_MASK) == __SI_TIMER && info->si_sys_private) {
 		/*
@@ -809,7 +810,7 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			task_clear_group_stop_pending(t);
+			task_clear_jobctl_stop_pending(t);
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
 			wake_up_state(t, __TASK_STOPPED);
 		} while_each_thread(p, t);
@@ -925,7 +926,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
-				task_clear_group_stop_pending(t);
+				task_clear_jobctl_stop_pending(t);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1160,7 +1161,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
-		task_clear_group_stop_pending(t);
+		task_clear_jobctl_stop_pending(t);
 		count++;
 
 		/* Don't bother with already dead threads */
@@ -1738,7 +1739,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	 * clear now.  We act as if SIGCONT is received after TASK_TRACED
 	 * is entered - ignore it.
 	 */
-	if (why == CLD_STOPPED && (current->group_stop & GROUP_STOP_PENDING))
+	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
 	current->last_siginfo = info;
@@ -1751,12 +1752,12 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	set_current_state(TASK_TRACED);
 
 	/*
-	 * We're committing to trapping.  Clearing GROUP_STOP_TRAPPING and
+	 * We're committing to trapping.  Clearing JOBCTL_TRAPPING and
 	 * transition to TASK_TRACED should be atomic with respect to
-	 * siglock.  This hsould be done after the arch hook as siglock is
+	 * siglock.  This should be done after the arch hook as siglock is
 	 * released and regrabbed across it.
 	 */
-	task_clear_group_stop_trapping(current);
+	task_clear_jobctl_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
@@ -1792,9 +1793,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 		 *
 		 * If @gstop_done, the ptracer went away between group stop
 		 * completion and here.  During detach, it would have set
-		 * GROUP_STOP_PENDING on us and we'll re-enter TASK_STOPPED
-		 * in do_signal_stop() on return, so notifying the real
-		 * parent of the group stop completion is enough.
+		 * JOBCTL_STOP_PENDING on us and we'll re-enter
+		 * TASK_STOPPED in do_signal_stop() on return, so notifying
+		 * the real parent of the group stop completion is enough.
 		 */
 		if (gstop_done)
 			do_notify_parent_cldstop(current, false, why);
@@ -1856,14 +1857,14 @@ static int do_signal_stop(int signr)
 {
 	struct signal_struct *sig = current->signal;
 
-	if (!(current->group_stop & GROUP_STOP_PENDING)) {
-		unsigned int gstop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME;
+	if (!(current->jobctl & JOBCTL_STOP_PENDING)) {
+		unsigned int gstop = JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME;
 		struct task_struct *t;
 
-		/* signr will be recorded in task->group_stop for retries */
-		WARN_ON_ONCE(signr & ~GROUP_STOP_SIGMASK);
+		/* signr will be recorded in task->jobctl for retries */
+		WARN_ON_ONCE(signr & ~JOBCTL_STOP_SIGMASK);
 
-		if (!likely(current->group_stop & GROUP_STOP_DEQUEUED) ||
+		if (!likely(current->jobctl & JOBCTL_STOP_DEQUEUED) ||
 		    unlikely(signal_group_exit(sig)))
 			return 0;
 		/*
@@ -1890,19 +1891,19 @@ static int do_signal_stop(int signr)
 		else
 			WARN_ON_ONCE(!task_ptrace(current));
 
-		current->group_stop &= ~GROUP_STOP_SIGMASK;
-		current->group_stop |= signr | gstop;
+		current->jobctl &= ~JOBCTL_STOP_SIGMASK;
+		current->jobctl |= signr | gstop;
 		sig->group_stop_count = 1;
 		for (t = next_thread(current); t != current;
 		     t = next_thread(t)) {
-			t->group_stop &= ~GROUP_STOP_SIGMASK;
+			t->jobctl &= ~JOBCTL_STOP_SIGMASK;
 			/*
 			 * Setting state to TASK_STOPPED for a group
 			 * stop is always done with the siglock held,
 			 * so this check has no races.
 			 */
 			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
-				t->group_stop |= signr | gstop;
+				t->jobctl |= signr | gstop;
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
 			}
@@ -1943,23 +1944,23 @@ retry:
 
 		spin_lock_irq(&current->sighand->siglock);
 	} else {
-		ptrace_stop(current->group_stop & GROUP_STOP_SIGMASK,
+		ptrace_stop(current->jobctl & JOBCTL_STOP_SIGMASK,
 			    CLD_STOPPED, 0, NULL);
 		current->exit_code = 0;
 	}
 
 	/*
-	 * GROUP_STOP_PENDING could be set if another group stop has
+	 * JOBCTL_STOP_PENDING could be set if another group stop has
 	 * started since being woken up or ptrace wants us to transit
 	 * between TASK_STOPPED and TRACED.  Retry group stop.
 	 */
-	if (current->group_stop & GROUP_STOP_PENDING) {
-		WARN_ON_ONCE(!(current->group_stop & GROUP_STOP_SIGMASK));
+	if (current->jobctl & JOBCTL_STOP_PENDING) {
+		WARN_ON_ONCE(!(current->jobctl & JOBCTL_STOP_SIGMASK));
 		goto retry;
 	}
 
 	/* PTRACE_ATTACH might have raced with task killing, clear trapping */
-	task_clear_group_stop_trapping(current);
+	task_clear_jobctl_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
 
@@ -2078,8 +2079,8 @@ relock:
 		if (unlikely(signr != 0))
 			ka = return_ka;
 		else {
-			if (unlikely(current->group_stop &
-				     GROUP_STOP_PENDING) && do_signal_stop(0))
+			if (unlikely(current->jobctl & JOBCTL_STOP_PENDING) &&
+			    do_signal_stop(0))
 				goto relock;
 
 			signr = dequeue_signal(current, &current->blocked,
@@ -2253,7 +2254,7 @@ void exit_signals(struct task_struct *tsk)
 	signotset(&unblocked);
 	retarget_shared_pending(tsk, &unblocked);
 
-	if (unlikely(tsk->group_stop & GROUP_STOP_PENDING) &&
+	if (unlikely(tsk->jobctl & JOBCTL_STOP_PENDING) &&
 	    task_participate_group_stop(tsk))
 		group_stop = CLD_STOPPED;
 out:
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 03/17] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
  2011-05-29 23:12 ` [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach() Tejun Heo
  2011-05-29 23:12 ` [PATCH 02/17] job control: rename signal->group_stop and flags to jobctl and update them Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 04/17] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

PTRACE_INTERRUPT is going to be added which should also skip
task_is_traced() check in ptrace_check_attach().  Rename @kill to
@ignore_state and make it bool.  Add function comment while at it.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/ptrace.h |    2 +-
 kernel/ptrace.c        |   24 +++++++++++++++++++-----
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 9178d5c..e93ef1a 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -105,7 +105,7 @@ extern long arch_ptrace(struct task_struct *child, long request,
 extern int ptrace_readdata(struct task_struct *tsk, unsigned long src, char __user *dst, int len);
 extern int ptrace_writedata(struct task_struct *tsk, char __user *src, unsigned long dst, int len);
 extern void ptrace_disable(struct task_struct *);
-extern int ptrace_check_attach(struct task_struct *task, int kill);
+extern int ptrace_check_attach(struct task_struct *task, bool ignore_state);
 extern int ptrace_request(struct task_struct *child, long request,
 			  unsigned long addr, unsigned long data);
 extern void ptrace_notify(int exit_code);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index da14bd2..cbda3c2 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -97,10 +97,24 @@ void __ptrace_unlink(struct task_struct *child)
 	spin_unlock(&child->sighand->siglock);
 }
 
-/*
- * Check that we have indeed attached to the thing..
+/**
+ * ptrace_check_attach - check whether ptracee is ready for ptrace operation
+ * @child: ptracee to check for
+ * @ignore_state: don't check whether @child is currently %TASK_TRACED
+ *
+ * Check whether @child is being ptraced by %current and ready for further
+ * ptrace operations.  If @ignore_state is %false, @child also should be in
+ * %TASK_TRACED state and on return the child is guaranteed to be traced
+ * and not executing.  If @ignore_state is %true, @child can be in any
+ * state.
+ *
+ * CONTEXT:
+ * Grabs and releases tasklist_lock and @child->sighand->siglock.
+ *
+ * RETURNS:
+ * 0 on success, -ESRCH if %child is not ready.
  */
-int ptrace_check_attach(struct task_struct *child, int kill)
+int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 {
 	int ret = -ESRCH;
 
@@ -119,13 +133,13 @@ int ptrace_check_attach(struct task_struct *child, int kill)
 		 */
 		spin_lock_irq(&child->sighand->siglock);
 		WARN_ON_ONCE(task_is_stopped(child));
-		if (task_is_traced(child) || kill)
+		if (task_is_traced(child) || ignore_state)
 			ret = 0;
 		spin_unlock_irq(&child->sighand->siglock);
 	}
 	read_unlock(&tasklist_lock);
 
-	if (!ret && !kill)
+	if (!ret && !ignore_state)
 		ret = wait_task_inactive(child, TASK_TRACED) ? 0 : -ESRCH;
 
 	/* All systems go.. */
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 04/17] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop()
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (2 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 03/17] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 05/17] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() Tejun Heo
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

In ptrace_stop(), after arch hook is done, the task state and jobctl
bits are updated while holding siglock.  The ordering requirement
there is that TASK_TRACED is set before JOBCTL_TRAPPING is cleared to
prevent ptracer waiting on TRAPPING doesn't end up waking up TRACED is
actually set and sees TASK_RUNNING in wait(2).

Move set_current_state(TASK_TRACED) to the top of the block and
reorganize comments.  This makes the ordering more obvious
(TASK_TRACED before other updates) and helps future updates to group
stop participation.

This patch doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/signal.c |   28 +++++++++++++---------------
 1 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index ab6851c..62a6c3b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1733,6 +1733,18 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	}
 
 	/*
+	 * We're committing to trapping.  TRACED should be visible before
+	 * TRAPPING is cleared; otherwise, the tracer might fail do_wait().
+	 * Also, transition to TRACED and updates to ->jobctl should be
+	 * atomic with respect to siglock and should be done after the arch
+	 * hook as siglock is released and regrabbed across it.
+	 */
+	set_current_state(TASK_TRACED);
+
+	current->last_siginfo = info;
+	current->exit_code = exit_code;
+
+	/*
 	 * If @why is CLD_STOPPED, we're trapping to participate in a group
 	 * stop.  Do the bookkeeping.  Note that if SIGCONT was delievered
 	 * while siglock was released for the arch hook, PENDING could be
@@ -1742,21 +1754,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
-	current->last_siginfo = info;
-	current->exit_code = exit_code;
-
-	/*
-	 * TRACED should be visible before TRAPPING is cleared; otherwise,
-	 * the tracer might fail do_wait().
-	 */
-	set_current_state(TASK_TRACED);
-
-	/*
-	 * We're committing to trapping.  Clearing JOBCTL_TRAPPING and
-	 * transition to TASK_TRACED should be atomic with respect to
-	 * siglock.  This should be done after the arch hook as siglock is
-	 * released and regrabbed across it.
-	 */
+	/* entering a trap, clear TRAPPING */
 	task_clear_jobctl_trapping(current);
 
 	spin_unlock_irq(&current->sighand->siglock);
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 05/17] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending()
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (3 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 04/17] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 06/17] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

This patch introduces JOBCTL_PENDING_MASK and replaces
task_clear_jobctl_stop_pending() with task_clear_jobctl_pending()
which takes an extra @mask argument.

JOBCTL_PENDING_MASK is currently equal to JOBCTL_STOP_PENDING but
future patches will add more bits.  recalc_sigpending_tsk() is updated
to use JOBCTL_PENDING_MASK instead.

task_clear_jobctl_pending() takes @mask which in subset of
JOBCTL_PENDING_MASK and clears the relevant jobctl bits.  If
JOBCTL_STOP_PENDING is set, other STOP bits are cleared together.  All
task_clear_jobctl_stop_pending() users are updated to call
task_clear_jobctl_pending() with JOBCTL_STOP_PENDING which is
functionally identical to task_clear_jobctl_stop_pending().

This patch doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/exec.c             |    2 +-
 include/linux/sched.h |    5 ++++-
 kernel/signal.c       |   27 +++++++++++++++++----------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 8986bb0..4402105 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1772,7 +1772,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
-		task_clear_jobctl_stop_pending(t);
+		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f4537d5..11791dd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1817,7 +1817,10 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define JOBCTL_STOP_CONSUME	(1 << JOBCTL_STOP_CONSUME_BIT)
 #define JOBCTL_TRAPPING		(1 << JOBCTL_TRAPPING_BIT)
 
-extern void task_clear_jobctl_stop_pending(struct task_struct *task);
+#define JOBCTL_PENDING_MASK	JOBCTL_STOP_PENDING
+
+extern void task_clear_jobctl_pending(struct task_struct *task,
+				      unsigned int mask);
 
 #ifdef CONFIG_PREEMPT_RCU
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 62a6c3b..288d952 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -124,7 +124,7 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
-	if ((t->jobctl & JOBCTL_STOP_PENDING) ||
+	if ((t->jobctl & JOBCTL_PENDING_MASK) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
@@ -245,18 +245,25 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
 }
 
 /**
- * task_clear_jobctl_stop_pending - clear pending group stop
+ * task_clear_jobctl_pending - clear jobctl pending bits
  * @task: target task
+ * @mask: pending bits to clear
  *
- * Clear group stop states for @task.
+ * Clear @mask from @task->jobctl.  @mask must be subset of
+ * %JOBCTL_PENDING_MASK.  If %JOBCTL_STOP_PENDING is being cleared, other
+ * STOP bits are cleared together.
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-void task_clear_jobctl_stop_pending(struct task_struct *task)
+void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
 {
-	task->jobctl &= ~(JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME |
-			  JOBCTL_STOP_DEQUEUED);
+	BUG_ON(mask & ~JOBCTL_PENDING_MASK);
+
+	if (mask & JOBCTL_STOP_PENDING)
+		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
+
+	task->jobctl &= ~mask;
 }
 
 /**
@@ -282,7 +289,7 @@ static bool task_participate_group_stop(struct task_struct *task)
 
 	WARN_ON_ONCE(!(task->jobctl & JOBCTL_STOP_PENDING));
 
-	task_clear_jobctl_stop_pending(task);
+	task_clear_jobctl_pending(task, JOBCTL_STOP_PENDING);
 
 	if (!consume)
 		return false;
@@ -810,7 +817,7 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
 		t = p;
 		do {
-			task_clear_jobctl_stop_pending(t);
+			task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
 			wake_up_state(t, __TASK_STOPPED);
 		} while_each_thread(p, t);
@@ -926,7 +933,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
-				task_clear_jobctl_stop_pending(t);
+				task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1161,7 +1168,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
-		task_clear_jobctl_stop_pending(t);
+		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 		count++;
 
 		/* Don't bother with already dead threads */
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 06/17] job control: make task_clear_jobctl_pending() clear TRAPPING automatically
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (4 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 05/17] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 07/17] job control: introduce task_set_jobctl_pending() Tejun Heo
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

JOBCTL_TRAPPING indicates that ptracer is waiting for tracee to
(re)transit into TRACED.  task_clear_jobctl_pending() must be called
when either tracee enters TRACED or the transition is cancelled for
some reason.  The former is achieved by explicitly calling
task_clear_jobctl_pending() in ptrace_stop() and the latter by calling
it at the end of do_signal_stop().

Calling task_clear_jobctl_trapping() at the end of do_signal_stop()
limits the scope TRAPPING can be used and is fragile in that seemingly
unrelated changes to tracee's control flow can lead to stuck TRAPPING.

We already have task_clear_jobctl_pending() calls on those cancelling
events to clear JOBCTL_STOP_PENDING.  Cancellations can be handled by
making those call sites use JOBCTL_PENDING_MASK instead and updating
task_clear_jobctl_pending() such that task_clear_jobctl_trapping() is
called automatically if no stop/trap is pending.

This patch makes the above changes and removes the fallback
task_clear_jobctl_trapping() call from do_signal_stop().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/exec.c       |    2 +-
 kernel/signal.c |   13 ++++++++-----
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 4402105..a9f2b36 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1772,7 +1772,7 @@ static int zap_process(struct task_struct *start, int exit_code)
 
 	t = start;
 	do {
-		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
+		task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 		if (t != current && t->mm) {
 			sigaddset(&t->pending.signal, SIGKILL);
 			signal_wake_up(t, 1);
diff --git a/kernel/signal.c b/kernel/signal.c
index 288d952..637a171 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -253,6 +253,9 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
  * %JOBCTL_PENDING_MASK.  If %JOBCTL_STOP_PENDING is being cleared, other
  * STOP bits are cleared together.
  *
+ * If clearing of @mask leaves no stop or trap pending, this function calls
+ * task_clear_jobctl_trapping().
+ *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
@@ -264,6 +267,9 @@ void task_clear_jobctl_pending(struct task_struct *task, unsigned int mask)
 		mask |= JOBCTL_STOP_CONSUME | JOBCTL_STOP_DEQUEUED;
 
 	task->jobctl &= ~mask;
+
+	if (!(task->jobctl & JOBCTL_PENDING_MASK))
+		task_clear_jobctl_trapping(task);
 }
 
 /**
@@ -933,7 +939,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 			signal->group_stop_count = 0;
 			t = p;
 			do {
-				task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
+				task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 				sigaddset(&t->pending.signal, SIGKILL);
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
@@ -1168,7 +1174,7 @@ int zap_other_threads(struct task_struct *p)
 	p->signal->group_stop_count = 0;
 
 	while_each_thread(p, t) {
-		task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
+		task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 		count++;
 
 		/* Don't bother with already dead threads */
@@ -1964,9 +1970,6 @@ retry:
 		goto retry;
 	}
 
-	/* PTRACE_ATTACH might have raced with task killing, clear trapping */
-	task_clear_jobctl_trapping(current);
-
 	spin_unlock_irq(&current->sighand->siglock);
 
 	tracehook_finish_jctl();
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 07/17] job control: introduce task_set_jobctl_pending()
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (5 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 06/17] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 08/17] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

task->jobctl currently hosts JOBCTL_STOP_PENDING and will host TRAP
pending bits too.  Setting pending conditions on a dying task may make
the task unkillable.  Currently, each setting site is responsible for
checking for the condition but with to-be-added job control traps this
becomes too fragile.

This patch adds task_set_jobctl_pending() which should be used when
setting task->jobctl bits to schedule a stop or trap.  The function
performs the followings to ease setting pending bits.

* Sanity checks.

* If fatal signal is pending or PF_EXITING is set, no bit is set.

* STOP_SIGMASK is automatically cleared if new value is being set.

do_signal_stop() and ptrace_attach() are updated to use
task_set_jobctl_pending() instead of setting STOP_PENDING explicitly.
The surrounding structures around setting are changed to fit
task_set_jobctl_pending() better but there should be no userland
visible behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h |    2 ++
 kernel/ptrace.c       |    6 +++---
 kernel/signal.c       |   46 ++++++++++++++++++++++++++++++++++++++++------
 3 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 11791dd..bc7a380 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1819,6 +1819,8 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 
 #define JOBCTL_PENDING_MASK	JOBCTL_STOP_PENDING
 
+extern bool task_set_jobctl_pending(struct task_struct *task,
+				    unsigned int mask);
 extern void task_clear_jobctl_pending(struct task_struct *task,
 				      unsigned int mask);
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index cbda3c2..1506556 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -256,10 +256,10 @@ static int ptrace_attach(struct task_struct *task)
 	 * The following task_is_stopped() test is safe as both transitions
 	 * in and out of STOPPED are protected by siglock.
 	 */
-	if (task_is_stopped(task)) {
-		task->jobctl |= JOBCTL_STOP_PENDING | JOBCTL_TRAPPING;
+	if (task_is_stopped(task) &&
+	    task_set_jobctl_pending(task,
+				    JOBCTL_STOP_PENDING | JOBCTL_TRAPPING))
 		signal_wake_up(task, 1);
-	}
 
 	spin_unlock(&task->sighand->siglock);
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 637a171..9ab91c5 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -224,6 +224,39 @@ static inline void print_dropped_signal(int sig)
 }
 
 /**
+ * task_set_jobctl_pending - set jobctl pending bits
+ * @task: target task
+ * @mask: pending bits to set
+ *
+ * Clear @mask from @task->jobctl.  @mask must be subset of
+ * %JOBCTL_PENDING_MASK | %JOBCTL_STOP_CONSUME | %JOBCTL_STOP_SIGMASK |
+ * %JOBCTL_TRAPPING.  If stop signo is being set, the existing signo is
+ * cleared.  If @task is already being killed or exiting, this function
+ * becomes noop.
+ *
+ * CONTEXT:
+ * Must be called with @task->sighand->siglock held.
+ *
+ * RETURNS:
+ * %true if @mask is set, %false if made noop because @task was dying.
+ */
+bool task_set_jobctl_pending(struct task_struct *task, unsigned int mask)
+{
+	BUG_ON(mask & ~(JOBCTL_PENDING_MASK | JOBCTL_STOP_CONSUME |
+			JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
+	BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
+
+	if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
+		return false;
+
+	if (mask & JOBCTL_STOP_SIGMASK)
+		task->jobctl &= ~JOBCTL_STOP_SIGMASK;
+
+	task->jobctl |= mask;
+	return true;
+}
+
+/**
  * task_clear_jobctl_trapping - clear jobctl trapping bit
  * @task: target task
  *
@@ -1902,19 +1935,20 @@ static int do_signal_stop(int signr)
 		else
 			WARN_ON_ONCE(!task_ptrace(current));
 
-		current->jobctl &= ~JOBCTL_STOP_SIGMASK;
-		current->jobctl |= signr | gstop;
-		sig->group_stop_count = 1;
+		sig->group_stop_count = 0;
+
+		if (task_set_jobctl_pending(current, signr | gstop))
+			sig->group_stop_count++;
+
 		for (t = next_thread(current); t != current;
 		     t = next_thread(t)) {
-			t->jobctl &= ~JOBCTL_STOP_SIGMASK;
 			/*
 			 * Setting state to TASK_STOPPED for a group
 			 * stop is always done with the siglock held,
 			 * so this check has no races.
 			 */
-			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
-				t->jobctl |= signr | gstop;
+			if (!task_is_stopped(t) &&
+			    task_set_jobctl_pending(t, signr | gstop)) {
 				sig->group_stop_count++;
 				signal_wake_up(t, 0);
 			}
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 08/17] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (6 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 07/17] job control: introduce task_set_jobctl_pending() Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-06-02 11:41   ` [PATCH UPDATED " Tejun Heo
  2011-05-29 23:12 ` [PATCH 09/17] signal: remove three noop tracehooks Tejun Heo
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

ptracer->signal->wait_chldexit was used to wait for TRAPPING; however,
->wait_chldexit was already complicated with waker-side filtering
without adding TRAPPING wait on top of it.  Also, it unnecessarily
made TRAPPING clearing depend on the current ptrace relationship - if
the ptracee is detached, wakeup is lost.

There is no reason to use signal->wait_chldexit here.  We're just
waiting for JOBCTL_TRAPPING bit to clear and given the relatively
infrequent use of ptrace, bit_waitqueue can serve it perfectly.

This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of
signal->wait_chldexit.

-v2: Use JOBCTL_*_BIT macros instead of ilog2() as suggested by Linus.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 kernel/ptrace.c |   10 ++++++++--
 kernel/signal.c |    3 +--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 1506556..82f566b 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -25,6 +25,12 @@
 #include <linux/hw_breakpoint.h>
 
 
+static int ptrace_trapping_sleep_fn(void *flags)
+{
+	schedule();
+	return 0;
+}
+
 /*
  * ptrace a task: make the debugger its new parent and
  * move it to the ptrace list.
@@ -269,8 +275,8 @@ unlock_tasklist:
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
-	wait_event(current->signal->wait_chldexit,
-			   !(task->jobctl & JOBCTL_TRAPPING));
+	wait_on_bit(&task->jobctl, JOBCTL_TRAPPING_BIT,
+			    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
 	return retval;
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 9ab91c5..172a4c7 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -272,8 +272,7 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
 {
 	if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
 		task->jobctl &= ~JOBCTL_TRAPPING;
-		__wake_up_sync_key(&task->parent->signal->wait_chldexit,
-				   TASK_UNINTERRUPTIBLE, 1, task);
+		wake_up_bit(&task->jobctl, JOBCTL_TRAPPING_BIT);
 	}
 }
 
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 09/17] signal: remove three noop tracehooks
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (7 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 08/17] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 10/17] job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap Tejun Heo
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

Remove the following three noop tracehooks in signals.c.

* tracehook_force_sigpending()
* tracehook_get_signal()
* tracehook_finish_jctl()

The code area is about to be updated and these hooks don't do anything
other than obfuscating the logic.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/tracehook.h |   52 ---------------------------------------------
 kernel/signal.c           |   44 ++++++++++++--------------------------
 2 files changed, 14 insertions(+), 82 deletions(-)

diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index e95f523..15745cd 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -425,58 +425,6 @@ static inline int tracehook_consider_fatal_signal(struct task_struct *task,
 	return (task_ptrace(task) & PT_PTRACED) != 0;
 }
 
-/**
- * tracehook_force_sigpending - let tracing force signal_pending(current) on
- *
- * Called when recomputing our signal_pending() flag.  Return nonzero
- * to force the signal_pending() flag on, so that tracehook_get_signal()
- * will be called before the next return to user mode.
- *
- * Called with @current->sighand->siglock held.
- */
-static inline int tracehook_force_sigpending(void)
-{
-	return 0;
-}
-
-/**
- * tracehook_get_signal - deliver synthetic signal to traced task
- * @task:		@current
- * @regs:		task_pt_regs(@current)
- * @info:		details of synthetic signal
- * @return_ka:		sigaction for synthetic signal
- *
- * Return zero to check for a real pending signal normally.
- * Return -1 after releasing the siglock to repeat the check.
- * Return a signal number to induce an artificial signal delivery,
- * setting *@info and *@return_ka to specify its details and behavior.
- *
- * The @return_ka->sa_handler value controls the disposition of the
- * signal, no matter the signal number.  For %SIG_DFL, the return value
- * is a representative signal to indicate the behavior (e.g. %SIGTERM
- * for death, %SIGQUIT for core dump, %SIGSTOP for job control stop,
- * %SIGTSTP for stop unless in an orphaned pgrp), but the signal number
- * reported will be @info->si_signo instead.
- *
- * Called with @task->sighand->siglock held, before dequeuing pending signals.
- */
-static inline int tracehook_get_signal(struct task_struct *task,
-				       struct pt_regs *regs,
-				       siginfo_t *info,
-				       struct k_sigaction *return_ka)
-{
-	return 0;
-}
-
-/**
- * tracehook_finish_jctl - report about return from job control stop
- *
- * This is called by do_signal_stop() after wakeup.
- */
-static inline void tracehook_finish_jctl(void)
-{
-}
-
 #define DEATH_REAP			-1
 #define DEATH_DELAYED_GROUP_LEADER	-2
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 172a4c7..c99b8b5 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -150,9 +150,7 @@ void recalc_sigpending_and_wake(struct task_struct *t)
 
 void recalc_sigpending(void)
 {
-	if (unlikely(tracehook_force_sigpending()))
-		set_thread_flag(TIF_SIGPENDING);
-	else if (!recalc_sigpending_tsk(current) && !freezing(current))
+	if (!recalc_sigpending_tsk(current) && !freezing(current))
 		clear_thread_flag(TIF_SIGPENDING);
 
 }
@@ -2005,8 +2003,6 @@ retry:
 
 	spin_unlock_irq(&current->sighand->siglock);
 
-	tracehook_finish_jctl();
-
 	return 1;
 }
 
@@ -2109,37 +2105,25 @@ relock:
 
 	for (;;) {
 		struct k_sigaction *ka;
-		/*
-		 * Tracing can induce an artificial signal and choose sigaction.
-		 * The return value in @signr determines the default action,
-		 * but @info->si_signo is the signal number we will report.
-		 */
-		signr = tracehook_get_signal(current, regs, info, return_ka);
-		if (unlikely(signr < 0))
+
+		if (unlikely(current->jobctl & JOBCTL_STOP_PENDING) &&
+		    do_signal_stop(0))
 			goto relock;
-		if (unlikely(signr != 0))
-			ka = return_ka;
-		else {
-			if (unlikely(current->jobctl & JOBCTL_STOP_PENDING) &&
-			    do_signal_stop(0))
-				goto relock;
 
-			signr = dequeue_signal(current, &current->blocked,
-					       info);
+		signr = dequeue_signal(current, &current->blocked, info);
 
-			if (!signr)
-				break; /* will return 0 */
+		if (!signr)
+			break; /* will return 0 */
 
-			if (signr != SIGKILL) {
-				signr = ptrace_signal(signr, info,
-						      regs, cookie);
-				if (!signr)
-					continue;
-			}
-
-			ka = &sighand->action[signr-1];
+		if (signr != SIGKILL) {
+			signr = ptrace_signal(signr, info,
+					      regs, cookie);
+			if (!signr)
+				continue;
 		}
 
+		ka = &sighand->action[signr-1];
+
 		/* Trace actually delivered signals. */
 		trace_signal_deliver(signr, info, ka);
 
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 10/17] job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (8 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 09/17] signal: remove three noop tracehooks Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 11/17] ptrace: implement PTRACE_SEIZE Tejun Heo
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

do_signal_stop() implemented both normal group stop and trap for group
stop while ptraced.  This approach has been enough but scheduled
changes require trap mechanism which can be used in more generic
manner and using group stop trap for generic trap site simplifies both
userland visible interface and implementation.

This patch adds a new jobctl flag - JOBCTL_TRAP_STOP.  When set, it
triggers a trap site, which behaves like group stop trap, in
get_signal_to_deliver() after checking for pending signals.  While
ptraced, do_signal_stop() doesn't stop itself.  It initiates group
stop if requested and schedules JOBCTL_TRAP_STOP and returns.  The
caller - get_signal_to_deliver() - is responsible for checking whether
TRAP_STOP is pending afterwards and handling it.

ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of
JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap
bits and TRAPPING so that TRAP_STOP and future trap bits don't linger
after detach.

While at it, add proper function comment to do_signal_stop() and make
it return bool.

-v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING
     instead of JOBCTL_PENDING_MASK.  This avoids accidentally
     clearing JOBCTL_STOP_CONSUME.  Spotted by Oleg.

-v3: do_signal_stop() updated to return %false without dropping
     siglock while ptraced and TRAP_STOP check moved inside for(;;)
     loop after group stop participation.  This avoids unnecessary
     relocking and also will help avoiding unnecessary traps by
     consuming group stop before handling pending traps.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h |    6 +++-
 kernel/ptrace.c       |   12 +++++--
 kernel/signal.c       |   83 ++++++++++++++++++++++++++++++------------------
 3 files changed, 66 insertions(+), 35 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index bc7a380..20f392b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1810,17 +1810,21 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define JOBCTL_STOP_DEQUEUED_BIT 16	/* stop signal dequeued */
 #define JOBCTL_STOP_PENDING_BIT	17	/* task should stop for group stop */
 #define JOBCTL_STOP_CONSUME_BIT	18	/* consume group stop count */
+#define JOBCTL_TRAP_STOP_BIT	19	/* trap for STOP */
 #define JOBCTL_TRAPPING_BIT	21	/* switching to TRACED */
 
 #define JOBCTL_STOP_DEQUEUED	(1 << JOBCTL_STOP_DEQUEUED_BIT)
 #define JOBCTL_STOP_PENDING	(1 << JOBCTL_STOP_PENDING_BIT)
 #define JOBCTL_STOP_CONSUME	(1 << JOBCTL_STOP_CONSUME_BIT)
+#define JOBCTL_TRAP_STOP	(1 << JOBCTL_TRAP_STOP_BIT)
 #define JOBCTL_TRAPPING		(1 << JOBCTL_TRAPPING_BIT)
 
-#define JOBCTL_PENDING_MASK	JOBCTL_STOP_PENDING
+#define JOBCTL_TRAP_MASK	JOBCTL_TRAP_STOP
+#define JOBCTL_PENDING_MASK	(JOBCTL_STOP_PENDING | JOBCTL_TRAP_MASK)
 
 extern bool task_set_jobctl_pending(struct task_struct *task,
 				    unsigned int mask);
+extern void task_clear_jobctl_trapping(struct task_struct *task);
 extern void task_clear_jobctl_pending(struct task_struct *task,
 				      unsigned int mask);
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 82f566b..2ac7260 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -83,6 +83,13 @@ void __ptrace_unlink(struct task_struct *child)
 	spin_lock(&child->sighand->siglock);
 
 	/*
+	 * Clear all pending traps and TRAPPING.  TRAPPING should be
+	 * cleared regardless of JOBCTL_STOP_PENDING.  Do it explicitly.
+	 */
+	task_clear_jobctl_pending(child, JOBCTL_TRAP_MASK);
+	task_clear_jobctl_trapping(child);
+
+	/*
 	 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
 	 * @child isn't dead.
 	 */
@@ -246,7 +253,7 @@ static int ptrace_attach(struct task_struct *task)
 	spin_lock(&task->sighand->siglock);
 
 	/*
-	 * If the task is already STOPPED, set JOBCTL_STOP_PENDING and
+	 * If the task is already STOPPED, set JOBCTL_TRAP_STOP and
 	 * TRAPPING, and kick it so that it transits to TRACED.  TRAPPING
 	 * will be cleared if the child completes the transition or any
 	 * event which clears the group stop states happens.  We'll wait
@@ -263,8 +270,7 @@ static int ptrace_attach(struct task_struct *task)
 	 * in and out of STOPPED are protected by siglock.
 	 */
 	if (task_is_stopped(task) &&
-	    task_set_jobctl_pending(task,
-				    JOBCTL_STOP_PENDING | JOBCTL_TRAPPING))
+	    task_set_jobctl_pending(task, JOBCTL_TRAP_STOP | JOBCTL_TRAPPING))
 		signal_wake_up(task, 1);
 
 	spin_unlock(&task->sighand->siglock);
diff --git a/kernel/signal.c b/kernel/signal.c
index c99b8b5..262bb6c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -266,7 +266,7 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned int mask)
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
  */
-static void task_clear_jobctl_trapping(struct task_struct *task)
+void task_clear_jobctl_trapping(struct task_struct *task)
 {
 	if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
 		task->jobctl &= ~JOBCTL_TRAPPING;
@@ -1790,13 +1790,16 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	/*
 	 * If @why is CLD_STOPPED, we're trapping to participate in a group
 	 * stop.  Do the bookkeeping.  Note that if SIGCONT was delievered
-	 * while siglock was released for the arch hook, PENDING could be
-	 * clear now.  We act as if SIGCONT is received after TASK_TRACED
-	 * is entered - ignore it.
+	 * across siglock relocks since INTERRUPT was scheduled, PENDING
+	 * could be clear now.  We act as if SIGCONT is received after
+	 * TASK_TRACED is entered - ignore it.
 	 */
 	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
+	/* any trap clears pending STOP trap */
+	task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
+
 	/* entering a trap, clear TRAPPING */
 	task_clear_jobctl_trapping(current);
 
@@ -1888,13 +1891,30 @@ void ptrace_notify(int exit_code)
 	spin_unlock_irq(&current->sighand->siglock);
 }
 
-/*
- * This performs the stopping for SIGSTOP and other stop signals.
- * We have to stop all threads in the thread group.
- * Returns non-zero if we've actually stopped and released the siglock.
- * Returns zero if we didn't stop and still hold the siglock.
+/**
+ * do_signal_stop - handle group stop for SIGSTOP and other stop signals
+ * @signr: signr causing group stop if initiating
+ *
+ * If %JOBCTL_STOP_PENDING is not set yet, initiate group stop with @signr
+ * and participate in it.  If already set, participate in the existing
+ * group stop.  If participated in a group stop (and thus slept), %true is
+ * returned with siglock released.
+ *
+ * If ptraced, this function doesn't handle stop itself.  Instead,
+ * %JOBCTL_TRAP_STOP is scheduled and %false is returned with siglock
+ * untouched.  The caller must ensure that INTERRUPT trap handling takes
+ * places afterwards.
+ *
+ * CONTEXT:
+ * Must be called with @current->sighand->siglock held, which is released
+ * on %true return.
+ *
+ * RETURNS:
+ * %false if group stop is already cancelled or ptrace trap is scheduled.
+ * %true if participated in group stop.
  */
-static int do_signal_stop(int signr)
+static bool do_signal_stop(int signr)
+	__releases(&current->sighand->siglock)
 {
 	struct signal_struct *sig = current->signal;
 
@@ -1907,7 +1927,7 @@ static int do_signal_stop(int signr)
 
 		if (!likely(current->jobctl & JOBCTL_STOP_DEQUEUED) ||
 		    unlikely(signal_group_exit(sig)))
-			return 0;
+			return false;
 		/*
 		 * There is no group stop already in progress.  We must
 		 * initiate one now.
@@ -1951,7 +1971,7 @@ static int do_signal_stop(int signr)
 			}
 		}
 	}
-retry:
+
 	if (likely(!task_ptrace(current))) {
 		int notify = 0;
 
@@ -1983,27 +2003,15 @@ retry:
 
 		/* Now we don't run again until woken by SIGCONT or SIGKILL */
 		schedule();
-
-		spin_lock_irq(&current->sighand->siglock);
+		return true;
 	} else {
-		ptrace_stop(current->jobctl & JOBCTL_STOP_SIGMASK,
-			    CLD_STOPPED, 0, NULL);
-		current->exit_code = 0;
-	}
-
-	/*
-	 * JOBCTL_STOP_PENDING could be set if another group stop has
-	 * started since being woken up or ptrace wants us to transit
-	 * between TASK_STOPPED and TRACED.  Retry group stop.
-	 */
-	if (current->jobctl & JOBCTL_STOP_PENDING) {
-		WARN_ON_ONCE(!(current->jobctl & JOBCTL_STOP_SIGMASK));
-		goto retry;
+		/*
+		 * While ptraced, group stop is handled by STOP trap.
+		 * Schedule it and let the caller deal with it.
+		 */
+		task_set_jobctl_pending(current, JOBCTL_TRAP_STOP);
+		return false;
 	}
-
-	spin_unlock_irq(&current->sighand->siglock);
-
-	return 1;
 }
 
 static int ptrace_signal(int signr, siginfo_t *info,
@@ -2110,6 +2118,19 @@ relock:
 		    do_signal_stop(0))
 			goto relock;
 
+		/*
+		 * Take care of ptrace jobctl traps.  It is currently used
+		 * only to trap for group stop while ptraced.
+		 */
+		if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
+			signr = current->jobctl & JOBCTL_STOP_SIGMASK;
+			WARN_ON_ONCE(!signr);
+			ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+			current->exit_code = 0;
+			spin_unlock_irq(&sighand->siglock);
+			goto relock;
+		}
+
 		signr = dequeue_signal(current, &current->blocked, info);
 
 		if (!signr)
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 11/17] ptrace: implement PTRACE_SEIZE
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (9 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 10/17] job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-06-01 19:01   ` Oleg Nesterov
  2011-06-02 11:43   ` [PATCH UPDATED " Tejun Heo
  2011-05-29 23:12 ` [PATCH 12/17] ptrace: implement PTRACE_INTERRUPT Tejun Heo
                   ` (6 subsequent siblings)
  17 siblings, 2 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side
effects on tracee signal and job control states.  This patch
implements a new ptrace request PTRACE_SEIZE which attaches and traps
tracee without affecting its signal and job control states.

The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_*
flags in @data.  Currently, the only defined flag is
PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE.
PTRACE_SEIZE will change ptrace behaviors outside of attach itself.
The changes will be implemented gradually and the DEVEL flag is to
prevent programs which expect full SEIZE behavior from using it before
all the behavior modifications are complete while allowing unit
testing.  The flag will be removed once SEIZE behaviors are completely
implemented.

* PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap.  After
  attaching tracee continues to run unless a trap condition occurs.

* PTRACE_SEIZE doesn't affect signal or group stop state.

* If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
  exit_code of (SIGTRAP | PTRACE_EVENT_STOP << 8) instead of the
  stopping signal number and returns usual trap siginfo on
  PTRACE_GETSIGINFO instead of NULL.

Note that there currently is no way to find out the stopping signal
number while seized.  This will be improved by future patches.

Seizing sets PT_SEIZED in ->ptrace of the tracee.  This flag will be
used to determine whether new SEIZE behaviors should be enabled.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };
  static const struct timespec ts1s = { .tv_sec = 1 };
  static const struct timespec ts3s = { .tv_sec = 3 };

  int main(int argc, char **argv)
  {
	  pid_t tracee;

	  tracee = fork();
	  if (tracee == 0) {
		  nanosleep(&ts100ms, NULL);
		  while (1) {
			  printf("tracee: alive\n");
			  nanosleep(&ts1s, NULL);
		  }
	  }

	  if (argc > 1)
		  kill(tracee, SIGSTOP);

	  nanosleep(&ts100ms, NULL);

	  ptrace(PTRACE_SEIZE, tracee, NULL,
		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
	  if (argc > 1) {
		  waitid(P_PID, tracee, NULL, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
	  }
	  nanosleep(&ts3s, NULL);
	  printf("tracer: exiting\n");
	  return 0;
  }

When the above program is called w/o argument, tracee is seized while
running and remains running.  When tracer exits, tracee continues to
run and print out messages.

  # ./test-seize-simple
  tracee: alive
  tracee: alive
  tracee: alive
  tracer: exiting
  tracee: alive
  tracee: alive

When called with an argument, tracee is seized from stopped state and
continued, and returns to stopped state when tracer exits.

  # ./test-seize
  tracee: alive
  tracee: alive
  tracee: alive
  tracer: exiting
  # ps -el|grep test-seize
  1 T     0  4720     1  0  80   0 -   941 signal ttyS0    00:00:00 test-seize

-v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan
     suggested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
---
 include/linux/ptrace.h |    7 +++++++
 kernel/ptrace.c        |   35 +++++++++++++++++++++++++++++------
 kernel/signal.c        |   32 ++++++++++++++++++++++++--------
 3 files changed, 60 insertions(+), 14 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index e93ef1a..67ad3f1 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -47,6 +47,11 @@
 #define PTRACE_GETREGSET	0x4204
 #define PTRACE_SETREGSET	0x4205
 
+#define PTRACE_SEIZE		0x4206
+
+/* flags in @data for PTRACE_SEIZE */
+#define PTRACE_SEIZE_DEVEL	0x80000000 /* temp flag for development */
+
 /* options set using PTRACE_SETOPTIONS */
 #define PTRACE_O_TRACESYSGOOD	0x00000001
 #define PTRACE_O_TRACEFORK	0x00000002
@@ -65,6 +70,7 @@
 #define PTRACE_EVENT_EXEC	4
 #define PTRACE_EVENT_VFORK_DONE	5
 #define PTRACE_EVENT_EXIT	6
+#define PTRACE_EVENT_STOP	7
 
 #include <asm/ptrace.h>
 
@@ -77,6 +83,7 @@
  * flags.  When the a task is stopped the ptracer owns task->ptrace.
  */
 
+#define PT_SEIZED	0x00010000	/* SEIZE used, enable new behavior */
 #define PT_PTRACED	0x00000001
 #define PT_DTRACE	0x00000002	/* delayed trace (used on m68k, i386) */
 #define PT_TRACESYSGOOD	0x00000004
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2ac7260..07505aa 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -209,10 +209,28 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 	return !err;
 }
 
-static int ptrace_attach(struct task_struct *task)
+static int ptrace_attach(struct task_struct *task, long request,
+			 unsigned long flags)
 {
+	bool seize = (request == PTRACE_SEIZE);
 	int retval;
 
+	/*
+	 * SEIZE will enable new ptrace behaviors which will be implemented
+	 * gradually.  SEIZE_DEVEL is used to prevent applications
+	 * expecting full SEIZE behaviors trapping on kernel commits which
+	 * are still in the process of implementing them.
+	 *
+	 * Only test programs for new ptrace behaviors being implemented
+	 * should set SEIZE_DEVEL.  If unset, SEIZE will fail with -EIO.
+	 *
+	 * Once SEIZE behaviors are completely implemented, this flag and
+	 * the following test will be removed.
+	 */
+	retval = -EIO;
+	if (seize && !(flags & PTRACE_SEIZE_DEVEL))
+		goto out;
+
 	audit_ptrace(task);
 
 	retval = -EPERM;
@@ -244,11 +262,16 @@ static int ptrace_attach(struct task_struct *task)
 		goto unlock_tasklist;
 
 	task->ptrace = PT_PTRACED;
+	if (seize)
+		task->ptrace |= PT_SEIZED;
 	if (task_ns_capable(task, CAP_SYS_PTRACE))
 		task->ptrace |= PT_PTRACE_CAP;
 
 	__ptrace_link(task, current);
-	send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
+
+	/* SEIZE doesn't trap tracee on attach */
+	if (!seize)
+		send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
 
 	spin_lock(&task->sighand->siglock);
 
@@ -784,8 +807,8 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
 		goto out;
 	}
 
-	if (request == PTRACE_ATTACH) {
-		ret = ptrace_attach(child);
+	if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+		ret = ptrace_attach(child, request, data);
 		/*
 		 * Some architectures need to do book-keeping after
 		 * a ptrace attach.
@@ -926,8 +949,8 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
 		goto out;
 	}
 
-	if (request == PTRACE_ATTACH) {
-		ret = ptrace_attach(child);
+	if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+		ret = ptrace_attach(child, request, data);
 		/*
 		 * Some architectures need to do book-keeping after
 		 * a ptrace attach.
diff --git a/kernel/signal.c b/kernel/signal.c
index 262bb6c..434243e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1873,7 +1873,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	recalc_sigpending_tsk(current);
 }
 
-void ptrace_notify(int exit_code)
+static void ptrace_do_notify(int exit_code, int why)
 {
 	siginfo_t info;
 
@@ -1886,8 +1886,13 @@ void ptrace_notify(int exit_code)
 	info.si_uid = current_uid();
 
 	/* Let the debugger run.  */
+	ptrace_stop(exit_code, why, 1, &info);
+}
+
+void ptrace_notify(int exit_code)
+{
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_stop(exit_code, CLD_TRAPPED, 1, &info);
+	ptrace_do_notify(exit_code, CLD_TRAPPED);
 	spin_unlock_irq(&current->sighand->siglock);
 }
 
@@ -2119,14 +2124,25 @@ relock:
 			goto relock;
 
 		/*
-		 * Take care of ptrace jobctl traps.  It is currently used
-		 * only to trap for group stop while ptraced.
+		 * Take care of ptrace jobctl traps.
+		 *
+		 * When PT_SEIZED, it's used for both group stop and
+		 * explicit SEIZE/INTERRUPT traps.  Both generate
+		 * PTRACE_EVENT_STOP trap with accompanying siginfo.
+		 *
+		 * When !PT_SEIZED, it's used only for group stop trap with
+		 * stop signal number as exit_code and no siginfo.
 		 */
 		if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
-			signr = current->jobctl & JOBCTL_STOP_SIGMASK;
-			WARN_ON_ONCE(!signr);
-			ptrace_stop(signr, CLD_STOPPED, 0, NULL);
-			current->exit_code = 0;
+			if (current->ptrace & PT_SEIZED) {
+				ptrace_do_notify(SIGTRAP | PTRACE_EVENT_STOP<<8,
+						 CLD_STOPPED);
+			} else {
+				signr = current->jobctl & JOBCTL_STOP_SIGMASK;
+				WARN_ON_ONCE(!signr);
+				ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+				current->exit_code = 0;
+			}
 			spin_unlock_irq(&sighand->siglock);
 			goto relock;
 		}
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 12/17] ptrace: implement PTRACE_INTERRUPT
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (10 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 11/17] ptrace: implement PTRACE_SEIZE Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 13/17] ptrace: add siginfo.si_pt_flags Tejun Heo
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

Currently, there's no way to trap a running ptracee short of sending a
signal which has various side effects.  This patch implements
PTRACE_INTERRUPT which traps ptracee without any signal or job control
related side effect.

The implementation is almost trivial.  It uses the group stop trap -
SIGTRAP | PTRACE_EVENT_STOP << 8.  A new trap flag
JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and
cleared when any trap happens.  As INTERRUPT should be useable
regardless of the current state of tracee, task_is_traced() test in
ptrace_check_attach() is skipped for INTERRUPT.

PTRACE_INTERRUPT is available iff tracee is attached with
PTRACE_SEIZE.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };
  static const struct timespec ts1s = { .tv_sec = 1 };
  static const struct timespec ts3s = { .tv_sec = 3 };

  int main(int argc, char **argv)
  {
	  pid_t tracee;

	  tracee = fork();
	  if (tracee == 0) {
		  nanosleep(&ts100ms, NULL);
		  while (1) {
			  printf("tracee: alive pid=%d\n", getpid());
			  nanosleep(&ts1s, NULL);
		  }
	  }

	  if (argc > 1)
		  kill(tracee, SIGSTOP);

	  nanosleep(&ts100ms, NULL);

	  ptrace(PTRACE_SEIZE, tracee, NULL,
		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
	  if (argc > 1) {
		  waitid(P_PID, tracee, NULL, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
	  }
	  nanosleep(&ts3s, NULL);

	  printf("tracer: INTERRUPT and DETACH\n");
	  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
	  waitid(P_PID, tracee, NULL, WSTOPPED);
	  ptrace(PTRACE_DETACH, tracee, NULL, NULL);
	  nanosleep(&ts3s, NULL);

	  printf("tracer: exiting\n");
	  kill(tracee, SIGKILL);
	  return 0;
  }

When called without argument, tracee is seized from running state,
interrupted and then detached back to running state.

  # ./test-interrupt
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracer: INTERRUPT and DETACH
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracer: exiting

When called with argument, tracee is seized from stopped state,
continued, interrupted and then detached back to stopped state.

  # ./test-interrupt  1
  tracee: alive pid=4548
  tracee: alive pid=4548
  tracee: alive pid=4548
  tracer: INTERRUPT and DETACH
  tracer: exiting

Before PTRACE_INTERRUPT, once the tracee was running, there was no way
to trap tracee and do PTRACE_DETACH without causing side effect.

-v2: Updated to use task_set_jobctl_pending() so that it doesn't end
     up scheduling TRAP_STOP if child is dying which may make the
     child unkillable.  Spotted by Oleg.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/ptrace.h |    1 +
 kernel/ptrace.c        |   29 +++++++++++++++++++++++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 67ad3f1..ad754d1 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -48,6 +48,7 @@
 #define PTRACE_SETREGSET	0x4205
 
 #define PTRACE_SEIZE		0x4206
+#define PTRACE_INTERRUPT	0x4207
 
 /* flags in @data for PTRACE_SEIZE */
 #define PTRACE_SEIZE_DEVEL	0x80000000 /* temp flag for development */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 07505aa..d5e38ac 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -657,10 +657,12 @@ static int ptrace_regset(struct task_struct *task, int req, unsigned int type,
 int ptrace_request(struct task_struct *child, long request,
 		   unsigned long addr, unsigned long data)
 {
+	bool seized = child->ptrace & PT_SEIZED;
 	int ret = -EIO;
 	siginfo_t siginfo;
 	void __user *datavp = (void __user *) data;
 	unsigned long __user *datalp = datavp;
+	unsigned long flags;
 
 	switch (request) {
 	case PTRACE_PEEKTEXT:
@@ -693,6 +695,27 @@ int ptrace_request(struct task_struct *child, long request,
 			ret = ptrace_setsiginfo(child, &siginfo);
 		break;
 
+	case PTRACE_INTERRUPT:
+		/*
+		 * Stop tracee without any side-effect on signal or job
+		 * control.  At least one trap is guaranteed to happen
+		 * after this request.  If @child is already trapped, the
+		 * current trap is not disturbed and another trap will
+		 * happen after the current trap is ended with PTRACE_CONT.
+		 *
+		 * The actual trap might not be PTRACE_EVENT_STOP trap but
+		 * the pending condition is cleared regardless.
+		 */
+		if (unlikely(!seized || !lock_task_sighand(child, &flags)))
+			break;
+
+		if (likely(task_set_jobctl_pending(child, JOBCTL_TRAP_STOP)))
+			signal_wake_up(child, 0);
+
+		unlock_task_sighand(child, &flags);
+		ret = 0;
+		break;
+
 	case PTRACE_DETACH:	 /* detach a process that was attached. */
 		ret = ptrace_detach(child, data);
 		break;
@@ -818,7 +841,8 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
 		goto out_put_task_struct;
 	}
 
-	ret = ptrace_check_attach(child, request == PTRACE_KILL);
+	ret = ptrace_check_attach(child, request == PTRACE_KILL ||
+				  request == PTRACE_INTERRUPT);
 	if (ret < 0)
 		goto out_put_task_struct;
 
@@ -960,7 +984,8 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
 		goto out_put_task_struct;
 	}
 
-	ret = ptrace_check_attach(child, request == PTRACE_KILL);
+	ret = ptrace_check_attach(child, request == PTRACE_KILL ||
+				  request == PTRACE_INTERRUPT);
 	if (!ret)
 		ret = compat_arch_ptrace(child, request, addr, data);
 
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 13/17] ptrace: add siginfo.si_pt_flags
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (11 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 12/17] ptrace: implement PTRACE_INTERRUPT Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 14/17] ptrace: make group stop state visible via PTRACE_GETSIGINFO Tejun Heo
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo, Tony Luck, Fenghua Yu, Ralf Baechle,
	Kyle McMartin, Helge Deller, James E.J. Bottomley,
	Benjamin Herrenschmidt, Paul Mackerras, Martin Schwidefsky,
	Heiko Carstens, David S. Miller, x86

This essentially is a simple addition of a flag field but seems
complicated thanks to siginfo_t convolution.  _sigtrap struct, which
contains all the fields used by ptrace_notify[_locked]() and the new
_pt_flags, is added to siginfo._sifields union along with the field
abbreviation macro si_pt_flags; then, __SI_TRAP is defined to
implement copying of the new field to userland.

Two architectures - ia64 and mips - define their own versions of
siginfo_t and ia64 implements its own copy_siginfo_to_user().  Also,
x86, mips, parisc, powerpc, s390, sparc and tile have compat_siginfo_t
and copy_siginfo_to_user32() for 32bit compatibility.  All are updated
such that [compat_]siginfo_t have _sigtrap and all the fields are
copied out.

x86 is tested.  Affected code in mips, powerpc, s390 and sparc are
compile tested.  mips and tile are untested.

This patch doesn't actually make use of the new field.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: x86@kernel.org
---
 arch/ia64/include/asm/siginfo.h       |    7 +++++++
 arch/ia64/kernel/signal.c             |    5 +++++
 arch/mips/include/asm/compat-signal.h |    7 +++++++
 arch/mips/include/asm/siginfo.h       |    7 +++++++
 arch/mips/kernel/signal32.c           |    5 +++++
 arch/parisc/kernel/signal32.c         |    5 +++++
 arch/parisc/kernel/signal32.h         |    7 +++++++
 arch/powerpc/kernel/ppc32.h           |    7 +++++++
 arch/powerpc/kernel/signal_32.c       |    5 +++++
 arch/s390/kernel/compat_linux.h       |    7 +++++++
 arch/s390/kernel/compat_signal.c      |    5 +++++
 arch/sparc/kernel/signal32.c          |   12 ++++++++++++
 arch/tile/kernel/compat_signal.c      |   11 +++++++++++
 arch/x86/ia32/ia32_signal.c           |    4 ++++
 arch/x86/include/asm/ia32.h           |    7 +++++++
 include/asm-generic/siginfo.h         |   10 ++++++++++
 kernel/signal.c                       |    7 ++++++-
 17 files changed, 117 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/include/asm/siginfo.h b/arch/ia64/include/asm/siginfo.h
index c8fcaa2..2cff1ce 100644
--- a/arch/ia64/include/asm/siginfo.h
+++ b/arch/ia64/include/asm/siginfo.h
@@ -70,6 +70,13 @@ typedef struct siginfo {
 			long _band;	/* POLL_IN, POLL_OUT, POLL_MSG (XPG requires a "long") */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			pid_t _pid;		/* sender's pid */
+			uid_t _uid;		/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 } siginfo_t;
 
diff --git a/arch/ia64/kernel/signal.c b/arch/ia64/kernel/signal.c
index 7bdafc8..ee18366 100644
--- a/arch/ia64/kernel/signal.c
+++ b/arch/ia64/kernel/signal.c
@@ -142,6 +142,11 @@ copy_siginfo_to_user (siginfo_t __user *to, siginfo_t *from)
 			err |= __put_user(from->si_addr, &to->si_addr);
 			err |= __put_user(from->si_imm, &to->si_imm);
 			break;
+		      case __SI_TRAP >> 16:
+			err |= __put_user(from->si_uid, &to->si_uid);
+			err |= __put_user(from->si_pid, &to->si_pid);
+			err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+			break;
 		      case __SI_TIMER >> 16:
 			err |= __put_user(from->si_tid, &to->si_tid);
 			err |= __put_user(from->si_overrun, &to->si_overrun);
diff --git a/arch/mips/include/asm/compat-signal.h b/arch/mips/include/asm/compat-signal.h
index 368a99e..47b2e4f 100644
--- a/arch/mips/include/asm/compat-signal.h
+++ b/arch/mips/include/asm/compat-signal.h
@@ -54,6 +54,13 @@ typedef struct compat_siginfo {
 			int _fd;
 		} _sigpoll;
 
+		/* SIGTRAP */
+		struct {
+			compat_pid_t _pid;	/* sender's pid */
+			compat_uid_t _uid;	/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
+
 		/* POSIX.1b timers */
 		struct {
 			timer_t _tid;		/* timer id */
diff --git a/arch/mips/include/asm/siginfo.h b/arch/mips/include/asm/siginfo.h
index 20ebeb8..6e8f0d6 100644
--- a/arch/mips/include/asm/siginfo.h
+++ b/arch/mips/include/asm/siginfo.h
@@ -96,6 +96,13 @@ typedef struct siginfo {
 			__ARCH_SI_BAND_T _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			pid_t _pid;		/* sender's pid */
+			__ARCH_SI_UID_T _uid;	/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 } siginfo_t;
 
diff --git a/arch/mips/kernel/signal32.c b/arch/mips/kernel/signal32.c
index aae9866..7e392e1 100644
--- a/arch/mips/kernel/signal32.c
+++ b/arch/mips/kernel/signal32.c
@@ -452,6 +452,11 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
 			err |= __put_user(from->si_band, &to->si_band);
 			err |= __put_user(from->si_fd, &to->si_fd);
 			break;
+		case __SI_TRAP >> 16:
+			err |= __put_user(from->si_pid, &to->si_pid);
+			err |= __put_user(from->si_uid, &to->si_uid);
+			err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+			break;
 		case __SI_RT >> 16: /* This is not generated by the kernel as of now.  */
 		case __SI_MESGQ >> 16:
 			err |= __put_user(from->si_pid, &to->si_pid);
diff --git a/arch/parisc/kernel/signal32.c b/arch/parisc/kernel/signal32.c
index e141324..ead8ca4 100644
--- a/arch/parisc/kernel/signal32.c
+++ b/arch/parisc/kernel/signal32.c
@@ -482,6 +482,11 @@ copy_siginfo_to_user32 (compat_siginfo_t __user *to, siginfo_t *from)
 			err |= __put_user(from->si_band, &to->si_band);
 			err |= __put_user(from->si_fd, &to->si_fd);
 			break;
+		case __SI_TRAP >> 16:
+			err |= __put_user(from->si_pid, &to->si_pid);
+			err |= __put_user(from->si_uid, &to->si_uid);
+			err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+			break;
 		case __SI_TIMER >> 16:
 			err |= __put_user(from->si_tid, &to->si_tid);
 			err |= __put_user(from->si_overrun, &to->si_overrun);
diff --git a/arch/parisc/kernel/signal32.h b/arch/parisc/kernel/signal32.h
index c780084..8016f51 100644
--- a/arch/parisc/kernel/signal32.h
+++ b/arch/parisc/kernel/signal32.h
@@ -104,6 +104,13 @@ typedef struct compat_siginfo {
                         int _band;      /* POLL_IN, POLL_OUT, POLL_MSG */
                         int _fd;
                 } _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			unsigned int _pid;      /* sender's pid */
+			unsigned int _uid;      /* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
         } _sifields;
 } compat_siginfo_t;
 
diff --git a/arch/powerpc/kernel/ppc32.h b/arch/powerpc/kernel/ppc32.h
index dc16aef..4293542 100644
--- a/arch/powerpc/kernel/ppc32.h
+++ b/arch/powerpc/kernel/ppc32.h
@@ -64,6 +64,13 @@ typedef struct compat_siginfo {
 			int _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			compat_pid_t _pid;		/* sender's pid */
+			compat_uid_t _uid;		/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 } compat_siginfo_t;
 
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index b96a3a0..d072458 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -716,6 +716,11 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *d, siginfo_t *s)
 		err |= __put_user(s->si_band, &d->si_band);
 		err |= __put_user(s->si_fd, &d->si_fd);
 		break;
+	case __SI_TRAP:
+		err |= __put_user(s->si_pid, &d->si_pid);
+		err |= __put_user(s->si_uid, &d->si_uid);
+		err |= __put_user(s->si_pt_flags, &d->si_pt_flags);
+		break;
 	case __SI_TIMER >> 16:
 		err |= __put_user(s->si_tid, &d->si_tid);
 		err |= __put_user(s->si_overrun, &d->si_overrun);
diff --git a/arch/s390/kernel/compat_linux.h b/arch/s390/kernel/compat_linux.h
index 9635d75..f8c973f 100644
--- a/arch/s390/kernel/compat_linux.h
+++ b/arch/s390/kernel/compat_linux.h
@@ -72,6 +72,13 @@ typedef struct compat_siginfo {
 			int	_band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int	_fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			pid_t		_pid;	/* sender's pid */
+			uid_t		_uid;	/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 } compat_siginfo_t;
 
diff --git a/arch/s390/kernel/compat_signal.c b/arch/s390/kernel/compat_signal.c
index eee9998..b3c9f6b 100644
--- a/arch/s390/kernel/compat_signal.c
+++ b/arch/s390/kernel/compat_signal.c
@@ -96,6 +96,11 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
 			err |= __put_user(from->si_band, &to->si_band);
 			err |= __put_user(from->si_fd, &to->si_fd);
 			break;
+		case __SI_TRAP >> 16:
+			err |= __put_user(from->si_pid, &to->si_pid);
+			err |= __put_user(from->si_uid, &to->si_uid);
+			err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+			break;
 		case __SI_TIMER >> 16:
 			err |= __put_user(from->si_tid, &to->si_tid);
 			err |= __put_user(from->si_overrun, &to->si_overrun);
diff --git a/arch/sparc/kernel/signal32.c b/arch/sparc/kernel/signal32.c
index 75fad42..c545212 100644
--- a/arch/sparc/kernel/signal32.c
+++ b/arch/sparc/kernel/signal32.c
@@ -102,6 +102,13 @@ typedef struct compat_siginfo{
 			int _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			compat_pid_t _pid;		/* sender's pid */
+			unsigned int _uid;		/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 }compat_siginfo_t;
 
@@ -165,6 +172,11 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
 			err |= __put_user(from->si_band, &to->si_band);
 			err |= __put_user(from->si_fd, &to->si_fd);
 			break;
+		case __SI_TRAP >> 16:
+			err |= __put_user(from->si_pid, &to->si_pid);
+			err |= __put_user(from->si_uid, &to->si_uid);
+			err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+			break;
 		case __SI_RT >> 16: /* This is not generated by the kernel as of now.  */
 		case __SI_MESGQ >> 16:
 			err |= __put_user(from->si_pid, &to->si_pid);
diff --git a/arch/tile/kernel/compat_signal.c b/arch/tile/kernel/compat_signal.c
index a7869ad..6a45c4f 100644
--- a/arch/tile/kernel/compat_signal.c
+++ b/arch/tile/kernel/compat_signal.c
@@ -109,6 +109,13 @@ struct compat_siginfo {
 			int _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			unsigned int _pid;	/* sender's pid */
+			unsigned int _uid;	/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 };
 
@@ -219,6 +226,10 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *to, siginfo_t *from)
 		case __SI_POLL >> 16:
 			err |= __put_user(from->si_fd, &to->si_fd);
 			break;
+		case __SI_TRAP >> 16:
+			err |= __put_user(from->si_uid, &to->si_uid);
+			err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+			break;
 		case __SI_TIMER >> 16:
 			err |= __put_user(from->si_overrun, &to->si_overrun);
 			err |= __put_user(ptr_to_compat(from->si_ptr),
diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 588a7aa..1df88cc 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -85,6 +85,10 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
 			case __SI_POLL >> 16:
 				put_user_ex(from->si_fd, &to->si_fd);
 				break;
+			case __SI_TRAP:
+				put_user_ex(from->si_uid, &to->si_uid);
+				put_user_ex(from->si_pt_flags, &to->si_pt_flags);
+				break;
 			case __SI_TIMER >> 16:
 				put_user_ex(from->si_overrun, &to->si_overrun);
 				put_user_ex(ptr_to_compat(from->si_ptr),
diff --git a/arch/x86/include/asm/ia32.h b/arch/x86/include/asm/ia32.h
index 1f7e625..7eab27a 100644
--- a/arch/x86/include/asm/ia32.h
+++ b/arch/x86/include/asm/ia32.h
@@ -126,6 +126,13 @@ typedef struct compat_siginfo {
 			int _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			unsigned int _pid;	/* sender's pid */
+			unsigned int _uid;	/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 } compat_siginfo_t;
 
diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h
index 0dd4e87..9ecabdf 100644
--- a/include/asm-generic/siginfo.h
+++ b/include/asm-generic/siginfo.h
@@ -90,6 +90,13 @@ typedef struct siginfo {
 			__ARCH_SI_BAND_T _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGTRAP */
+		struct {
+			__kernel_pid_t _pid;	/* sender's pid */
+			__ARCH_SI_UID_T _uid;	/* sender's uid */
+			unsigned int _pt_flags;
+		} _sigtrap;
 	} _sifields;
 } siginfo_t;
 
@@ -116,6 +123,7 @@ typedef struct siginfo {
 #define si_addr_lsb	_sifields._sigfault._addr_lsb
 #define si_band		_sifields._sigpoll._band
 #define si_fd		_sifields._sigpoll._fd
+#define si_pt_flags	_sifields._sigtrap._pt_flags
 
 #ifdef __KERNEL__
 #define __SI_MASK	0xffff0000u
@@ -126,6 +134,7 @@ typedef struct siginfo {
 #define __SI_CHLD	(4 << 16)
 #define __SI_RT		(5 << 16)
 #define __SI_MESGQ	(6 << 16)
+#define __SI_TRAP	(7 << 16)
 #define __SI_CODE(T,N)	((T) | ((N) & 0xffff))
 #else
 #define __SI_KILL	0
@@ -135,6 +144,7 @@ typedef struct siginfo {
 #define __SI_CHLD	0
 #define __SI_RT		0
 #define __SI_MESGQ	0
+#define __SI_TRAP	0
 #define __SI_CODE(T,N)	(N)
 #endif
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 434243e..3f131be 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1881,7 +1881,7 @@ static void ptrace_do_notify(int exit_code, int why)
 
 	memset(&info, 0, sizeof info);
 	info.si_signo = SIGTRAP;
-	info.si_code = exit_code;
+	info.si_code = __SI_TRAP | exit_code;
 	info.si_pid = task_pid_vnr(current);
 	info.si_uid = current_uid();
 
@@ -2534,6 +2534,11 @@ int copy_siginfo_to_user(siginfo_t __user *to, siginfo_t *from)
 		err |= __put_user(from->si_band, &to->si_band);
 		err |= __put_user(from->si_fd, &to->si_fd);
 		break;
+	case __SI_TRAP:
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		err |= __put_user(from->si_pt_flags, &to->si_pt_flags);
+		break;
 	case __SI_FAULT:
 		err |= __put_user(from->si_addr, &to->si_addr);
 #ifdef __ARCH_SI_TRAPNO
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 14/17] ptrace: make group stop state visible via PTRACE_GETSIGINFO
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (12 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 13/17] ptrace: add siginfo.si_pt_flags Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 15/17] ptrace: don't let PTRACE_SETSIGINFO override __SI_TRAP siginfo Tejun Heo
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

If tracee is SEIZED, PTRACE_EVENT_STOP trap is used for group stop;
however, there currently is no way to find out which signal initiated
the group stop or if the group stop is still in effect.

This patch updates ptrace_do_notify() such that it reports group stop
information via si.si_signo and si.si_pt_flags - both are stored in
siginfo on trap.  Note that it's only available if tracee was seized.

This doesn't address notification and tracer has to repeatedly put the
tracee into trap and poll the flag.  Later patches will deal with
notification and trap transition.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(int argc, char **argv)
  {
	  pid_t tracee, tracer;
	  int i;

	  tracee = fork();
	  if (!tracee)
		  while (1)
			  nanosleep(&ts1s, NULL);

	  tracer = fork();
	  if (!tracer) {
		  int last_stopped = 0, stopped;
		  siginfo_t si;

		  ptrace(PTRACE_SEIZE, tracee, NULL,
			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
	  repeat:
		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
		  waitid(P_PID, tracee, NULL, WSTOPPED);

		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
		  if (si.si_code) {
			  stopped = !!si.si_status;
			  if (stopped != last_stopped)
				  printf("tracer: stopped=%d signo=%d\n",
					 stopped, si.si_signo);
			  last_stopped = stopped;
			  ptrace(PTRACE_CONT, tracee, NULL, NULL);
		  } else {
			  printf("tracer: SIG %d\n", si.si_signo);
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(unsigned long)si.si_signo);
		  }
		  goto repeat;
	  }

	  for (i = 0; i < 3; i++) {
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGSTOP\n");
		  kill(tracee, SIGSTOP);
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGCONT\n");
		  kill(tracee, SIGCONT);
	  }
	  nanosleep(&ts1s, NULL);

	  kill(tracer, SIGKILL);
	  kill(tracee, SIGKILL);
	  return 0;
  }

Tracer delivers signal, resumes group stop, induces INTERRUPT traps
and reports group stop state change in busy loop.  Mother sends
SIGSTOP or CONT to tracee on each second.  Note that si_pt_flags and
flag testing are replaced with si_status testing.  si_status occupies
the same offset as si_pt_flags and PTRACE_SI_STOPPED is the only flag
defined, so I took a dirty short cut.

  # ./test-stopped
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: SIG 18
  tracer: stopped=0 signo=5

-v2: Local variable sig defined inside if() block it's used in as
     suggested by Oleg.

-v3: siginfo is filled in on trap rather than on PTRACE_GETSIGINFO as
     suggested by Oleg.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/ptrace.h |    3 +++
 kernel/signal.c        |    8 ++++++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index ad754d1..f16a63f 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -73,6 +73,9 @@
 #define PTRACE_EVENT_EXIT	6
 #define PTRACE_EVENT_STOP	7
 
+/* flags in siginfo.si_pt_flags from PTRACE_GETSIGINFO */
+#define PTRACE_SI_STOPPED	0x00000001 /* tracee is job control stopped */
+
 #include <asm/ptrace.h>
 
 #ifdef __KERNEL__
diff --git a/kernel/signal.c b/kernel/signal.c
index 3f131be..c6ebdfe 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1875,6 +1875,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 
 static void ptrace_do_notify(int exit_code, int why)
 {
+	struct signal_struct *sig = current->signal;
 	siginfo_t info;
 
 	BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
@@ -1885,6 +1886,13 @@ static void ptrace_do_notify(int exit_code, int why)
 	info.si_pid = task_pid_vnr(current);
 	info.si_uid = current_uid();
 
+	if ((current->ptrace & PT_SEIZED) &&
+	    (sig->group_stop_count || sig->flags & SIGNAL_STOP_STOPPED)) {
+		info.si_pt_flags |= PTRACE_SI_STOPPED;
+		info.si_signo = current->jobctl & JOBCTL_STOP_SIGMASK;
+		WARN_ON_ONCE(!info.si_signo);
+	}
+
 	/* Let the debugger run.  */
 	ptrace_stop(exit_code, why, 1, &info);
 }
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 15/17] ptrace: don't let PTRACE_SETSIGINFO override __SI_TRAP siginfo
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (13 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 14/17] ptrace: make group stop state visible via PTRACE_GETSIGINFO Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 16/17] ptrace: implement TRAP_NOTIFY and use it for group stop events Tejun Heo
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

__SI_TRAP siginfo is special in the operation of ptrace.  It reports
group stop related information and will also interact with
notification retraps.  Don't let userland mess with it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/ptrace.c |   31 ++++++++++++++++++++++---------
 1 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d5e38ac..f1efe07 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -545,16 +545,29 @@ static int ptrace_getsiginfo(struct task_struct *child, siginfo_t *info)
 static int ptrace_setsiginfo(struct task_struct *child, const siginfo_t *info)
 {
 	unsigned long flags;
-	int error = -ESRCH;
+	int error;
 
-	if (lock_task_sighand(child, &flags)) {
-		error = -EINVAL;
-		if (likely(child->last_siginfo != NULL)) {
-			*child->last_siginfo = *info;
-			error = 0;
-		}
-		unlock_task_sighand(child, &flags);
-	}
+	if (!lock_task_sighand(child, &flags))
+		return -ESRCH;
+
+	error = -EINVAL;
+	if (unlikely(!child->last_siginfo))
+		goto out_unlock;
+
+	/*
+	 * If seized, __SI_TRAP siginfo is used to communicate information
+	 * regarding traps and contains dynamic information generated on
+	 * GETSIGINFO.  Don't let userland override or fake it.
+	 */
+	if ((child->ptrace & PT_SEIZED) &&
+	    unlikely((child->last_siginfo->si_code & __SI_MASK) == __SI_TRAP ||
+		     (info->si_code & __SI_MASK) == __SI_TRAP))
+		goto out_unlock;
+
+	*child->last_siginfo = *info;
+	error = 0;
+out_unlock:
+	unlock_task_sighand(child, &flags);
 	return error;
 }
 
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 16/17] ptrace: implement TRAP_NOTIFY and use it for group stop events
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (14 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 15/17] ptrace: don't let PTRACE_SETSIGINFO override __SI_TRAP siginfo Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-05-29 23:12 ` [PATCH 17/17] ptrace: implement PTRACE_LISTEN Tejun Heo
  2011-05-30 15:42 ` [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Oleg Nesterov
  17 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

Currently there's no way for ptracer to find out whether group stop
finished other than polling with INTERRUPT - GETSIGINFO - CONT
sequence This patch implements group stop notification for ptracer
using STOP traps.

When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY
is set, which schedules a STOP trap which is sticky - it isn't cleared
by other traps and at least one STOP trap will happen eventually.
STOP trap is synchronization point for event notification and ptracer
is required to fetch group stop information with PTRACE_GETSIGINFO on
each STOP trap if it intends to keep track of group stop changes.

Notifications are generated both on start and end of group stops but,
because group stop participation always happens before STOP trap, this
doesn't cause an extra trap while tracee is participating in group
stop.  The symmetry will be useful later.

Note that this notification works iff tracee is not trapped.
Currently there is no way to be notified of group stop state changes
while tracee is trapped.  This will be addressed by a later patch.

An example program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(int argc, char **argv)
  {
	  pid_t tracee, tracer;
	  int i;

	  tracee = fork();
	  if (!tracee)
		  while (1)
			  pause();

	  tracer = fork();
	  if (!tracer) {
		  int stopped;
		  siginfo_t si;

		  ptrace(PTRACE_SEIZE, tracee, NULL,
			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
	  repeat:
		  waitid(P_PID, tracee, NULL, WSTOPPED);

		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
		  if (!si.si_code) {
			  printf("tracer: SIG %d\n", si.si_signo);
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(unsigned long)si.si_signo);
			  goto repeat;
		  }
		  stopped = !!si.si_status;
		  printf("tracer: stopped=%d signo=%d\n", stopped, si.si_signo);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
		  goto repeat;
	  }

	  for (i = 0; i < 3; i++) {
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGSTOP\n");
		  kill(tracee, SIGSTOP);
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGCONT\n");
		  kill(tracee, SIGCONT);
	  }
	  nanosleep(&ts1s, NULL);

	  kill(tracer, SIGKILL);
	  kill(tracee, SIGKILL);
	  return 0;
  }

In the above program, tracer keeps tracee running and gets
notification of each group stop state changes.

  # ./test-notify
  tracer: stopped=0 signo=5
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/ptrace.h |    2 ++
 include/linux/sched.h  |    4 +++-
 kernel/signal.c        |   38 +++++++++++++++++++++++++++++++++++---
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index f16a63f..43f762d 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -73,6 +73,8 @@
 #define PTRACE_EVENT_EXIT	6
 #define PTRACE_EVENT_STOP	7
 
+#define PTRACE_STOP_SI_CODE	(__SI_TRAP | SIGTRAP | PTRACE_EVENT_STOP << 8)
+
 /* flags in siginfo.si_pt_flags from PTRACE_GETSIGINFO */
 #define PTRACE_SI_STOPPED	0x00000001 /* tracee is job control stopped */
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 20f392b..161658f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1811,15 +1811,17 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define JOBCTL_STOP_PENDING_BIT	17	/* task should stop for group stop */
 #define JOBCTL_STOP_CONSUME_BIT	18	/* consume group stop count */
 #define JOBCTL_TRAP_STOP_BIT	19	/* trap for STOP */
+#define JOBCTL_TRAP_NOTIFY_BIT	20	/* trap for NOTIFY */
 #define JOBCTL_TRAPPING_BIT	21	/* switching to TRACED */
 
 #define JOBCTL_STOP_DEQUEUED	(1 << JOBCTL_STOP_DEQUEUED_BIT)
 #define JOBCTL_STOP_PENDING	(1 << JOBCTL_STOP_PENDING_BIT)
 #define JOBCTL_STOP_CONSUME	(1 << JOBCTL_STOP_CONSUME_BIT)
 #define JOBCTL_TRAP_STOP	(1 << JOBCTL_TRAP_STOP_BIT)
+#define JOBCTL_TRAP_NOTIFY	(1 << JOBCTL_TRAP_NOTIFY_BIT)
 #define JOBCTL_TRAPPING		(1 << JOBCTL_TRAPPING_BIT)
 
-#define JOBCTL_TRAP_MASK	JOBCTL_TRAP_STOP
+#define JOBCTL_TRAP_MASK	(JOBCTL_TRAP_STOP | JOBCTL_TRAP_NOTIFY)
 #define JOBCTL_PENDING_MASK	(JOBCTL_STOP_PENDING | JOBCTL_TRAP_MASK)
 
 extern bool task_set_jobctl_pending(struct task_struct *task,
diff --git a/kernel/signal.c b/kernel/signal.c
index c6ebdfe..5a72324 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -817,6 +817,30 @@ static int check_kill_permission(int sig, struct siginfo *info,
 	return security_task_kill(t, info, sig, 0);
 }
 
+/**
+ * ptrace_trap_notify - schedule trap to notify ptracer
+ * @t: tracee wanting to notify tracer
+ *
+ * This function schedules sticky ptrace trap which is cleared on the next
+ * TRAP_STOP to notify ptracer of an event.  @t must have been seized by
+ * ptracer.
+ *
+ * If @t is running, STOP trap will be taken.  If already trapped, STOP
+ * trap will be eventually taken without returning to userland after the
+ * existing traps are finished by PTRACE_CONT.
+ *
+ * CONTEXT:
+ * Must be called with @task->sighand->siglock held.
+ */
+static void ptrace_trap_notify(struct task_struct *t)
+{
+	WARN_ON_ONCE(!(t->ptrace & PT_SEIZED));
+	assert_spin_locked(&t->sighand->siglock);
+
+	task_set_jobctl_pending(t, JOBCTL_TRAP_NOTIFY);
+	signal_wake_up(t, 0);
+}
+
 /*
  * Handle magic process-wide effects of stop/continue signals. Unlike
  * the signal actions, these happen immediately at signal-generation
@@ -855,7 +879,10 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		do {
 			task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
-			wake_up_state(t, __TASK_STOPPED);
+			if (likely(!(t->ptrace & PT_SEIZED)))
+				wake_up_state(t, __TASK_STOPPED);
+			else
+				ptrace_trap_notify(t);
 		} while_each_thread(p, t);
 
 		/*
@@ -1797,8 +1824,10 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	if (why == CLD_STOPPED && (current->jobctl & JOBCTL_STOP_PENDING))
 		gstop_done = task_participate_group_stop(current);
 
-	/* any trap clears pending STOP trap */
+	/* any trap clears pending STOP trap, STOP trap clears NOTIFY */
 	task_clear_jobctl_pending(current, JOBCTL_TRAP_STOP);
+	if (info && info->si_code == PTRACE_STOP_SI_CODE)
+		task_clear_jobctl_pending(current, JOBCTL_TRAP_NOTIFY);
 
 	/* entering a trap, clear TRAPPING */
 	task_clear_jobctl_trapping(current);
@@ -1980,7 +2009,10 @@ static bool do_signal_stop(int signr)
 			if (!task_is_stopped(t) &&
 			    task_set_jobctl_pending(t, signr | gstop)) {
 				sig->group_stop_count++;
-				signal_wake_up(t, 0);
+				if (likely(!(t->ptrace & PT_SEIZED)))
+					signal_wake_up(t, 0);
+				else
+					ptrace_trap_notify(t);
 			}
 		}
 	}
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 17/17] ptrace: implement PTRACE_LISTEN
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (15 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 16/17] ptrace: implement TRAP_NOTIFY and use it for group stop events Tejun Heo
@ 2011-05-29 23:12 ` Tejun Heo
  2011-06-02 17:33   ` Oleg Nesterov
  2011-05-30 15:42 ` [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Oleg Nesterov
  17 siblings, 1 reply; 50+ messages in thread
From: Tejun Heo @ 2011-05-29 23:12 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro, Tejun Heo

The previous patch implemented async notification for ptrace but it
only worked while trace is running.  This patch introduces
PTRACE_LISTEN which is suggested by Oleg Nestrov.

It's allowed iff tracee is in STOP trap and puts tracee into
quasi-running state - tracee never really runs but wait(2) and
ptrace(2) consider it to be running.  While ptracer is listening,
tracee is allowed to re-enter STOP to notify an async event.
Listening state is cleared on the first notification.  Ptracer can
also clear it by issuing INTERRUPT - tracee will re-trap into STOP
with listening state cleared.

This allows ptracer to monitor group stop state without running tracee
- use INTERRUPT to put tracee into STOP trap, issue LISTEN and then
wait(2) to wait for the next group stop event.  When it happens,
PTRACE_GETSIGINFO provides information to determine the current state.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207
  #define PTRACE_LISTEN		0x4208

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(int argc, char **argv)
  {
	  pid_t tracee, tracer;
	  int i;

	  tracee = fork();
	  if (!tracee)
		  while (1)
			  pause();

	  tracer = fork();
	  if (!tracer) {
		  int stopped;
		  siginfo_t si;

		  ptrace(PTRACE_SEIZE, tracee, NULL,
			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
	  repeat:
		  waitid(P_PID, tracee, NULL, WSTOPPED);

		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
		  if (!si.si_code) {
			  printf("tracer: SIG %d\n", si.si_signo);
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(unsigned long)si.si_signo);
			  goto repeat;
		  }
		  stopped = !!si.si_status;
		  printf("tracer: stopped=%d signo=%d\n", stopped, si.si_signo);
		  if (stopped)
			  ptrace(PTRACE_LISTEN, tracee, NULL, NULL);
		  else
			  ptrace(PTRACE_CONT, tracee, NULL, NULL);
		  goto repeat;
	  }

	  for (i = 0; i < 3; i++) {
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGSTOP\n");
		  kill(tracee, SIGSTOP);
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGCONT\n");
		  kill(tracee, SIGCONT);
	  }
	  nanosleep(&ts1s, NULL);

	  kill(tracer, SIGKILL);
	  kill(tracee, SIGKILL);
	  return 0;
  }

This is identical to the program to test TRAP_NOTIFY except that
tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped.
This allows ptracer to monitor when group stop ends without running
tracee.

  # ./test-listen
  tracer: stopped=0 signo=5
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/ptrace.h |    1 +
 include/linux/sched.h  |    2 ++
 kernel/exit.c          |    2 +-
 kernel/ptrace.c        |   42 +++++++++++++++++++++++++++++++++++++++---
 kernel/signal.c        |   13 +++++++++----
 5 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 43f762d..abcfb2b 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -49,6 +49,7 @@
 
 #define PTRACE_SEIZE		0x4206
 #define PTRACE_INTERRUPT	0x4207
+#define PTRACE_LISTEN		0x4208
 
 /* flags in @data for PTRACE_SEIZE */
 #define PTRACE_SEIZE_DEVEL	0x80000000 /* temp flag for development */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 161658f..58ee03e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1813,6 +1813,7 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define JOBCTL_TRAP_STOP_BIT	19	/* trap for STOP */
 #define JOBCTL_TRAP_NOTIFY_BIT	20	/* trap for NOTIFY */
 #define JOBCTL_TRAPPING_BIT	21	/* switching to TRACED */
+#define JOBCTL_LISTENING_BIT	22	/* ptracer is listening for events */
 
 #define JOBCTL_STOP_DEQUEUED	(1 << JOBCTL_STOP_DEQUEUED_BIT)
 #define JOBCTL_STOP_PENDING	(1 << JOBCTL_STOP_PENDING_BIT)
@@ -1820,6 +1821,7 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 #define JOBCTL_TRAP_STOP	(1 << JOBCTL_TRAP_STOP_BIT)
 #define JOBCTL_TRAP_NOTIFY	(1 << JOBCTL_TRAP_NOTIFY_BIT)
 #define JOBCTL_TRAPPING		(1 << JOBCTL_TRAPPING_BIT)
+#define JOBCTL_LISTENING	(1 << JOBCTL_LISTENING_BIT)
 
 #define JOBCTL_TRAP_MASK	(JOBCTL_TRAP_STOP | JOBCTL_TRAP_NOTIFY)
 #define JOBCTL_PENDING_MASK	(JOBCTL_STOP_PENDING | JOBCTL_TRAP_MASK)
diff --git a/kernel/exit.c b/kernel/exit.c
index 20a4064..b3bb3c2 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1416,7 +1416,7 @@ static int wait_task_stopped(struct wait_opts *wo,
 	spin_lock_irq(&p->sighand->siglock);
 
 	p_code = task_stopped_code(p, ptrace);
-	if (unlikely(!p_code))
+	if (unlikely(!p_code) || p->jobctl & JOBCTL_LISTENING)
 		goto unlock_sig;
 
 	exit_code = *p_code;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index f1efe07..3862142 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -146,7 +146,8 @@ int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 		 */
 		spin_lock_irq(&child->sighand->siglock);
 		WARN_ON_ONCE(task_is_stopped(child));
-		if (task_is_traced(child) || ignore_state)
+		if (ignore_state || (task_is_traced(child) &&
+				     !(child->jobctl & JOBCTL_LISTENING)))
 			ret = 0;
 		spin_unlock_irq(&child->sighand->siglock);
 	}
@@ -672,7 +673,7 @@ int ptrace_request(struct task_struct *child, long request,
 {
 	bool seized = child->ptrace & PT_SEIZED;
 	int ret = -EIO;
-	siginfo_t siginfo;
+	siginfo_t siginfo, *si;
 	void __user *datavp = (void __user *) data;
 	unsigned long __user *datalp = datavp;
 	unsigned long flags;
@@ -722,8 +723,43 @@ int ptrace_request(struct task_struct *child, long request,
 		if (unlikely(!seized || !lock_task_sighand(child, &flags)))
 			break;
 
+		/*
+		 * INTERRUPT doesn't disturb existing trap sans one
+		 * exception.  If ptracer issued LISTEN for the current
+		 * STOP, this INTERRUPT should clear LISTEN and re-trap
+		 * tracee into STOP.
+		 */
 		if (likely(task_set_jobctl_pending(child, JOBCTL_TRAP_STOP)))
-			signal_wake_up(child, 0);
+			signal_wake_up(child, child->jobctl & JOBCTL_LISTENING);
+
+		unlock_task_sighand(child, &flags);
+		ret = 0;
+		break;
+
+	case PTRACE_LISTEN:
+		/*
+		 * Listen for events.  Tracee must be in STOP.  It's not
+		 * resumed per-se but is not considered to be in TRACED by
+		 * wait(2) or ptrace(2).  If an async event (e.g. group
+		 * stop state change) happens, tracee will enter STOP trap
+		 * again.  Alternatively, ptracer can issue INTERRUPT to
+		 * finish listening and re-trap tracee into STOP.
+		 */
+		if (unlikely(!seized || !lock_task_sighand(child, &flags)))
+			break;
+
+		si = child->last_siginfo;
+		if (unlikely(!si || si->si_code != PTRACE_STOP_SI_CODE))
+			break;
+
+		child->jobctl |= JOBCTL_LISTENING;
+
+		/*
+		 * If NOTIFY is set, it means event happened between start
+		 * of this trap and now.  Trigger re-trap immediately.
+		 */
+		if (child->jobctl & JOBCTL_TRAP_NOTIFY)
+			signal_wake_up(child, true);
 
 		unlock_task_sighand(child, &flags);
 		ret = 0;
diff --git a/kernel/signal.c b/kernel/signal.c
index 5a72324..2ec1d08 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -825,9 +825,11 @@ static int check_kill_permission(int sig, struct siginfo *info,
  * TRAP_STOP to notify ptracer of an event.  @t must have been seized by
  * ptracer.
  *
- * If @t is running, STOP trap will be taken.  If already trapped, STOP
- * trap will be eventually taken without returning to userland after the
- * existing traps are finished by PTRACE_CONT.
+ * If @t is running, STOP trap will be taken.  If trapped for STOP and
+ * ptracer is listening for events, tracee is woken up so that it can
+ * re-trap for the new event.  If trapped otherwise, STOP trap will be
+ * eventually taken without returning to userland after the existing traps
+ * are finished by PTRACE_CONT.
  *
  * CONTEXT:
  * Must be called with @task->sighand->siglock held.
@@ -838,7 +840,7 @@ static void ptrace_trap_notify(struct task_struct *t)
 	assert_spin_locked(&t->sighand->siglock);
 
 	task_set_jobctl_pending(t, JOBCTL_TRAP_NOTIFY);
-	signal_wake_up(t, 0);
+	signal_wake_up(t, t->jobctl & JOBCTL_LISTENING);
 }
 
 /*
@@ -1894,6 +1896,9 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	spin_lock_irq(&current->sighand->siglock);
 	current->last_siginfo = NULL;
 
+	/* LISTENING can be set only during STOP traps, clear it */
+	current->jobctl &= ~JOBCTL_LISTENING;
+
 	/*
 	 * Queued signals ignored us while we were stopped for tracing.
 	 * So check for any that we should take before resuming user mode.
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
                   ` (16 preceding siblings ...)
  2011-05-29 23:12 ` [PATCH 17/17] ptrace: implement PTRACE_LISTEN Tejun Heo
@ 2011-05-30 15:42 ` Oleg Nesterov
  2011-06-01  5:39   ` Tejun Heo
  17 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2011-05-30 15:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hi Tejun,

On 05/30, Tejun Heo wrote:
>
> * PTRACE_LISTEN added.  This puts tracee into quasi-active state where
>   wait(2) and sync ptrace requests fail and tracee is allowed to
>   re-trap into STOP to notify an async event.

Nice!

Personally, I think the whole series is fine. So far I only read it
inline (without applying). I'll try to read it carefully patch-by-patch,
it is always good to try to re-check the details, but I do not expect
I'll find something which should delay this series from merging into
ptrace tree.

Thanks Tejun ;)

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-05-30 15:42 ` [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Oleg Nesterov
@ 2011-06-01  5:39   ` Tejun Heo
  2011-06-02 12:31     ` Tejun Heo
  0 siblings, 1 reply; 50+ messages in thread
From: Tejun Heo @ 2011-06-01  5:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hey,

On Mon, May 30, 2011 at 05:42:11PM +0200, Oleg Nesterov wrote:
> On 05/30, Tejun Heo wrote:
> >
> > * PTRACE_LISTEN added.  This puts tracee into quasi-active state where
> >   wait(2) and sync ptrace requests fail and tracee is allowed to
> >   re-trap into STOP to notify an async event.
> 
> Nice!
> 
> Personally, I think the whole series is fine. So far I only read it
> inline (without applying). I'll try to read it carefully patch-by-patch,
> it is always good to try to re-check the details, but I do not expect
> I'll find something which should delay this series from merging into
> ptrace tree.

Yeah, yeah, PTRACE_LISTEN turned out to be pretty easy to implement.
I didn't really have to change too much.  I'll test different corner
cases (different thread doing waking/sleeping, race against INTERRUPT
kind of things) and let you know how it goes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach()
  2011-05-29 23:12 ` [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach() Tejun Heo
@ 2011-06-01 18:47   ` Oleg Nesterov
  2011-06-02  5:03     ` Tejun Heo
  2011-06-02 11:39   ` [PATCH UPDATED " Tejun Heo
  1 sibling, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-01 18:47 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

On 05/30, Tejun Heo wrote:
>
> Local variable wait_trap is used to determine whether to call
> wait_event() on TRAPPING or not, which doesn't change or optimize
> anything.  Remove it.
>
> ...
>
> -	if (wait_trap)
> -		wait_event(current->signal->wait_chldexit,
> -			   !(task->group_stop & GROUP_STOP_TRAPPING));
> +	wait_event(current->signal->wait_chldexit,
> +		   !(task->group_stop & GROUP_STOP_TRAPPING));
>  	return retval;

Well, it doesn't change anything, but only if ptrace_attach() succeeds.
The caller should not wait if STOP_TRAPPING was already set by another
tracer and we are going to fail. Afaics, nothing really bad can happen
but still this doesn't look very clean.

And. Please note that this patch is buggy until 8/17 "use bit_waitqueue
for TRAPPING", wait_event(current->signal->wait_chldexit) can hang forever
in this case since we are not ->parent.

I agree, wait_trap should go away. We can hit STOP_TRAPPING after attach
if we change detach to set this bit. But perhaps it would be more clean
to not call wait_event/wait_bit unconditionally anyway, we can check
retval == 0.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/17] ptrace: implement PTRACE_SEIZE
  2011-05-29 23:12 ` [PATCH 11/17] ptrace: implement PTRACE_SEIZE Tejun Heo
@ 2011-06-01 19:01   ` Oleg Nesterov
  2011-06-01 19:55     ` Oleg Nesterov
  2011-06-02  5:13     ` Tejun Heo
  2011-06-02 11:43   ` [PATCH UPDATED " Tejun Heo
  1 sibling, 2 replies; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-01 19:01 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

On 05/30, Tejun Heo wrote:
>
> implements a new ptrace request PTRACE_SEIZE which attaches and traps
> tracee

it doesn't trap the tracee ;)

> * PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap.  After
>   attaching tracee continues to run unless a trap condition occurs.

OK.

Just to remind, tracehook_report_clone() shouldn't send SIGSTOP if
the auto-attached tracee is PT_SEIZED.

> * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
>   exit_code of (SIGTRAP | PTRACE_EVENT_STOP << 8) instead of the
>   stopping signal number

Hmm. May be it would be better to report stopping_signal | PTRACE_EVENT_STOP
instead... I am not sure yet, but it seems this way we can avoid the
PTRACE_GETSIGINFO changes. I'll try to explain later, when I finsh the
reading.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/17] ptrace: implement PTRACE_SEIZE
  2011-06-01 19:01   ` Oleg Nesterov
@ 2011-06-01 19:55     ` Oleg Nesterov
  2011-06-02  5:13     ` Tejun Heo
  1 sibling, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-01 19:55 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

On 06/01, Oleg Nesterov wrote:
>
> On 05/30, Tejun Heo wrote:
> >
> > * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
> >   exit_code of (SIGTRAP | PTRACE_EVENT_STOP << 8) instead of the
> >   stopping signal number
>
> Hmm. May be it would be better to report stopping_signal | PTRACE_EVENT_STOP
> instead... I am not sure yet, but it seems this way we can avoid the
> PTRACE_GETSIGINFO changes. I'll try to explain later, when I finsh the
> reading.

Tomorrow. I do not see any problems, the whole series looks fine to me.
Except I have some minor concerns which I need to re-check and summarize.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach()
  2011-06-01 18:47   ` Oleg Nesterov
@ 2011-06-02  5:03     ` Tejun Heo
  0 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-02  5:03 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hello,

On Wed, Jun 01, 2011 at 08:47:43PM +0200, Oleg Nesterov wrote:
> On 05/30, Tejun Heo wrote:
> > -	if (wait_trap)
> > -		wait_event(current->signal->wait_chldexit,
> > -			   !(task->group_stop & GROUP_STOP_TRAPPING));
> > +	wait_event(current->signal->wait_chldexit,
> > +		   !(task->group_stop & GROUP_STOP_TRAPPING));
> >  	return retval;
> 
> Well, it doesn't change anything, but only if ptrace_attach() succeeds.
> The caller should not wait if STOP_TRAPPING was already set by another
> tracer and we are going to fail. Afaics, nothing really bad can happen
> but still this doesn't look very clean.
> 
> And. Please note that this patch is buggy until 8/17 "use bit_waitqueue
> for TRAPPING", wait_event(current->signal->wait_chldexit) can hang forever
> in this case since we are not ->parent.

Ah, darn it.  It was part of later patch which got dropped and I just
salavaged this part and put it at the front forgetting about the
chldexit conversion.

> I agree, wait_trap should go away. We can hit STOP_TRAPPING after attach
> if we change detach to set this bit. But perhaps it would be more clean
> to not call wait_event/wait_bit unconditionally anyway, we can check
> retval == 0.

Yeap, that sounds about right.  Will post updated patch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/17] ptrace: implement PTRACE_SEIZE
  2011-06-01 19:01   ` Oleg Nesterov
  2011-06-01 19:55     ` Oleg Nesterov
@ 2011-06-02  5:13     ` Tejun Heo
  1 sibling, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-02  5:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hey,

On Wed, Jun 01, 2011 at 09:01:17PM +0200, Oleg Nesterov wrote:
> On 05/30, Tejun Heo wrote:
> > implements a new ptrace request PTRACE_SEIZE which attaches and traps
> > tracee
> 
> it doesn't trap the tracee ;)

Ah, missed that one.  Will update.

> > * PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap.  After
> >   attaching tracee continues to run unless a trap condition occurs.
> 
> OK.
> 
> Just to remind, tracehook_report_clone() shouldn't send SIGSTOP if
> the auto-attached tracee is PT_SEIZED.

Yeap, let's deal with it (and others) later.

> > * If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
> >   exit_code of (SIGTRAP | PTRACE_EVENT_STOP << 8) instead of the
> >   stopping signal number
> 
> Hmm. May be it would be better to report stopping_signal | PTRACE_EVENT_STOP
> instead... I am not sure yet, but it seems this way we can avoid the
> PTRACE_GETSIGINFO changes. I'll try to explain later, when I finsh the
> reading.

Maybe, unsure.  Currently all the existing PTRACE_EVENT_* codes use
SIGTRAP and requiring GETSIGINFO on TRAP_STOP seems reasonable enough.
So, (stopping signo) | EVENT_STOP on stops, SIGTRAP | EVENT_STOP on
INTERRUPT and SIGCONT | EVENT_STOP (hmm....) on continued?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH UPDATED 01/17] ptrace: remove silly wait_trap variable from ptrace_attach()
  2011-05-29 23:12 ` [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach() Tejun Heo
  2011-06-01 18:47   ` Oleg Nesterov
@ 2011-06-02 11:39   ` Tejun Heo
  1 sibling, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-02 11:39 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

>From ead8735f43a21993c78e6e4ed5627da6ec4911bd Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Thu, 2 Jun 2011 11:13:59 +0200

Remove local variable wait_trap which determines whether to wait for
!TRAPPING or not and simply wait for it if attach was successful.

-v2: Oleg pointed out wait should happen iff attach was successful.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 kernel/ptrace.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2df1157..4f689cb 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -184,7 +184,6 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 
 static int ptrace_attach(struct task_struct *task)
 {
-	bool wait_trap = false;
 	int retval;
 
 	audit_ptrace(task);
@@ -246,7 +245,6 @@ static int ptrace_attach(struct task_struct *task)
 	if (task_is_stopped(task)) {
 		task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING;
 		signal_wake_up(task, 1);
-		wait_trap = true;
 	}
 
 	spin_unlock(&task->sighand->siglock);
@@ -257,7 +255,7 @@ unlock_tasklist:
 unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
-	if (wait_trap)
+	if (!retval)
 		wait_event(current->signal->wait_chldexit,
 			   !(task->group_stop & GROUP_STOP_TRAPPING));
 	return retval;
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH UPDATED 08/17] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit
  2011-05-29 23:12 ` [PATCH 08/17] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
@ 2011-06-02 11:41   ` Tejun Heo
  0 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-02 11:41 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

>From abf3702a74e4e9f6fe4250bc82b80f8421b1dee3 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Thu, 2 Jun 2011 11:14:00 +0200

ptracer->signal->wait_chldexit was used to wait for TRAPPING; however,
->wait_chldexit was already complicated with waker-side filtering
without adding TRAPPING wait on top of it.  Also, it unnecessarily
made TRAPPING clearing depend on the current ptrace relationship - if
the ptracee is detached, wakeup is lost.

There is no reason to use signal->wait_chldexit here.  We're just
waiting for JOBCTL_TRAPPING bit to clear and given the relatively
infrequent use of ptrace, bit_waitqueue can serve it perfectly.

This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of
signal->wait_chldexit.

-v2: Use JOBCTL_*_BIT macros instead of ilog2() as suggested by Linus.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
Regenerated to apply on top of the updated first patch.  Other than
that, no change.

Thanks.

 kernel/ptrace.c |   10 ++++++++--
 kernel/signal.c |    3 +--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 0c37d99..7f05f3a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -25,6 +25,12 @@
 #include <linux/hw_breakpoint.h>
 
 
+static int ptrace_trapping_sleep_fn(void *flags)
+{
+	schedule();
+	return 0;
+}
+
 /*
  * ptrace a task: make the debugger its new parent and
  * move it to the ptrace list.
@@ -270,8 +276,8 @@ unlock_creds:
 	mutex_unlock(&task->signal->cred_guard_mutex);
 out:
 	if (!retval)
-		wait_event(current->signal->wait_chldexit,
-			   !(task->jobctl & JOBCTL_TRAPPING));
+		wait_on_bit(&task->jobctl, JOBCTL_TRAPPING_BIT,
+			    ptrace_trapping_sleep_fn, TASK_UNINTERRUPTIBLE);
 	return retval;
 }
 
diff --git a/kernel/signal.c b/kernel/signal.c
index 9ab91c5..172a4c7 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -272,8 +272,7 @@ static void task_clear_jobctl_trapping(struct task_struct *task)
 {
 	if (unlikely(task->jobctl & JOBCTL_TRAPPING)) {
 		task->jobctl &= ~JOBCTL_TRAPPING;
-		__wake_up_sync_key(&task->parent->signal->wait_chldexit,
-				   TASK_UNINTERRUPTIBLE, 1, task);
+		wake_up_bit(&task->jobctl, JOBCTL_TRAPPING_BIT);
 	}
 }
 
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH UPDATED 11/17] ptrace: implement PTRACE_SEIZE
  2011-05-29 23:12 ` [PATCH 11/17] ptrace: implement PTRACE_SEIZE Tejun Heo
  2011-06-01 19:01   ` Oleg Nesterov
@ 2011-06-02 11:43   ` Tejun Heo
  1 sibling, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-02 11:43 UTC (permalink / raw)
  To: oleg
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

>From df66a98de92fe06bef9db97f3925b89f9f30a79a Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Thu, 2 Jun 2011 11:14:00 +0200

PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side
effects on tracee signal and job control states.  This patch
implements a new ptrace request PTRACE_SEIZE which attaches a tracee
without trapping it or affecting its signal and job control states.

The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_*
flags in @data.  Currently, the only defined flag is
PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE.
PTRACE_SEIZE will change ptrace behaviors outside of attach itself.
The changes will be implemented gradually and the DEVEL flag is to
prevent programs which expect full SEIZE behavior from using it before
all the behavior modifications are complete while allowing unit
testing.  The flag will be removed once SEIZE behaviors are completely
implemented.

* PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap.  After
  attaching tracee continues to run unless a trap condition occurs.

* PTRACE_SEIZE doesn't affect signal or group stop state.

* If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
  exit_code of (SIGTRAP | PTRACE_EVENT_STOP << 8) instead of the
  stopping signal number and returns usual trap siginfo on
  PTRACE_GETSIGINFO instead of NULL.

Note that there currently is no way to find out the stopping signal
number while seized.  This will be improved by future patches.

Seizing sets PT_SEIZED in ->ptrace of the tracee.  This flag will be
used to determine whether new SEIZE behaviors should be enabled.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };
  static const struct timespec ts1s = { .tv_sec = 1 };
  static const struct timespec ts3s = { .tv_sec = 3 };

  int main(int argc, char **argv)
  {
	  pid_t tracee;

	  tracee = fork();
	  if (tracee == 0) {
		  nanosleep(&ts100ms, NULL);
		  while (1) {
			  printf("tracee: alive\n");
			  nanosleep(&ts1s, NULL);
		  }
	  }

	  if (argc > 1)
		  kill(tracee, SIGSTOP);

	  nanosleep(&ts100ms, NULL);

	  ptrace(PTRACE_SEIZE, tracee, NULL,
		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
	  if (argc > 1) {
		  waitid(P_PID, tracee, NULL, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
	  }
	  nanosleep(&ts3s, NULL);
	  printf("tracer: exiting\n");
	  return 0;
  }

When the above program is called w/o argument, tracee is seized while
running and remains running.  When tracer exits, tracee continues to
run and print out messages.

  # ./test-seize-simple
  tracee: alive
  tracee: alive
  tracee: alive
  tracer: exiting
  tracee: alive
  tracee: alive

When called with an argument, tracee is seized from stopped state and
continued, and returns to stopped state when tracer exits.

  # ./test-seize
  tracee: alive
  tracee: alive
  tracee: alive
  tracer: exiting
  # ps -el|grep test-seize
  1 T     0  4720     1  0  80   0 -   941 signal ttyS0    00:00:00 test-seize

-v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan
     suggested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
---
Patch description updated.  Other than that, nothing changed.

Thanks.

 include/linux/ptrace.h |    7 +++++++
 kernel/ptrace.c        |   35 +++++++++++++++++++++++++++++------
 kernel/signal.c        |   32 ++++++++++++++++++++++++--------
 3 files changed, 60 insertions(+), 14 deletions(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index e93ef1a..67ad3f1 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -47,6 +47,11 @@
 #define PTRACE_GETREGSET	0x4204
 #define PTRACE_SETREGSET	0x4205
 
+#define PTRACE_SEIZE		0x4206
+
+/* flags in @data for PTRACE_SEIZE */
+#define PTRACE_SEIZE_DEVEL	0x80000000 /* temp flag for development */
+
 /* options set using PTRACE_SETOPTIONS */
 #define PTRACE_O_TRACESYSGOOD	0x00000001
 #define PTRACE_O_TRACEFORK	0x00000002
@@ -65,6 +70,7 @@
 #define PTRACE_EVENT_EXEC	4
 #define PTRACE_EVENT_VFORK_DONE	5
 #define PTRACE_EVENT_EXIT	6
+#define PTRACE_EVENT_STOP	7
 
 #include <asm/ptrace.h>
 
@@ -77,6 +83,7 @@
  * flags.  When the a task is stopped the ptracer owns task->ptrace.
  */
 
+#define PT_SEIZED	0x00010000	/* SEIZE used, enable new behavior */
 #define PT_PTRACED	0x00000001
 #define PT_DTRACE	0x00000002	/* delayed trace (used on m68k, i386) */
 #define PT_TRACESYSGOOD	0x00000004
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 45a8a4c..dcf9f97 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -209,10 +209,28 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 	return !err;
 }
 
-static int ptrace_attach(struct task_struct *task)
+static int ptrace_attach(struct task_struct *task, long request,
+			 unsigned long flags)
 {
+	bool seize = (request == PTRACE_SEIZE);
 	int retval;
 
+	/*
+	 * SEIZE will enable new ptrace behaviors which will be implemented
+	 * gradually.  SEIZE_DEVEL is used to prevent applications
+	 * expecting full SEIZE behaviors trapping on kernel commits which
+	 * are still in the process of implementing them.
+	 *
+	 * Only test programs for new ptrace behaviors being implemented
+	 * should set SEIZE_DEVEL.  If unset, SEIZE will fail with -EIO.
+	 *
+	 * Once SEIZE behaviors are completely implemented, this flag and
+	 * the following test will be removed.
+	 */
+	retval = -EIO;
+	if (seize && !(flags & PTRACE_SEIZE_DEVEL))
+		goto out;
+
 	audit_ptrace(task);
 
 	retval = -EPERM;
@@ -244,11 +262,16 @@ static int ptrace_attach(struct task_struct *task)
 		goto unlock_tasklist;
 
 	task->ptrace = PT_PTRACED;
+	if (seize)
+		task->ptrace |= PT_SEIZED;
 	if (task_ns_capable(task, CAP_SYS_PTRACE))
 		task->ptrace |= PT_PTRACE_CAP;
 
 	__ptrace_link(task, current);
-	send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
+
+	/* SEIZE doesn't trap tracee on attach */
+	if (!seize)
+		send_sig_info(SIGSTOP, SEND_SIG_FORCED, task);
 
 	spin_lock(&task->sighand->siglock);
 
@@ -785,8 +808,8 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
 		goto out;
 	}
 
-	if (request == PTRACE_ATTACH) {
-		ret = ptrace_attach(child);
+	if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+		ret = ptrace_attach(child, request, data);
 		/*
 		 * Some architectures need to do book-keeping after
 		 * a ptrace attach.
@@ -927,8 +950,8 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
 		goto out;
 	}
 
-	if (request == PTRACE_ATTACH) {
-		ret = ptrace_attach(child);
+	if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+		ret = ptrace_attach(child, request, data);
 		/*
 		 * Some architectures need to do book-keeping after
 		 * a ptrace attach.
diff --git a/kernel/signal.c b/kernel/signal.c
index 262bb6c..434243e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1873,7 +1873,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, siginfo_t *info)
 	recalc_sigpending_tsk(current);
 }
 
-void ptrace_notify(int exit_code)
+static void ptrace_do_notify(int exit_code, int why)
 {
 	siginfo_t info;
 
@@ -1886,8 +1886,13 @@ void ptrace_notify(int exit_code)
 	info.si_uid = current_uid();
 
 	/* Let the debugger run.  */
+	ptrace_stop(exit_code, why, 1, &info);
+}
+
+void ptrace_notify(int exit_code)
+{
 	spin_lock_irq(&current->sighand->siglock);
-	ptrace_stop(exit_code, CLD_TRAPPED, 1, &info);
+	ptrace_do_notify(exit_code, CLD_TRAPPED);
 	spin_unlock_irq(&current->sighand->siglock);
 }
 
@@ -2119,14 +2124,25 @@ relock:
 			goto relock;
 
 		/*
-		 * Take care of ptrace jobctl traps.  It is currently used
-		 * only to trap for group stop while ptraced.
+		 * Take care of ptrace jobctl traps.
+		 *
+		 * When PT_SEIZED, it's used for both group stop and
+		 * explicit SEIZE/INTERRUPT traps.  Both generate
+		 * PTRACE_EVENT_STOP trap with accompanying siginfo.
+		 *
+		 * When !PT_SEIZED, it's used only for group stop trap with
+		 * stop signal number as exit_code and no siginfo.
 		 */
 		if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
-			signr = current->jobctl & JOBCTL_STOP_SIGMASK;
-			WARN_ON_ONCE(!signr);
-			ptrace_stop(signr, CLD_STOPPED, 0, NULL);
-			current->exit_code = 0;
+			if (current->ptrace & PT_SEIZED) {
+				ptrace_do_notify(SIGTRAP | PTRACE_EVENT_STOP<<8,
+						 CLD_STOPPED);
+			} else {
+				signr = current->jobctl & JOBCTL_STOP_SIGMASK;
+				WARN_ON_ONCE(!signr);
+				ptrace_stop(signr, CLD_STOPPED, 0, NULL);
+				current->exit_code = 0;
+			}
 			spin_unlock_irq(&sighand->siglock);
 			goto relock;
 		}
-- 
1.7.5.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-01  5:39   ` Tejun Heo
@ 2011-06-02 12:31     ` Tejun Heo
  2011-06-02 14:51       ` Denys Vlasenko
  2011-06-02 18:27       ` Oleg Nesterov
  0 siblings, 2 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-02 12:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hello, Oleg.

On Wed, Jun 01, 2011 at 02:39:03PM +0900, Tejun Heo wrote:
> I'll test different corner cases (different thread doing
> waking/sleeping, race against INTERRUPT kind of things) and let you
> know how it goes.

I've tested threaded one and INTERRUPT immediate re-triggering and at
least the apparent cases work fine.  I've re-generated the git tree on
top of 3.0-rc1 with the updated patches.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize

The commit ID should be 7eae14288e745cfaabeb035fa67f89eb63a1aff7.  As
git.korg seems to be taking quite a while to sync, it might be a
better idea to pull from master.korg.

 ssh://master.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize

IIRC, pending issues are...

* Which signo to use in exit_code on STOP traps.

* Implicit signal on clone.

* What to do about events which are reported by genuine SIGTRAP
  generation?

* Group leader exit issue.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-02 12:31     ` Tejun Heo
@ 2011-06-02 14:51       ` Denys Vlasenko
  2011-06-03  1:24         ` Tejun Heo
  2011-06-02 18:27       ` Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Denys Vlasenko @ 2011-06-02 14:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

On Thu, Jun 2, 2011 at 2:31 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Oleg.
>
> On Wed, Jun 01, 2011 at 02:39:03PM +0900, Tejun Heo wrote:
>> I'll test different corner cases (different thread doing
>> waking/sleeping, race against INTERRUPT kind of things) and let you
>> know how it goes.
>
> I've tested threaded one and INTERRUPT immediate re-triggering and at
> least the apparent cases work fine.  I've re-generated the git tree on
> top of 3.0-rc1 with the updated patches.
>
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize
>
> The commit ID should be 7eae14288e745cfaabeb035fa67f89eb63a1aff7.  As
> git.korg seems to be taking quite a while to sync, it might be a
> better idea to pull from master.korg.
>
>  ssh://master.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize
>
> IIRC, pending issues are...
>
> * Which signo to use in exit_code on STOP traps.

SIGSTOP/TSTP/TTIN/TTOU on entering group-stop,
SIGCONT on leaving group-stop.

> * Implicit signal on clone.

Best if it is converted to STOP trap (the same is one caused by INTERRUPT).

I guess this may be optionally changed
(similar to how PTRACE_O_TRACEEXEC
changes post-execve SIGTRAP into PTRACE_EVENT_EXEC).

Why not turn it on *unconditionally* on SEIZE?
Because otherwise ptrace users will turn into

if (we_used_SEIZE)
    do_something;
else
    do_something_else;

maze, which is maintenance nightmare.
It's possible users will opt to not use new functionality at all
instead of going that way.

If everything is monolithically tied into SEIZE, users won't be able
to opt to use only easy parts of new functionality (such as
PTRACE_INTERRUPT and PTRACE_LISTEN) if this *forces* them
to also use harder parts of new functionality, in this case
forces them to double and obfuscate their existing code
which handles SIGSTOP-on-child-auto-attach. They don't really
want to, since this SIGSTOP *in practice* isn't that problematic.


> * What to do about events which are reported by genuine SIGTRAP
>  generation?

I don't understand what you meant here. Example(s)?


> * Group leader exit issue.

Ohhh this is an ugly one. It turns out it is linked to the question
of "how execve works under ptrace", in a non-obvious way.
I will respond in the second thread, with an example of current
kernel's breakage.

-- 
vda

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 17/17] ptrace: implement PTRACE_LISTEN
  2011-05-29 23:12 ` [PATCH 17/17] ptrace: implement PTRACE_LISTEN Tejun Heo
@ 2011-06-02 17:33   ` Oleg Nesterov
  2011-06-13 14:10     ` Tejun Heo
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-02 17:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

On 05/30, Tejun Heo wrote:
>
> This allows ptracer to monitor group stop state without running tracee
> - use INTERRUPT to put tracee into STOP trap, issue LISTEN and then
> wait(2) to wait for the next group stop event.  When it happens,
> PTRACE_GETSIGINFO provides information to determine the current state.

Great. Just a couple of questions,

> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1416,7 +1416,7 @@ static int wait_task_stopped(struct wait_opts *wo,
>  	spin_lock_irq(&p->sighand->siglock);
>
>  	p_code = task_stopped_code(p, ptrace);
> -	if (unlikely(!p_code))
> +	if (unlikely(!p_code) || p->jobctl & JOBCTL_LISTENING)
>  		goto unlock_sig;

Up to you, but perhaps this JOBCTL_LISTENING check should go into
task_stopped_code() ? Or do you think we can't check it without
siglock?

> +	case PTRACE_LISTEN:
> +		/*
> +		 * Listen for events.  Tracee must be in STOP.  It's not
> +		 * resumed per-se but is not considered to be in TRACED by
> +		 * wait(2) or ptrace(2).  If an async event (e.g. group
> +		 * stop state change) happens, tracee will enter STOP trap
> +		 * again.  Alternatively, ptracer can issue INTERRUPT to
> +		 * finish listening and re-trap tracee into STOP.
> +		 */
> +		if (unlikely(!seized || !lock_task_sighand(child, &flags)))
> +			break;
> +
> +		si = child->last_siginfo;
> +		if (unlikely(!si || si->si_code != PTRACE_STOP_SI_CODE))
> +			break;
> +
> +		child->jobctl |= JOBCTL_LISTENING;
> +
> +		/*
> +		 * If NOTIFY is set, it means event happened between start
> +		 * of this trap and now.  Trigger re-trap immediately.
> +		 */
> +		if (child->jobctl & JOBCTL_TRAP_NOTIFY)
> +			signal_wake_up(child, true);

Again, I won't insist if you prefer signal_wake_up(), but afaics
wake_up_state(__TASK_TRACED) should be enough.

> @@ -838,7 +840,7 @@ static void ptrace_trap_notify(struct task_struct *t)
>  	assert_spin_locked(&t->sighand->siglock);
>
>  	task_set_jobctl_pending(t, JOBCTL_TRAP_NOTIFY);
> -	signal_wake_up(t, 0);
> +	signal_wake_up(t, t->jobctl & JOBCTL_LISTENING);
>  }

OK. The only thing I can't understand is why prepare_signal(SIGCONT)
calls ptrace_trap_notify() unconditionally. How about

		if (likely(!(t->ptrace & PT_SEIZED)))
			wake_up_state(t, __TASK_STOPPED);
	-	else
	+	else if (why)
			ptrace_trap_notify(t);

?

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-02 12:31     ` Tejun Heo
  2011-06-02 14:51       ` Denys Vlasenko
@ 2011-06-02 18:27       ` Oleg Nesterov
  2011-06-02 21:09         ` Denys Vlasenko
  1 sibling, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-02 18:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hi Tejun,

On 06/02, Tejun Heo wrote:
>
> I've tested threaded one and INTERRUPT immediate re-triggering and at
> least the apparent cases work fine.  I've re-generated the git tree on
> top of 3.0-rc1 with the updated patches.
>
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize

Everything looks fine to me.

I feel we can cleanup this code a little bit, but we can do this later.

Only one thing. I think it makes sense to discuss the idea from Denys,

> * Which signo to use in exit_code on STOP traps.

Yes. other pending issues can be solved later.

So,

	static void ptrace_do_notify(int exit_code, int why)
	{
		info.si_signo = SIGTRAP;
		info.si_code = __SI_TRAP | exit_code;

		if ((current->ptrace & PT_SEIZED) &&
		    (sig->group_stop_count || sig->flags & SIGNAL_STOP_STOPPED)) {
			info.si_pt_flags |= PTRACE_SI_STOPPED;
			info.si_signo = current->jobctl & JOBCTL_STOP_SIGMASK;
			WARN_ON_ONCE(!info.si_signo);
		}

	}

Can't we set si_pt_flags only if PTRACE_EVENT_STOP? Afaics, this
should be enough to support the jobctl tracking.

If yes, then can't we encode si_pt_flags in task->exit_code which
is "visible" to wait?

IOW, ptrace_do_notify(PTRACE_EVENT_STOP) path should use

	exit_code = signr | PTRACE_EVENT_STOP<<8;

and signr is roughly calculated as

	if (group_stop_count || SIGNAL_STOP_STOPPED)
		signr = jobctl & JOBCTL_STOP_SIGMASK;
	else if (JOBCTL_TRAP_NOTIFY)
		signr = SIGCONT;
	else
		signr =  SIGTRAP; // PTRACE_INTERRUPT

In this case we can avoid all siginfo changes. The tracer does wait(status)
anyway, it can see the state without GETSIGINFO. The only problem, the tracer
should be careful to avoid the confusion with ptrace_signal(), it should
check status & (PTRACE_EVENT_STOP << 16).

What do you think?

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-02 18:27       ` Oleg Nesterov
@ 2011-06-02 21:09         ` Denys Vlasenko
  2011-06-03  1:34           ` Tejun Heo
  0 siblings, 1 reply; 50+ messages in thread
From: Denys Vlasenko @ 2011-06-02 21:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

On Thursday 02 June 2011 20:27, Oleg Nesterov wrote:
> Hi Tejun,
> 
> On 06/02, Tejun Heo wrote:
> >
> > I've tested threaded one and INTERRUPT immediate re-triggering and at
> > least the apparent cases work fine.  I've re-generated the git tree on
> > top of 3.0-rc1 with the updated patches.
> >
> >  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-ptrace-seize
> 
> Everything looks fine to me.
> 
> I feel we can cleanup this code a little bit, but we can do this later.
> 
> Only one thing. I think it makes sense to discuss the idea from Denys,
> 
> > * Which signo to use in exit_code on STOP traps.
> 
> Yes. other pending issues can be solved later.
> 
> So,
> 
> 	static void ptrace_do_notify(int exit_code, int why)
> 	{
> 		info.si_signo = SIGTRAP;
> 		info.si_code = __SI_TRAP | exit_code;
> 
> 		if ((current->ptrace & PT_SEIZED) &&
> 		    (sig->group_stop_count || sig->flags & SIGNAL_STOP_STOPPED)) {
> 			info.si_pt_flags |= PTRACE_SI_STOPPED;
> 			info.si_signo = current->jobctl & JOBCTL_STOP_SIGMASK;
> 			WARN_ON_ONCE(!info.si_signo);
> 		}
> 
> 	}
> 
> Can't we set si_pt_flags only if PTRACE_EVENT_STOP? Afaics, this
> should be enough to support the jobctl tracking.
> 
> If yes, then can't we encode si_pt_flags in task->exit_code which
> is "visible" to wait?
> 
> IOW, ptrace_do_notify(PTRACE_EVENT_STOP) path should use
> 
> 	exit_code = signr | PTRACE_EVENT_STOP<<8;
> 
> and signr is roughly calculated as
> 
> 	if (group_stop_count || SIGNAL_STOP_STOPPED)
> 		signr = jobctl & JOBCTL_STOP_SIGMASK;
> 	else if (JOBCTL_TRAP_NOTIFY)
> 		signr = SIGCONT;
> 	else
> 		signr =  SIGTRAP; // PTRACE_INTERRUPT
> 
> In this case we can avoid all siginfo changes. The tracer does wait(status)
> anyway, it can see the state without GETSIGINFO. The only problem, the tracer
> should be careful to avoid the confusion with ptrace_signal(), it should
> check status & (PTRACE_EVENT_STOP << 16).
> 
> What do you think?

This should alleviate Linus' concerns that we are suffering from unnecessary
featuritis - that we invent API which is significantly different
from existing one, and as such users will not use it.

Existing API doesn't use GETSIGINFO data per se to distinguish group-stop from
signal-delivery-stop, it uses the fact that GETSIGINFO fails on group-stop.
(Well, arguably it's not a "designed" API, more like "accidentally created API",
but nevertheless it exists right now). Oleg's proposal means that the new way
may be makde to work very similarly.

The less we deverge in handling of group-stop from existing API while fixing it,
the better. If the only thing strace needs to change is to issue PTRACE_LISTEN
instead of PTRACE_CONT on group-stop, then it's wonderful.

I understand that this patchset doesn't do exactly that yet, but it appears
it can be achieved relatively easy by a future change. Don't take this
as a request to respin the patchset yet again.

-- 
vda

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-02 14:51       ` Denys Vlasenko
@ 2011-06-03  1:24         ` Tejun Heo
  2011-06-03 10:25           ` Pedro Alves
  2011-06-03 11:57           ` Denys Vlasenko
  0 siblings, 2 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-03  1:24 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

Hey, Denys.

On Thu, Jun 02, 2011 at 04:51:48PM +0200, Denys Vlasenko wrote:
> SIGSTOP/TSTP/TTIN/TTOU on entering group-stop,
> SIGCONT on leaving group-stop.

Yeah and SIGTRAP on INTERRUPT.  It makes sense.

> > * Implicit signal on clone.
> 
> Best if it is converted to STOP trap (the same is one caused by INTERRUPT).
> 
> I guess this may be optionally changed
> (similar to how PTRACE_O_TRACEEXEC
> changes post-execve SIGTRAP into PTRACE_EVENT_EXEC).
> 
> Why not turn it on *unconditionally* on SEIZE?
> Because otherwise ptrace users will turn into
> 
> if (we_used_SEIZE)
>     do_something;
> else
>     do_something_else;
> 
> maze, which is maintenance nightmare.
> It's possible users will opt to not use new functionality at all
> instead of going that way.

Hmmm... I see.  The other side of the argument is that some level of
"if (SEIZEd)" is inevitable anyway and in the longer run we would be
better off defaulting to the better behavior than making things
optional.

> If everything is monolithically tied into SEIZE, users won't be able
> to opt to use only easy parts of new functionality (such as
> PTRACE_INTERRUPT and PTRACE_LISTEN) if this *forces* them
> to also use harder parts of new functionality, in this case
> forces them to double and obfuscate their existing code
> which handles SIGSTOP-on-child-auto-attach. They don't really
> want to, since this SIGSTOP *in practice* isn't that problematic.

Anyways, let's think about that, but SIGSTOP on clone is closely
linked to why SEIZE is used in the first place and I currently lean
toward tying it to SEIZE.

> > * What to do about events which are reported by genuine SIGTRAP
> >  generation?
> 
> I don't understand what you meant here. Example(s)?

The syscall, breakpoint, single step SIGTRAPs which can't be
distinguished from userland generated SIGTRAPs and may be mixed and/or
lost.  Maybe it's best to leave them alone or maybe we can add some
way to distinguish them which is mostly backward compatible (which is
enabled w/ SEIZE and hopefully doesn't require noticeable userland
changes).

> > * Group leader exit issue.
> 
> Ohhh this is an ugly one. It turns out it is linked to the question
> of "how execve works under ptrace", in a non-obvious way.
> I will respond in the second thread, with an example of current
> kernel's breakage.

I haven't followed the thread yet.  Let's talk about it in that
thread.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-02 21:09         ` Denys Vlasenko
@ 2011-06-03  1:34           ` Tejun Heo
  2011-06-03 11:37             ` Denys Vlasenko
  2011-06-03 15:37             ` Oleg Nesterov
  0 siblings, 2 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-03  1:34 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

Hello, Oleg, Denys.

On Thu, Jun 02, 2011 at 11:09:44PM +0200, Denys Vlasenko wrote:
> > and signr is roughly calculated as
> > 
> > 	if (group_stop_count || SIGNAL_STOP_STOPPED)
> > 		signr = jobctl & JOBCTL_STOP_SIGMASK;
> > 	else if (JOBCTL_TRAP_NOTIFY)
> > 		signr = SIGCONT;
> > 	else
> > 		signr =  SIGTRAP; // PTRACE_INTERRUPT
> > 
> > In this case we can avoid all siginfo changes. The tracer does wait(status)
> > anyway, it can see the state without GETSIGINFO. The only problem, the tracer
> > should be careful to avoid the confusion with ptrace_signal(), it should
> > check status & (PTRACE_EVENT_STOP << 16).
> > 
> > What do you think?

Yeap, it makes sense.

> This should alleviate Linus' concerns that we are suffering from unnecessary
> featuritis - that we invent API which is significantly different
> from existing one, and as such users will not use it.
> 
> Existing API doesn't use GETSIGINFO data per se to distinguish group-stop from
> signal-delivery-stop, it uses the fact that GETSIGINFO fails on group-stop.
> (Well, arguably it's not a "designed" API, more like "accidentally created API",
> but nevertheless it exists right now). Oleg's proposal means that the new way
> may be makde to work very similarly.

Okay, you want to preserve %NULL SIGINFO on all STOP traps.  Dropping
si_pt_flags and using exit_code makes sense but I think we'll be
better off enabling GETSIGINFO.  The affected part of code has to be
changed anyway and with %NULL GETSIGINFO we effectively would have
driven ourselves into corner if more information needs to be added
later on.  Don't current users unconditionally issue CONT(sig) anyway?

> The less we deverge in handling of group-stop from existing API while fixing it,
> the better. If the only thing strace needs to change is to issue PTRACE_LISTEN
> instead of PTRACE_CONT on group-stop, then it's wonderful.
> 
> I understand that this patchset doesn't do exactly that yet, but it appears
> it can be achieved relatively easy by a future change. Don't take this
> as a request to respin the patchset yet again.

Heh, don't worry about re-spinning.  It's only take#4 now after all. :)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03  1:24         ` Tejun Heo
@ 2011-06-03 10:25           ` Pedro Alves
  2011-06-16  8:38             ` Tejun Heo
  2011-06-03 11:57           ` Denys Vlasenko
  1 sibling, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2011-06-03 10:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Oleg Nesterov, jan.kratochvil, linux-kernel,
	torvalds, akpm, indan, bdonlan

On Friday 03 June 2011 02:24:45, Tejun Heo wrote:
> > > * What to do about events which are reported by genuine SIGTRAP
> > >  generation?
> > 
> > I don't understand what you meant here. Example(s)?
> 
> The syscall, breakpoint, single step SIGTRAPs which can't be
> distinguished from userland generated SIGTRAPs and may be mixed and/or
> lost.  Maybe it's best to leave them alone or maybe we can add some
> way to distinguish them which is mostly backward compatible (which is
> enabled w/ SEIZE and hopefully doesn't require noticeable userland
> changes).

Some archs started encoding some of that stuff on SIGTRAPs si_code.
E.g., on a ppc box I got here, I see:

/* `si_code' values for SIGTRAP signal.  */
enum
{
  TRAP_BRKPT = 1,               /* Process breakpoint.  */
# define TRAP_BRKPT     TRAP_BRKPT
  TRAP_TRACE                    /* Process trace trap.  */
# define TRAP_TRACE     TRAP_TRACE
};

It'd be _very_ useful for x86 (and others) to have
something like TRAP_BRKPT for int3.  Both for ptracers
and in-process self debuggers.

I'd be super happy to be told we have that already
in recent kernels and I missed it.  :-)

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03  1:34           ` Tejun Heo
@ 2011-06-03 11:37             ` Denys Vlasenko
  2011-06-03 11:58               ` Denys Vlasenko
  2011-06-03 15:37             ` Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Denys Vlasenko @ 2011-06-03 11:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

On Friday 03 June 2011 03:34, Tejun Heo wrote:
> > This should alleviate Linus' concerns that we are suffering from unnecessary
> > featuritis - that we invent API which is significantly different
> > from existing one, and as such users will not use it.
> > 
> > Existing API doesn't use GETSIGINFO data per se to distinguish group-stop from
> > signal-delivery-stop, it uses the fact that GETSIGINFO fails on group-stop.
> > (Well, arguably it's not a "designed" API, more like "accidentally created API",
> > but nevertheless it exists right now). Oleg's proposal means that the new way
> > may be makde to work very similarly.
> 
> Okay, you want to preserve %NULL SIGINFO on all STOP traps.  Dropping
> si_pt_flags and using exit_code makes sense but I think we'll be
> better off enabling GETSIGINFO.  The affected part of code has to be
> changed anyway

Well, not exactly. The affected part of code in strace *has been changed
already*. Latest version of strace finally does show "SIGSTOP delivered"
and "group-stop happened" events differently.

Now strace, ideally, can have a minimal change

-	ptrace(PTRACE_LISTEN, pid, 0, sig);
+	if (this_is_group_stop)
+		ptrace(PTRACE_LISTEN, pid, 0, 0);
+	else
+		ptrace(PTRACE_SYSCALL, pid, 0, sig);

and boom, group-stops are handled correctly!

> and with %NULL GETSIGINFO we effectively would have 
> driven ourselves into corner if more information needs to be added
> later on.

Oleg thinks that GETSIGINFO may be not the best way to provide additional data.
Think about it. It's "get _signal_ info" op. By overloading it to mean something
else, we are creating yet another ptrace quirk: "sometimes GETSIGINFO will
return something else than signal info".

Oleg says he prefers to add a separate GETfoo op, with cleaner meaning.


> Don't current users unconditionally issue CONT(sig) anyway? 

Yes.

-- 
vda

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03  1:24         ` Tejun Heo
  2011-06-03 10:25           ` Pedro Alves
@ 2011-06-03 11:57           ` Denys Vlasenko
  2011-06-03 12:11             ` Pedro Alves
  2011-06-03 15:46             ` Oleg Nesterov
  1 sibling, 2 replies; 50+ messages in thread
From: Denys Vlasenko @ 2011-06-03 11:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

On Friday 03 June 2011 03:24, Tejun Heo wrote:
> > > * Implicit signal on clone.
> > 
> > Best if it is converted to STOP trap (the same is one caused by INTERRUPT).
> > 
> > I guess this may be optionally changed
> > (similar to how PTRACE_O_TRACEEXEC
> > changes post-execve SIGTRAP into PTRACE_EVENT_EXEC).
> > 
> > Why not turn it on *unconditionally* on SEIZE?
> > Because otherwise ptrace users will turn into
> > 
> > if (we_used_SEIZE)
> >     do_something;
> > else
> >     do_something_else;
> > 
> > maze, which is maintenance nightmare.
> > It's possible users will opt to not use new functionality at all
> > instead of going that way.
> 
> Hmmm... I see.  The other side of the argument is that some level of
> "if (SEIZEd)" is inevitable anyway and in the longer run we would be
> better off defaulting to the better behavior than making things
> optional.
> 
> > If everything is monolithically tied into SEIZE, users won't be able
> > to opt to use only easy parts of new functionality (such as
> > PTRACE_INTERRUPT and PTRACE_LISTEN) if this *forces* them
> > to also use harder parts of new functionality, in this case
> > forces them to double and obfuscate their existing code
> > which handles SIGSTOP-on-child-auto-attach. They don't really
> > want to, since this SIGSTOP *in practice* isn't that problematic.
> 
> Anyways, let's think about that, but SIGSTOP on clone is closely
> linked to why SEIZE is used in the first place and I currently lean
> toward tying it to SEIZE.

Ok.

SIGSTOP on clone is less problematic because in practice it's
rather hard to send a real SIGSTOP to a thread which is _just_ created.
So the race window is mostly theoretical - unlike the much more realistic
race on initial attach to an already running process.

But if tracer already uses SEIZE on initial attach, it ought to be
willing to handle the new way of post-clone stop too.
Therefore I'm ok with this idea.


> > > * What to do about events which are reported by genuine SIGTRAP
> > >  generation?
> > 
> > I don't understand what you meant here. Example(s)?
> 
> The syscall, breakpoint, single step SIGTRAPs which can't be
> distinguished from userland generated SIGTRAPs and may be mixed and/or
> lost.  Maybe it's best to leave them alone or maybe we can add some
> way to distinguish them which is mostly backward compatible (which is
> enabled w/ SEIZE and hopefully doesn't require noticeable userland
> changes).

Currently, all PTRACE_EVENTs are enabled with ptrace options.

I propose using the same way instead of using something different.
It works. It's not problematic. Just add more PTRACE_O_foo bits.
Then user who really want it will use it, and users which
would rather use their existing code with less changes aren't
forced to change.

I am not insisting on a separate bit per event,
single blanket PTRACE_O_FLAGTRAPS bit which converts all of these
to PTRACE_EVENTs all once is ok with me.

-- 
vda

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03 11:37             ` Denys Vlasenko
@ 2011-06-03 11:58               ` Denys Vlasenko
  0 siblings, 0 replies; 50+ messages in thread
From: Denys Vlasenko @ 2011-06-03 11:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

On Friday 03 June 2011 13:37, Denys Vlasenko wrote:
> On Friday 03 June 2011 03:34, Tejun Heo wrote:
> > > This should alleviate Linus' concerns that we are suffering from unnecessary
> > > featuritis - that we invent API which is significantly different
> > > from existing one, and as such users will not use it.
> > > 
> > > Existing API doesn't use GETSIGINFO data per se to distinguish group-stop from
> > > signal-delivery-stop, it uses the fact that GETSIGINFO fails on group-stop.
> > > (Well, arguably it's not a "designed" API, more like "accidentally created API",
> > > but nevertheless it exists right now). Oleg's proposal means that the new way
> > > may be makde to work very similarly.
> > 
> > Okay, you want to preserve %NULL SIGINFO on all STOP traps.  Dropping
> > si_pt_flags and using exit_code makes sense but I think we'll be
> > better off enabling GETSIGINFO.  The affected part of code has to be
> > changed anyway
> 
> Well, not exactly. The affected part of code in strace *has been changed
> already*. Latest version of strace finally does show "SIGSTOP delivered"
> and "group-stop happened" events differently.
> 
> Now strace, ideally, can have a minimal change
> 
> -	ptrace(PTRACE_LISTEN, pid, 0, sig);
> +	if (this_is_group_stop)
> +		ptrace(PTRACE_LISTEN, pid, 0, 0);
> +	else
> +		ptrace(PTRACE_SYSCALL, pid, 0, sig);
> 
> and boom, group-stops are handled correctly!

I meant

-	ptrace(PTRACE_SYSCALL, pid, 0, sig);
+	if (this_is_group_stop)
+		ptrace(PTRACE_LISTEN, pid, 0, 0);
+	else
+		ptrace(PTRACE_SYSCALL, pid, 0, sig);

-- 
vda

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03 11:57           ` Denys Vlasenko
@ 2011-06-03 12:11             ` Pedro Alves
  2011-06-03 14:12               ` Denys Vlasenko
  2011-06-03 15:46             ` Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2011-06-03 12:11 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds,
	akpm, indan, bdonlan

On Friday 03 June 2011 12:57:02, Denys Vlasenko wrote:
> On Friday 03 June 2011 03:24, Tejun Heo wrote:
> > > > * Implicit signal on clone.
> > > 
> > > Best if it is converted to STOP trap (the same is one caused by INTERRUPT).
> > > 
> > > I guess this may be optionally changed
> > > (similar to how PTRACE_O_TRACEEXEC
> > > changes post-execve SIGTRAP into PTRACE_EVENT_EXEC).
> > > 
> > > Why not turn it on *unconditionally* on SEIZE?
> > > Because otherwise ptrace users will turn into
> > > 
> > > if (we_used_SEIZE)
> > >     do_something;
> > > else
> > >     do_something_else;
> > > 
> > > maze, which is maintenance nightmare.
> > > It's possible users will opt to not use new functionality at all
> > > instead of going that way.
> > 
> > Hmmm... I see.  The other side of the argument is that some level of
> > "if (SEIZEd)" is inevitable anyway and in the longer run we would be
> > better off defaulting to the better behavior than making things
> > optional.
> > 
> > > If everything is monolithically tied into SEIZE, users won't be able
> > > to opt to use only easy parts of new functionality (such as
> > > PTRACE_INTERRUPT and PTRACE_LISTEN) if this *forces* them
> > > to also use harder parts of new functionality, in this case
> > > forces them to double and obfuscate their existing code
> > > which handles SIGSTOP-on-child-auto-attach. They don't really
> > > want to, since this SIGSTOP *in practice* isn't that problematic.
> > 
> > Anyways, let's think about that, but SIGSTOP on clone is closely
> > linked to why SEIZE is used in the first place and I currently lean
> > toward tying it to SEIZE.
> 
> Ok.
> 
> SIGSTOP on clone is less problematic because in practice it's
> rather hard to send a real SIGSTOP to a thread which is _just_ created.
> So the race window is mostly theoretical - unlike the much more realistic
> race on initial attach to an already running process.
> 
> But if tracer already uses SEIZE on initial attach, it ought to be
> willing to handle the new way of post-clone stop too.
> Therefore I'm ok with this idea.
> 
> 
> > > > * What to do about events which are reported by genuine SIGTRAP
> > > >  generation?
> > > 
> > > I don't understand what you meant here. Example(s)?
> > 
> > The syscall, breakpoint, single step SIGTRAPs which can't be
> > distinguished from userland generated SIGTRAPs and may be mixed and/or
> > lost.  Maybe it's best to leave them alone or maybe we can add some
> > way to distinguish them which is mostly backward compatible (which is
> > enabled w/ SEIZE and hopefully doesn't require noticeable userland
> > changes).
> 
> Currently, all PTRACE_EVENTs are enabled with ptrace options.
> 
> I propose using the same way instead of using something different.
> It works. It's not problematic. Just add more PTRACE_O_foo bits.
> Then user who really want it will use it, and users which
> would rather use their existing code with less changes aren't
> forced to change.

I'm in principle against this.  What realistic good does it
bring over making exception/syscall SIGTRAPs distinguisheable
on the siginfo?  Userspace should see these SIGTRAPs if a tracer
isn't there anyway, and, even if a tracer _is_ there, you
may want to forward breakpoint/step SIGTRAPs to the
tracee, just as if a ptracer wasn't there --- I do that often to
debug an in-process self debugger.  Being able to distinguish the
cause of the SIGTRAP is useful for self debuggers as well,
which leads to putting the info in siginfo anyway.

> I am not insisting on a separate bit per event,
> single blanket PTRACE_O_FLAGTRAPS bit which converts all of these
> to PTRACE_EVENTs all once is ok with me.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03 12:11             ` Pedro Alves
@ 2011-06-03 14:12               ` Denys Vlasenko
  2011-06-03 15:24                 ` Pedro Alves
  0 siblings, 1 reply; 50+ messages in thread
From: Denys Vlasenko @ 2011-06-03 14:12 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Tejun Heo, Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds,
	akpm, indan, bdonlan

On Fri, Jun 3, 2011 at 2:11 PM, Pedro Alves <pedro@codesourcery.com> wrote:
>> > > > * What to do about events which are reported by genuine SIGTRAP
>> > > >  generation?
>> > >
>> > > I don't understand what you meant here. Example(s)?
>> >
>> > The syscall, breakpoint, single step SIGTRAPs which can't be
>> > distinguished from userland generated SIGTRAPs and may be mixed and/or
>> > lost.  Maybe it's best to leave them alone or maybe we can add some
>> > way to distinguish them which is mostly backward compatible (which is
>> > enabled w/ SEIZE and hopefully doesn't require noticeable userland
>> > changes).
>>
>> Currently, all PTRACE_EVENTs are enabled with ptrace options.
>>
>> I propose using the same way instead of using something different.
>> It works. It's not problematic. Just add more PTRACE_O_foo bits.
>> Then user who really want it will use it, and users which
>> would rather use their existing code with less changes aren't
>> forced to change.
>
> I'm in principle against this.  What realistic good does it
> bring over making exception/syscall SIGTRAPs distinguisheable
> on the siginfo?

Requiring GETSIGINFO on every syscall trap is (1) a bit expensive
and (2) was never needed before.

>  Userspace should see these SIGTRAPs if a tracer
> isn't there anyway, and, even if a tracer _is_ there, you
> may want to forward breakpoint/step SIGTRAPs to the
> tracee, just as if a ptracer wasn't there --- I do that often to
> debug an in-process self debugger.

Ah, you are talking about int3 SIGTRAP? That one, IMHO,
should be treated as "real" signal and passed down to the program
by strace. For debuggers which use int3 as breakpoint insn,
well, I don't know. Perhaps they don't cope well with user-sent
SIGTRAPs anyway?

>  Being able to distinguish the
> cause of the SIGTRAP is useful for self debuggers as well,
> which leads to putting the info in siginfo anyway.

I don't understand what use case you have in mind.

-- 
vda

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03 14:12               ` Denys Vlasenko
@ 2011-06-03 15:24                 ` Pedro Alves
  0 siblings, 0 replies; 50+ messages in thread
From: Pedro Alves @ 2011-06-03 15:24 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, Oleg Nesterov, jan.kratochvil, linux-kernel, torvalds,
	akpm, indan, bdonlan

On Friday 03 June 2011 15:12:55, Denys Vlasenko wrote:
> On Fri, Jun 3, 2011 at 2:11 PM, Pedro Alves <pedro@codesourcery.com> wrote:
> >> > > > * What to do about events which are reported by genuine SIGTRAP
> >> > > >  generation?
> >> > >
> >> > > I don't understand what you meant here. Example(s)?
> >> >
> >> > The syscall, breakpoint, single step SIGTRAPs which can't be
> >> > distinguished from userland generated SIGTRAPs and may be mixed and/or
> >> > lost.  Maybe it's best to leave them alone or maybe we can add some
> >> > way to distinguish them which is mostly backward compatible (which is
> >> > enabled w/ SEIZE and hopefully doesn't require noticeable userland
> >> > changes).
> >>
> >> Currently, all PTRACE_EVENTs are enabled with ptrace options.
> >>
> >> I propose using the same way instead of using something different.
> >> It works. It's not problematic. Just add more PTRACE_O_foo bits.
> >> Then user who really want it will use it, and users which
> >> would rather use their existing code with less changes aren't
> >> forced to change.
> >
> > I'm in principle against this.  What realistic good does it
> > bring over making exception/syscall SIGTRAPs distinguisheable
> > on the siginfo?
> 
> Requiring GETSIGINFO on every syscall trap is (1) a bit expensive
> and (2) was never needed before.

On (1) - a tracer can enable the sysgood bit today, no need to
GETSIGINFO in that case.  syscall traps are not generated if
a thread is not traced.  This leaves user-sent SIGTRAPs, breakpoints,
single-step/trace, data breakpoints/watchpoints.  The first
three are generatable without a tracer installed.  The latter,
not possible on x86 to tweak the debug registers from userspace
today (unless a ptracer tweaks the debug registers, and then
detaches, without clearing them), but possible to workaround with
a kernel module.  Likely possible to trigger on other archs though.

Does strace end up passing non-syscall SIGTRAPs down to
the tracee that it shouldn't today, due to bad ptrace API?  What
are those SIGTRAPs?

On (2) - Tracers who need to handle breakpoints and single-steps
need to do other ptrace calls on a SIGTRAP to handle the event.
Read registers (read current PC, unless you get that from ... siginfo),
read debug registers --- check whether you've hit a hardware
breakpoint or data breakpoint/watchpoint.  On PPC, for example, GDB
reads siginfo on every SIGTRAP to check for TRAP_HWBKPT on si_code,
and check its si_errno for which hwbreak triggered, so "never needed
before" is false.

And a flurry of user-sent SIGTRAPs is not something anything
sane would do.  :-)

> 
> >  Userspace should see these SIGTRAPs if a tracer
> > isn't there anyway, and, even if a tracer _is_ there, you
> > may want to forward breakpoint/step SIGTRAPs to the
> > tracee, just as if a ptracer wasn't there --- I do that often to
> > debug an in-process self debugger.
> 
> Ah, you are talking about int3 SIGTRAP? That one, IMHO,
> should be treated as "real" signal and passed down to the program
> by strace.

Of course.  int1's (single-step's, caused by TF (trace flag)
set on x86) should be treated the same.  These are real exceptions
that should be forwarded to a program as signals.  What are
the SIGTRAPs that shouldn't be passed down to the program
by strace (excluding the current PTRACE_EVENT_XXX SIGTRAPs)?

> For debuggers which use int3 as breakpoint insn,
> well, I don't know. 

They need to resort to heuristics based on their own state machine
logic (did I set the thread stepping?, is the PC the same as it was
before?, etc., is there an int3 at PC - 1?), to infer whether
a SIGTRAP corresponds to a known breakpoint hit or not,
and unwind the PC if so.  All that is heuristic, and can fail, e.g,
if you overwrite the int3 in memory, before waitpid'ing the SIGTRAP.
Failing to unwind the PC or unwinding it when you shouldn't is
disastrous.

> Perhaps they don't cope well with user-sent
> SIGTRAPs anyway?

Given that breakpoint/trace SIGTRAPs should be passed down to
the program, the fact that user-sent SIGTRAPs don't queue
with those, is not a ptrace problem --- it's a unix signals
problem that can happen without ptrace involved.

But who sends these anyway?  :-P

> >  Being able to distinguish the
> > cause of the SIGTRAP is useful for self debuggers as well,
> > which leads to putting the info in siginfo anyway.
> 
> I don't understand what use case you have in mind.

There are programs that implement a sort of self debugger.
A "ptracer" without ptrace.  We've written one recently for
x86/x86-64.  Install signal handlers for all signals, reserve
one thread for the "debugger", and a signal to interrupt other
threads.  Handle breakpoints and single-steps by handling SIGTRAP.
You need to take some trade offs, but it works.

If we have that info already available on siginfo (at least
some archs do, and I hope x86 gets it too), then it looks
like adding more special ptrace stuff wouldn't do a
significant good.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03  1:34           ` Tejun Heo
  2011-06-03 11:37             ` Denys Vlasenko
@ 2011-06-03 15:37             ` Oleg Nesterov
  1 sibling, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-03 15:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, jan.kratochvil, linux-kernel, torvalds, akpm,
	indan, bdonlan, pedro

On 06/03, Tejun Heo wrote:
>
> Okay, you want to preserve %NULL SIGINFO on all STOP traps.  Dropping
> si_pt_flags and using exit_code makes sense but I think we'll be
> better off enabling GETSIGINFO.

Agreed. I do not see why GETSIGINFO should fail.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03 11:57           ` Denys Vlasenko
  2011-06-03 12:11             ` Pedro Alves
@ 2011-06-03 15:46             ` Oleg Nesterov
  1 sibling, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-03 15:46 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Tejun Heo, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

On 06/03, Denys Vlasenko wrote:
>
> On Friday 03 June 2011 03:24, Tejun Heo wrote:
> >
> > Anyways, let's think about that, but SIGSTOP on clone is closely
> > linked to why SEIZE is used in the first place and I currently lean
> > toward tying it to SEIZE.
>
> Ok.
>
> SIGSTOP on clone is less problematic because in practice it's
> rather hard to send a real SIGSTOP to a thread which is _just_ created.
> So the race window is mostly theoretical

It is not. The new child can dequeue a signal which was sent to the
process (thread group). How the tracer can distinguish? Also, SIGSTOP
can be sent to pgrp and the new tracee can be forked.

I think that SEIZE should never send SIGSTOP on auto-attach, without
any options.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 17/17] ptrace: implement PTRACE_LISTEN
  2011-06-02 17:33   ` Oleg Nesterov
@ 2011-06-13 14:10     ` Tejun Heo
  2011-06-13 20:33       ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Tejun Heo @ 2011-06-13 14:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hello, Oleg.

On Thu, Jun 02, 2011 at 07:33:30PM +0200, Oleg Nesterov wrote:
> >  	p_code = task_stopped_code(p, ptrace);
> > -	if (unlikely(!p_code))
> > +	if (unlikely(!p_code) || p->jobctl & JOBCTL_LISTENING)
> >  		goto unlock_sig;
> 
> Up to you, but perhaps this JOBCTL_LISTENING check should go into
> task_stopped_code() ? Or do you think we can't check it without
> siglock?

So updated.  I don't think it's gonna introduce any new race
condition.

> > +		/*
> > +		 * If NOTIFY is set, it means event happened between start
> > +		 * of this trap and now.  Trigger re-trap immediately.
> > +		 */
> > +		if (child->jobctl & JOBCTL_TRAP_NOTIFY)
> > +			signal_wake_up(child, true);
> 
> Again, I won't insist if you prefer signal_wake_up(), but afaics
> wake_up_state(__TASK_TRACED) should be enough.

Re-trapping from attach/detach paths are already using
signal_wake_up() and I think it would be better to keep it consistent.

> > @@ -838,7 +840,7 @@ static void ptrace_trap_notify(struct task_struct *t)
> >  	assert_spin_locked(&t->sighand->siglock);
> >
> >  	task_set_jobctl_pending(t, JOBCTL_TRAP_NOTIFY);
> > -	signal_wake_up(t, 0);
> > +	signal_wake_up(t, t->jobctl & JOBCTL_LISTENING);
> >  }
> 
> OK. The only thing I can't understand is why prepare_signal(SIGCONT)
> calls ptrace_trap_notify() unconditionally. How about
> 
> 		if (likely(!(t->ptrace & PT_SEIZED)))
> 			wake_up_state(t, __TASK_STOPPED);
> 	-	else
> 	+	else if (why)
> 			ptrace_trap_notify(t);
> 
> ?

I'm having a Deja Vu.  Did I reply to this already?  Anyways, here are
my rationales.

* Tracer should be able to handle seemingly spurious notifications.
  e.g. rapid SIGSTOP/CONT sequence may generate seemingly spurious
  notifications even when it actually isn't spurious.

  SIGCONT always generating notification is correct and I don't see
  good reasons to optimize it.  Moreover, I think it doesn't hurt to
  have a way to reliably trigger spurious notification.

* If we're gonna optimize out SIGCONT processing if the target process
  doesn't need it, the proper way would be testing stopped state and
  exit before walking through the group list.  However, I think it's
  done the current way for a reason - always trying to wake up on
  SIGCONT is more robust in case something went out of sync &&
  optimizing spurious SIGCONT doesn't really help anyone.

So, I'd like to keep this one as it currently is.  It's more robust
and useful this way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 17/17] ptrace: implement PTRACE_LISTEN
  2011-06-13 14:10     ` Tejun Heo
@ 2011-06-13 20:33       ` Oleg Nesterov
  2011-06-14  6:45         ` Tejun Heo
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-13 20:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hello Tejun,

On 06/13, Tejun Heo wrote:
>
> On Thu, Jun 02, 2011 at 07:33:30PM +0200, Oleg Nesterov wrote:
> > > +		/*
> > > +		 * If NOTIFY is set, it means event happened between start
> > > +		 * of this trap and now.  Trigger re-trap immediately.
> > > +		 */
> > > +		if (child->jobctl & JOBCTL_TRAP_NOTIFY)
> > > +			signal_wake_up(child, true);
> >
> > Again, I won't insist if you prefer signal_wake_up(), but afaics
> > wake_up_state(__TASK_TRACED) should be enough.
>
> Re-trapping from attach/detach paths are already using
> signal_wake_up()

because attach sets TRAP_STOP which contributes to recalc_sigpending().

If JOBCTL_TRAP_NOTIFY is set, TIF_SIGPENDING should be already set as
well by the same reason. And in any case ptrace_stop() does
recalc_sigpending_tsk() before return, TIF_SIGPENDING is never really
needed when we are going to wake up the TASK_TRACED task.

However,

> and I think it would be better to keep it consistent.

OK, I don't really mind, up to you.

> > OK. The only thing I can't understand is why prepare_signal(SIGCONT)
> > calls ptrace_trap_notify() unconditionally. How about
> >
> > 		if (likely(!(t->ptrace & PT_SEIZED)))
> > 			wake_up_state(t, __TASK_STOPPED);
> > 	-	else
> > 	+	else if (why)
> > 			ptrace_trap_notify(t);
> >
> > ?
>
> I'm having a Deja Vu.

Me too...

> Did I reply to this already?  Anyways, here are
> my rationales.
>
> * Tracer should be able to handle seemingly spurious notifications.
> ...
> SIGCONT always generating notification is correct

Yes, I didn't say this is wrong.

>   and I don't see
>   good reasons to optimize it.  Moreover, I think it doesn't hurt to
>   have a way to reliably trigger spurious notification.

Well. I don't really understand why, but OK. Let's keep it this way.

> * If we're gonna optimize out SIGCONT processing if the target process
>   doesn't need it, the proper way would be testing stopped state and
>   exit before walking through the group list.

We can't, at least we need rm_from_queue(SIG_KERNEL_STOP_MASK) and
task_clear_jobctl_pending().


> However, I think it's
>   done the current way for a reason - always trying to wake up on
>   SIGCONT is more robust in case something went out of sync

Hmm. I am wondering if we can ever see why == 0 && __TASK_STOPPED with
the recent fixes...

> So, I'd like to keep this one as it currently is.

OK.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 17/17] ptrace: implement PTRACE_LISTEN
  2011-06-13 20:33       ` Oleg Nesterov
@ 2011-06-14  6:45         ` Tejun Heo
  0 siblings, 0 replies; 50+ messages in thread
From: Tejun Heo @ 2011-06-14  6:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vda.linux, jan.kratochvil, linux-kernel, torvalds, akpm, indan,
	bdonlan, pedro

Hey, Oleg.

On Mon, Jun 13, 2011 at 10:33:41PM +0200, Oleg Nesterov wrote:
> >   and I don't see
> >   good reasons to optimize it.  Moreover, I think it doesn't hurt to
> >   have a way to reliably trigger spurious notification.
> 
> Well. I don't really understand why, but OK. Let's keep it this way.

So that we can just send SIGCONT to test how applications hold up
against spurious notifications.  ie. making corner case a bit more
common and easier to reproduce so that it's more discoverable /
debuggable.

> > However, I think it's
> >   done the current way for a reason - always trying to wake up on
> >   SIGCONT is more robust in case something went out of sync
> 
> Hmm. I am wondering if we can ever see why == 0 && __TASK_STOPPED with
> the recent fixes...

Probably not but I still think it's a better style to make SIGCONT
_always_ wake up.  <obligatory inaccurate car analogy> It's like a car
brake.  If transmission is in parking, gear interlocking should keep
the wheels from spinning and brake pedal input may be bypassed safely
for most cases, but, still, it's better to guarantee that stepping on
the brake pedal always makes the brake pads squeezed. </a>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-03 10:25           ` Pedro Alves
@ 2011-06-16  8:38             ` Tejun Heo
  2011-06-16  9:56               ` Pedro Alves
  0 siblings, 1 reply; 50+ messages in thread
From: Tejun Heo @ 2011-06-16  8:38 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Denys Vlasenko, Oleg Nesterov, jan.kratochvil, linux-kernel,
	torvalds, akpm, indan, bdonlan

Hello, Pedro.  Sorry about late reply.

On Fri, Jun 03, 2011 at 11:25:15AM +0100, Pedro Alves wrote:
> Some archs started encoding some of that stuff on SIGTRAPs si_code.
> E.g., on a ppc box I got here, I see:
> 
> /* `si_code' values for SIGTRAP signal.  */
> enum
> {
>   TRAP_BRKPT = 1,               /* Process breakpoint.  */
> # define TRAP_BRKPT     TRAP_BRKPT
>   TRAP_TRACE                    /* Process trace trap.  */
> # define TRAP_TRACE     TRAP_TRACE
> };

Yeap, several archs seem to have grown their own tricks.

> It'd be _very_ useful for x86 (and others) to have
> something like TRAP_BRKPT for int3.  Both for ptracers
> and in-process self debuggers.
> 
> I'd be super happy to be told we have that already
> in recent kernels and I missed it.  :-)

Unfortunately, AFAICS, x86 doesn't have it nor is there a generic
mechanism, but this should be solvable without disrupting existing
users.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-16  8:38             ` Tejun Heo
@ 2011-06-16  9:56               ` Pedro Alves
  2011-06-17 19:08                 ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Pedro Alves @ 2011-06-16  9:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Denys Vlasenko, Oleg Nesterov, jan.kratochvil, linux-kernel,
	torvalds, akpm, indan, bdonlan

Hi Tejun,

On Thursday 16 June 2011 09:38:28, Tejun Heo wrote:
> On Fri, Jun 03, 2011 at 11:25:15AM +0100, Pedro Alves wrote:
> > It'd be _very_ useful for x86 (and others) to have
> > something like TRAP_BRKPT for int3.  Both for ptracers
> > and in-process self debuggers.
> > 
> > I'd be super happy to be told we have that already
> > in recent kernels and I missed it.  :-)
> 
> Unfortunately, AFAICS, x86 doesn't have it nor is there a generic
> mechanism, but this should be solvable without disrupting existing
> users.

Hmm, looks like there was an earlier attempt:

<http://lkml.indiana.edu/hypermail/linux/kernel/0809.2/1898.html>

Reading that thread, there were some minor revisions, but I didn't
see any objection.  It seems that patch made it 2.6.28 (the only
sources I have handy), as I see get_si_code calls in arch/x86/kernel/traps.c,
and arch/x86/include/asm/traps.h.

But, I'm running 2.6.38, x86-64, and 

#include <stdio.h>
#include <string.h>
#include <signal.h>

static void
handler (int sig, siginfo_t *info, void *context)
{
  printf ("si_code = %x\n", info->si_code);
}

int
main (void)
{
  struct sigaction action;

  memset (&action, 0, sizeof (action));
  action.sa_sigaction = handler;
  action.sa_flags |= SA_SIGINFO;
  sigaction (SIGTRAP, &action, NULL);

  asm volatile ("int3");

  return 0;
}

still prints:

$ ./trap 
si_code = 80

Hmm...  I don't have any other kernel version sources or
git checkout handy at the moment to check the history
behind this.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4
  2011-06-16  9:56               ` Pedro Alves
@ 2011-06-17 19:08                 ` Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2011-06-17 19:08 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Tejun Heo, Denys Vlasenko, jan.kratochvil, linux-kernel,
	torvalds, akpm, indan, bdonlan

On 06/16, Pedro Alves wrote:
>
> On Thursday 16 June 2011 09:38:28, Tejun Heo wrote:
> > On Fri, Jun 03, 2011 at 11:25:15AM +0100, Pedro Alves wrote:
> > > It'd be _very_ useful for x86 (and others) to have
> > > something like TRAP_BRKPT for int3.  Both for ptracers
> > > and in-process self debuggers.
> > >
> > > I'd be super happy to be told we have that already
> > > in recent kernels and I missed it.  :-)
> >
> > Unfortunately, AFAICS, x86 doesn't have it nor is there a generic
> > mechanism, but this should be solvable without disrupting existing
> > users.
>
> Hmm, looks like there was an earlier attempt:
>
> <http://lkml.indiana.edu/hypermail/linux/kernel/0809.2/1898.html>
>
> Reading that thread, there were some minor revisions, but I didn't
> see any objection.  It seems that patch made it 2.6.28 (the only
> sources I have handy), as I see get_si_code calls in arch/x86/kernel/traps.c,
> and arch/x86/include/asm/traps.h.

Yes, this works (I think). But this handles the X86_EFLAGS_TF case, for
example PTRACE_SINGLESTEP.

> But, I'm running 2.6.38, x86-64, and
>
> #include <stdio.h>
> #include <string.h>
> #include <signal.h>
>
> static void
> handler (int sig, siginfo_t *info, void *context)
> {
>   printf ("si_code = %x\n", info->si_code);
> }
>
> int
> main (void)
> {
>   struct sigaction action;
>
>   memset (&action, 0, sizeof (action));
>   action.sa_sigaction = handler;
>   action.sa_flags |= SA_SIGINFO;
>   sigaction (SIGTRAP, &action, NULL);
>
>   asm volatile ("int3");
>
>   return 0;
> }
>
> still prints:
>
> $ ./trap
> si_code = 80

Yes. do_int3() simply calls force_sig(SIGTRAP). Trivial to change, I think.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2011-06-17 19:10 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-29 23:12 [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Tejun Heo
2011-05-29 23:12 ` [PATCH 01/17] ptrace: remove silly wait_trap variable from ptrace_attach() Tejun Heo
2011-06-01 18:47   ` Oleg Nesterov
2011-06-02  5:03     ` Tejun Heo
2011-06-02 11:39   ` [PATCH UPDATED " Tejun Heo
2011-05-29 23:12 ` [PATCH 02/17] job control: rename signal->group_stop and flags to jobctl and update them Tejun Heo
2011-05-29 23:12 ` [PATCH 03/17] ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments Tejun Heo
2011-05-29 23:12 ` [PATCH 04/17] ptrace: relocate set_current_state(TASK_TRACED) in ptrace_stop() Tejun Heo
2011-05-29 23:12 ` [PATCH 05/17] job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending() Tejun Heo
2011-05-29 23:12 ` [PATCH 06/17] job control: make task_clear_jobctl_pending() clear TRAPPING automatically Tejun Heo
2011-05-29 23:12 ` [PATCH 07/17] job control: introduce task_set_jobctl_pending() Tejun Heo
2011-05-29 23:12 ` [PATCH 08/17] ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit Tejun Heo
2011-06-02 11:41   ` [PATCH UPDATED " Tejun Heo
2011-05-29 23:12 ` [PATCH 09/17] signal: remove three noop tracehooks Tejun Heo
2011-05-29 23:12 ` [PATCH 10/17] job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap Tejun Heo
2011-05-29 23:12 ` [PATCH 11/17] ptrace: implement PTRACE_SEIZE Tejun Heo
2011-06-01 19:01   ` Oleg Nesterov
2011-06-01 19:55     ` Oleg Nesterov
2011-06-02  5:13     ` Tejun Heo
2011-06-02 11:43   ` [PATCH UPDATED " Tejun Heo
2011-05-29 23:12 ` [PATCH 12/17] ptrace: implement PTRACE_INTERRUPT Tejun Heo
2011-05-29 23:12 ` [PATCH 13/17] ptrace: add siginfo.si_pt_flags Tejun Heo
2011-05-29 23:12 ` [PATCH 14/17] ptrace: make group stop state visible via PTRACE_GETSIGINFO Tejun Heo
2011-05-29 23:12 ` [PATCH 15/17] ptrace: don't let PTRACE_SETSIGINFO override __SI_TRAP siginfo Tejun Heo
2011-05-29 23:12 ` [PATCH 16/17] ptrace: implement TRAP_NOTIFY and use it for group stop events Tejun Heo
2011-05-29 23:12 ` [PATCH 17/17] ptrace: implement PTRACE_LISTEN Tejun Heo
2011-06-02 17:33   ` Oleg Nesterov
2011-06-13 14:10     ` Tejun Heo
2011-06-13 20:33       ` Oleg Nesterov
2011-06-14  6:45         ` Tejun Heo
2011-05-30 15:42 ` [PATCHSET ptrace] ptrace: implement PTRACE_SEIZE/INTERRUPT and group stop notification, take#4 Oleg Nesterov
2011-06-01  5:39   ` Tejun Heo
2011-06-02 12:31     ` Tejun Heo
2011-06-02 14:51       ` Denys Vlasenko
2011-06-03  1:24         ` Tejun Heo
2011-06-03 10:25           ` Pedro Alves
2011-06-16  8:38             ` Tejun Heo
2011-06-16  9:56               ` Pedro Alves
2011-06-17 19:08                 ` Oleg Nesterov
2011-06-03 11:57           ` Denys Vlasenko
2011-06-03 12:11             ` Pedro Alves
2011-06-03 14:12               ` Denys Vlasenko
2011-06-03 15:24                 ` Pedro Alves
2011-06-03 15:46             ` Oleg Nesterov
2011-06-02 18:27       ` Oleg Nesterov
2011-06-02 21:09         ` Denys Vlasenko
2011-06-03  1:34           ` Tejun Heo
2011-06-03 11:37             ` Denys Vlasenko
2011-06-03 11:58               ` Denys Vlasenko
2011-06-03 15:37             ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.