All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7][v4] Container-init signal semantics
@ 2008-12-24 11:44 Sukadev Bhattiprolu
  2008-12-24 11:50 ` [RFC][PATCH 1/7][v4] Remove 'handler' parameter to tracehook functions Sukadev Bhattiprolu
                   ` (6 more replies)
  0 siblings, 7 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:44 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


Container-init must behave like global-init to processes within the
container and hence it must be immune to unhandled fatal signals from
within the container (i.e SIG_DFL signals that terminate the process).

But the same container-init must behave like a normal process to 
processes in ancestor namespaces and so if it receives the same fatal
signal from a process in ancestor namespace, the signal must be
processed.

Implementing these semantics requires that send_signal() determine pid
namespace of the sender but since signals can originate from workqueues/
interrupt-handlers, determining pid namespace of sender may not always
be possible or safe.

This patchset implements the design/simplified semantics suggested by
Oleg Nesterov.  The simplified semantics for container-init are:

	- container-init must never be terminated by a signal from a
	  descendant process.

	- container-init must never be immune to SIGKILL from an ancestor
	  namespace (so a process in parent namespace must always be able
	  to terminate a descendant container).

	- container-init may be immune to unhandled fatal signals (like
	  SIGUSR1) even if they are from ancestor namespace (SIGKILL is
	  the only reliable signal from ancestor namespace).

Patches in this set:

	[PATCH 1/7] Remove 'handler' parameter to tracehook functions
	[PATCH 2/7] Protect init from unwanted signals more
	[PATCH 3/7] Define siginfo_from_ancestor_ns()
	[PATCH 4/7] Protect cinit from unblocked SIG_DFL signals
	[PATCH 5/7] Protect cinit from blocked fatal signals
	[PATCH 6/7] SI_USER: Masquerade si_pid when crossing pid ns boundary
	[PATCH 7/7] SI_TKILL: Masquerade si_pid when crossing pid ns boundary

Changelog[v4]:
	- Remove SIGNAL_UNKILLABLE_FROM_NS flag and simplify logic as
	  suggested by Oleg Nesterov.
	- Check ns == NULL in siginfo_from_ancestor_ns() (Patch 3/7). 
	  Although http://lkml.org/lkml/2008/12/16/502 makes it less likely
	  that ns == NULL, looks like an explicit check won't hurt ?
	- Dropped patch that set SIGNAL_UNKILLABLE_FROM_NS and set
	  SIGNAL_UNKILLABLE in patch 5/7 to be bisect-safe.
	- Add a warning in rt_sigqueueinfo() if SI_ASYNCIO is used
	  (patch 3/7)
	- Added two patches (6/7 and 7/7) to masquerade si_pid for
	  SI_USER and SI_TKILL


Changelog[v3]:
	Changes based on discussions of previous version:
		http://lkml.org/lkml/2008/11/25/458

	Major changes:

	- Define SIGNAL_UNKILLABLE_FROM_NS and use in container-inits to
	  skip fatal signals from same namespace but process SIGKILL/SIGSTOP
	  from ancestor namespace.
	- Use SI_FROMUSER() and si_code != SI_ASYNCIO to determine if
	  it is safe to dereference pid-namespace of caller. Highly
	  experimental :-)
	- Masquerading si_pid when crossing namespace boundary: relevant
	  patches merged in -mm and dropped from this set.

	Minor changes:

	- Remove 'handler' parameter to tracehook functions
	- Update sig_ignored() to drop SIG_DFL signals to global init early
	  (tried to address Roland's  and Oleg's comments)
	- Use 'same_ns' flag to drop SIGKILL/SIGSTOP to cinit from same
	  namespace


TODO:
	- Use sig_task_unkillable() in fs/proc/array.c:task_sig()
	  to correctly report ignored signals for container/global
	  init.


Limitations/side-effects of current design

	- Container-init is immune to suicide - kill(getpid(), SIGKILL) is
	  ignored. Use exit() :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [RFC][PATCH 1/7][v4] Remove 'handler' parameter to tracehook functions
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
@ 2008-12-24 11:50 ` Sukadev Bhattiprolu
  2008-12-24 11:50 ` [RFC][PATCH 2/7][v4] Protect init from unwanted signals more Sukadev Bhattiprolu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:50 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Oleg Nesterov <oleg@redhat.com>
Subject: [RFC][PATCH 1/7][v4] Remove 'handler' parameter to tracehook functions

Based on an earlier patch submitted by Oleg Nesterov and comments
from Roland McGrath (http://lkml.org/lkml/2008/11/19/258).

The handler parameter is currently unused in the tracehook functions.
Besides, the tracehook functions are called with siglock held, so the
functions can check the handler if they later need to.

Removing the parameter simiplifies changes to sig_ignored() in a follow-on
patch.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
---
 arch/x86/kernel/ptrace.c  |    2 +-
 include/linux/tracehook.h |   13 ++++---------
 kernel/signal.c           |    6 +++---
 3 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 0a6d8c1..d6ef716 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1585,6 +1585,6 @@ asmregparm void syscall_trace_leave(struct pt_regs *regs)
 	 * system call instruction.
 	 */
 	if (test_thread_flag(TIF_SINGLESTEP) &&
-	    tracehook_consider_fatal_signal(current, SIGTRAP, SIG_DFL))
+	    tracehook_consider_fatal_signal(current, SIGTRAP))
 		send_sigtrap(current, regs, 0, TRAP_BRKPT);
 }
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 6186a78..eb4c654 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -388,17 +388,14 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info,
  * tracehook_consider_ignored_signal - suppress short-circuit of ignored signal
  * @task:		task receiving the signal
  * @sig:		signal number being sent
- * @handler:		%SIG_IGN or %SIG_DFL
  *
  * Return zero iff tracing doesn't care to examine this ignored signal,
  * so it can short-circuit normal delivery and never even get queued.
- * Either @handler is %SIG_DFL and @sig's default is ignore, or it's %SIG_IGN.
  *
  * Called with @task->sighand->siglock held.
  */
 static inline int tracehook_consider_ignored_signal(struct task_struct *task,
-						    int sig,
-						    void __user *handler)
+						    int sig)
 {
 	return (task_ptrace(task) & PT_PTRACED) != 0;
 }
@@ -407,19 +404,17 @@ static inline int tracehook_consider_ignored_signal(struct task_struct *task,
  * tracehook_consider_fatal_signal - suppress special handling of fatal signal
  * @task:		task receiving the signal
  * @sig:		signal number being sent
- * @handler:		%SIG_DFL or %SIG_IGN
  *
  * Return nonzero to prevent special handling of this termination signal.
- * Normally @handler is %SIG_DFL.  It can be %SIG_IGN if @sig is ignored,
- * in which case force_sig() is about to reset it to %SIG_DFL.
+ * Normally handler for signal is %SIG_DFL.  It can be %SIG_IGN if @sig is
+ * ignored, in which case force_sig() is about to reset it to %SIG_DFL.
  * When this returns zero, this signal might cause a quick termination
  * that does not give the debugger a chance to intercept the signal.
  *
  * Called with or without @task->sighand->siglock held.
  */
 static inline int tracehook_consider_fatal_signal(struct task_struct *task,
-						  int sig,
-						  void __user *handler)
+						  int sig)
 {
 	return (task_ptrace(task) & PT_PTRACED) != 0;
 }
diff --git a/kernel/signal.c b/kernel/signal.c
index 2a64304..7945e71 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -72,7 +72,7 @@ static int sig_ignored(struct task_struct *t, int sig)
 	/*
 	 * Tracers may want to know about even ignored signals.
 	 */
-	return !tracehook_consider_ignored_signal(t, sig, handler);
+	return !tracehook_consider_ignored_signal(t, sig);
 }
 
 /*
@@ -316,7 +316,7 @@ int unhandled_signal(struct task_struct *tsk, int sig)
 		return 1;
 	if (handler != SIG_IGN && handler != SIG_DFL)
 		return 0;
-	return !tracehook_consider_fatal_signal(tsk, sig, handler);
+	return !tracehook_consider_fatal_signal(tsk, sig);
 }
 
 
@@ -775,7 +775,7 @@ static void complete_signal(int sig, struct task_struct *p, int group)
 	    !(signal->flags & (SIGNAL_UNKILLABLE | SIGNAL_GROUP_EXIT)) &&
 	    !sigismember(&t->real_blocked, sig) &&
 	    (sig == SIGKILL ||
-	     !tracehook_consider_fatal_signal(t, sig, SIG_DFL))) {
+	     !tracehook_consider_fatal_signal(t, sig))) {
 		/*
 		 * This signal will be fatal to the whole group.
 		 */
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 2/7][v4] Protect init from unwanted signals more
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
  2008-12-24 11:50 ` [RFC][PATCH 1/7][v4] Remove 'handler' parameter to tracehook functions Sukadev Bhattiprolu
@ 2008-12-24 11:50 ` Sukadev Bhattiprolu
       [not found]   ` <20081224115047.GB8020-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2008-12-24 11:51 ` [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns() Sukadev Bhattiprolu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:50 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Oleg Nesterov <oleg@redhat.com>
Subject: [RFC][PATCH 2/7][v4] Protect init from unwanted signals more

(This is a modified version of the patch submitted by Oleg Nesterov
http://lkml.org/lkml/2008/11/18/249 and tries to address comments
that came up in that discussion)

init ignores the SIG_DFL signals but we queue them anyway, including
SIGKILL. This is mostly OK, the signal will be dropped silently when
dequeued, but the pending SIGKILL has 2 bad implications:

        - it implies fatal_signal_pending(), so we confuse things
          like wait_for_completion_killable/lock_page_killable.

        - for the sub-namespace inits, the pending SIGKILL can
          mask (legacy_queue) the subsequent SIGKILL from the
          parent namespace which must kill cinit reliably.
          (preparation, cinits don't have SIGNAL_UNKILLABLE yet)

The patch can't help when init is ptraced, but ptracing of init is
not "safe" anyway.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
---
 kernel/signal.c |   16 +++++++++++++---
 1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 7945e71..55f41b6 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -53,10 +53,21 @@ static int sig_handler_ignored(void __user *handler, int sig)
 		(handler == SIG_DFL && sig_kernel_ignore(sig));
 }
 
-static int sig_ignored(struct task_struct *t, int sig)
+static int sig_task_ignored(struct task_struct *t, int sig)
 {
 	void __user *handler;
 
+	handler = sig_handler(t, sig);
+
+	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
+			(handler == SIG_IGN || handler == SIG_DFL))
+		return 1;
+	
+	return sig_handler_ignored(handler, sig);
+}
+
+static int sig_ignored(struct task_struct *t, int sig)
+{
 	/*
 	 * Blocked signals are never ignored, since the
 	 * signal handler may change by the time it is
@@ -65,8 +76,7 @@ static int sig_ignored(struct task_struct *t, int sig)
 	if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig))
 		return 0;
 
-	handler = sig_handler(t, sig);
-	if (!sig_handler_ignored(handler, sig))
+	if (!sig_task_ignored(t, sig))
 		return 0;
 
 	/*
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
  2008-12-24 11:50 ` [RFC][PATCH 1/7][v4] Remove 'handler' parameter to tracehook functions Sukadev Bhattiprolu
  2008-12-24 11:50 ` [RFC][PATCH 2/7][v4] Protect init from unwanted signals more Sukadev Bhattiprolu
@ 2008-12-24 11:51 ` Sukadev Bhattiprolu
  2008-12-24 16:28   ` Oleg Nesterov
  2008-12-24 11:51 ` [RFC][PATCH 4/7][v4] Protect cinit from unblocked SIG_DFL signals Sukadev Bhattiprolu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:51 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Subject: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()

Determine if sender of a signal is from an ancestor namespace. This
function will be used in a follow-on patch.

This is an early/lightly tested RFC patch. Would it be safe to implement
siginfo_from_user() as below and then use it dereference the pid
namespace of sender ?

This is based on discussions on the patch from Oleg Nesterov and me
http://lkml.org/lkml/2008/11/25/462.

Changelog[v2]:
	- siginfo_from_ancestor_ns() is fairly clean and it does not need
	  to be under CONFIG_PID_NS. Only siginfo_from_user() needs to be.
	- Warn if rt_sigqueueinfo() uses SI_ASYNCIO.
	- Added a check for pid-ns of receiver being NULL (in case it is
	  exiting).

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/signal.c |   67 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 55f41b6..0011f99 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -820,6 +820,56 @@ static inline int legacy_queue(struct sigpending *signals, int sig)
 {
 	return (sig < SIGRTMIN) && sigismember(&signals->signal, sig);
 }
+/*
+ * Return 1 if this signal originated directly from a user process (i.e via
+ * kill(), tkill(), sigqueue()).  Return 0 otherwise.
+ *
+ * TODO:
+ * 	  To make this less hacky, make SI_ASYNCIO a kernel signal (and
+ * 	  remove the warning in sys_rt_sigqueueinfo()).
+ */
+#ifdef CONFIG_PID_NS
+static inline int siginfo_from_user(siginfo_t *info)
+{
+	if (!is_si_special(info) && SI_FROMUSER(info) &&
+				info->si_code != SI_ASYNCIO)
+		return 1;
+
+	return 0;
+}
+#else
+static inline int siginfo_from_user(siginfo_t *info)
+{
+	return 0;
+}
+#endif
+
+static inline int siginfo_from_ancestor_ns(struct task_struct *t,
+                       siginfo_t *info)
+{
+	struct pid_namespace *ns;
+
+	/*
+	 * Ensure signal is from user-space before checking pid namespace
+	 */
+	if (siginfo_from_user(info)) {
+		/*
+		 * If we do not have a pid in the receiver's namespace,
+		 * we must be an ancestor of the receiver. 
+		 *
+		 * CHECK:
+		 * 	If receiver is exiting, ns == NULL, signal will be
+		 * 	queued but ignored (wants_signal() is FALSES). For
+		 * 	compatibility with current behavior, assume it is
+		 * 	from ancestor and queue the signal anyway ?
+		 */
+		ns = task_active_pid_ns(t);
+		if (!ns || task_pid_nr_ns(current, ns) <= 0)
+			return 1;
+	}
+
+	return 0;
+}
 
 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
 			int group)
@@ -2312,6 +2362,20 @@ sys_tkill(pid_t pid, int sig)
 	return do_tkill(0, pid, sig);
 }
 
+#ifdef CONFIG_PID_NS
+/*
+ * siginfo_from_user() assumes that si_code SI_ASYNCIO comes only from
+ * within the kernel. If an application is passing in SI_ASYNCIO we 
+ * want to know about it.
+ */
+static void warn_on_asyncio(siginfo_t *info)
+{
+	WARN_ON_ONCE(info->si_code == SI_ASYNCIO);
+}
+#else
+#define warn_on_asyncio(info)	{}
+#endif
+
 asmlinkage long
 sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
 {
@@ -2324,6 +2388,9 @@ sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
 	   Nor can they impersonate a kill(), which adds source info.  */
 	if (info.si_code >= 0)
 		return -EPERM;
+
+	warn_on_asyncio(&info);
+
 	info.si_signo = sig;
 
 	/* POSIX.1b doesn't mention process groups.  */
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 4/7][v4] Protect cinit from unblocked SIG_DFL signals
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
                   ` (2 preceding siblings ...)
  2008-12-24 11:51 ` [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns() Sukadev Bhattiprolu
@ 2008-12-24 11:51 ` Sukadev Bhattiprolu
  2008-12-24 11:52 ` [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals Sukadev Bhattiprolu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:51 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Subject: [RFC][PATCH 4/7][v4] Protect cinit from unblocked SIG_DFL signals

Drop early any SIG_DFL or SIG_IGN signals to container-init from within
the same container. But queue SIGSTOP and SIGKILL to the container-init
if they are from an ancestor container.

Blocked, fatal signals (i.e when SIG_DFL is to terminate) from within the
container can still terminate the container-init. That will be addressed
in the next patch.

Note:	To be bisect-safe, SIGNAL_UNKILLABLE will be set for container-inits
   	in a follow-on patch. Until then, this patch is just a preparatory
	setp.

Changelog[v2]:
	- (Oleg Nesterov) Remove SIGNAL_UNKILLABLE_FROM_NS and rename
	  'same_ns' to 'from_ancestor_ns'.
	- SIGNAL_UNKILLABLE is not yet set for container-inits (will be
	  set in follow-on patch).

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/signal.c |   36 +++++++++++++++++++++++++++---------
 1 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 0011f99..5c4374f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -53,20 +53,34 @@ static int sig_handler_ignored(void __user *handler, int sig)
 		(handler == SIG_DFL && sig_kernel_ignore(sig));
 }
 
-static int sig_task_ignored(struct task_struct *t, int sig)
+/*
+ * Return 1 if task @t is either the global init or the container-init
+ * of the sender's container. Return 0 otherwise.
+ */
+static int sig_task_unkillable(struct task_struct *t, int from_ancestor_ns)
+{
+	int flags = t->signal->flags;
+
+	if (unlikely(flags & SIGNAL_UNKILLABLE) && !from_ancestor_ns)
+		return 1;
+
+	return 0;
+}
+
+static int sig_task_ignored(struct task_struct *t, int sig, int from_ancestor_ns)
 {
 	void __user *handler;
 
 	handler = sig_handler(t, sig);
 
-	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
+	if (sig_task_unkillable(t, from_ancestor_ns) &&
 			(handler == SIG_IGN || handler == SIG_DFL))
 		return 1;
-	
+
 	return sig_handler_ignored(handler, sig);
 }
 
-static int sig_ignored(struct task_struct *t, int sig)
+static int sig_ignored(struct task_struct *t, int sig, int from_ancestor_ns)
 {
 	/*
 	 * Blocked signals are never ignored, since the
@@ -76,7 +90,7 @@ static int sig_ignored(struct task_struct *t, int sig)
 	if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig))
 		return 0;
 
-	if (!sig_task_ignored(t, sig))
+	if (!sig_task_ignored(t, sig, from_ancestor_ns))
 		return 0;
 
 	/*
@@ -632,7 +646,7 @@ static int check_kill_permission(int sig, struct siginfo *info,
  * Returns true if the signal should be actually delivered, otherwise
  * it should be dropped.
  */
-static int prepare_signal(int sig, struct task_struct *p)
+static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 {
 	struct signal_struct *signal = p->signal;
 	struct task_struct *t;
@@ -716,7 +730,7 @@ static int prepare_signal(int sig, struct task_struct *p)
 		}
 	}
 
-	return !sig_ignored(p, sig);
+	return !sig_ignored(p, sig, from_ancestor_ns);
 }
 
 /*
@@ -876,11 +890,15 @@ static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
 {
 	struct sigpending *pending;
 	struct sigqueue *q;
+	int from_ancestor_ns;
 
 	trace_sched_signal_send(sig, t);
 
 	assert_spin_locked(&t->sighand->siglock);
-	if (!prepare_signal(sig, t))
+
+	from_ancestor_ns = siginfo_from_ancestor_ns(t, info);
+
+	if (!prepare_signal(sig, t, from_ancestor_ns))
 		return 0;
 
 	pending = group ? &t->signal->shared_pending : &t->pending;
@@ -1375,7 +1393,7 @@ int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
 		goto ret;
 
 	ret = 1; /* the signal is ignored */
-	if (!prepare_signal(sig, t))
+	if (!prepare_signal(sig, t, 1))
 		goto out;
 
 	ret = 0;
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
                   ` (3 preceding siblings ...)
  2008-12-24 11:51 ` [RFC][PATCH 4/7][v4] Protect cinit from unblocked SIG_DFL signals Sukadev Bhattiprolu
@ 2008-12-24 11:52 ` Sukadev Bhattiprolu
  2008-12-24 16:09   ` Oleg Nesterov
  2008-12-24 11:53 ` [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary Sukadev Bhattiprolu
  2008-12-24 11:53 ` [RFC][PATCH 7/7][v4] SI_TKILL: " Sukadev Bhattiprolu
  6 siblings, 1 reply; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:52 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Subject: [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals

Normally SIG_DFL signals to global and container-init are dropped early.
But if a signal is blocked when it is posted, we cannot drop the signal
since the receiver may install a handler before unblocking the signal.
Once this signal is queued however, the receiver container-init has
no way of knowing if the signal was sent from an ancestor or descendant
namespace.  This patch ensures that contianer-init drops all SIG_DFL
signals in get_signal_to_deliver() except SIGKILL/SIGSTOP.

If SIGSTOP/SIGKILL originate from a descendant of container-init they
are never queued (i.e dropped in sig_ignored() in an earler patch).

If SIGSTOP/SIGKILL originate from parent namespace, the signal is queued
and container-init processes the signal.

See comments in patch below for details.

Changelog[v2]:
	- Rename sig_unkillable() to unkillable_by_sig()
	- Remove SIGNAL_UNKILLABLE_FROM_NS flag and simplify (Oleg Nesterov)
	- Set SIGNAL_UNKILLABLE for container-init in this patch.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/fork.c   |    2 ++
 kernel/signal.c |   41 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index dba2d3f..d3e93ef 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -812,6 +812,8 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	atomic_set(&sig->live, 1);
 	init_waitqueue_head(&sig->wait_chldexit);
 	sig->flags = 0;
+	if (clone_flags & CLONE_NEWPID)
+		sig->flags |= SIGNAL_UNKILLABLE;
 	sig->group_exit_code = 0;
 	sig->group_exit_task = NULL;
 	sig->group_stop_count = 0;
diff --git a/kernel/signal.c b/kernel/signal.c
index 5c4374f..660fadd 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1818,6 +1818,41 @@ static int ptrace_signal(int signr, siginfo_t *info,
 	return signr;
 }
 
+/*
+ * Return 1 if the process owning @signal should NOT terminate as a result of
+ * the signal @signr. Return 0 otherwise.
+ *
+ * Specifically if process owning @signal is
+ * 	- neither global nor a container-init, return 0
+ *	- the global-init, return 1.
+ *	- container-init, return 0 if signal is SIGKILL or SIGSTOP. Return
+ *	  1 otherwise.
+ *
+ * sig_ignored() drops any unblocked fatal signals to global/container-init
+ * from within the same namespace. This of course includes SIGKILL/SIGSTOP
+ * which can never be blocked.  sig_ignored() does not drop the SIGKILL/
+ * SIGSTOP if the are from an ancestor namespace.
+ *
+ * So, @signal is for a container-init and if @signr is either SIGKILL or
+ * SIGSTOP, it must have come from an ancestor namespace. So container-init
+ * should be killable (return 0).
+ *
+ * If @signal refers to a container-init and @signr is neither SIGKILL nor
+ * SIGSTOP, it was queued because it was blocked when it was posted. The
+ * signal may have come from same container - hence it should not be
+ * killable (return 1).
+ *
+ * Note:
+ * 	This means that SIGKILL is the only sure way to terminate a
+ * 	container-init even from ancestor namespace.
+ */
+static int unkillable_by_sig(struct signal_struct *signal, int signr)
+{
+	if ((signal->flags & SIGNAL_UNKILLABLE) && !sig_kernel_only(signr))
+		return 1;
+	return 0;
+}
+
 int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
 			  struct pt_regs *regs, void *cookie)
 {
@@ -1909,9 +1944,11 @@ relock:
 
 		/*
 		 * Global init gets no signals it doesn't want.
+		 * Container-init gets no signals it doesn't want from same
+		 * container.
 		 */
-		if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
-		    !signal_group_exit(signal))
+		if (unkillable_by_sig(signal, signr) &&
+				!signal_group_exit(signal))
 			continue;
 
 		if (sig_kernel_stop(signr)) {
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
                   ` (4 preceding siblings ...)
  2008-12-24 11:52 ` [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals Sukadev Bhattiprolu
@ 2008-12-24 11:53 ` Sukadev Bhattiprolu
  2008-12-24 15:43   ` Oleg Nesterov
  2008-12-24 11:53 ` [RFC][PATCH 7/7][v4] SI_TKILL: " Sukadev Bhattiprolu
  6 siblings, 1 reply; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:53 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Subject: [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary

When sending a signal to a descendant namespace, we must set ->si_pid to
0 since the sender does not have a valid pid in the descendant namespace.
But we cannot do this in sys_kill() since a) we do not yet know the pid
namespace of receiver b) the pid number in sys_kill() may correspond to
a process group and processes in the process group may be in different
namespaces. So masquerade si_pid at the lower level, send_signal()
function.

Note:
	- If rt_sigqueueinfo() sets si_code to SI_USER when sending a
	  signal across a pid namespace boundary, the value in ->si_pid
	  will be cleared to 0.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/signal.c |   32 +++++++++++++++++++++++++++++++-
 1 files changed, 31 insertions(+), 1 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 660fadd..5a6aea8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -885,6 +885,34 @@ static inline int siginfo_from_ancestor_ns(struct task_struct *t,
 	return 0;
 }
 
+/*
+ * masquerade_si_pid():
+ *
+ * 	Called when signal is crossing pid namespace boundary. Compute
+ * 	pid of sender in receiver's pid namespace.
+ *
+ * 	We only care about SI_USER here. Other si_codes are addressed
+ * 	either at the 'origin' of signal or set to a default value in
+ * 	send_signal(). But we can't address SI_USER at 'origin', i.e .
+ * 	in sys_kill(), since we may have a process group, rather than
+ * 	a specific process there. And different processes in same pgrp
+ * 	can be in different namespaces.
+ */
+static void masquerade_si_pid(struct task_struct *t, siginfo_t *info)
+{
+	if (is_si_special(info) || SI_FROMKERNEL(info))
+		return;
+
+	/*
+	 * When crossing pid namespace boundary, SI_USER signal can only
+	 * go from ancestor to descendant ns but not the other way. So,
+	 * just ->si_pid to 0 since, the sender will not have a pid in
+	 * the receiver's namespace.
+	 */
+	if (info->si_code == SI_USER)
+		info->si_pid = 0;
+}
+
 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
 			int group)
 {
@@ -946,6 +974,8 @@ static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
 			break;
 		default:
 			copy_siginfo(&q->info, info);
+			if (from_ancestor_ns)
+				masquerade_si_pid(t, &q->info);
 			break;
 		}
 	} else if (!is_si_special(info)) {
@@ -2343,7 +2373,7 @@ sys_kill(pid_t pid, int sig)
 	info.si_signo = sig;
 	info.si_errno = 0;
 	info.si_code = SI_USER;
-	info.si_pid = task_tgid_vnr(current);
+	info.si_pid = 0;	/* masquerade in send_signal() */
 	info.si_uid = current_uid();
 
 	return kill_something_info(sig, &info, pid);
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC][PATCH 7/7][v4] SI_TKILL: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
                   ` (5 preceding siblings ...)
  2008-12-24 11:53 ` [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary Sukadev Bhattiprolu
@ 2008-12-24 11:53 ` Sukadev Bhattiprolu
  2008-12-24 15:55   ` Oleg Nesterov
  6 siblings, 1 reply; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 11:53 UTC (permalink / raw)
  To: oleg, ebiederm, roland, bastian; +Cc: daniel, xemul, containers, linux-kernel


From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Subject: [RFC][PATCH 7/7][v4] SI_TKILL: Masquerade si_pid when crossing pid ns boundary

When sending a signal to a process in a descendant namespace, ->si_pid
must be set to the pid of sender in receiver's namespace. But if the
receiver is exiting (theoretically, ns == NULL), just set ->si_pid to
active pid namespace of sender (receiver won't see the signal anyway ?).

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/signal.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 5a6aea8..e263c23 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2385,17 +2385,22 @@ static int do_tkill(pid_t tgid, pid_t pid, int sig)
 	struct siginfo info;
 	struct task_struct *p;
 	unsigned long flags;
+	struct pid_namespace *ns;
 
 	error = -ESRCH;
 	info.si_signo = sig;
 	info.si_errno = 0;
 	info.si_code = SI_TKILL;
-	info.si_pid = task_tgid_vnr(current);
 	info.si_uid = current_uid();
 
 	rcu_read_lock();
 	p = find_task_by_vpid(pid);
 	if (p && (tgid <= 0 || task_tgid_vnr(p) == tgid)) {
+		ns = task_active_pid_ns(p);
+		if (ns)
+			info.si_pid = task_tgid_nr_ns(current, ns);
+		else
+			info.si_pid = task_tgid_vnr(current);
 		error = check_kill_permission(sig, &info, p);
 		/*
 		 * The null signal is a permissions and process existence
-- 
1.5.2.5

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 11:53 ` [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary Sukadev Bhattiprolu
@ 2008-12-24 15:43   ` Oleg Nesterov
  2008-12-24 16:18     ` Oleg Nesterov
  2008-12-24 21:08     ` Sukadev Bhattiprolu
  0 siblings, 2 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 15:43 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> +static void masquerade_si_pid(struct task_struct *t, siginfo_t *info)
> +{
> +	if (is_si_special(info) || SI_FROMKERNEL(info))
> +		return;
> +
> +	/*
> +	 * When crossing pid namespace boundary, SI_USER signal can only
> +	 * go from ancestor to descendant ns but not the other way. So,
> +	 * just ->si_pid to 0 since, the sender will not have a pid in
> +	 * the receiver's namespace.
> +	 */
> +	if (info->si_code == SI_USER)
> +		info->si_pid = 0;
> +}
> +
>  static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
>  			int group)
>  {
> @@ -946,6 +974,8 @@ static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
>  			break;
>  		default:
>  			copy_siginfo(&q->info, info);
> +			if (from_ancestor_ns)
> +				masquerade_si_pid(t, &q->info);
>  			break;
>  		}
>  	} else if (!is_si_special(info)) {
> @@ -2343,7 +2373,7 @@ sys_kill(pid_t pid, int sig)
>  	info.si_signo = sig;
>  	info.si_errno = 0;
>  	info.si_code = SI_USER;
> -	info.si_pid = task_tgid_vnr(current);
> +	info.si_pid = 0;	/* masquerade in send_signal() */
>  	info.si_uid = current_uid();

Can't understand this patch. First of all, it looks wrong. Looks like
we never set .si_pid != 0 when the signal is set by sys_kill() ?

But more importantly, unless I missed something, this patch is unnecessary
complication.

We call masquerade_si_pid() only when from_ancestor_ns == T, this is correct.
But this means that (!is_si_special(info) && SI_FROMUSER(info)) == T, why
do we re-check in masquerade_si_pid() ?

And why can't we just do

	 default:
	 	copy_siginfo(&q->info, info);
	 	if (from_ancestor_ns)
	 		info->si_pid = 0;

? Why should we check SI_USER and change sys_kill() ?

see also the comment for the next 7/7 patch.

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 7/7][v4] SI_TKILL: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 11:53 ` [RFC][PATCH 7/7][v4] SI_TKILL: " Sukadev Bhattiprolu
@ 2008-12-24 15:55   ` Oleg Nesterov
  2008-12-24 21:04     ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 15:55 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2385,17 +2385,22 @@ static int do_tkill(pid_t tgid, pid_t pid, int sig)
>  	struct siginfo info;
>  	struct task_struct *p;
>  	unsigned long flags;
> +	struct pid_namespace *ns;
>
>  	error = -ESRCH;
>  	info.si_signo = sig;
>  	info.si_errno = 0;
>  	info.si_code = SI_TKILL;
> -	info.si_pid = task_tgid_vnr(current);
>  	info.si_uid = current_uid();
>
>  	rcu_read_lock();
>  	p = find_task_by_vpid(pid);
>  	if (p && (tgid <= 0 || task_tgid_vnr(p) == tgid)) {
> +		ns = task_active_pid_ns(p);
> +		if (ns)
> +			info.si_pid = task_tgid_nr_ns(current, ns);
> +		else
> +			info.si_pid = task_tgid_vnr(current);

if ns == 0, "p" won't see the signal anyway, so all we need is

	-	info.si_pid = task_tgid_vnr(current);
	+	info.si_pid = task_tgid_nr_ns(current, task_active_ns(p));

like we do in __do_notify().


But. this of course doesn't work for sys_kill(). Can't we change the helpers
which send SI_FROMUSER() signals so that they do not fill .si_pid at all?
Then send_signal() can do:

	default:
		copy_siginfo(&q->info, info);
		info.si_pid = 0;
		if (!from_ancestot_ns)
			info.si_pid = task_tgid_nr_ns(current, ...);

?

Yes, we use "current". But we already used it in siginfo_from_ancestor_ns().

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals
  2008-12-24 11:52 ` [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals Sukadev Bhattiprolu
@ 2008-12-24 16:09   ` Oleg Nesterov
  2008-12-24 21:25     ` Sukadev Bhattiprolu
  2008-12-31  0:04     ` Roland McGrath
  0 siblings, 2 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 16:09 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> + * So, @signal is for a container-init and if @signr is either SIGKILL or
> + * SIGSTOP, it must have come from an ancestor namespace.

This is wrong. SIGKILL can be sent "internally", for example by
do_group_exit().

> + * If @signal refers to a container-init and @signr is neither SIGKILL nor
> + * SIGSTOP, it was queued because it was blocked when it was posted.

This is not right too. It is possible that init had a handler when
the signal was sent, and the handler was set to SIG_DFL before the
signal was dequeued.

> +static int unkillable_by_sig(struct signal_struct *signal, int signr)
> +{
> +	if ((signal->flags & SIGNAL_UNKILLABLE) && !sig_kernel_only(signr))
> +		return 1;
> +	return 0;
> +}
> +
>  int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
>  			  struct pt_regs *regs, void *cookie)
>  {
> @@ -1909,9 +1944,11 @@ relock:
>  
>  		/*
>  		 * Global init gets no signals it doesn't want.
> +		 * Container-init gets no signals it doesn't want from same
> +		 * container.
>  		 */
> -		if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
> -		    !signal_group_exit(signal))
> +		if (unkillable_by_sig(signal, signr) &&
> +				!signal_group_exit(signal))

No need to check signal_group_exit(signal). It was needed to
handle SIGKILL when it is sent by do_group_exit()/de_thread().
With this patch this is covered by sig_kernel_only().

Personally, I'd prefer to retain this check inline with a small
comment

		/* SMALL COMMENT ;) */
		if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
		    !sig_kernel_only(signr))
			continue;

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 15:43   ` Oleg Nesterov
@ 2008-12-24 16:18     ` Oleg Nesterov
  2008-12-24 21:08     ` Sukadev Bhattiprolu
  1 sibling, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 16:18 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Oleg Nesterov wrote:
>
> ? Why should we check SI_USER and change sys_kill() ?
                                ^^^^^^^^^^^^^^^^^^^
I'm afraid I wasn't clear...

> see also the comment for the next 7/7 patch.

yes, please see the next comment.

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
  2008-12-24 11:51 ` [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns() Sukadev Bhattiprolu
@ 2008-12-24 16:28   ` Oleg Nesterov
       [not found]     ` <20081224162823.GE11593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 16:28 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> +#ifdef CONFIG_PID_NS
> +static inline int siginfo_from_user(siginfo_t *info)
> +{
> +	if (!is_si_special(info) && SI_FROMUSER(info) &&
> +				info->si_code != SI_ASYNCIO)
> +		return 1;
> +
> +	return 0;
> +}
> +#else
> +static inline int siginfo_from_user(siginfo_t *info)
> +{
> +	return 0;
> +}
> +#endif
> +
> +static inline int siginfo_from_ancestor_ns(struct task_struct *t,
> +                       siginfo_t *info)
> +{
> +	struct pid_namespace *ns;
> +
> +	/*
> +	 * Ensure signal is from user-space before checking pid namespace
> +	 */
> +	if (siginfo_from_user(info)) {
> +		/*
> +		 * If we do not have a pid in the receiver's namespace,
> +		 * we must be an ancestor of the receiver. 
> +		 *
> +		 * CHECK:
> +		 * 	If receiver is exiting, ns == NULL, signal will be
> +		 * 	queued but ignored (wants_signal() is FALSES). For
> +		 * 	compatibility with current behavior, assume it is
> +		 * 	from ancestor and queue the signal anyway ?
> +		 */
> +		ns = task_active_pid_ns(t);
> +		if (!ns || task_pid_nr_ns(current, ns) <= 0)
> +			return 1;
> +	}
> +
> +	return 0;
> +}

Small nit... siginfo_from_user() is only called by siginfo_from_ancestor_ns().
The first helper depends on CONFIG_PID_NS, the second is not. A bit strange.

Isn't it cleaner to do

	#ifdef CONFIG_PID_NS
	static inline int siginfo_from_user(siginfo_t *info)
	{
		...
	}
	static inline int siginfo_from_ancestor_ns(...)
	{
		...
	}
	#else
	static inline int siginfo_from_ancestor_ns(...)
	{
		return 0;
	}
	#endif

?

> +#ifdef CONFIG_PID_NS
> +/*
> + * siginfo_from_user() assumes that si_code SI_ASYNCIO comes only from
> + * within the kernel. If an application is passing in SI_ASYNCIO we 
> + * want to know about it.
> + */
> +static void warn_on_asyncio(siginfo_t *info)
> +{
> +	WARN_ON_ONCE(info->si_code == SI_ASYNCIO);
> +}
> +#else
> +#define warn_on_asyncio(info)	{}
> +#endif
> +
>  asmlinkage long
>  sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
>  {
> @@ -2324,6 +2388,9 @@ sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
>  	   Nor can they impersonate a kill(), which adds source info.  */
>  	if (info.si_code >= 0)
>  		return -EPERM;
> +
> +	warn_on_asyncio(&info);

Hmm... why do you want this? The user-space can use any si_code >= 0,
why should we uglify the code?

And, SI_ASYNCIO only matters when we send the signal to the subnamespace,
and in that case we will probably mangle .si_pid. So why don't we warn
when .si_code == SI_USER?

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 2/7][v4] Protect init from unwanted signals more
  2008-12-24 11:50 ` [RFC][PATCH 2/7][v4] Protect init from unwanted signals more Sukadev Bhattiprolu
@ 2008-12-24 16:35       ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 16:35 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: bastian-yyjItF7Rl6lg9hUCZPvPmw,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-qjLDD68F18O7TbgM5vRIOg, roland-H+wXaHxf7aLQT0dZR+AlfA,
	xemul-GEFAQzZX7r8dnm+yROfE0A

On 12/24, Sukadev Bhattiprolu wrote:
>
> -static int sig_ignored(struct task_struct *t, int sig)
> +static int sig_task_ignored(struct task_struct *t, int sig)
>  {
>  	void __user *handler;
>
> +	handler = sig_handler(t, sig);
> +
> +	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
> +			(handler == SIG_IGN || handler == SIG_DFL))
> +		return 1;
> +	
> +	return sig_handler_ignored(handler, sig);

Well, really minor nit, but can't resist ;)

if we check both SIG_IGN and SIG_DFL, then why do we call
sig_handler_ignored() ? We can do

	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE))
		return handler == SIG_IGN || handler == SIG_DFL;
	return sig_handler_ignored(handler, sig);

Or, we can do

	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
			 handler == SIG_DFL)
		return 1;
	return sig_handler_ignored(handler, sig);

because sig_handler_ignored() checks SIG_IGN too.

Of course, this is a matter of taste only...

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 2/7][v4] Protect init from unwanted signals more
@ 2008-12-24 16:35       ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 16:35 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> -static int sig_ignored(struct task_struct *t, int sig)
> +static int sig_task_ignored(struct task_struct *t, int sig)
>  {
>  	void __user *handler;
>
> +	handler = sig_handler(t, sig);
> +
> +	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
> +			(handler == SIG_IGN || handler == SIG_DFL))
> +		return 1;
> +	
> +	return sig_handler_ignored(handler, sig);

Well, really minor nit, but can't resist ;)

if we check both SIG_IGN and SIG_DFL, then why do we call
sig_handler_ignored() ? We can do

	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE))
		return handler == SIG_IGN || handler == SIG_DFL;
	return sig_handler_ignored(handler, sig);

Or, we can do

	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
			 handler == SIG_DFL)
		return 1;
	return sig_handler_ignored(handler, sig);

because sig_handler_ignored() checks SIG_IGN too.

Of course, this is a matter of taste only...

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 7/7][v4] SI_TKILL: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 15:55   ` Oleg Nesterov
@ 2008-12-24 21:04     ` Sukadev Bhattiprolu
  2008-12-24 21:34       ` Oleg Nesterov
  0 siblings, 1 reply; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 21:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| On 12/24, Sukadev Bhattiprolu wrote:
| >
| > --- a/kernel/signal.c
| > +++ b/kernel/signal.c
| > @@ -2385,17 +2385,22 @@ static int do_tkill(pid_t tgid, pid_t pid, int sig)
| >  	struct siginfo info;
| >  	struct task_struct *p;
| >  	unsigned long flags;
| > +	struct pid_namespace *ns;
| >
| >  	error = -ESRCH;
| >  	info.si_signo = sig;
| >  	info.si_errno = 0;
| >  	info.si_code = SI_TKILL;
| > -	info.si_pid = task_tgid_vnr(current);
| >  	info.si_uid = current_uid();
| >
| >  	rcu_read_lock();
| >  	p = find_task_by_vpid(pid);
| >  	if (p && (tgid <= 0 || task_tgid_vnr(p) == tgid)) {
| > +		ns = task_active_pid_ns(p);
| > +		if (ns)
| > +			info.si_pid = task_tgid_nr_ns(current, ns);
| > +		else
| > +			info.si_pid = task_tgid_vnr(current);
| 
| if ns == 0, "p" won't see the signal anyway, so all we need is

Yes, p won't see the signal, but task_tgid_nr_ns() is not safe if ns == NULL.
We should either check ns or make task_tgid_nr_ns() and friends safe with
ns == NULL ?

| 
| 	-	info.si_pid = task_tgid_vnr(current);
| 	+	info.si_pid = task_tgid_nr_ns(current, task_active_ns(p));
| 
| like we do in __do_notify().

Yes, I had a question about ns == 0 in this patch and was wondering if I
should add a check in __do_notify() too.

| 
| 
| But. this of course doesn't work for sys_kill(). Can't we change the helpers
| which send SI_FROMUSER() signals so that they do not fill .si_pid at all?

SI_FROMUSER() basically comes down to SI_USER and SI_TKILL (SI_QUEUE,
SI_SIGIO, SI_DETHREAD are unused ?)  SI_USER has to be masqueraded in
send_signal(). That leaves us with SI_TKILL. 

I was trying to have all si_pid settings done at origin and so the change
here for SI_TKILL. But yes, SI_USER (sys_kill() case) can't be done at
origin hence the special case for it in send_signal().

| Then send_signal() can do:
| 
| 	default:
| 		copy_siginfo(&q->info, info);
| 		info.si_pid = 0;
| 		if (!from_ancestot_ns)
| 			info.si_pid = task_tgid_nr_ns(current, ...);
| 
| ?


My preference was to address SI_TKILL also at origin, but am not
particular.  Yes, that will work too.


| 
| Yes, we use "current". But we already used it in siginfo_from_ancestor_ns().
| 
| Oleg.
| 
| --
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at  http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 15:43   ` Oleg Nesterov
  2008-12-24 16:18     ` Oleg Nesterov
@ 2008-12-24 21:08     ` Sukadev Bhattiprolu
  1 sibling, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 21:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| On 12/24, Sukadev Bhattiprolu wrote:
| >
| > +static void masquerade_si_pid(struct task_struct *t, siginfo_t *info)
| > +{
| > +	if (is_si_special(info) || SI_FROMKERNEL(info))
| > +		return;

Grr, This needs to go. I started with a general function but then
got specific to SI_USER.

| > +
| > +	/*
| > +	 * When crossing pid namespace boundary, SI_USER signal can only
| > +	 * go from ancestor to descendant ns but not the other way. So,
| > +	 * just ->si_pid to 0 since, the sender will not have a pid in
| > +	 * the receiver's namespace.
| > +	 */
| > +	if (info->si_code == SI_USER)
| > +		info->si_pid = 0;
| > +}
| > +
| >  static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
| >  			int group)
| >  {
| > @@ -946,6 +974,8 @@ static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
| >  			break;
| >  		default:
| >  			copy_siginfo(&q->info, info);
| > +			if (from_ancestor_ns)
| > +				masquerade_si_pid(t, &q->info);
| >  			break;
| >  		}
| >  	} else if (!is_si_special(info)) {
| > @@ -2343,7 +2373,7 @@ sys_kill(pid_t pid, int sig)
| >  	info.si_signo = sig;
| >  	info.si_errno = 0;
| >  	info.si_code = SI_USER;
| > -	info.si_pid = task_tgid_vnr(current);
| > +	info.si_pid = 0;	/* masquerade in send_signal() */
| >  	info.si_uid = current_uid();

Yes, I will undo this...

| 
| Can't understand this patch. First of all, it looks wrong. Looks like
| we never set .si_pid != 0 when the signal is set by sys_kill() ?
| 
| But more importantly, unless I missed something, this patch is unnecessary
| complication.
| 
| We call masquerade_si_pid() only when from_ancestor_ns == T, this is correct.
| But this means that (!is_si_special(info) && SI_FROMUSER(info)) == T, why
| do we re-check in masquerade_si_pid() ?
| 
| And why can't we just do
| 
| 	 default:
| 	 	copy_siginfo(&q->info, info);
| 	 	if (from_ancestor_ns)
| 	 		info->si_pid = 0;

and go with this.

| 
| ? Why should we check SI_USER and change sys_kill() ?
| 
| see also the comment for the next 7/7 patch.
| 
| Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 2/7][v4] Protect init from unwanted signals more
  2008-12-24 16:35       ` Oleg Nesterov
  (?)
@ 2008-12-24 21:11       ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 21:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| Or, we can do
| 
| 	if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
| 			 handler == SIG_DFL)
| 		return 1;
| 	return sig_handler_ignored(handler, sig);
| 
| because sig_handler_ignored() checks SIG_IGN too.

Yes it more optimal. I will change.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
  2008-12-24 16:28   ` Oleg Nesterov
@ 2008-12-24 21:24         ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 21:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: bastian-yyjItF7Rl6lg9hUCZPvPmw,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-qjLDD68F18O7TbgM5vRIOg, roland-H+wXaHxf7aLQT0dZR+AlfA,
	xemul-GEFAQzZX7r8dnm+yROfE0A

Oleg Nesterov [oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org] wrote:
| Small nit... siginfo_from_user() is only called by siginfo_from_ancestor_ns().
| The first helper depends on CONFIG_PID_NS, the second is not. A bit strange.

| 
| Isn't it cleaner to do
| 
| 	#ifdef CONFIG_PID_NS
| 	static inline int siginfo_from_user(siginfo_t *info)
| 	{
| 		...
| 	}
| 	static inline int siginfo_from_ancestor_ns(...)
| 	{
| 		...
| 	}
| 	#else
| 	static inline int siginfo_from_ancestor_ns(...)
| 	{
| 		return 0;
| 	}
| 	#endif
| 
| ?

Yes, it was that way in the earlier version, but I thought we introduced
CONFIG_PID_NS only to hide the ugliness resulting from pid-ns. Ok. I
will revert.

| 
| > +#ifdef CONFIG_PID_NS
| > +/*
| > + * siginfo_from_user() assumes that si_code SI_ASYNCIO comes only from
| > + * within the kernel. If an application is passing in SI_ASYNCIO we 
| > + * want to know about it.
| > + */
| > +static void warn_on_asyncio(siginfo_t *info)
| > +{
| > +	WARN_ON_ONCE(info->si_code == SI_ASYNCIO);
| > +}
| > +#else
| > +#define warn_on_asyncio(info)	{}
| > +#endif
| > +
| >  asmlinkage long
| >  sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
| >  {
| > @@ -2324,6 +2388,9 @@ sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
| >  	   Nor can they impersonate a kill(), which adds source info.  */
| >  	if (info.si_code >= 0)
| >  		return -EPERM;
| > +
| > +	warn_on_asyncio(&info);
| 
| Hmm... why do you want this? The user-space can use any si_code >= 0,
| why should we uglify the code?

I thought losing a SIGKILL, however twisted the path, was serious enough
to justify the ugliness. Again, I am not particular.

| 
| And, SI_ASYNCIO only matters when we send the signal to the subnamespace,
| and in that case we will probably mangle .si_pid. So why don't we warn
| when .si_code == SI_USER?

I was wondering if I should there too :-) But what do you think ? 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
@ 2008-12-24 21:24         ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 21:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| Small nit... siginfo_from_user() is only called by siginfo_from_ancestor_ns().
| The first helper depends on CONFIG_PID_NS, the second is not. A bit strange.

| 
| Isn't it cleaner to do
| 
| 	#ifdef CONFIG_PID_NS
| 	static inline int siginfo_from_user(siginfo_t *info)
| 	{
| 		...
| 	}
| 	static inline int siginfo_from_ancestor_ns(...)
| 	{
| 		...
| 	}
| 	#else
| 	static inline int siginfo_from_ancestor_ns(...)
| 	{
| 		return 0;
| 	}
| 	#endif
| 
| ?

Yes, it was that way in the earlier version, but I thought we introduced
CONFIG_PID_NS only to hide the ugliness resulting from pid-ns. Ok. I
will revert.

| 
| > +#ifdef CONFIG_PID_NS
| > +/*
| > + * siginfo_from_user() assumes that si_code SI_ASYNCIO comes only from
| > + * within the kernel. If an application is passing in SI_ASYNCIO we 
| > + * want to know about it.
| > + */
| > +static void warn_on_asyncio(siginfo_t *info)
| > +{
| > +	WARN_ON_ONCE(info->si_code == SI_ASYNCIO);
| > +}
| > +#else
| > +#define warn_on_asyncio(info)	{}
| > +#endif
| > +
| >  asmlinkage long
| >  sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
| >  {
| > @@ -2324,6 +2388,9 @@ sys_rt_sigqueueinfo(pid_t pid, int sig, siginfo_t __user *uinfo)
| >  	   Nor can they impersonate a kill(), which adds source info.  */
| >  	if (info.si_code >= 0)
| >  		return -EPERM;
| > +
| > +	warn_on_asyncio(&info);
| 
| Hmm... why do you want this? The user-space can use any si_code >= 0,
| why should we uglify the code?

I thought losing a SIGKILL, however twisted the path, was serious enough
to justify the ugliness. Again, I am not particular.

| 
| And, SI_ASYNCIO only matters when we send the signal to the subnamespace,
| and in that case we will probably mangle .si_pid. So why don't we warn
| when .si_code == SI_USER?

I was wondering if I should there too :-) But what do you think ? 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals
  2008-12-24 16:09   ` Oleg Nesterov
@ 2008-12-24 21:25     ` Sukadev Bhattiprolu
  2008-12-31  0:04     ` Roland McGrath
  1 sibling, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-24 21:25 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

| Personally, I'd prefer to retain this check inline with a small
| comment
| 
| 		/* SMALL COMMENT ;) */
| 		if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
| 		    !sig_kernel_only(signr))
| 			continue;

Ok. :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 7/7][v4] SI_TKILL: Masquerade si_pid when crossing pid ns boundary
  2008-12-24 21:04     ` Sukadev Bhattiprolu
@ 2008-12-24 21:34       ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 21:34 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> Oleg Nesterov [oleg@redhat.com] wrote:
>
> | >  	if (p && (tgid <= 0 || task_tgid_vnr(p) == tgid)) {
> | > +		ns = task_active_pid_ns(p);
> | > +		if (ns)
> | > +			info.si_pid = task_tgid_nr_ns(current, ns);
> | > +		else
> | > +			info.si_pid = task_tgid_vnr(current);
> |
> | if ns == 0, "p" won't see the signal anyway, so all we need is
>
> Yes, p won't see the signal, but task_tgid_nr_ns() is not safe if ns == NULL.

Yes, I forgot that task_tgid_nr_ns() assumes ns != NULL.

> | like we do in __do_notify().
>
> Yes, I had a question about ns == 0 in this patch and was wondering if I
> should add a check in __do_notify() too.

I guess  __do_notify() is fine, ns_of_pid() can't be NULL.

> | But. this of course doesn't work for sys_kill(). Can't we change the helpers
> | which send SI_FROMUSER() signals so that they do not fill .si_pid at all?
>
> SI_FROMUSER() basically comes down to SI_USER and SI_TKILL (SI_QUEUE,
> SI_SIGIO, SI_DETHREAD are unused ?)

Yes, I was going to kill them many times but always forget.

> SI_USER has to be masqueraded in
> send_signal(). That leaves us with SI_TKILL.
>
> I was trying to have all si_pid settings done at origin and so the change
> here for SI_TKILL. But yes, SI_USER (sys_kill() case) can't be done at
> origin hence the special case for it in send_signal().
>
> | Then send_signal() can do:
> |
> | 	default:
> | 		copy_siginfo(&q->info, info);
> | 		info.si_pid = 0;
> | 		if (!from_ancestot_ns)
> | 			info.si_pid = task_tgid_nr_ns(current, ...);
> |
> | ?
>
> My preference was to address SI_TKILL also at origin, but am not
> particular.  Yes, that will work too.

... and the code becomes more clean and simple.

I really dislike the fact that sys_kill() relies on send_signal()
(this is correct), but do_tkill() and __do_notify() play with pid ns
themselves. And this complicates send_signal() too.

Unless the kernel has a user which sends the "strange" SI_FROMUSER()
signal without ->si_pid, of course...

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
  2008-12-24 21:24         ` Sukadev Bhattiprolu
@ 2008-12-24 22:03             ` Oleg Nesterov
  -1 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 22:03 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: bastian-yyjItF7Rl6lg9hUCZPvPmw,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-qjLDD68F18O7TbgM5vRIOg, roland-H+wXaHxf7aLQT0dZR+AlfA,
	xemul-GEFAQzZX7r8dnm+yROfE0A

On 12/24, Sukadev Bhattiprolu wrote:
>
> Oleg Nesterov [oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org] wrote:
> | And, SI_ASYNCIO only matters when we send the signal to the subnamespace,
> | and in that case we will probably mangle .si_pid. So why don't we warn
> | when .si_code == SI_USER?
>
> I was wondering if I should there too :-) But what do you think ?

Well, if you ask me, I'd suggest to document the problems with
sigqueueinfo() and forget. Whatever we do, we can't be always
right.

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
@ 2008-12-24 22:03             ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2008-12-24 22:03 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

On 12/24, Sukadev Bhattiprolu wrote:
>
> Oleg Nesterov [oleg@redhat.com] wrote:
> | And, SI_ASYNCIO only matters when we send the signal to the subnamespace,
> | and in that case we will probably mangle .si_pid. So why don't we warn
> | when .si_code == SI_USER?
>
> I was wondering if I should there too :-) But what do you think ?

Well, if you ask me, I'd suggest to document the problems with
sigqueueinfo() and forget. Whatever we do, we can't be always
right.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns()
  2008-12-24 22:03             ` Oleg Nesterov
  (?)
@ 2008-12-27 20:38             ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 27+ messages in thread
From: Sukadev Bhattiprolu @ 2008-12-27 20:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ebiederm, roland, bastian, daniel, xemul, containers, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| On 12/24, Sukadev Bhattiprolu wrote:
| >
| > Oleg Nesterov [oleg@redhat.com] wrote:
| > | And, SI_ASYNCIO only matters when we send the signal to the subnamespace,
| > | and in that case we will probably mangle .si_pid. So why don't we warn
| > | when .si_code == SI_USER?
| >
| > I was wondering if I should there too :-) But what do you think ?
| 
| Well, if you ask me, I'd suggest to document the problems with
| sigqueueinfo() and forget. Whatever we do, we can't be always
| right.

Ok.  According to
http://www.kernel.org/doc/man-pages/online/pages/man2/rt_sigqueueinfo.2.html.
rt_sigqueueinfo() is not directly published to users and users can't be
passing in siginfo_t directly. So no man page update is needed either ? 

Sukadev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals
  2008-12-24 16:09   ` Oleg Nesterov
  2008-12-24 21:25     ` Sukadev Bhattiprolu
@ 2008-12-31  0:04     ` Roland McGrath
  2009-01-05 12:24       ` Oleg Nesterov
  1 sibling, 1 reply; 27+ messages in thread
From: Roland McGrath @ 2008-12-31  0:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sukadev Bhattiprolu, ebiederm, bastian, daniel, xemul,
	containers, linux-kernel

> > + * If @signal refers to a container-init and @signr is neither SIGKILL nor
> > + * SIGSTOP, it was queued because it was blocked when it was posted.
> 
> This is not right too. It is possible that init had a handler when
> the signal was sent, and the handler was set to SIG_DFL before the
> signal was dequeued.

do_sigaction's sig_handler_ignored() should exclude that.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals
  2008-12-31  0:04     ` Roland McGrath
@ 2009-01-05 12:24       ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2009-01-05 12:24 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Sukadev Bhattiprolu, ebiederm, bastian, daniel, xemul,
	containers, linux-kernel

On 12/30, Roland McGrath wrote:
>
> > > + * If @signal refers to a container-init and @signr is neither SIGKILL nor
> > > + * SIGSTOP, it was queued because it was blocked when it was posted.
> >
> > This is not right too. It is possible that init had a handler when
> > the signal was sent, and the handler was set to SIG_DFL before the
> > signal was dequeued.
>
> do_sigaction's sig_handler_ignored() should exclude that.

Yes sure, but sig_handler_ignored(sig) can be false, it doesn't take
SIGNAL_UNKILLABLE into account.

Oleg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2009-01-05 12:24 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-24 11:44 [PATCH 0/7][v4] Container-init signal semantics Sukadev Bhattiprolu
2008-12-24 11:50 ` [RFC][PATCH 1/7][v4] Remove 'handler' parameter to tracehook functions Sukadev Bhattiprolu
2008-12-24 11:50 ` [RFC][PATCH 2/7][v4] Protect init from unwanted signals more Sukadev Bhattiprolu
     [not found]   ` <20081224115047.GB8020-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-12-24 16:35     ` Oleg Nesterov
2008-12-24 16:35       ` Oleg Nesterov
2008-12-24 21:11       ` Sukadev Bhattiprolu
2008-12-24 11:51 ` [RFC][PATCH 3/7][v4] Define siginfo_from_ancestor_ns() Sukadev Bhattiprolu
2008-12-24 16:28   ` Oleg Nesterov
     [not found]     ` <20081224162823.GE11593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-12-24 21:24       ` Sukadev Bhattiprolu
2008-12-24 21:24         ` Sukadev Bhattiprolu
     [not found]         ` <20081224212426.GD13502-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-12-24 22:03           ` Oleg Nesterov
2008-12-24 22:03             ` Oleg Nesterov
2008-12-27 20:38             ` Sukadev Bhattiprolu
2008-12-24 11:51 ` [RFC][PATCH 4/7][v4] Protect cinit from unblocked SIG_DFL signals Sukadev Bhattiprolu
2008-12-24 11:52 ` [RFC][PATCH 5/7][v4] Protect cinit from blocked fatal signals Sukadev Bhattiprolu
2008-12-24 16:09   ` Oleg Nesterov
2008-12-24 21:25     ` Sukadev Bhattiprolu
2008-12-31  0:04     ` Roland McGrath
2009-01-05 12:24       ` Oleg Nesterov
2008-12-24 11:53 ` [RFC][PATCH 6/7][v4] SI_USER: Masquerade si_pid when crossing pid ns boundary Sukadev Bhattiprolu
2008-12-24 15:43   ` Oleg Nesterov
2008-12-24 16:18     ` Oleg Nesterov
2008-12-24 21:08     ` Sukadev Bhattiprolu
2008-12-24 11:53 ` [RFC][PATCH 7/7][v4] SI_TKILL: " Sukadev Bhattiprolu
2008-12-24 15:55   ` Oleg Nesterov
2008-12-24 21:04     ` Sukadev Bhattiprolu
2008-12-24 21:34       ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.