* [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot @ 2011-08-11 20:23 Daniel Lezcano [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> ` (3 more replies) 0 siblings, 4 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 20:23 UTC (permalink / raw) To: akpm; +Cc: oleg, bonbons, containers, linux-kernel, serge From: Daniel Lezcano <dlezcano@fr.ibm.com> In the case of a VPS, when we shutdown/halt/reboot the container, the reboot utility will invoke the sys_reboot syscall which has the bad effect to reboot the host. The way to fix that is to drop the CAP_SYS_REBOOT capability in the container. In this case, the container shutdowns correctly but, at the end, the init process is waiting indefinitely and we have the containers stuck with one process (the init process). In order to fix that, we used a hypervisor process, parent of the container's init process, watching for the container's utmp file and detecting when the runlevel changes. When this runlevel change is detected we wait for the container to have one process left and then we kill the container's init. That works well if we modify the distro configuration files, we make /var/run to not be a tmpfs and we remove all the files inside this directory when the container boots. *But* as soon as we upgrade the container distro, all the tweaks are lost. So this method works but at the cost of tweaking the containers configuration files again and again, each time there is an update, which is not tolerable in a production environment. This patchset solves the problem by send a SIGCHLD signal to the process parent of the init process of the child pid namespace. By this way, we know when a pid namespace wanted to reboot/halt/shutdown and we can take advantage of that to kill, restart or suspend the container. Daniel Lezcano (2): add SA_CLDREBOOT flag Notify container-init parent a 'reboot' occured arch/alpha/include/asm/signal.h | 2 + arch/arm/include/asm/signal.h | 2 + arch/avr32/include/asm/signal.h | 2 + arch/cris/include/asm/signal.h | 2 + arch/h8300/include/asm/signal.h | 2 + arch/ia64/include/asm/signal.h | 2 + arch/m32r/include/asm/signal.h | 2 + arch/m68k/include/asm/signal.h | 2 + arch/mips/include/asm/signal.h | 2 + arch/mn10300/include/asm/signal.h | 2 + arch/parisc/include/asm/signal.h | 2 + arch/powerpc/include/asm/signal.h | 2 + arch/s390/include/asm/signal.h | 2 + arch/sparc/include/asm/signal.h | 2 +- arch/x86/include/asm/signal.h | 2 + arch/xtensa/include/asm/signal.h | 2 + include/asm-generic/siginfo.h | 3 +- include/asm-generic/signal.h | 2 + include/linux/sched.h | 1 + kernel/signal.c | 40 +++++++++++++++++++++++++++++++++++++ kernel/sys.c | 20 ++++++++++++++++- 21 files changed, 94 insertions(+), 4 deletions(-) -- 1.7.4.1 ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org>]
* [PATCH 1/2] add SA_CLDREBOOT flag [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> @ 2011-08-11 20:24 ` Daniel Lezcano 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov 2 siblings, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 20:24 UTC (permalink / raw) To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, oleg-6lXkIZvqkOAvJsYlp49lxw, linux-kernel-u79uwXL29TY76Z2rM5mHXA Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD when a process of a child pid namespace calls the reboot syscall. Signed-off-by: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org> --- arch/alpha/include/asm/signal.h | 2 ++ arch/arm/include/asm/signal.h | 2 ++ arch/avr32/include/asm/signal.h | 2 ++ arch/cris/include/asm/signal.h | 2 ++ arch/h8300/include/asm/signal.h | 2 ++ arch/ia64/include/asm/signal.h | 2 ++ arch/m32r/include/asm/signal.h | 2 ++ arch/m68k/include/asm/signal.h | 2 ++ arch/mips/include/asm/signal.h | 2 ++ arch/mn10300/include/asm/signal.h | 2 ++ arch/parisc/include/asm/signal.h | 2 ++ arch/powerpc/include/asm/signal.h | 2 ++ arch/s390/include/asm/signal.h | 2 ++ arch/sparc/include/asm/signal.h | 2 +- arch/x86/include/asm/signal.h | 2 ++ arch/xtensa/include/asm/signal.h | 2 ++ include/asm-generic/signal.h | 2 ++ 17 files changed, 33 insertions(+), 1 deletions(-) diff --git a/arch/alpha/include/asm/signal.h b/arch/alpha/include/asm/signal.h index a938830..40a2285 100644 --- a/arch/alpha/include/asm/signal.h +++ b/arch/alpha/include/asm/signal.h @@ -82,6 +82,7 @@ typedef unsigned long sigset_t; * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * * SA_ONESHOT and SA_NOMASK are the historical Linux names for the Single * Unix names RESETHAND and NODEFER respectively. @@ -94,6 +95,7 @@ typedef unsigned long sigset_t; #define SA_RESETHAND 0x00000010 #define SA_NOCLDWAIT 0x00000020 #define SA_SIGINFO 0x00000040 +#define SA_CLDREBOOT 0x00000080 #define SA_ONESHOT SA_RESETHAND #define SA_NOMASK SA_NODEFER diff --git a/arch/arm/include/asm/signal.h b/arch/arm/include/asm/signal.h index 43ba0fb..2b585ec 100644 --- a/arch/arm/include/asm/signal.h +++ b/arch/arm/include/asm/signal.h @@ -78,6 +78,7 @@ typedef unsigned long sigset_t; * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_SIGINFO deliver the signal with SIGINFO structs + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_THIRTYTWO delivers the signal in 32-bit mode, even if the task * is running in 26-bit. * SA_ONSTACK allows alternate signal stacks (see sigaltstack(2)). @@ -91,6 +92,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_THIRTYTWO 0x02000000 #define SA_RESTORER 0x04000000 #define SA_ONSTACK 0x08000000 diff --git a/arch/avr32/include/asm/signal.h b/arch/avr32/include/asm/signal.h index 8790dfc..a025546 100644 --- a/arch/avr32/include/asm/signal.h +++ b/arch/avr32/include/asm/signal.h @@ -83,6 +83,7 @@ typedef unsigned long sigset_t; * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_SIGINFO deliver the signal with SIGINFO structs + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NODEFER prevents the current signal from being masked in the handler. @@ -94,6 +95,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_RESTORER 0x04000000 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 diff --git a/arch/cris/include/asm/signal.h b/arch/cris/include/asm/signal.h index ea6af9a..fa804f0 100644 --- a/arch/cris/include/asm/signal.h +++ b/arch/cris/include/asm/signal.h @@ -76,6 +76,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -87,6 +88,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001u #define SA_NOCLDWAIT 0x00000002u #define SA_SIGINFO 0x00000004u +#define SA_CLDREBOOT 0x00000008u #define SA_ONSTACK 0x08000000u #define SA_RESTART 0x10000000u #define SA_NODEFER 0x40000000u diff --git a/arch/h8300/include/asm/signal.h b/arch/h8300/include/asm/signal.h index fd8b66e..13b4b5a 100644 --- a/arch/h8300/include/asm/signal.h +++ b/arch/h8300/include/asm/signal.h @@ -76,6 +76,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -86,6 +87,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 /* not supported yet */ #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/ia64/include/asm/signal.h b/arch/ia64/include/asm/signal.h index b166248..30245a1 100644 --- a/arch/ia64/include/asm/signal.h +++ b/arch/ia64/include/asm/signal.h @@ -58,6 +58,7 @@ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -68,6 +69,7 @@ #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/m32r/include/asm/signal.h b/arch/m32r/include/asm/signal.h index b2eeb0d..c547212 100644 --- a/arch/m32r/include/asm/signal.h +++ b/arch/m32r/include/asm/signal.h @@ -78,6 +78,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -88,6 +89,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001u #define SA_NOCLDWAIT 0x00000002u #define SA_SIGINFO 0x00000004u +#define SA_CLDREBOOT 0x00000008u #define SA_ONSTACK 0x08000000u #define SA_RESTART 0x10000000u #define SA_NODEFER 0x40000000u diff --git a/arch/m68k/include/asm/signal.h b/arch/m68k/include/asm/signal.h index 60e8866..14334b2 100644 --- a/arch/m68k/include/asm/signal.h +++ b/arch/m68k/include/asm/signal.h @@ -76,6 +76,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -86,6 +87,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/mips/include/asm/signal.h b/arch/mips/include/asm/signal.h index c783f36..acf7494 100644 --- a/arch/mips/include/asm/signal.h +++ b/arch/mips/include/asm/signal.h @@ -66,6 +66,7 @@ typedef unsigned long old_sigset_t; /* at least 32 bits */ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -80,6 +81,7 @@ typedef unsigned long old_sigset_t; /* at least 32 bits */ #define SA_NODEFER 0x40000000 #define SA_NOCLDWAIT 0x00010000 #define SA_NOCLDSTOP 0x00000001 +#define SA_CLDREBOOT 0x00000002 #define SA_NOMASK SA_NODEFER #define SA_ONESHOT SA_RESETHAND diff --git a/arch/mn10300/include/asm/signal.h b/arch/mn10300/include/asm/signal.h index 1865d72..ad1f454 100644 --- a/arch/mn10300/include/asm/signal.h +++ b/arch/mn10300/include/asm/signal.h @@ -86,6 +86,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -96,6 +97,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001U #define SA_NOCLDWAIT 0x00000002U #define SA_SIGINFO 0x00000004U +#define SA_CLDREBOOT 0x00000008U #define SA_ONSTACK 0x08000000U #define SA_RESTART 0x10000000U #define SA_NODEFER 0x40000000U diff --git a/arch/parisc/include/asm/signal.h b/arch/parisc/include/asm/signal.h index c203563..85c2fec 100644 --- a/arch/parisc/include/asm/signal.h +++ b/arch/parisc/include/asm/signal.h @@ -50,6 +50,7 @@ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -58,6 +59,7 @@ * Unix names RESETHAND and NODEFER respectively. */ #define SA_ONSTACK 0x00000001 +#define SA_CLDREBOOT 0x00000002 #define SA_RESETHAND 0x00000004 #define SA_NOCLDSTOP 0x00000008 #define SA_SIGINFO 0x00000010 diff --git a/arch/powerpc/include/asm/signal.h b/arch/powerpc/include/asm/signal.h index 3eb13be..f96e2d5 100644 --- a/arch/powerpc/include/asm/signal.h +++ b/arch/powerpc/include/asm/signal.h @@ -65,6 +65,7 @@ typedef struct { * SA_ONSTACK is not currently supported, but will allow sigaltstack(2). * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -75,6 +76,7 @@ typedef struct { #define SA_NOCLDSTOP 0x00000001U #define SA_NOCLDWAIT 0x00000002U #define SA_SIGINFO 0x00000004U +#define SA_CLDREBOOT 0x00000008U #define SA_ONSTACK 0x08000000U #define SA_RESTART 0x10000000U #define SA_NODEFER 0x40000000U diff --git a/arch/s390/include/asm/signal.h b/arch/s390/include/asm/signal.h index cdf5cb2..449a2dc 100644 --- a/arch/s390/include/asm/signal.h +++ b/arch/s390/include/asm/signal.h @@ -86,6 +86,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -96,6 +97,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/sparc/include/asm/signal.h b/arch/sparc/include/asm/signal.h index e49b828..26cbd7e 100644 --- a/arch/sparc/include/asm/signal.h +++ b/arch/sparc/include/asm/signal.h @@ -146,7 +146,7 @@ struct sigstack { #define SA_NOMASK 0x20u #define SA_NOCLDWAIT 0x100u #define SA_SIGINFO 0x200u - +#define SA_CLDREBOOT 0x400u #define SIG_BLOCK 0x01 /* for blocking signals */ #define SIG_UNBLOCK 0x02 /* for unblocking signals */ diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h index 598457c..83696a5 100644 --- a/arch/x86/include/asm/signal.h +++ b/arch/x86/include/asm/signal.h @@ -90,6 +90,7 @@ typedef unsigned long sigset_t; * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_NODEFER prevents the current signal from being masked in the handler. * * SA_ONESHOT and SA_NOMASK are the historical Linux names for the Single @@ -98,6 +99,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001u #define SA_NOCLDWAIT 0x00000002u #define SA_SIGINFO 0x00000004u +#define SA_CLDREBOOT 0x00000008u #define SA_ONSTACK 0x08000000u #define SA_RESTART 0x10000000u #define SA_NODEFER 0x40000000u diff --git a/arch/xtensa/include/asm/signal.h b/arch/xtensa/include/asm/signal.h index 633ba73..5961f12 100644 --- a/arch/xtensa/include/asm/signal.h +++ b/arch/xtensa/include/asm/signal.h @@ -77,6 +77,7 @@ typedef struct { * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -87,6 +88,7 @@ typedef struct { #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 /* not supported yet */ #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/include/asm-generic/signal.h b/include/asm-generic/signal.h index 555c0ae..504f910 100644 --- a/include/asm-generic/signal.h +++ b/include/asm-generic/signal.h @@ -57,6 +57,7 @@ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_REBOOT flag to turn on SIGCHLD when pid namespace reboots * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -67,6 +68,7 @@ #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000004 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 63+ messages in thread
* [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 20:24 ` [PATCH 1/2] add SA_CLDREBOOT flag Daniel Lezcano @ 2011-08-11 20:24 ` Daniel Lezcano 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov 2 siblings, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 20:24 UTC (permalink / raw) To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, oleg-6lXkIZvqkOAvJsYlp49lxw, linux-kernel-u79uwXL29TY76Z2rM5mHXA When the reboot syscall is called and the pid namespace where the calling process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED to the parent of this pid namespace. Signed-off-by: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org> --- include/asm-generic/siginfo.h | 3 ++- include/linux/sched.h | 1 + kernel/signal.c | 40 ++++++++++++++++++++++++++++++++++++++++ kernel/sys.c | 20 ++++++++++++++++++-- 4 files changed, 61 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h index 0dd4e87..9bff4a2 100644 --- a/include/asm-generic/siginfo.h +++ b/include/asm-generic/siginfo.h @@ -218,7 +218,8 @@ typedef struct siginfo { #define CLD_TRAPPED (__SI_CHLD|4) /* traced child has trapped */ #define CLD_STOPPED (__SI_CHLD|5) /* child has stopped */ #define CLD_CONTINUED (__SI_CHLD|6) /* stopped child has continued */ -#define NSIGCHLD 6 +#define CLD_REBOOTED (__SI_CHLD|7) /* process was killed by a reboot */ +#define NSIGCHLD 7 /* * SIGPOLL si_codes diff --git a/include/linux/sched.h b/include/linux/sched.h index 20b03bf..c62dc9e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2170,6 +2170,7 @@ extern int kill_pgrp(struct pid *pid, int sig, int priv); extern int kill_pid(struct pid *pid, int sig, int priv); extern int kill_proc_info(int, struct siginfo *, pid_t); extern __must_check bool do_notify_parent(struct task_struct *, int); +extern void do_notify_parent_cldreboot(struct task_struct *, int, char *); extern void __wake_up_parent(struct task_struct *p, struct task_struct *parent); extern void force_sig(int, struct task_struct *); extern int send_sig(int, struct task_struct *, int); diff --git a/kernel/signal.c b/kernel/signal.c index 291c970..7d3d44c 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1668,6 +1668,46 @@ bool do_notify_parent(struct task_struct *tsk, int sig) return autoreap; } +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) +{ + struct siginfo info = { }; + struct task_struct *parent; + struct sighand_struct *sighand; + unsigned long flags; + + if (tsk->ptrace) + parent = tsk->parent; + else { + tsk = tsk->group_leader; + parent = tsk->real_parent; + } + + info.si_signo = SIGCHLD; + info.si_errno = 0; + info.si_status = why; + + rcu_read_lock(); + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); + info.si_uid = __task_cred(tsk)->uid; + rcu_read_unlock(); + + info.si_utime = cputime_to_clock_t(tsk->utime); + info.si_stime = cputime_to_clock_t(tsk->stime); + + info.si_code = CLD_REBOOTED; + + sighand = parent->sighand; + spin_lock_irqsave(&sighand->siglock, flags); + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) + __group_send_sig_info(SIGCHLD, &info, parent); + /* + * Even if SIGCHLD is not generated, we must wake up wait4 calls. + */ + __wake_up_parent(tsk, parent); + spin_unlock_irqrestore(&sighand->siglock, flags); +} + /** * do_notify_parent_cldstop - notify parent of stopped/continued state change * @tsk: task reporting the state change diff --git a/kernel/sys.c b/kernel/sys.c index a101ba3..8f7a9ed 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -42,6 +42,7 @@ #include <linux/syscalls.h> #include <linux/kprobes.h> #include <linux/user_namespace.h> +#include <linux/sched.h> #include <linux/kmsg_dump.h> @@ -411,6 +412,13 @@ void kernel_power_off(void) } EXPORT_SYMBOL_GPL(kernel_power_off); +static void pid_namespace_reboot(struct pid_namespace *pid_ns, + int cmd, char *buffer) +{ + struct task_struct *tsk = pid_ns->child_reaper; + do_notify_parent_cldreboot(tsk, cmd, buffer); +} + static DEFINE_MUTEX(reboot_mutex); /* @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, { char buffer[256]; int ret = 0; + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; + + /* We only trust the superuser with rebooting the system. */ + if (!capable(CAP_SYS_BOOT)) { + /* If we are not in the initial pid namespace, we send a signal + * to the parent of this init pid namespace, notifying a shutdown + * occured */ + if (pid_ns != &init_pid_ns) + pid_namespace_reboot(pid_ns, cmd, buffer); - /* We only trust the superuser with rebooting the system. */ - if (!capable(CAP_SYS_BOOT)) return -EPERM; + } /* For safety, we require "magic" arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 20:24 ` [PATCH 1/2] add SA_CLDREBOOT flag Daniel Lezcano 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano @ 2011-08-14 16:17 ` Oleg Nesterov 2 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 16:17 UTC (permalink / raw) To: Daniel Lezcano Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/11, Daniel Lezcano wrote: > > In the case of a VPS, when we shutdown/halt/reboot the container, the > reboot utility will invoke the sys_reboot syscall which has the bad > effect to reboot the host. Stupid question. Can't sys_reboot() simply kill init (and thus the whole pid_ns) in this case? Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* [PATCH 1/2] add SA_CLDREBOOT flag 2011-08-11 20:23 [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Daniel Lezcano [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> @ 2011-08-11 20:24 ` Daniel Lezcano [not found] ` <1313094241-3674-2-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-14 16:15 ` Oleg Nesterov 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov 3 siblings, 2 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 20:24 UTC (permalink / raw) To: akpm; +Cc: oleg, bonbons, containers, linux-kernel, serge Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD when a process of a child pid namespace calls the reboot syscall. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> --- arch/alpha/include/asm/signal.h | 2 ++ arch/arm/include/asm/signal.h | 2 ++ arch/avr32/include/asm/signal.h | 2 ++ arch/cris/include/asm/signal.h | 2 ++ arch/h8300/include/asm/signal.h | 2 ++ arch/ia64/include/asm/signal.h | 2 ++ arch/m32r/include/asm/signal.h | 2 ++ arch/m68k/include/asm/signal.h | 2 ++ arch/mips/include/asm/signal.h | 2 ++ arch/mn10300/include/asm/signal.h | 2 ++ arch/parisc/include/asm/signal.h | 2 ++ arch/powerpc/include/asm/signal.h | 2 ++ arch/s390/include/asm/signal.h | 2 ++ arch/sparc/include/asm/signal.h | 2 +- arch/x86/include/asm/signal.h | 2 ++ arch/xtensa/include/asm/signal.h | 2 ++ include/asm-generic/signal.h | 2 ++ 17 files changed, 33 insertions(+), 1 deletions(-) diff --git a/arch/alpha/include/asm/signal.h b/arch/alpha/include/asm/signal.h index a938830..40a2285 100644 --- a/arch/alpha/include/asm/signal.h +++ b/arch/alpha/include/asm/signal.h @@ -82,6 +82,7 @@ typedef unsigned long sigset_t; * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * * SA_ONESHOT and SA_NOMASK are the historical Linux names for the Single * Unix names RESETHAND and NODEFER respectively. @@ -94,6 +95,7 @@ typedef unsigned long sigset_t; #define SA_RESETHAND 0x00000010 #define SA_NOCLDWAIT 0x00000020 #define SA_SIGINFO 0x00000040 +#define SA_CLDREBOOT 0x00000080 #define SA_ONESHOT SA_RESETHAND #define SA_NOMASK SA_NODEFER diff --git a/arch/arm/include/asm/signal.h b/arch/arm/include/asm/signal.h index 43ba0fb..2b585ec 100644 --- a/arch/arm/include/asm/signal.h +++ b/arch/arm/include/asm/signal.h @@ -78,6 +78,7 @@ typedef unsigned long sigset_t; * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_SIGINFO deliver the signal with SIGINFO structs + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_THIRTYTWO delivers the signal in 32-bit mode, even if the task * is running in 26-bit. * SA_ONSTACK allows alternate signal stacks (see sigaltstack(2)). @@ -91,6 +92,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_THIRTYTWO 0x02000000 #define SA_RESTORER 0x04000000 #define SA_ONSTACK 0x08000000 diff --git a/arch/avr32/include/asm/signal.h b/arch/avr32/include/asm/signal.h index 8790dfc..a025546 100644 --- a/arch/avr32/include/asm/signal.h +++ b/arch/avr32/include/asm/signal.h @@ -83,6 +83,7 @@ typedef unsigned long sigset_t; * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_SIGINFO deliver the signal with SIGINFO structs + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NODEFER prevents the current signal from being masked in the handler. @@ -94,6 +95,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_RESTORER 0x04000000 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 diff --git a/arch/cris/include/asm/signal.h b/arch/cris/include/asm/signal.h index ea6af9a..fa804f0 100644 --- a/arch/cris/include/asm/signal.h +++ b/arch/cris/include/asm/signal.h @@ -76,6 +76,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -87,6 +88,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001u #define SA_NOCLDWAIT 0x00000002u #define SA_SIGINFO 0x00000004u +#define SA_CLDREBOOT 0x00000008u #define SA_ONSTACK 0x08000000u #define SA_RESTART 0x10000000u #define SA_NODEFER 0x40000000u diff --git a/arch/h8300/include/asm/signal.h b/arch/h8300/include/asm/signal.h index fd8b66e..13b4b5a 100644 --- a/arch/h8300/include/asm/signal.h +++ b/arch/h8300/include/asm/signal.h @@ -76,6 +76,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -86,6 +87,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 /* not supported yet */ #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/ia64/include/asm/signal.h b/arch/ia64/include/asm/signal.h index b166248..30245a1 100644 --- a/arch/ia64/include/asm/signal.h +++ b/arch/ia64/include/asm/signal.h @@ -58,6 +58,7 @@ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -68,6 +69,7 @@ #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/m32r/include/asm/signal.h b/arch/m32r/include/asm/signal.h index b2eeb0d..c547212 100644 --- a/arch/m32r/include/asm/signal.h +++ b/arch/m32r/include/asm/signal.h @@ -78,6 +78,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -88,6 +89,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001u #define SA_NOCLDWAIT 0x00000002u #define SA_SIGINFO 0x00000004u +#define SA_CLDREBOOT 0x00000008u #define SA_ONSTACK 0x08000000u #define SA_RESTART 0x10000000u #define SA_NODEFER 0x40000000u diff --git a/arch/m68k/include/asm/signal.h b/arch/m68k/include/asm/signal.h index 60e8866..14334b2 100644 --- a/arch/m68k/include/asm/signal.h +++ b/arch/m68k/include/asm/signal.h @@ -76,6 +76,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -86,6 +87,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/mips/include/asm/signal.h b/arch/mips/include/asm/signal.h index c783f36..acf7494 100644 --- a/arch/mips/include/asm/signal.h +++ b/arch/mips/include/asm/signal.h @@ -66,6 +66,7 @@ typedef unsigned long old_sigset_t; /* at least 32 bits */ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -80,6 +81,7 @@ typedef unsigned long old_sigset_t; /* at least 32 bits */ #define SA_NODEFER 0x40000000 #define SA_NOCLDWAIT 0x00010000 #define SA_NOCLDSTOP 0x00000001 +#define SA_CLDREBOOT 0x00000002 #define SA_NOMASK SA_NODEFER #define SA_ONESHOT SA_RESETHAND diff --git a/arch/mn10300/include/asm/signal.h b/arch/mn10300/include/asm/signal.h index 1865d72..ad1f454 100644 --- a/arch/mn10300/include/asm/signal.h +++ b/arch/mn10300/include/asm/signal.h @@ -86,6 +86,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -96,6 +97,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001U #define SA_NOCLDWAIT 0x00000002U #define SA_SIGINFO 0x00000004U +#define SA_CLDREBOOT 0x00000008U #define SA_ONSTACK 0x08000000U #define SA_RESTART 0x10000000U #define SA_NODEFER 0x40000000U diff --git a/arch/parisc/include/asm/signal.h b/arch/parisc/include/asm/signal.h index c203563..85c2fec 100644 --- a/arch/parisc/include/asm/signal.h +++ b/arch/parisc/include/asm/signal.h @@ -50,6 +50,7 @@ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -58,6 +59,7 @@ * Unix names RESETHAND and NODEFER respectively. */ #define SA_ONSTACK 0x00000001 +#define SA_CLDREBOOT 0x00000002 #define SA_RESETHAND 0x00000004 #define SA_NOCLDSTOP 0x00000008 #define SA_SIGINFO 0x00000010 diff --git a/arch/powerpc/include/asm/signal.h b/arch/powerpc/include/asm/signal.h index 3eb13be..f96e2d5 100644 --- a/arch/powerpc/include/asm/signal.h +++ b/arch/powerpc/include/asm/signal.h @@ -65,6 +65,7 @@ typedef struct { * SA_ONSTACK is not currently supported, but will allow sigaltstack(2). * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -75,6 +76,7 @@ typedef struct { #define SA_NOCLDSTOP 0x00000001U #define SA_NOCLDWAIT 0x00000002U #define SA_SIGINFO 0x00000004U +#define SA_CLDREBOOT 0x00000008U #define SA_ONSTACK 0x08000000U #define SA_RESTART 0x10000000U #define SA_NODEFER 0x40000000U diff --git a/arch/s390/include/asm/signal.h b/arch/s390/include/asm/signal.h index cdf5cb2..449a2dc 100644 --- a/arch/s390/include/asm/signal.h +++ b/arch/s390/include/asm/signal.h @@ -86,6 +86,7 @@ typedef unsigned long sigset_t; * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -96,6 +97,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/arch/sparc/include/asm/signal.h b/arch/sparc/include/asm/signal.h index e49b828..26cbd7e 100644 --- a/arch/sparc/include/asm/signal.h +++ b/arch/sparc/include/asm/signal.h @@ -146,7 +146,7 @@ struct sigstack { #define SA_NOMASK 0x20u #define SA_NOCLDWAIT 0x100u #define SA_SIGINFO 0x200u - +#define SA_CLDREBOOT 0x400u #define SIG_BLOCK 0x01 /* for blocking signals */ #define SIG_UNBLOCK 0x02 /* for unblocking signals */ diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h index 598457c..83696a5 100644 --- a/arch/x86/include/asm/signal.h +++ b/arch/x86/include/asm/signal.h @@ -90,6 +90,7 @@ typedef unsigned long sigset_t; * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_NODEFER prevents the current signal from being masked in the handler. * * SA_ONESHOT and SA_NOMASK are the historical Linux names for the Single @@ -98,6 +99,7 @@ typedef unsigned long sigset_t; #define SA_NOCLDSTOP 0x00000001u #define SA_NOCLDWAIT 0x00000002u #define SA_SIGINFO 0x00000004u +#define SA_CLDREBOOT 0x00000008u #define SA_ONSTACK 0x08000000u #define SA_RESTART 0x10000000u #define SA_NODEFER 0x40000000u diff --git a/arch/xtensa/include/asm/signal.h b/arch/xtensa/include/asm/signal.h index 633ba73..5961f12 100644 --- a/arch/xtensa/include/asm/signal.h +++ b/arch/xtensa/include/asm/signal.h @@ -77,6 +77,7 @@ typedef struct { * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_CLDREBOOT flag to turn on SIGCHLD when pid namespace calls reboot * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -87,6 +88,7 @@ typedef struct { #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 /* not supported yet */ #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000008 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 diff --git a/include/asm-generic/signal.h b/include/asm-generic/signal.h index 555c0ae..504f910 100644 --- a/include/asm-generic/signal.h +++ b/include/asm-generic/signal.h @@ -57,6 +57,7 @@ * SA_ONSTACK indicates that a registered stack_t will be used. * SA_RESTART flag to get restarting signals (which were the default long ago) * SA_NOCLDSTOP flag to turn off SIGCHLD when children stop. + * SA_REBOOT flag to turn on SIGCHLD when pid namespace reboots * SA_RESETHAND clears the handler when the signal is delivered. * SA_NOCLDWAIT flag on SIGCHLD to inhibit zombies. * SA_NODEFER prevents the current signal from being masked in the handler. @@ -67,6 +68,7 @@ #define SA_NOCLDSTOP 0x00000001 #define SA_NOCLDWAIT 0x00000002 #define SA_SIGINFO 0x00000004 +#define SA_CLDREBOOT 0x00000004 #define SA_ONSTACK 0x08000000 #define SA_RESTART 0x10000000 #define SA_NODEFER 0x40000000 -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 63+ messages in thread
[parent not found: <1313094241-3674-2-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org>]
* Re: [PATCH 1/2] add SA_CLDREBOOT flag [not found] ` <1313094241-3674-2-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> @ 2011-08-14 16:15 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 16:15 UTC (permalink / raw) To: Daniel Lezcano Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/11, Daniel Lezcano wrote: > > Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD > when a process of a child pid namespace calls the reboot syscall. I doubt this is really useful in general. SIGCHLD doesn't queue, if the parent of init already has the pending SIGCHLD it won't see CLD_REBOOTED anyway. Anyway, you do not need a special helper to fill siginfo. And personally I strongly dislike SA_CLDREBOOT. For what? Imho, this is unnecessary complication. If we want SIGCHLD-on-reboot, just send it unconditionally. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 1/2] add SA_CLDREBOOT flag 2011-08-11 20:24 ` [PATCH 1/2] add SA_CLDREBOOT flag Daniel Lezcano [not found] ` <1313094241-3674-2-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> @ 2011-08-14 16:15 ` Oleg Nesterov [not found] ` <20110814161532.GA30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-14 16:36 ` Bruno Prémont 1 sibling, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 16:15 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, bonbons, containers, linux-kernel, serge On 08/11, Daniel Lezcano wrote: > > Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD > when a process of a child pid namespace calls the reboot syscall. I doubt this is really useful in general. SIGCHLD doesn't queue, if the parent of init already has the pending SIGCHLD it won't see CLD_REBOOTED anyway. Anyway, you do not need a special helper to fill siginfo. And personally I strongly dislike SA_CLDREBOOT. For what? Imho, this is unnecessary complication. If we want SIGCHLD-on-reboot, just send it unconditionally. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110814161532.GA30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/2] add SA_CLDREBOOT flag [not found] ` <20110814161532.GA30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-14 16:36 ` Bruno Prémont 0 siblings, 0 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-14 16:36 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Sun, 14 August 2011 Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On 08/11, Daniel Lezcano wrote: > > > > Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD > > when a process of a child pid namespace calls the reboot syscall. > > I doubt this is really useful in general. SIGCHLD doesn't queue, > if the parent of init already has the pending SIGCHLD it won't see > CLD_REBOOTED anyway. Anyway, you do not need a special helper to > fill siginfo. > > And personally I strongly dislike SA_CLDREBOOT. For what? Imho, this > is unnecessary complication. If we want SIGCHLD-on-reboot, just send > it unconditionally. The information that is needed is what options the container did pass to sys_reboot(). Did the container want a reboot or a power-off (or maybe just touch the LINUX_REBOOT_CMD_CAD_* status)? As long as that information can reach parent/reaper of container's init the needed functionality should be there. Bruno ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 1/2] add SA_CLDREBOOT flag 2011-08-14 16:15 ` Oleg Nesterov [not found] ` <20110814161532.GA30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-14 16:36 ` Bruno Prémont 2011-08-14 17:10 ` Oleg Nesterov [not found] ` <20110814183611.05937c96-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-14 16:36 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Daniel Lezcano, containers, akpm, linux-kernel On Sun, 14 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > On 08/11, Daniel Lezcano wrote: > > > > Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD > > when a process of a child pid namespace calls the reboot syscall. > > I doubt this is really useful in general. SIGCHLD doesn't queue, > if the parent of init already has the pending SIGCHLD it won't see > CLD_REBOOTED anyway. Anyway, you do not need a special helper to > fill siginfo. > > And personally I strongly dislike SA_CLDREBOOT. For what? Imho, this > is unnecessary complication. If we want SIGCHLD-on-reboot, just send > it unconditionally. The information that is needed is what options the container did pass to sys_reboot(). Did the container want a reboot or a power-off (or maybe just touch the LINUX_REBOOT_CMD_CAD_* status)? As long as that information can reach parent/reaper of container's init the needed functionality should be there. Bruno ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 1/2] add SA_CLDREBOOT flag 2011-08-14 16:36 ` Bruno Prémont @ 2011-08-14 17:10 ` Oleg Nesterov [not found] ` <20110814183611.05937c96-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 1 sibling, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 17:10 UTC (permalink / raw) To: Bruno Prémont; +Cc: Daniel Lezcano, containers, akpm, linux-kernel On 08/14, Bruno Prémont wrote: > > On Sun, 14 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > > On 08/11, Daniel Lezcano wrote: > > > > > > Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD > > > when a process of a child pid namespace calls the reboot syscall. > > > > I doubt this is really useful in general. SIGCHLD doesn't queue, > > if the parent of init already has the pending SIGCHLD it won't see > > CLD_REBOOTED anyway. Anyway, you do not need a special helper to > > fill siginfo. > > > > And personally I strongly dislike SA_CLDREBOOT. For what? Imho, this > > is unnecessary complication. If we want SIGCHLD-on-reboot, just send > > it unconditionally. > > The information that is needed is what options the container did pass > to sys_reboot(). > Did the container want a reboot or a power-off (or maybe just touch the > LINUX_REBOOT_CMD_CAD_* status)? > > As long as that information can reach parent/reaper of container's init > the needed functionality should be there. Sorry, can't understand. Once again, why do you want SA_CLDREBOOT? OK, you may be need to pass some info via siginfo (but note that it can be lost, and you do not need a special helper for that). But why do you want this hack in sa_flags? Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110814183611.05937c96-hY15tx4IgV39zxVx7UNMDg@public.gmane.org>]
* Re: [PATCH 1/2] add SA_CLDREBOOT flag [not found] ` <20110814183611.05937c96-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> @ 2011-08-14 17:10 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 17:10 UTC (permalink / raw) To: Bruno Prémont Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/14, Bruno Prémont wrote: > > On Sun, 14 August 2011 Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > On 08/11, Daniel Lezcano wrote: > > > > > > Add the SA_CLDREBOOT flag to enable the CLD_REBOOT on SIGCHLD > > > when a process of a child pid namespace calls the reboot syscall. > > > > I doubt this is really useful in general. SIGCHLD doesn't queue, > > if the parent of init already has the pending SIGCHLD it won't see > > CLD_REBOOTED anyway. Anyway, you do not need a special helper to > > fill siginfo. > > > > And personally I strongly dislike SA_CLDREBOOT. For what? Imho, this > > is unnecessary complication. If we want SIGCHLD-on-reboot, just send > > it unconditionally. > > The information that is needed is what options the container did pass > to sys_reboot(). > Did the container want a reboot or a power-off (or maybe just touch the > LINUX_REBOOT_CMD_CAD_* status)? > > As long as that information can reach parent/reaper of container's init > the needed functionality should be there. Sorry, can't understand. Once again, why do you want SA_CLDREBOOT? OK, you may be need to pass some info via siginfo (but note that it can be lost, and you do not need a special helper for that). But why do you want this hack in sa_flags? Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 20:23 [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Daniel Lezcano [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 20:24 ` [PATCH 1/2] add SA_CLDREBOOT flag Daniel Lezcano @ 2011-08-11 20:24 ` Daniel Lezcano 2011-08-11 21:09 ` Serge Hallyn ` (3 more replies) 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov 3 siblings, 4 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 20:24 UTC (permalink / raw) To: akpm; +Cc: oleg, bonbons, containers, linux-kernel, serge When the reboot syscall is called and the pid namespace where the calling process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED to the parent of this pid namespace. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> --- include/asm-generic/siginfo.h | 3 ++- include/linux/sched.h | 1 + kernel/signal.c | 40 ++++++++++++++++++++++++++++++++++++++++ kernel/sys.c | 20 ++++++++++++++++++-- 4 files changed, 61 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h index 0dd4e87..9bff4a2 100644 --- a/include/asm-generic/siginfo.h +++ b/include/asm-generic/siginfo.h @@ -218,7 +218,8 @@ typedef struct siginfo { #define CLD_TRAPPED (__SI_CHLD|4) /* traced child has trapped */ #define CLD_STOPPED (__SI_CHLD|5) /* child has stopped */ #define CLD_CONTINUED (__SI_CHLD|6) /* stopped child has continued */ -#define NSIGCHLD 6 +#define CLD_REBOOTED (__SI_CHLD|7) /* process was killed by a reboot */ +#define NSIGCHLD 7 /* * SIGPOLL si_codes diff --git a/include/linux/sched.h b/include/linux/sched.h index 20b03bf..c62dc9e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2170,6 +2170,7 @@ extern int kill_pgrp(struct pid *pid, int sig, int priv); extern int kill_pid(struct pid *pid, int sig, int priv); extern int kill_proc_info(int, struct siginfo *, pid_t); extern __must_check bool do_notify_parent(struct task_struct *, int); +extern void do_notify_parent_cldreboot(struct task_struct *, int, char *); extern void __wake_up_parent(struct task_struct *p, struct task_struct *parent); extern void force_sig(int, struct task_struct *); extern int send_sig(int, struct task_struct *, int); diff --git a/kernel/signal.c b/kernel/signal.c index 291c970..7d3d44c 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1668,6 +1668,46 @@ bool do_notify_parent(struct task_struct *tsk, int sig) return autoreap; } +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) +{ + struct siginfo info = { }; + struct task_struct *parent; + struct sighand_struct *sighand; + unsigned long flags; + + if (tsk->ptrace) + parent = tsk->parent; + else { + tsk = tsk->group_leader; + parent = tsk->real_parent; + } + + info.si_signo = SIGCHLD; + info.si_errno = 0; + info.si_status = why; + + rcu_read_lock(); + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); + info.si_uid = __task_cred(tsk)->uid; + rcu_read_unlock(); + + info.si_utime = cputime_to_clock_t(tsk->utime); + info.si_stime = cputime_to_clock_t(tsk->stime); + + info.si_code = CLD_REBOOTED; + + sighand = parent->sighand; + spin_lock_irqsave(&sighand->siglock, flags); + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) + __group_send_sig_info(SIGCHLD, &info, parent); + /* + * Even if SIGCHLD is not generated, we must wake up wait4 calls. + */ + __wake_up_parent(tsk, parent); + spin_unlock_irqrestore(&sighand->siglock, flags); +} + /** * do_notify_parent_cldstop - notify parent of stopped/continued state change * @tsk: task reporting the state change diff --git a/kernel/sys.c b/kernel/sys.c index a101ba3..8f7a9ed 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -42,6 +42,7 @@ #include <linux/syscalls.h> #include <linux/kprobes.h> #include <linux/user_namespace.h> +#include <linux/sched.h> #include <linux/kmsg_dump.h> @@ -411,6 +412,13 @@ void kernel_power_off(void) } EXPORT_SYMBOL_GPL(kernel_power_off); +static void pid_namespace_reboot(struct pid_namespace *pid_ns, + int cmd, char *buffer) +{ + struct task_struct *tsk = pid_ns->child_reaper; + do_notify_parent_cldreboot(tsk, cmd, buffer); +} + static DEFINE_MUTEX(reboot_mutex); /* @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, { char buffer[256]; int ret = 0; + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; + + /* We only trust the superuser with rebooting the system. */ + if (!capable(CAP_SYS_BOOT)) { + /* If we are not in the initial pid namespace, we send a signal + * to the parent of this init pid namespace, notifying a shutdown + * occured */ + if (pid_ns != &init_pid_ns) + pid_namespace_reboot(pid_ns, cmd, buffer); - /* We only trust the superuser with rebooting the system. */ - if (!capable(CAP_SYS_BOOT)) return -EPERM; + } /* For safety, we require "magic" arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano @ 2011-08-11 21:09 ` Serge Hallyn 2011-08-11 21:30 ` Daniel Lezcano 2011-08-11 21:30 ` Daniel Lezcano [not found] ` <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> ` (2 subsequent siblings) 3 siblings, 2 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-11 21:09 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, containers, bonbons, oleg, linux-kernel Quoting Daniel Lezcano (daniel.lezcano@free.fr): > When the reboot syscall is called and the pid namespace where the calling > process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > to the parent of this pid namespace. > > Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> ... > +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) > +{ > + struct siginfo info = { }; > + struct task_struct *parent; > + struct sighand_struct *sighand; > + unsigned long flags; > + > + if (tsk->ptrace) > + parent = tsk->parent; > + else { > + tsk = tsk->group_leader; > + parent = tsk->real_parent; > + } > + > + info.si_signo = SIGCHLD; > + info.si_errno = 0; > + info.si_status = why; > + > + rcu_read_lock(); > + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); > + info.si_uid = __task_cred(tsk)->uid; This eventually should become: info.si_uid = user_ns_map_uid(task_cred_xxx(t, user_ns), current_cred(), current_uid()); I've got a first-stab patch at converting the rest of kernel/signal.c in http://kernel.ubuntu.com/git?p=serge/userns-2.6.git > + rcu_read_unlock(); > + > + info.si_utime = cputime_to_clock_t(tsk->utime); > + info.si_stime = cputime_to_clock_t(tsk->stime); > + > + info.si_code = CLD_REBOOTED; > + > + sighand = parent->sighand; > + spin_lock_irqsave(&sighand->siglock, flags); > + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && > + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) > + __group_send_sig_info(SIGCHLD, &info, parent); > + /* > + * Even if SIGCHLD is not generated, we must wake up wait4 calls. > + */ > + __wake_up_parent(tsk, parent); > + spin_unlock_irqrestore(&sighand->siglock, flags); > +} ... > @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, > { > char buffer[256]; > int ret = 0; > + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; > + > + /* We only trust the superuser with rebooting the system. */ > + if (!capable(CAP_SYS_BOOT)) { Doesn't this mean that an unprivileged task in a container can shut down the container? The pidns->user_ns patch I sent earlier today gives you what you need so that you can add if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT) return -EPERM; right here to prevent that. > + /* If we are not in the initial pid namespace, we send a signal > + * to the parent of this init pid namespace, notifying a shutdown > + * occured */ > + if (pid_ns != &init_pid_ns) > + pid_namespace_reboot(pid_ns, cmd, buffer); > > - /* We only trust the superuser with rebooting the system. */ > - if (!capable(CAP_SYS_BOOT)) > return -EPERM; > + } > > /* For safety, we require "magic" arguments. */ > if (magic1 != LINUX_REBOOT_MAGIC1 || > -- > 1.7.4.1 > > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 21:09 ` Serge Hallyn @ 2011-08-11 21:30 ` Daniel Lezcano 2011-08-11 21:30 ` Daniel Lezcano 1 sibling, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 21:30 UTC (permalink / raw) To: Serge Hallyn Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw On 08/11/2011 11:09 PM, Serge Hallyn wrote: > Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): >> When the reboot syscall is called and the pid namespace where the calling >> process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED >> to the parent of this pid namespace. >> >> Signed-off-by: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org> > ... > >> +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) >> +{ >> + struct siginfo info = { }; >> + struct task_struct *parent; >> + struct sighand_struct *sighand; >> + unsigned long flags; >> + >> + if (tsk->ptrace) >> + parent = tsk->parent; >> + else { >> + tsk = tsk->group_leader; >> + parent = tsk->real_parent; >> + } >> + >> + info.si_signo = SIGCHLD; >> + info.si_errno = 0; >> + info.si_status = why; >> + >> + rcu_read_lock(); >> + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); >> + info.si_uid = __task_cred(tsk)->uid; > > This eventually should become: > > info.si_uid = user_ns_map_uid(task_cred_xxx(t, user_ns), > current_cred(), current_uid()); > > I've got a first-stab patch at converting the rest of > kernel/signal.c in http://kernel.ubuntu.com/git?p=serge/userns-2.6.git Ok, thanks. >> + rcu_read_unlock(); >> + >> + info.si_utime = cputime_to_clock_t(tsk->utime); >> + info.si_stime = cputime_to_clock_t(tsk->stime); >> + >> + info.si_code = CLD_REBOOTED; >> + >> + sighand = parent->sighand; >> + spin_lock_irqsave(&sighand->siglock, flags); >> + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && >> + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) >> + __group_send_sig_info(SIGCHLD, &info, parent); >> + /* >> + * Even if SIGCHLD is not generated, we must wake up wait4 calls. >> + */ >> + __wake_up_parent(tsk, parent); >> + spin_unlock_irqrestore(&sighand->siglock, flags); >> +} > ... > >> @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, >> { >> char buffer[256]; >> int ret = 0; >> + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; >> + >> + /* We only trust the superuser with rebooting the system. */ >> + if (!capable(CAP_SYS_BOOT)) { > Doesn't this mean that an unprivileged task in a container can shut > down the container? Ha ha ! Right, good catch :) Yes, rethinking about it, we can do what initially proposed Bruno by just preventing to reboot when we are not in the init_pid_ns. Actually, the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" and "kill -1 SIGKILL", and would not make sense to do that in a child pid namespace, except if we are in a container where we don't want to reboot :) So IMO, it is safe to do: if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) return -EPERM; if (pid_ns != &init_pid_ns) return pid_namespace_reboot(pid_ns, cmd, buffer); > The pidns->user_ns patch I sent earlier today gives you what you need > so that you can add > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT) > return -EPERM; > > right here to prevent that. > >> + /* If we are not in the initial pid namespace, we send a signal >> + * to the parent of this init pid namespace, notifying a shutdown >> + * occured */ >> + if (pid_ns != &init_pid_ns) >> + pid_namespace_reboot(pid_ns, cmd, buffer); >> >> - /* We only trust the superuser with rebooting the system. */ >> - if (!capable(CAP_SYS_BOOT)) >> return -EPERM; >> + } >> >> /* For safety, we require "magic" arguments. */ >> if (magic1 != LINUX_REBOOT_MAGIC1 || >> -- >> 1.7.4.1 >> >> _______________________________________________ >> Containers mailing list >> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org >> https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 21:09 ` Serge Hallyn 2011-08-11 21:30 ` Daniel Lezcano @ 2011-08-11 21:30 ` Daniel Lezcano [not found] ` <4E444A04.3070403-GANU6spQydw@public.gmane.org> 2011-08-11 21:50 ` Serge Hallyn 1 sibling, 2 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-11 21:30 UTC (permalink / raw) To: Serge Hallyn; +Cc: akpm, containers, bonbons, oleg, linux-kernel On 08/11/2011 11:09 PM, Serge Hallyn wrote: > Quoting Daniel Lezcano (daniel.lezcano@free.fr): >> When the reboot syscall is called and the pid namespace where the calling >> process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED >> to the parent of this pid namespace. >> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> > ... > >> +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) >> +{ >> + struct siginfo info = { }; >> + struct task_struct *parent; >> + struct sighand_struct *sighand; >> + unsigned long flags; >> + >> + if (tsk->ptrace) >> + parent = tsk->parent; >> + else { >> + tsk = tsk->group_leader; >> + parent = tsk->real_parent; >> + } >> + >> + info.si_signo = SIGCHLD; >> + info.si_errno = 0; >> + info.si_status = why; >> + >> + rcu_read_lock(); >> + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); >> + info.si_uid = __task_cred(tsk)->uid; > > This eventually should become: > > info.si_uid = user_ns_map_uid(task_cred_xxx(t, user_ns), > current_cred(), current_uid()); > > I've got a first-stab patch at converting the rest of > kernel/signal.c in http://kernel.ubuntu.com/git?p=serge/userns-2.6.git Ok, thanks. >> + rcu_read_unlock(); >> + >> + info.si_utime = cputime_to_clock_t(tsk->utime); >> + info.si_stime = cputime_to_clock_t(tsk->stime); >> + >> + info.si_code = CLD_REBOOTED; >> + >> + sighand = parent->sighand; >> + spin_lock_irqsave(&sighand->siglock, flags); >> + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && >> + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) >> + __group_send_sig_info(SIGCHLD, &info, parent); >> + /* >> + * Even if SIGCHLD is not generated, we must wake up wait4 calls. >> + */ >> + __wake_up_parent(tsk, parent); >> + spin_unlock_irqrestore(&sighand->siglock, flags); >> +} > ... > >> @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, >> { >> char buffer[256]; >> int ret = 0; >> + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; >> + >> + /* We only trust the superuser with rebooting the system. */ >> + if (!capable(CAP_SYS_BOOT)) { > Doesn't this mean that an unprivileged task in a container can shut > down the container? Ha ha ! Right, good catch :) Yes, rethinking about it, we can do what initially proposed Bruno by just preventing to reboot when we are not in the init_pid_ns. Actually, the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" and "kill -1 SIGKILL", and would not make sense to do that in a child pid namespace, except if we are in a container where we don't want to reboot :) So IMO, it is safe to do: if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) return -EPERM; if (pid_ns != &init_pid_ns) return pid_namespace_reboot(pid_ns, cmd, buffer); > The pidns->user_ns patch I sent earlier today gives you what you need > so that you can add > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT) > return -EPERM; > > right here to prevent that. > >> + /* If we are not in the initial pid namespace, we send a signal >> + * to the parent of this init pid namespace, notifying a shutdown >> + * occured */ >> + if (pid_ns != &init_pid_ns) >> + pid_namespace_reboot(pid_ns, cmd, buffer); >> >> - /* We only trust the superuser with rebooting the system. */ >> - if (!capable(CAP_SYS_BOOT)) >> return -EPERM; >> + } >> >> /* For safety, we require "magic" arguments. */ >> if (magic1 != LINUX_REBOOT_MAGIC1 || >> -- >> 1.7.4.1 >> >> _______________________________________________ >> Containers mailing list >> Containers@lists.linux-foundation.org >> https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <4E444A04.3070403-GANU6spQydw@public.gmane.org>]
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <4E444A04.3070403-GANU6spQydw@public.gmane.org> @ 2011-08-11 21:50 ` Serge Hallyn 2011-08-12 16:29 ` Serge Hallyn 1 sibling, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-11 21:50 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): > On 08/11/2011 11:09 PM, Serge Hallyn wrote: > > Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): > >> When the reboot syscall is called and the pid namespace where the calling > >> process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > >> to the parent of this pid namespace. > >> > >> Signed-off-by: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org> > > ... > > > >> +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) > >> +{ > >> + struct siginfo info = { }; > >> + struct task_struct *parent; > >> + struct sighand_struct *sighand; > >> + unsigned long flags; > >> + > >> + if (tsk->ptrace) > >> + parent = tsk->parent; > >> + else { > >> + tsk = tsk->group_leader; > >> + parent = tsk->real_parent; > >> + } > >> + > >> + info.si_signo = SIGCHLD; > >> + info.si_errno = 0; > >> + info.si_status = why; > >> + > >> + rcu_read_lock(); > >> + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); > >> + info.si_uid = __task_cred(tsk)->uid; > > > > This eventually should become: > > > > info.si_uid = user_ns_map_uid(task_cred_xxx(t, user_ns), > > current_cred(), current_uid()); > > > > I've got a first-stab patch at converting the rest of > > kernel/signal.c in http://kernel.ubuntu.com/git?p=serge/userns-2.6.git > > Ok, thanks. > > >> + rcu_read_unlock(); > >> + > >> + info.si_utime = cputime_to_clock_t(tsk->utime); > >> + info.si_stime = cputime_to_clock_t(tsk->stime); > >> + > >> + info.si_code = CLD_REBOOTED; > >> + > >> + sighand = parent->sighand; > >> + spin_lock_irqsave(&sighand->siglock, flags); > >> + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && > >> + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) > >> + __group_send_sig_info(SIGCHLD, &info, parent); > >> + /* > >> + * Even if SIGCHLD is not generated, we must wake up wait4 calls. > >> + */ > >> + __wake_up_parent(tsk, parent); > >> + spin_unlock_irqrestore(&sighand->siglock, flags); > >> +} > > ... > > > >> @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, > >> { > >> char buffer[256]; > >> int ret = 0; > >> + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; > >> + > >> + /* We only trust the superuser with rebooting the system. */ > >> + if (!capable(CAP_SYS_BOOT)) { > > Doesn't this mean that an unprivileged task in a container can shut > > down the container? > > Ha ha ! Right, good catch :) > > Yes, rethinking about it, we can do what initially proposed Bruno by > just preventing to reboot when we are not in the init_pid_ns. Actually, > the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" > and "kill -1 SIGKILL", and would not make sense to do that in a child > pid namespace, except if we are in a container where we don't want to > reboot :) > > So IMO, it is safe to do: > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) > return -EPERM; That sounds good. Until the pid_ns->user_ns patch goes in, just capable(CAP_SYS_BOOT) works too. Actually, if this is the only thing CAP_SYS_BOOT grants you, and if it is always fully namespaced, then I'm not sure there'll ever be a reason to switch this to ns_capable(). thanks, -serge > if (pid_ns != &init_pid_ns) > return pid_namespace_reboot(pid_ns, cmd, buffer); > > > > The pidns->user_ns patch I sent earlier today gives you what you need > > so that you can add > > > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT) > > return -EPERM; > > > > right here to prevent that. > > > >> + /* If we are not in the initial pid namespace, we send a signal > >> + * to the parent of this init pid namespace, notifying a shutdown > >> + * occured */ > >> + if (pid_ns != &init_pid_ns) > >> + pid_namespace_reboot(pid_ns, cmd, buffer); > >> > >> - /* We only trust the superuser with rebooting the system. */ > >> - if (!capable(CAP_SYS_BOOT)) > >> return -EPERM; > >> + } > >> > >> /* For safety, we require "magic" arguments. */ > >> if (magic1 != LINUX_REBOOT_MAGIC1 || > >> -- > >> 1.7.4.1 > >> > >> _______________________________________________ > >> Containers mailing list > >> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > >> https://lists.linux-foundation.org/mailman/listinfo/containers > ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 21:30 ` Daniel Lezcano @ 2011-08-12 16:29 ` Serge Hallyn 2011-08-11 21:50 ` Serge Hallyn 1 sibling, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-12 16:29 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): ... > > Doesn't this mean that an unprivileged task in a container can shut > > down the container? > > Ha ha ! Right, good catch :) > > Yes, rethinking about it, we can do what initially proposed Bruno by > just preventing to reboot when we are not in the init_pid_ns. Actually, > the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" > and "kill -1 SIGKILL", and would not make sense to do that in a child > pid namespace, except if we are in a container where we don't want to > reboot :) > > So IMO, it is safe to do: > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) > return -EPERM; > > if (pid_ns != &init_pid_ns) > return pid_namespace_reboot(pid_ns, cmd, buffer); So I don't know if you want to prepend http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e to your patchset, or just check nsown_capable(CAP_SYS_BOOT) for now, but as soon as you resend with that I'll happily, nay, ecstatically ack. thanks, -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured @ 2011-08-12 16:29 ` Serge Hallyn 0 siblings, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-12 16:29 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, containers, bonbons, oleg, linux-kernel Quoting Daniel Lezcano (daniel.lezcano@free.fr): ... > > Doesn't this mean that an unprivileged task in a container can shut > > down the container? > > Ha ha ! Right, good catch :) > > Yes, rethinking about it, we can do what initially proposed Bruno by > just preventing to reboot when we are not in the init_pid_ns. Actually, > the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" > and "kill -1 SIGKILL", and would not make sense to do that in a child > pid namespace, except if we are in a container where we don't want to > reboot :) > > So IMO, it is safe to do: > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) > return -EPERM; > > if (pid_ns != &init_pid_ns) > return pid_namespace_reboot(pid_ns, cmd, buffer); So I don't know if you want to prepend http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e to your patchset, or just check nsown_capable(CAP_SYS_BOOT) for now, but as soon as you resend with that I'll happily, nay, ecstatically ack. thanks, -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-12 16:29 ` Serge Hallyn (?) @ 2011-08-12 20:42 ` Daniel Lezcano [not found] ` <4E45904F.60904-GANU6spQydw@public.gmane.org> 2011-08-12 21:13 ` Serge Hallyn -1 siblings, 2 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-12 20:42 UTC (permalink / raw) To: Serge Hallyn; +Cc: akpm, containers, bonbons, oleg, linux-kernel On 08/12/2011 06:29 PM, Serge Hallyn wrote: > Quoting Daniel Lezcano (daniel.lezcano@free.fr): > ... >>> Doesn't this mean that an unprivileged task in a container can shut >>> down the container? >> Ha ha ! Right, good catch :) >> >> Yes, rethinking about it, we can do what initially proposed Bruno by >> just preventing to reboot when we are not in the init_pid_ns. Actually, >> the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" >> and "kill -1 SIGKILL", and would not make sense to do that in a child >> pid namespace, except if we are in a container where we don't want to >> reboot :) >> >> So IMO, it is safe to do: >> >> if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) >> return -EPERM; >> >> if (pid_ns != &init_pid_ns) >> return pid_namespace_reboot(pid_ns, cmd, buffer); > So I don't know if you want to prepend > http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e > to your patchset, or just check nsown_capable(CAP_SYS_BOOT) for now, > but as soon as you resend with that I'll happily, nay, > ecstatically ack. Ok, in order to not mix the functionnalities, I will send in a separate patch the nsown_capable change. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <4E45904F.60904-GANU6spQydw@public.gmane.org>]
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <4E45904F.60904-GANU6spQydw@public.gmane.org> @ 2011-08-12 21:13 ` Serge Hallyn 0 siblings, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-12 21:13 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): > On 08/12/2011 06:29 PM, Serge Hallyn wrote: > > Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): > > ... > >>> Doesn't this mean that an unprivileged task in a container can shut > >>> down the container? > >> Ha ha ! Right, good catch :) > >> > >> Yes, rethinking about it, we can do what initially proposed Bruno by > >> just preventing to reboot when we are not in the init_pid_ns. Actually, > >> the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" > >> and "kill -1 SIGKILL", and would not make sense to do that in a child > >> pid namespace, except if we are in a container where we don't want to > >> reboot :) > >> > >> So IMO, it is safe to do: > >> > >> if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) > >> return -EPERM; > >> > >> if (pid_ns != &init_pid_ns) > >> return pid_namespace_reboot(pid_ns, cmd, buffer); > > So I don't know if you want to prepend > > http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e > > to your patchset, or just check nsown_capable(CAP_SYS_BOOT) for now, > > but as soon as you resend with that I'll happily, nay, > > ecstatically ack. > > Ok, in order to not mix the functionnalities, I will send in a separate > patch the nsown_capable change. Actually the nsown_capable was a temp hack fix anyway, for a problem we don't actually have. I forgot you are not using user namespaces yet anyway. So you can just leave it as if (!capable(CAP_SYS_BOOT)) return -EPERM; for now. Since you're root in the init user namespace, that'll pass (as soon as you don't drop it from bounding set of course) thanks, -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-12 20:42 ` Daniel Lezcano [not found] ` <4E45904F.60904-GANU6spQydw@public.gmane.org> @ 2011-08-12 21:13 ` Serge Hallyn 1 sibling, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-12 21:13 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, containers, bonbons, oleg, linux-kernel Quoting Daniel Lezcano (daniel.lezcano@free.fr): > On 08/12/2011 06:29 PM, Serge Hallyn wrote: > > Quoting Daniel Lezcano (daniel.lezcano@free.fr): > > ... > >>> Doesn't this mean that an unprivileged task in a container can shut > >>> down the container? > >> Ha ha ! Right, good catch :) > >> > >> Yes, rethinking about it, we can do what initially proposed Bruno by > >> just preventing to reboot when we are not in the init_pid_ns. Actually, > >> the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" > >> and "kill -1 SIGKILL", and would not make sense to do that in a child > >> pid namespace, except if we are in a container where we don't want to > >> reboot :) > >> > >> So IMO, it is safe to do: > >> > >> if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) > >> return -EPERM; > >> > >> if (pid_ns != &init_pid_ns) > >> return pid_namespace_reboot(pid_ns, cmd, buffer); > > So I don't know if you want to prepend > > http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e > > to your patchset, or just check nsown_capable(CAP_SYS_BOOT) for now, > > but as soon as you resend with that I'll happily, nay, > > ecstatically ack. > > Ok, in order to not mix the functionnalities, I will send in a separate > patch the nsown_capable change. Actually the nsown_capable was a temp hack fix anyway, for a problem we don't actually have. I forgot you are not using user namespaces yet anyway. So you can just leave it as if (!capable(CAP_SYS_BOOT)) return -EPERM; for now. Since you're root in the init user namespace, that'll pass (as soon as you don't drop it from bounding set of course) thanks, -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-12 16:29 ` Serge Hallyn (?) (?) @ 2011-08-12 20:42 ` Daniel Lezcano -1 siblings, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-12 20:42 UTC (permalink / raw) To: Serge Hallyn Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw On 08/12/2011 06:29 PM, Serge Hallyn wrote: > Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): > ... >>> Doesn't this mean that an unprivileged task in a container can shut >>> down the container? >> Ha ha ! Right, good catch :) >> >> Yes, rethinking about it, we can do what initially proposed Bruno by >> just preventing to reboot when we are not in the init_pid_ns. Actually, >> the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" >> and "kill -1 SIGKILL", and would not make sense to do that in a child >> pid namespace, except if we are in a container where we don't want to >> reboot :) >> >> So IMO, it is safe to do: >> >> if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) >> return -EPERM; >> >> if (pid_ns != &init_pid_ns) >> return pid_namespace_reboot(pid_ns, cmd, buffer); > So I don't know if you want to prepend > http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e > to your patchset, or just check nsown_capable(CAP_SYS_BOOT) for now, > but as soon as you resend with that I'll happily, nay, > ecstatically ack. Ok, in order to not mix the functionnalities, I will send in a separate patch the nsown_capable change. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 21:30 ` Daniel Lezcano [not found] ` <4E444A04.3070403-GANU6spQydw@public.gmane.org> @ 2011-08-11 21:50 ` Serge Hallyn 1 sibling, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-11 21:50 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, containers, bonbons, oleg, linux-kernel Quoting Daniel Lezcano (daniel.lezcano@free.fr): > On 08/11/2011 11:09 PM, Serge Hallyn wrote: > > Quoting Daniel Lezcano (daniel.lezcano@free.fr): > >> When the reboot syscall is called and the pid namespace where the calling > >> process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > >> to the parent of this pid namespace. > >> > >> Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> > > ... > > > >> +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) > >> +{ > >> + struct siginfo info = { }; > >> + struct task_struct *parent; > >> + struct sighand_struct *sighand; > >> + unsigned long flags; > >> + > >> + if (tsk->ptrace) > >> + parent = tsk->parent; > >> + else { > >> + tsk = tsk->group_leader; > >> + parent = tsk->real_parent; > >> + } > >> + > >> + info.si_signo = SIGCHLD; > >> + info.si_errno = 0; > >> + info.si_status = why; > >> + > >> + rcu_read_lock(); > >> + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); > >> + info.si_uid = __task_cred(tsk)->uid; > > > > This eventually should become: > > > > info.si_uid = user_ns_map_uid(task_cred_xxx(t, user_ns), > > current_cred(), current_uid()); > > > > I've got a first-stab patch at converting the rest of > > kernel/signal.c in http://kernel.ubuntu.com/git?p=serge/userns-2.6.git > > Ok, thanks. > > >> + rcu_read_unlock(); > >> + > >> + info.si_utime = cputime_to_clock_t(tsk->utime); > >> + info.si_stime = cputime_to_clock_t(tsk->stime); > >> + > >> + info.si_code = CLD_REBOOTED; > >> + > >> + sighand = parent->sighand; > >> + spin_lock_irqsave(&sighand->siglock, flags); > >> + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && > >> + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) > >> + __group_send_sig_info(SIGCHLD, &info, parent); > >> + /* > >> + * Even if SIGCHLD is not generated, we must wake up wait4 calls. > >> + */ > >> + __wake_up_parent(tsk, parent); > >> + spin_unlock_irqrestore(&sighand->siglock, flags); > >> +} > > ... > > > >> @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, > >> { > >> char buffer[256]; > >> int ret = 0; > >> + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; > >> + > >> + /* We only trust the superuser with rebooting the system. */ > >> + if (!capable(CAP_SYS_BOOT)) { > > Doesn't this mean that an unprivileged task in a container can shut > > down the container? > > Ha ha ! Right, good catch :) > > Yes, rethinking about it, we can do what initially proposed Bruno by > just preventing to reboot when we are not in the init_pid_ns. Actually, > the sys_reboot occurs after the services shutdown and "kill -1 SIGTERM" > and "kill -1 SIGKILL", and would not make sense to do that in a child > pid namespace, except if we are in a container where we don't want to > reboot :) > > So IMO, it is safe to do: > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) > return -EPERM; That sounds good. Until the pid_ns->user_ns patch goes in, just capable(CAP_SYS_BOOT) works too. Actually, if this is the only thing CAP_SYS_BOOT grants you, and if it is always fully namespaced, then I'm not sure there'll ever be a reason to switch this to ns_capable(). thanks, -serge > if (pid_ns != &init_pid_ns) > return pid_namespace_reboot(pid_ns, cmd, buffer); > > > > The pidns->user_ns patch I sent earlier today gives you what you need > > so that you can add > > > > if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT) > > return -EPERM; > > > > right here to prevent that. > > > >> + /* If we are not in the initial pid namespace, we send a signal > >> + * to the parent of this init pid namespace, notifying a shutdown > >> + * occured */ > >> + if (pid_ns != &init_pid_ns) > >> + pid_namespace_reboot(pid_ns, cmd, buffer); > >> > >> - /* We only trust the superuser with rebooting the system. */ > >> - if (!capable(CAP_SYS_BOOT)) > >> return -EPERM; > >> + } > >> > >> /* For safety, we require "magic" arguments. */ > >> if (magic1 != LINUX_REBOOT_MAGIC1 || > >> -- > >> 1.7.4.1 > >> > >> _______________________________________________ > >> Containers mailing list > >> Containers@lists.linux-foundation.org > >> https://lists.linux-foundation.org/mailman/listinfo/containers > ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org>]
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> @ 2011-08-11 21:09 ` Serge Hallyn 2011-08-13 0:19 ` Matt Helsley 2011-08-14 16:01 ` Oleg Nesterov 2 siblings, 0 replies; 63+ messages in thread From: Serge Hallyn @ 2011-08-11 21:09 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw Quoting Daniel Lezcano (daniel.lezcano-GANU6spQydw@public.gmane.org): > When the reboot syscall is called and the pid namespace where the calling > process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > to the parent of this pid namespace. > > Signed-off-by: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org> ... > +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) > +{ > + struct siginfo info = { }; > + struct task_struct *parent; > + struct sighand_struct *sighand; > + unsigned long flags; > + > + if (tsk->ptrace) > + parent = tsk->parent; > + else { > + tsk = tsk->group_leader; > + parent = tsk->real_parent; > + } > + > + info.si_signo = SIGCHLD; > + info.si_errno = 0; > + info.si_status = why; > + > + rcu_read_lock(); > + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); > + info.si_uid = __task_cred(tsk)->uid; This eventually should become: info.si_uid = user_ns_map_uid(task_cred_xxx(t, user_ns), current_cred(), current_uid()); I've got a first-stab patch at converting the rest of kernel/signal.c in http://kernel.ubuntu.com/git?p=serge/userns-2.6.git > + rcu_read_unlock(); > + > + info.si_utime = cputime_to_clock_t(tsk->utime); > + info.si_stime = cputime_to_clock_t(tsk->stime); > + > + info.si_code = CLD_REBOOTED; > + > + sighand = parent->sighand; > + spin_lock_irqsave(&sighand->siglock, flags); > + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && > + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) > + __group_send_sig_info(SIGCHLD, &info, parent); > + /* > + * Even if SIGCHLD is not generated, we must wake up wait4 calls. > + */ > + __wake_up_parent(tsk, parent); > + spin_unlock_irqrestore(&sighand->siglock, flags); > +} ... > @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, > { > char buffer[256]; > int ret = 0; > + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; > + > + /* We only trust the superuser with rebooting the system. */ > + if (!capable(CAP_SYS_BOOT)) { Doesn't this mean that an unprivileged task in a container can shut down the container? The pidns->user_ns patch I sent earlier today gives you what you need so that you can add if (!ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT) return -EPERM; right here to prevent that. > + /* If we are not in the initial pid namespace, we send a signal > + * to the parent of this init pid namespace, notifying a shutdown > + * occured */ > + if (pid_ns != &init_pid_ns) > + pid_namespace_reboot(pid_ns, cmd, buffer); > > - /* We only trust the superuser with rebooting the system. */ > - if (!capable(CAP_SYS_BOOT)) > return -EPERM; > + } > > /* For safety, we require "magic" arguments. */ > if (magic1 != LINUX_REBOOT_MAGIC1 || > -- > 1.7.4.1 > > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 21:09 ` Serge Hallyn @ 2011-08-13 0:19 ` Matt Helsley 2011-08-14 16:01 ` Oleg Nesterov 2 siblings, 0 replies; 63+ messages in thread From: Matt Helsley @ 2011-08-13 0:19 UTC (permalink / raw) To: Daniel Lezcano Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw On Thu, Aug 11, 2011 at 10:24:01PM +0200, Daniel Lezcano wrote: > When the reboot syscall is called and the pid namespace where the calling > process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > to the parent of this pid namespace. Shouldn't we honor the exit_signal set via clone() here rather than hardcode SIGCHLD? More below... > > Signed-off-by: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org> > --- > include/asm-generic/siginfo.h | 3 ++- > include/linux/sched.h | 1 + > kernel/signal.c | 40 ++++++++++++++++++++++++++++++++++++++++ > kernel/sys.c | 20 ++++++++++++++++++-- > 4 files changed, 61 insertions(+), 3 deletions(-) > > diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h > index 0dd4e87..9bff4a2 100644 > --- a/include/asm-generic/siginfo.h > +++ b/include/asm-generic/siginfo.h > @@ -218,7 +218,8 @@ typedef struct siginfo { > #define CLD_TRAPPED (__SI_CHLD|4) /* traced child has trapped */ > #define CLD_STOPPED (__SI_CHLD|5) /* child has stopped */ > #define CLD_CONTINUED (__SI_CHLD|6) /* stopped child has continued */ > -#define NSIGCHLD 6 > +#define CLD_REBOOTED (__SI_CHLD|7) /* process was killed by a reboot */ > +#define NSIGCHLD 7 > > /* > * SIGPOLL si_codes > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 20b03bf..c62dc9e 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -2170,6 +2170,7 @@ extern int kill_pgrp(struct pid *pid, int sig, int priv); > extern int kill_pid(struct pid *pid, int sig, int priv); > extern int kill_proc_info(int, struct siginfo *, pid_t); > extern __must_check bool do_notify_parent(struct task_struct *, int); > +extern void do_notify_parent_cldreboot(struct task_struct *, int, char *); > extern void __wake_up_parent(struct task_struct *p, struct task_struct *parent); > extern void force_sig(int, struct task_struct *); > extern int send_sig(int, struct task_struct *, int); > diff --git a/kernel/signal.c b/kernel/signal.c > index 291c970..7d3d44c 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1668,6 +1668,46 @@ bool do_notify_parent(struct task_struct *tsk, int sig) > return autoreap; > } > > +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) > +{ > + struct siginfo info = { }; > + struct task_struct *parent; > + struct sighand_struct *sighand; > + unsigned long flags; > + > + if (tsk->ptrace) > + parent = tsk->parent; > + else { > + tsk = tsk->group_leader; > + parent = tsk->real_parent; > + } > + > + info.si_signo = SIGCHLD; should this be: info.si_signo = tsk->exit_signal == -1 ? SIGCHLD : tsk->exit_signal; ? > + info.si_errno = 0; > + info.si_status = why; > + > + rcu_read_lock(); > + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); > + info.si_uid = __task_cred(tsk)->uid; > + rcu_read_unlock(); > + > + info.si_utime = cputime_to_clock_t(tsk->utime); > + info.si_stime = cputime_to_clock_t(tsk->stime); > + > + info.si_code = CLD_REBOOTED; > + > + sighand = parent->sighand; > + spin_lock_irqsave(&sighand->siglock, flags); > + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && > + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) > + __group_send_sig_info(SIGCHLD, &info, parent); (with corresponding changes above...) > + /* > + * Even if SIGCHLD is not generated, we must wake up wait4 calls. > + */ ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 21:09 ` Serge Hallyn 2011-08-13 0:19 ` Matt Helsley @ 2011-08-14 16:01 ` Oleg Nesterov 2 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 16:01 UTC (permalink / raw) To: Daniel Lezcano Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/11, Daniel Lezcano wrote: > > When the reboot syscall is called and the pid namespace where the calling > process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > to the parent of this pid namespace. OK, but why you can't simply send the signal? Why do you need the strange do_notify_parent_cldreboot() which tries to mimic do_notify_parent() for (afaics) no reason ? > +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) buffer is not used. Why? > + if (tsk->ptrace) > + parent = tsk->parent; > + else { > + tsk = tsk->group_leader; > + parent = tsk->real_parent; > + } For what? I simply can't understand this... > + sighand = parent->sighand; > + spin_lock_irqsave(&sighand->siglock, flags); this is unsafe, we can't trust ->sighand and parent. > + * Even if SIGCHLD is not generated, we must wake up wait4 calls. but not in this case, afaics? > + __wake_up_parent(tsk, parent); Why do you need __wake_up_parent()? > +static void pid_namespace_reboot(struct pid_namespace *pid_ns, > + int cmd, char *buffer) > +{ > + struct task_struct *tsk = pid_ns->child_reaper; > + do_notify_parent_cldreboot(tsk, cmd, buffer); nothing prevents ->child_reaper from exiting it it is multithreaded, this can crash the kernel. > @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, > { > char buffer[256]; again, it is not used. > int ret = 0; > + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; > + > + /* We only trust the superuser with rebooting the system. */ > + if (!capable(CAP_SYS_BOOT)) { > + /* If we are not in the initial pid namespace, we send a signal > + * to the parent of this init pid namespace, notifying a shutdown > + * occured */ > + if (pid_ns != &init_pid_ns) > + pid_namespace_reboot(pid_ns, cmd, buffer); Hmm. Looks like pid_ns should be checked after CAP_SYS_BOOT? Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano 2011-08-11 21:09 ` Serge Hallyn [not found] ` <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> @ 2011-08-13 0:19 ` Matt Helsley 2011-08-13 14:41 ` Daniel Lezcano [not found] ` <20110813001959.GB5777-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> 2011-08-14 16:01 ` Oleg Nesterov 3 siblings, 2 replies; 63+ messages in thread From: Matt Helsley @ 2011-08-13 0:19 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, containers, bonbons, oleg, linux-kernel On Thu, Aug 11, 2011 at 10:24:01PM +0200, Daniel Lezcano wrote: > When the reboot syscall is called and the pid namespace where the calling > process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > to the parent of this pid namespace. Shouldn't we honor the exit_signal set via clone() here rather than hardcode SIGCHLD? More below... > > Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> > --- > include/asm-generic/siginfo.h | 3 ++- > include/linux/sched.h | 1 + > kernel/signal.c | 40 ++++++++++++++++++++++++++++++++++++++++ > kernel/sys.c | 20 ++++++++++++++++++-- > 4 files changed, 61 insertions(+), 3 deletions(-) > > diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h > index 0dd4e87..9bff4a2 100644 > --- a/include/asm-generic/siginfo.h > +++ b/include/asm-generic/siginfo.h > @@ -218,7 +218,8 @@ typedef struct siginfo { > #define CLD_TRAPPED (__SI_CHLD|4) /* traced child has trapped */ > #define CLD_STOPPED (__SI_CHLD|5) /* child has stopped */ > #define CLD_CONTINUED (__SI_CHLD|6) /* stopped child has continued */ > -#define NSIGCHLD 6 > +#define CLD_REBOOTED (__SI_CHLD|7) /* process was killed by a reboot */ > +#define NSIGCHLD 7 > > /* > * SIGPOLL si_codes > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 20b03bf..c62dc9e 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -2170,6 +2170,7 @@ extern int kill_pgrp(struct pid *pid, int sig, int priv); > extern int kill_pid(struct pid *pid, int sig, int priv); > extern int kill_proc_info(int, struct siginfo *, pid_t); > extern __must_check bool do_notify_parent(struct task_struct *, int); > +extern void do_notify_parent_cldreboot(struct task_struct *, int, char *); > extern void __wake_up_parent(struct task_struct *p, struct task_struct *parent); > extern void force_sig(int, struct task_struct *); > extern int send_sig(int, struct task_struct *, int); > diff --git a/kernel/signal.c b/kernel/signal.c > index 291c970..7d3d44c 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1668,6 +1668,46 @@ bool do_notify_parent(struct task_struct *tsk, int sig) > return autoreap; > } > > +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) > +{ > + struct siginfo info = { }; > + struct task_struct *parent; > + struct sighand_struct *sighand; > + unsigned long flags; > + > + if (tsk->ptrace) > + parent = tsk->parent; > + else { > + tsk = tsk->group_leader; > + parent = tsk->real_parent; > + } > + > + info.si_signo = SIGCHLD; should this be: info.si_signo = tsk->exit_signal == -1 ? SIGCHLD : tsk->exit_signal; ? > + info.si_errno = 0; > + info.si_status = why; > + > + rcu_read_lock(); > + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); > + info.si_uid = __task_cred(tsk)->uid; > + rcu_read_unlock(); > + > + info.si_utime = cputime_to_clock_t(tsk->utime); > + info.si_stime = cputime_to_clock_t(tsk->stime); > + > + info.si_code = CLD_REBOOTED; > + > + sighand = parent->sighand; > + spin_lock_irqsave(&sighand->siglock, flags); > + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && > + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) > + __group_send_sig_info(SIGCHLD, &info, parent); (with corresponding changes above...) > + /* > + * Even if SIGCHLD is not generated, we must wake up wait4 calls. > + */ ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-13 0:19 ` Matt Helsley @ 2011-08-13 14:41 ` Daniel Lezcano [not found] ` <20110813001959.GB5777-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> 1 sibling, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-13 14:41 UTC (permalink / raw) To: Matt Helsley; +Cc: akpm, containers, bonbons, oleg, linux-kernel On 08/13/2011 02:19 AM, Matt Helsley wrote: > On Thu, Aug 11, 2011 at 10:24:01PM +0200, Daniel Lezcano wrote: >> When the reboot syscall is called and the pid namespace where the calling >> process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED >> to the parent of this pid namespace. > Shouldn't we honor the exit_signal set via clone() here rather than > hardcode SIGCHLD? More below... Hmm, I don't think so, that does not really make sense. [ ... ] >> +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) >> +{ >> + struct siginfo info = { }; >> + struct task_struct *parent; >> + struct sighand_struct *sighand; >> + unsigned long flags; >> + >> + if (tsk->ptrace) >> + parent = tsk->parent; >> + else { >> + tsk = tsk->group_leader; >> + parent = tsk->real_parent; >> + } >> + >> + info.si_signo = SIGCHLD; > should this be: > > info.si_signo = tsk->exit_signal == -1 ? SIGCHLD : tsk->exit_signal; No, because this code is always called for SIGHLD. The userspace will receive this signal with CLD_REBOOT if it sets the SA_CLDREBOOT flag option which is not set by default. >> + info.si_errno = 0; >> + info.si_status = why; >> + >> + rcu_read_lock(); >> + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); >> + info.si_uid = __task_cred(tsk)->uid; >> + rcu_read_unlock(); >> + >> + info.si_utime = cputime_to_clock_t(tsk->utime); >> + info.si_stime = cputime_to_clock_t(tsk->stime); >> + >> + info.si_code = CLD_REBOOTED; >> + >> + sighand = parent->sighand; >> + spin_lock_irqsave(&sighand->siglock, flags); >> + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && >> + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) >> + __group_send_sig_info(SIGCHLD, &info, parent); > (with corresponding changes above...) > >> + /* >> + * Even if SIGCHLD is not generated, we must wake up wait4 calls. >> + */ ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110813001959.GB5777-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>]
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured [not found] ` <20110813001959.GB5777-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> @ 2011-08-13 14:41 ` Daniel Lezcano 0 siblings, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-13 14:41 UTC (permalink / raw) To: Matt Helsley Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, oleg-6lXkIZvqkOAvJsYlp49lxw On 08/13/2011 02:19 AM, Matt Helsley wrote: > On Thu, Aug 11, 2011 at 10:24:01PM +0200, Daniel Lezcano wrote: >> When the reboot syscall is called and the pid namespace where the calling >> process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED >> to the parent of this pid namespace. > Shouldn't we honor the exit_signal set via clone() here rather than > hardcode SIGCHLD? More below... Hmm, I don't think so, that does not really make sense. [ ... ] >> +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) >> +{ >> + struct siginfo info = { }; >> + struct task_struct *parent; >> + struct sighand_struct *sighand; >> + unsigned long flags; >> + >> + if (tsk->ptrace) >> + parent = tsk->parent; >> + else { >> + tsk = tsk->group_leader; >> + parent = tsk->real_parent; >> + } >> + >> + info.si_signo = SIGCHLD; > should this be: > > info.si_signo = tsk->exit_signal == -1 ? SIGCHLD : tsk->exit_signal; No, because this code is always called for SIGHLD. The userspace will receive this signal with CLD_REBOOT if it sets the SA_CLDREBOOT flag option which is not set by default. >> + info.si_errno = 0; >> + info.si_status = why; >> + >> + rcu_read_lock(); >> + info.si_pid = task_pid_nr_ns(tsk, parent->nsproxy->pid_ns); >> + info.si_uid = __task_cred(tsk)->uid; >> + rcu_read_unlock(); >> + >> + info.si_utime = cputime_to_clock_t(tsk->utime); >> + info.si_stime = cputime_to_clock_t(tsk->stime); >> + >> + info.si_code = CLD_REBOOTED; >> + >> + sighand = parent->sighand; >> + spin_lock_irqsave(&sighand->siglock, flags); >> + if (sighand->action[SIGCHLD-1].sa.sa_handler != SIG_IGN && >> + sighand->action[SIGCHLD-1].sa.sa_flags & SA_CLDREBOOT) >> + __group_send_sig_info(SIGCHLD, &info, parent); > (with corresponding changes above...) > >> + /* >> + * Even if SIGCHLD is not generated, we must wake up wait4 calls. >> + */ ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 2/2] Notify container-init parent a 'reboot' occured 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano ` (2 preceding siblings ...) 2011-08-13 0:19 ` Matt Helsley @ 2011-08-14 16:01 ` Oleg Nesterov 3 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 16:01 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, bonbons, containers, linux-kernel, serge On 08/11, Daniel Lezcano wrote: > > When the reboot syscall is called and the pid namespace where the calling > process belongs to is not from the init pidns, we send a SIGCHLD with CLD_REBOOTED > to the parent of this pid namespace. OK, but why you can't simply send the signal? Why do you need the strange do_notify_parent_cldreboot() which tries to mimic do_notify_parent() for (afaics) no reason ? > +void do_notify_parent_cldreboot(struct task_struct *tsk, int why, char *buffer) buffer is not used. Why? > + if (tsk->ptrace) > + parent = tsk->parent; > + else { > + tsk = tsk->group_leader; > + parent = tsk->real_parent; > + } For what? I simply can't understand this... > + sighand = parent->sighand; > + spin_lock_irqsave(&sighand->siglock, flags); this is unsafe, we can't trust ->sighand and parent. > + * Even if SIGCHLD is not generated, we must wake up wait4 calls. but not in this case, afaics? > + __wake_up_parent(tsk, parent); Why do you need __wake_up_parent()? > +static void pid_namespace_reboot(struct pid_namespace *pid_ns, > + int cmd, char *buffer) > +{ > + struct task_struct *tsk = pid_ns->child_reaper; > + do_notify_parent_cldreboot(tsk, cmd, buffer); nothing prevents ->child_reaper from exiting it it is multithreaded, this can crash the kernel. > @@ -426,10 +434,18 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, > { > char buffer[256]; again, it is not used. > int ret = 0; > + struct pid_namespace *pid_ns = current->nsproxy->pid_ns; > + > + /* We only trust the superuser with rebooting the system. */ > + if (!capable(CAP_SYS_BOOT)) { > + /* If we are not in the initial pid namespace, we send a signal > + * to the parent of this init pid namespace, notifying a shutdown > + * occured */ > + if (pid_ns != &init_pid_ns) > + pid_namespace_reboot(pid_ns, cmd, buffer); Hmm. Looks like pid_ns should be checked after CAP_SYS_BOOT? Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-11 20:23 [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Daniel Lezcano ` (2 preceding siblings ...) 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano @ 2011-08-14 16:17 ` Oleg Nesterov 2011-08-14 21:36 ` Serge E. Hallyn [not found] ` <20110814161707.GB30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 3 siblings, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-14 16:17 UTC (permalink / raw) To: Daniel Lezcano; +Cc: akpm, bonbons, containers, linux-kernel, serge On 08/11, Daniel Lezcano wrote: > > In the case of a VPS, when we shutdown/halt/reboot the container, the > reboot utility will invoke the sys_reboot syscall which has the bad > effect to reboot the host. Stupid question. Can't sys_reboot() simply kill init (and thus the whole pid_ns) in this case? Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov @ 2011-08-14 21:36 ` Serge E. Hallyn 2011-08-15 14:47 ` Oleg Nesterov [not found] ` <20110814213642.GB13799-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> [not found] ` <20110814161707.GB30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Serge E. Hallyn @ 2011-08-14 21:36 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Daniel Lezcano, akpm, bonbons, containers, linux-kernel Quoting Oleg Nesterov (oleg@redhat.com): > On 08/11, Daniel Lezcano wrote: > > > > In the case of a VPS, when we shutdown/halt/reboot the container, the > > reboot utility will invoke the sys_reboot syscall which has the bad > > effect to reboot the host. > > Stupid question. Can't sys_reboot() simply kill init (and thus the whole > pid_ns) in this case? The goal is to be able to distinguish a request for reboot from shutdown. If we just kill the init, then the parent of init (the container monitor) cannot restart the container to emulate reboot. [ sorry if I misunderstand what you're asking ] -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-14 21:36 ` Serge E. Hallyn @ 2011-08-15 14:47 ` Oleg Nesterov 2011-08-15 17:39 ` Serge E. Hallyn ` (2 more replies) [not found] ` <20110814213642.GB13799-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 1 sibling, 3 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-15 14:47 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Daniel Lezcano, akpm, bonbons, containers, linux-kernel On 08/14, Serge E. Hallyn wrote: > > Quoting Oleg Nesterov (oleg@redhat.com): > > On 08/11, Daniel Lezcano wrote: > > > > > > In the case of a VPS, when we shutdown/halt/reboot the container, the > > > reboot utility will invoke the sys_reboot syscall which has the bad > > > effect to reboot the host. > > > > Stupid question. Can't sys_reboot() simply kill init (and thus the whole > > pid_ns) in this case? > > The goal is to be able to distinguish a request for reboot from shutdown. > If we just kill the init, then the parent of init (the container monitor) > cannot restart the container to emulate reboot. OK, thanks. What if init reports the reason it was killed? Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, - add "int reboot_cmd" into struct pid_namespace - sys_reboot(cmd) does if (!global_namespace) { task_active_pid_ns(current)->reboot_cmd = cmd; sigkill_my_init(); } - change zap_pid_ns_processes() to do if (pid_ns->reboot_cmd) // approximately current->exit_state = pid_ns->reboot_cmd; Then its parent can look at status after wait(&status). Not that I think this is very nice, but signals are not reliable. And once again, SIGCHLD doesn't queue. And, perhaps this doesn't matter, but sys_reboot() sends SIGCHLD and returns -EPERM, this can confuse the container. In any case. If you want to send a signal, please do not introduce SA_CLDREBOOT. Please do not play with ptrace or __wake_up_parent, this is meaningless. Just fill siginfo and send SIGCHLD unconditionally. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-15 14:47 ` Oleg Nesterov @ 2011-08-15 17:39 ` Serge E. Hallyn 2011-08-15 17:50 ` Daniel Lezcano [not found] ` <20110815173940.GA19620-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> [not found] ` <20110815144744.GA9660-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-18 23:46 ` Daniel Lezcano 2 siblings, 2 replies; 63+ messages in thread From: Serge E. Hallyn @ 2011-08-15 17:39 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Daniel Lezcano, akpm, bonbons, containers, linux-kernel Quoting Oleg Nesterov (oleg@redhat.com): > On 08/14, Serge E. Hallyn wrote: > > > > Quoting Oleg Nesterov (oleg@redhat.com): > > > On 08/11, Daniel Lezcano wrote: > > > > > > > > In the case of a VPS, when we shutdown/halt/reboot the container, the > > > > reboot utility will invoke the sys_reboot syscall which has the bad > > > > effect to reboot the host. > > > > > > Stupid question. Can't sys_reboot() simply kill init (and thus the whole > > > pid_ns) in this case? > > > > The goal is to be able to distinguish a request for reboot from shutdown. > > If we just kill the init, then the parent of init (the container monitor) > > cannot restart the container to emulate reboot. > > OK, thanks. > > What if init reports the reason it was killed? > > Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, > > - add "int reboot_cmd" into struct pid_namespace > > - sys_reboot(cmd) does > > if (!global_namespace) { > task_active_pid_ns(current)->reboot_cmd = cmd; > sigkill_my_init(); > } > > - change zap_pid_ns_processes() to do > > if (pid_ns->reboot_cmd) // approximately > current->exit_state = pid_ns->reboot_cmd; > > Then its parent can look at status after wait(&status). That looks good to me. Daniel, is there a reason this wouldn't work? > Not that I think this is very nice, but signals are not reliable. > And once again, SIGCHLD doesn't queue. And, perhaps this doesn't > matter, but sys_reboot() sends SIGCHLD and returns -EPERM, this > can confuse the container. I think you're looking at an older (Aug 11) version. Daniel's newer patch should do the right thing (but I don't seem to have it in front of me atm) > In any case. If you want to send a signal, please do not introduce > SA_CLDREBOOT. Please do not play with ptrace or __wake_up_parent, this > is meaningless. Just fill siginfo and send SIGCHLD unconditionally. > > Oleg. Thanks for the help, Oleg. -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-15 17:39 ` Serge E. Hallyn @ 2011-08-15 17:50 ` Daniel Lezcano [not found] ` <20110815173940.GA19620-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 1 sibling, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-15 17:50 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Oleg Nesterov, akpm, bonbons, containers, linux-kernel On 08/15/2011 07:39 PM, Serge E. Hallyn wrote: > Quoting Oleg Nesterov (oleg@redhat.com): >> On 08/14, Serge E. Hallyn wrote: >>> Quoting Oleg Nesterov (oleg@redhat.com): >>>> On 08/11, Daniel Lezcano wrote: >>>>> In the case of a VPS, when we shutdown/halt/reboot the container, the >>>>> reboot utility will invoke the sys_reboot syscall which has the bad >>>>> effect to reboot the host. >>>> Stupid question. Can't sys_reboot() simply kill init (and thus the whole >>>> pid_ns) in this case? >>> The goal is to be able to distinguish a request for reboot from shutdown. >>> If we just kill the init, then the parent of init (the container monitor) >>> cannot restart the container to emulate reboot. >> OK, thanks. >> >> What if init reports the reason it was killed? >> >> Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, >> >> - add "int reboot_cmd" into struct pid_namespace >> >> - sys_reboot(cmd) does >> >> if (!global_namespace) { >> task_active_pid_ns(current)->reboot_cmd = cmd; >> sigkill_my_init(); >> } >> >> - change zap_pid_ns_processes() to do >> >> if (pid_ns->reboot_cmd) // approximately >> current->exit_state = pid_ns->reboot_cmd; >> >> Then its parent can look at status after wait(&status). > That looks good to me. Daniel, is there a reason this wouldn't work? No, it is an alternative we can try. Oleg raise a good point about the SIGCHLD not being queue. Let me write this patch and test it. -- Daniel ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110815173940.GA19620-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110815173940.GA19620-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> @ 2011-08-15 17:50 ` Daniel Lezcano 0 siblings, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-15 17:50 UTC (permalink / raw) To: Serge E. Hallyn Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, Oleg Nesterov, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/15/2011 07:39 PM, Serge E. Hallyn wrote: > Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> On 08/14, Serge E. Hallyn wrote: >>> Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >>>> On 08/11, Daniel Lezcano wrote: >>>>> In the case of a VPS, when we shutdown/halt/reboot the container, the >>>>> reboot utility will invoke the sys_reboot syscall which has the bad >>>>> effect to reboot the host. >>>> Stupid question. Can't sys_reboot() simply kill init (and thus the whole >>>> pid_ns) in this case? >>> The goal is to be able to distinguish a request for reboot from shutdown. >>> If we just kill the init, then the parent of init (the container monitor) >>> cannot restart the container to emulate reboot. >> OK, thanks. >> >> What if init reports the reason it was killed? >> >> Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, >> >> - add "int reboot_cmd" into struct pid_namespace >> >> - sys_reboot(cmd) does >> >> if (!global_namespace) { >> task_active_pid_ns(current)->reboot_cmd = cmd; >> sigkill_my_init(); >> } >> >> - change zap_pid_ns_processes() to do >> >> if (pid_ns->reboot_cmd) // approximately >> current->exit_state = pid_ns->reboot_cmd; >> >> Then its parent can look at status after wait(&status). > That looks good to me. Daniel, is there a reason this wouldn't work? No, it is an alternative we can try. Oleg raise a good point about the SIGCHLD not being queue. Let me write this patch and test it. -- Daniel ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110815144744.GA9660-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110815144744.GA9660-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-15 17:39 ` Serge E. Hallyn 2011-08-18 23:46 ` Daniel Lezcano 1 sibling, 0 replies; 63+ messages in thread From: Serge E. Hallyn @ 2011-08-15 17:39 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > On 08/14, Serge E. Hallyn wrote: > > > > Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > On 08/11, Daniel Lezcano wrote: > > > > > > > > In the case of a VPS, when we shutdown/halt/reboot the container, the > > > > reboot utility will invoke the sys_reboot syscall which has the bad > > > > effect to reboot the host. > > > > > > Stupid question. Can't sys_reboot() simply kill init (and thus the whole > > > pid_ns) in this case? > > > > The goal is to be able to distinguish a request for reboot from shutdown. > > If we just kill the init, then the parent of init (the container monitor) > > cannot restart the container to emulate reboot. > > OK, thanks. > > What if init reports the reason it was killed? > > Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, > > - add "int reboot_cmd" into struct pid_namespace > > - sys_reboot(cmd) does > > if (!global_namespace) { > task_active_pid_ns(current)->reboot_cmd = cmd; > sigkill_my_init(); > } > > - change zap_pid_ns_processes() to do > > if (pid_ns->reboot_cmd) // approximately > current->exit_state = pid_ns->reboot_cmd; > > Then its parent can look at status after wait(&status). That looks good to me. Daniel, is there a reason this wouldn't work? > Not that I think this is very nice, but signals are not reliable. > And once again, SIGCHLD doesn't queue. And, perhaps this doesn't > matter, but sys_reboot() sends SIGCHLD and returns -EPERM, this > can confuse the container. I think you're looking at an older (Aug 11) version. Daniel's newer patch should do the right thing (but I don't seem to have it in front of me atm) > In any case. If you want to send a signal, please do not introduce > SA_CLDREBOOT. Please do not play with ptrace or __wake_up_parent, this > is meaningless. Just fill siginfo and send SIGCHLD unconditionally. > > Oleg. Thanks for the help, Oleg. -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110815144744.GA9660-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-15 17:39 ` Serge E. Hallyn @ 2011-08-18 23:46 ` Daniel Lezcano 1 sibling, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-18 23:46 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/15/2011 04:47 PM, Oleg Nesterov wrote: > On 08/14, Serge E. Hallyn wrote: >> Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >>> On 08/11, Daniel Lezcano wrote: >>>> In the case of a VPS, when we shutdown/halt/reboot the container, the >>>> reboot utility will invoke the sys_reboot syscall which has the bad >>>> effect to reboot the host. >>> Stupid question. Can't sys_reboot() simply kill init (and thus the whole >>> pid_ns) in this case? >> The goal is to be able to distinguish a request for reboot from shutdown. >> If we just kill the init, then the parent of init (the container monitor) >> cannot restart the container to emulate reboot. > OK, thanks. > > What if init reports the reason it was killed? > > Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, > > - add "int reboot_cmd" into struct pid_namespace > > - sys_reboot(cmd) does > > if (!global_namespace) { > task_active_pid_ns(current)->reboot_cmd = cmd; > sigkill_my_init(); > } Hi Oleg, what would be your advice to get rid of from_ancestor_ns which prevent the signal to be delivered to the init process ? Thanks -- Daniel ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-15 14:47 ` Oleg Nesterov 2011-08-15 17:39 ` Serge E. Hallyn [not found] ` <20110815144744.GA9660-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-18 23:46 ` Daniel Lezcano 2011-08-19 15:24 ` Oleg Nesterov [not found] ` <4E4DA461.8030006-GANU6spQydw@public.gmane.org> 2 siblings, 2 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-18 23:46 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Serge E. Hallyn, akpm, bonbons, containers, linux-kernel On 08/15/2011 04:47 PM, Oleg Nesterov wrote: > On 08/14, Serge E. Hallyn wrote: >> Quoting Oleg Nesterov (oleg@redhat.com): >>> On 08/11, Daniel Lezcano wrote: >>>> In the case of a VPS, when we shutdown/halt/reboot the container, the >>>> reboot utility will invoke the sys_reboot syscall which has the bad >>>> effect to reboot the host. >>> Stupid question. Can't sys_reboot() simply kill init (and thus the whole >>> pid_ns) in this case? >> The goal is to be able to distinguish a request for reboot from shutdown. >> If we just kill the init, then the parent of init (the container monitor) >> cannot restart the container to emulate reboot. > OK, thanks. > > What if init reports the reason it was killed? > > Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, > > - add "int reboot_cmd" into struct pid_namespace > > - sys_reboot(cmd) does > > if (!global_namespace) { > task_active_pid_ns(current)->reboot_cmd = cmd; > sigkill_my_init(); > } Hi Oleg, what would be your advice to get rid of from_ancestor_ns which prevent the signal to be delivered to the init process ? Thanks -- Daniel ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-18 23:46 ` Daniel Lezcano @ 2011-08-19 15:24 ` Oleg Nesterov [not found] ` <20110819152416.GA17034-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-22 12:28 ` Daniel Lezcano [not found] ` <4E4DA461.8030006-GANU6spQydw@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-19 15:24 UTC (permalink / raw) To: Daniel Lezcano; +Cc: Serge E. Hallyn, akpm, bonbons, containers, linux-kernel On 08/19, Daniel Lezcano wrote: > > On 08/15/2011 04:47 PM, Oleg Nesterov wrote: > > > > - sys_reboot(cmd) does > > > > if (!global_namespace) { > > task_active_pid_ns(current)->reboot_cmd = cmd; > > sigkill_my_init(); > > } > > Hi Oleg, > > what would be your advice to get rid of from_ancestor_ns which prevent > the signal to be delivered to the init process ? Sure, a plain kill can't work. You can do force_sig_info(), this clears SIGNAL_UNKILLABLE. Hmm. But now I seem to recall we have other reasons to make the new sigkill_task() helper... We will see. Anyway, force_ should work afaics. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110819152416.GA17034-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110819152416.GA17034-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-22 12:28 ` Daniel Lezcano 0 siblings, 0 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-22 12:28 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/19/2011 05:24 PM, Oleg Nesterov wrote: > On 08/19, Daniel Lezcano wrote: >> On 08/15/2011 04:47 PM, Oleg Nesterov wrote: >>> - sys_reboot(cmd) does >>> >>> if (!global_namespace) { >>> task_active_pid_ns(current)->reboot_cmd = cmd; >>> sigkill_my_init(); >>> } >> Hi Oleg, >> >> what would be your advice to get rid of from_ancestor_ns which prevent >> the signal to be delivered to the init process ? > Sure, a plain kill can't work. You can do force_sig_info(), this clears > SIGNAL_UNKILLABLE. > > Hmm. But now I seem to recall we have other reasons to make the new > sigkill_task() helper... We will see. Anyway, force_ should work afaics. Thanks Oleg. I wrote the patch by sending a signal to the init process of the pid namespace using force_sig_info. That works fine, thanks for the hint. I am wondering what is the best way to transmit the reason of the reboot to the parent of the container's init. If we pass the reason to the exit_code of the init process, that will be a bit weird as the process is signaled and did not exited no ? Furthermore, how to differentiate an application container (eg. a script) exiting with an error with the same value of a reboot reason ? Wouldn't make sense to let the user to specify a signal via prctl where the si_code is filled with the reason ? Without invoking the prctl, the init process is simply killed by the kernel, otherwise we send the signal to the container's init. From userspace: void sigreboot(int sig, siginfo_t *si, void *private) { switch(si->si_code) { case LINUX_REBOOT_CMD_RESTART: reboot_container(); break; case LINUX_REBOOT_CMD_HALT: halt_container(); break; ... } } struct sigaction sa = { .sa_sigaction = sigreboot, .sa_flags = SA_SIGINFO; } sigaction(SIGUSR1, &sa, NULL); prctl(PR_SIGREBOOT, SIGUSR1); ----- and from the kernel (called from sys_reboot): int kill_pid_ns(struct pid_namespace *pid_ns, int reason) { struct task_struct *tsk; struct siginfo info; if (pid_ns->notifier) { info.si_signo = SIGKILL; info.si_errno = 0; info.si_code = reason; info.si_pid = 0; info.si_uid = 0; return force_sig_info(notifier->sig, &info, notifier->tsk); } write_lock_irq(&tasklist_lock); tsk = pid_ns->child_reaper; write_unlock_irq(&tasklist_lock); info.si_signo = SIGKILL; info.si_errno = 0; info.si_code = SI_KERNEL; info.si_pid = 0; info.si_uid = 0; return force_sig_info(SIGKILL, &info, tsk); } Roughly, assuming pid_ns->notifier is reseted when we reparent to the init_pid_ns.init. What do you think ? Thanks -- Daniel ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-19 15:24 ` Oleg Nesterov [not found] ` <20110819152416.GA17034-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-22 12:28 ` Daniel Lezcano 2011-08-22 15:44 ` Oleg Nesterov [not found] ` <4E524B73.3050704-GANU6spQydw@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Daniel Lezcano @ 2011-08-22 12:28 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Serge E. Hallyn, akpm, bonbons, containers, linux-kernel On 08/19/2011 05:24 PM, Oleg Nesterov wrote: > On 08/19, Daniel Lezcano wrote: >> On 08/15/2011 04:47 PM, Oleg Nesterov wrote: >>> - sys_reboot(cmd) does >>> >>> if (!global_namespace) { >>> task_active_pid_ns(current)->reboot_cmd = cmd; >>> sigkill_my_init(); >>> } >> Hi Oleg, >> >> what would be your advice to get rid of from_ancestor_ns which prevent >> the signal to be delivered to the init process ? > Sure, a plain kill can't work. You can do force_sig_info(), this clears > SIGNAL_UNKILLABLE. > > Hmm. But now I seem to recall we have other reasons to make the new > sigkill_task() helper... We will see. Anyway, force_ should work afaics. Thanks Oleg. I wrote the patch by sending a signal to the init process of the pid namespace using force_sig_info. That works fine, thanks for the hint. I am wondering what is the best way to transmit the reason of the reboot to the parent of the container's init. If we pass the reason to the exit_code of the init process, that will be a bit weird as the process is signaled and did not exited no ? Furthermore, how to differentiate an application container (eg. a script) exiting with an error with the same value of a reboot reason ? Wouldn't make sense to let the user to specify a signal via prctl where the si_code is filled with the reason ? Without invoking the prctl, the init process is simply killed by the kernel, otherwise we send the signal to the container's init. >From userspace: void sigreboot(int sig, siginfo_t *si, void *private) { switch(si->si_code) { case LINUX_REBOOT_CMD_RESTART: reboot_container(); break; case LINUX_REBOOT_CMD_HALT: halt_container(); break; ... } } struct sigaction sa = { .sa_sigaction = sigreboot, .sa_flags = SA_SIGINFO; } sigaction(SIGUSR1, &sa, NULL); prctl(PR_SIGREBOOT, SIGUSR1); ----- and from the kernel (called from sys_reboot): int kill_pid_ns(struct pid_namespace *pid_ns, int reason) { struct task_struct *tsk; struct siginfo info; if (pid_ns->notifier) { info.si_signo = SIGKILL; info.si_errno = 0; info.si_code = reason; info.si_pid = 0; info.si_uid = 0; return force_sig_info(notifier->sig, &info, notifier->tsk); } write_lock_irq(&tasklist_lock); tsk = pid_ns->child_reaper; write_unlock_irq(&tasklist_lock); info.si_signo = SIGKILL; info.si_errno = 0; info.si_code = SI_KERNEL; info.si_pid = 0; info.si_uid = 0; return force_sig_info(SIGKILL, &info, tsk); } Roughly, assuming pid_ns->notifier is reseted when we reparent to the init_pid_ns.init. What do you think ? Thanks -- Daniel ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-22 12:28 ` Daniel Lezcano @ 2011-08-22 15:44 ` Oleg Nesterov [not found] ` <20110822154448.GA8527-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-22 16:31 ` Bruno Prémont [not found] ` <4E524B73.3050704-GANU6spQydw@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-22 15:44 UTC (permalink / raw) To: Daniel Lezcano; +Cc: Serge E. Hallyn, akpm, bonbons, containers, linux-kernel On 08/22, Daniel Lezcano wrote: > > If we pass the reason to the exit_code of the init process, that will be > a bit weird as the process is signaled and did not exited no ? Just in case, you shouldn't change ->exit_code blindly. We should only change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. In this case we can assume that it was killed by sys_reboot. Now. I didn't really mean exit_state should be equal to sys_reboot's cmd arg. I thought about something like swicth (reboot_cmd) { case LINUX_REBOOT_CMD_RESTART: code = SIGHUP; break; case LINUX_REBOOT_CMD_HALT: code = SIGINT; // doesn't really matter what we report ... } we know that init can't be killed by SIGHUP/SIGINT, and this can't be confused with the case when init does exit(exit_code). But in fact I do not not think that WIFSIGNALED() is that important. init shouldn't exit anyway. > Furthermore, how to differentiate an application container (eg. a > script) exiting with an error with the same value of a reboot reason ? Well, I think it is better to fix the script than the kernel. Daniel, I am not arguing. I agree that this looks like the hack anyway. Just I think that other approaches are even worse imho. We should try to make the kernel change as simple as possible. > Wouldn't make sense to let the user to specify a signal via prctl where > the si_code is filled with the reason ? Sorry, I don't quite understand the idea... And, iiuc, the point was to "fix" sys_reboot() so that we do not need to mofify the distro/userspace? In short. Please do what you like more. But I'd like you to know, I'll argue with any complications which (afaics!) we can avoid, I promise ;) Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110822154448.GA8527-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110822154448.GA8527-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-22 16:31 ` Bruno Prémont 0 siblings, 0 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-22 16:31 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Mon, 22 August 2011 Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On 08/22, Daniel Lezcano wrote: > > > > If we pass the reason to the exit_code of the init process, that will be > > a bit weird as the process is signaled and did not exited no ? > > Just in case, you shouldn't change ->exit_code blindly. We should only > change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. > In this case we can assume that it was killed by sys_reboot. > > Now. I didn't really mean exit_state should be equal to sys_reboot's > cmd arg. I thought about something like > > swicth (reboot_cmd) { > case LINUX_REBOOT_CMD_RESTART: > code = SIGHUP; > break; > case LINUX_REBOOT_CMD_HALT: > code = SIGINT; // doesn't really matter what we report > ... > } Isn't it possible to add the two cases to si_code possible values, e.g. CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT to avoid possible confusion with CDL_STOPPED)? Then on sys_reboot() flag container init and kill it (this way sys_reboot() preserves its "will not return on success for restart/halt" scematic)? Then container init would see CLD_KILLED replaced with matching reboot reason. > we know that init can't be killed by SIGHUP/SIGINT, and this can't be > confused with the case when init does exit(exit_code). > > But in fact I do not not think that WIFSIGNALED() is that important. > init shouldn't exit anyway. Playing with the exit code is probably more problematic (maybe even more once a process can unshare PID_NS as then any process spawning children may see the exit codes with special meaning and mis-reporting the child as having failed when return code is non-zero). > > Furthermore, how to differentiate an application container (eg. a > > script) exiting with an error with the same value of a reboot reason ? > > Well, I think it is better to fix the script than the kernel. IMHO this depends a lot on the point-of-view regarding pid_ns versus container where technically both may be the same though user might have completely different setup in mind. For container init should never exit by itself, for simple segmentation the basic script approach makes a lot of sense and gets invisible to parent once unshare(PID_NS) gets possible. > Daniel, I am not arguing. I agree that this looks like the hack anyway. > Just I think that other approaches are even worse imho. We should try > to make the kernel change as simple as possible. > > > > Wouldn't make sense to let the user to specify a signal via prctl where > > the si_code is filled with the reason ? > > Sorry, I don't quite understand the idea... > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > to mofify the distro/userspace? That's definitely the goal (not modify distro/userspace running inside container). Bruno ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-22 15:44 ` Oleg Nesterov [not found] ` <20110822154448.GA8527-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-22 16:31 ` Bruno Prémont [not found] ` <20110822183134.10390b46-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 2011-08-22 17:39 ` Oleg Nesterov 1 sibling, 2 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-22 16:31 UTC (permalink / raw) To: Oleg Nesterov Cc: Daniel Lezcano, Serge E. Hallyn, akpm, containers, linux-kernel On Mon, 22 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > On 08/22, Daniel Lezcano wrote: > > > > If we pass the reason to the exit_code of the init process, that will be > > a bit weird as the process is signaled and did not exited no ? > > Just in case, you shouldn't change ->exit_code blindly. We should only > change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. > In this case we can assume that it was killed by sys_reboot. > > Now. I didn't really mean exit_state should be equal to sys_reboot's > cmd arg. I thought about something like > > swicth (reboot_cmd) { > case LINUX_REBOOT_CMD_RESTART: > code = SIGHUP; > break; > case LINUX_REBOOT_CMD_HALT: > code = SIGINT; // doesn't really matter what we report > ... > } Isn't it possible to add the two cases to si_code possible values, e.g. CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT to avoid possible confusion with CDL_STOPPED)? Then on sys_reboot() flag container init and kill it (this way sys_reboot() preserves its "will not return on success for restart/halt" scematic)? Then container init would see CLD_KILLED replaced with matching reboot reason. > we know that init can't be killed by SIGHUP/SIGINT, and this can't be > confused with the case when init does exit(exit_code). > > But in fact I do not not think that WIFSIGNALED() is that important. > init shouldn't exit anyway. Playing with the exit code is probably more problematic (maybe even more once a process can unshare PID_NS as then any process spawning children may see the exit codes with special meaning and mis-reporting the child as having failed when return code is non-zero). > > Furthermore, how to differentiate an application container (eg. a > > script) exiting with an error with the same value of a reboot reason ? > > Well, I think it is better to fix the script than the kernel. IMHO this depends a lot on the point-of-view regarding pid_ns versus container where technically both may be the same though user might have completely different setup in mind. For container init should never exit by itself, for simple segmentation the basic script approach makes a lot of sense and gets invisible to parent once unshare(PID_NS) gets possible. > Daniel, I am not arguing. I agree that this looks like the hack anyway. > Just I think that other approaches are even worse imho. We should try > to make the kernel change as simple as possible. > > > > Wouldn't make sense to let the user to specify a signal via prctl where > > the si_code is filled with the reason ? > > Sorry, I don't quite understand the idea... > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > to mofify the distro/userspace? That's definitely the goal (not modify distro/userspace running inside container). Bruno ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110822183134.10390b46-hY15tx4IgV39zxVx7UNMDg@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110822183134.10390b46-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> @ 2011-08-22 17:39 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-22 17:39 UTC (permalink / raw) To: Bruno Prémont Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/22, Bruno Prémont wrote: > > On Mon, 22 August 2011 Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > On 08/22, Daniel Lezcano wrote: > > > > > > If we pass the reason to the exit_code of the init process, that will be > > > a bit weird as the process is signaled and did not exited no ? > > > > Just in case, you shouldn't change ->exit_code blindly. We should only > > change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. > > In this case we can assume that it was killed by sys_reboot. > > > > Now. I didn't really mean exit_state should be equal to sys_reboot's > > cmd arg. I thought about something like > > > > swicth (reboot_cmd) { > > case LINUX_REBOOT_CMD_RESTART: > > code = SIGHUP; > > break; > > case LINUX_REBOOT_CMD_HALT: > > code = SIGINT; // doesn't really matter what we report > > ... > > } > > Isn't it possible to add the two cases to si_code possible values, e.g. > CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT How? You should change do_wait() paths then. Even if we could, personally I'd strongly object ;) Look, you have the very specific problem. The kernel can't do everything to make everyone happy. There is tradeoff. But if you really meant siginfo->si_code, I do not understand at all what you actually mean. This info is not preserved when the task exits. > to avoid possible > confusion with CDL_STOPPED)? How it is possible to confuse this with CDL_STOPPED? > Then on sys_reboot() flag container init and kill it (this way sys_reboot() > preserves its "will not return on success for restart/halt" scematic)? This is what I suggested... > Then container init would see CLD_KILLED replaced with matching reboot > reason. For what? its parent need this info, not container init. I guess I got lost completely. > Playing with the exit code is probably more problematic OK, then please do something else. I do not pretend I really understand what do you really need to solve your problem. But please do not forget the kernel is already very complex and full of misc hacks ;) > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > > to mofify the distro/userspace? > > That's definitely the goal (not modify distro/userspace running inside > container). In this case I do not understand how prctl() can help. But please do not try to convince me, this is simply unnecessary ;) Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-22 16:31 ` Bruno Prémont [not found] ` <20110822183134.10390b46-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> @ 2011-08-22 17:39 ` Oleg Nesterov [not found] ` <20110822173949.GA13242-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-22 19:17 ` Bruno Prémont 1 sibling, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-22 17:39 UTC (permalink / raw) To: Bruno Prémont Cc: Daniel Lezcano, Serge E. Hallyn, akpm, containers, linux-kernel On 08/22, Bruno Prémont wrote: > > On Mon, 22 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > > On 08/22, Daniel Lezcano wrote: > > > > > > If we pass the reason to the exit_code of the init process, that will be > > > a bit weird as the process is signaled and did not exited no ? > > > > Just in case, you shouldn't change ->exit_code blindly. We should only > > change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. > > In this case we can assume that it was killed by sys_reboot. > > > > Now. I didn't really mean exit_state should be equal to sys_reboot's > > cmd arg. I thought about something like > > > > swicth (reboot_cmd) { > > case LINUX_REBOOT_CMD_RESTART: > > code = SIGHUP; > > break; > > case LINUX_REBOOT_CMD_HALT: > > code = SIGINT; // doesn't really matter what we report > > ... > > } > > Isn't it possible to add the two cases to si_code possible values, e.g. > CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT How? You should change do_wait() paths then. Even if we could, personally I'd strongly object ;) Look, you have the very specific problem. The kernel can't do everything to make everyone happy. There is tradeoff. But if you really meant siginfo->si_code, I do not understand at all what you actually mean. This info is not preserved when the task exits. > to avoid possible > confusion with CDL_STOPPED)? How it is possible to confuse this with CDL_STOPPED? > Then on sys_reboot() flag container init and kill it (this way sys_reboot() > preserves its "will not return on success for restart/halt" scematic)? This is what I suggested... > Then container init would see CLD_KILLED replaced with matching reboot > reason. For what? its parent need this info, not container init. I guess I got lost completely. > Playing with the exit code is probably more problematic OK, then please do something else. I do not pretend I really understand what do you really need to solve your problem. But please do not forget the kernel is already very complex and full of misc hacks ;) > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > > to mofify the distro/userspace? > > That's definitely the goal (not modify distro/userspace running inside > container). In this case I do not understand how prctl() can help. But please do not try to convince me, this is simply unnecessary ;) Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110822173949.GA13242-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110822173949.GA13242-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-22 19:17 ` Bruno Prémont 0 siblings, 0 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-22 19:17 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Mon, 22 August 2011 Oleg Nesterov wrote: > On 08/22, Bruno Prémont wrote: > > > > On Mon, 22 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > > > On 08/22, Daniel Lezcano wrote: > > > > > > > > If we pass the reason to the exit_code of the init process, that will be > > > > a bit weird as the process is signaled and did not exited no ? > > > > > > Just in case, you shouldn't change ->exit_code blindly. We should only > > > change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. > > > In this case we can assume that it was killed by sys_reboot. > > > > > > Now. I didn't really mean exit_state should be equal to sys_reboot's > > > cmd arg. I thought about something like > > > > > > swicth (reboot_cmd) { > > > case LINUX_REBOOT_CMD_RESTART: > > > code = SIGHUP; > > > break; > > > case LINUX_REBOOT_CMD_HALT: > > > code = SIGINT; // doesn't really matter what we report > > > ... > > > } > > > > Isn't it possible to add the two cases to si_code possible values, e.g. > > CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT > > How? You should change do_wait() paths then. Even if we could, personally > I'd strongly object ;) Look, you have the very specific problem. The kernel > can't do everything to make everyone happy. There is tradeoff. > > But if you really meant siginfo->si_code, I do not understand at all what > you actually mean. This info is not preserved when the task exits. I've been reading do_wait() code a bit, it decides between CLD_KILLED and CLD_DUMPED based on a bit of (struct task_struct).exit_code. So struct_task IS still available (how about it's namespace references? If the namespaces are not the reboot reason would need to be stored somewhere inside of struct task which might be some overhead too much) So as long as container init's task_struct exists the reboot reason could be preserved and used to replace CLD_DUMPED/CLD_KILLED siginfo->si_code. So yes, tradeoff then is where to adjust code in order to get the information along. > > to avoid possible > > confusion with CDL_STOPPED)? > > How it is possible to confuse this with CDL_STOPPED? Confusion is usually reading too quickly! (e.g. skipping, overseeing, not realizing the details) > > Then on sys_reboot() flag container init and kill it (this way sys_reboot() > > preserves its "will not return on success for restart/halt" scematic)? > > This is what I suggested... > > > Then container init would see CLD_KILLED replaced with matching reboot > > reason. > > For what? its parent need this info, not container init. I guess I got > lost completely. I meant CLD_KILLED as visible to parent in siginfo. Sure, as container's init gets killed it doesn't see anything itself. > > Playing with the exit code is probably more problematic > > OK, then please do something else. I do not pretend I really understand > what do you really need to solve your problem. But please do not forget > the kernel is already very complex and full of misc hacks ;) > > > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > > > to mofify the distro/userspace? > > > > That's definitely the goal (not modify distro/userspace running inside > > container). > > In this case I do not understand how prctl() can help. I'm not talking about prctl() - as I understand Daniel the prctl() part is for the process outside of the container, not the one inside. So for container hypervisor to say if it wants to get informed or not. > But please do not try to convince me, this is simply unnecessary ;) No, trying to know what's reasonably possible without having to do a second iteration when a new (foreseeable) features arrives. And also hopefully having the same understanding of the issue. Bruno _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-22 17:39 ` Oleg Nesterov [not found] ` <20110822173949.GA13242-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-22 19:17 ` Bruno Prémont 2011-08-23 13:33 ` Oleg Nesterov [not found] ` <20110822211716.7c141d5c-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-22 19:17 UTC (permalink / raw) To: Oleg Nesterov Cc: Daniel Lezcano, Serge E. Hallyn, akpm, containers, linux-kernel On Mon, 22 August 2011 Oleg Nesterov wrote: > On 08/22, Bruno Prémont wrote: > > > > On Mon, 22 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > > > On 08/22, Daniel Lezcano wrote: > > > > > > > > If we pass the reason to the exit_code of the init process, that will be > > > > a bit weird as the process is signaled and did not exited no ? > > > > > > Just in case, you shouldn't change ->exit_code blindly. We should only > > > change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. > > > In this case we can assume that it was killed by sys_reboot. > > > > > > Now. I didn't really mean exit_state should be equal to sys_reboot's > > > cmd arg. I thought about something like > > > > > > swicth (reboot_cmd) { > > > case LINUX_REBOOT_CMD_RESTART: > > > code = SIGHUP; > > > break; > > > case LINUX_REBOOT_CMD_HALT: > > > code = SIGINT; // doesn't really matter what we report > > > ... > > > } > > > > Isn't it possible to add the two cases to si_code possible values, e.g. > > CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT > > How? You should change do_wait() paths then. Even if we could, personally > I'd strongly object ;) Look, you have the very specific problem. The kernel > can't do everything to make everyone happy. There is tradeoff. > > But if you really meant siginfo->si_code, I do not understand at all what > you actually mean. This info is not preserved when the task exits. I've been reading do_wait() code a bit, it decides between CLD_KILLED and CLD_DUMPED based on a bit of (struct task_struct).exit_code. So struct_task IS still available (how about it's namespace references? If the namespaces are not the reboot reason would need to be stored somewhere inside of struct task which might be some overhead too much) So as long as container init's task_struct exists the reboot reason could be preserved and used to replace CLD_DUMPED/CLD_KILLED siginfo->si_code. So yes, tradeoff then is where to adjust code in order to get the information along. > > to avoid possible > > confusion with CDL_STOPPED)? > > How it is possible to confuse this with CDL_STOPPED? Confusion is usually reading too quickly! (e.g. skipping, overseeing, not realizing the details) > > Then on sys_reboot() flag container init and kill it (this way sys_reboot() > > preserves its "will not return on success for restart/halt" scematic)? > > This is what I suggested... > > > Then container init would see CLD_KILLED replaced with matching reboot > > reason. > > For what? its parent need this info, not container init. I guess I got > lost completely. I meant CLD_KILLED as visible to parent in siginfo. Sure, as container's init gets killed it doesn't see anything itself. > > Playing with the exit code is probably more problematic > > OK, then please do something else. I do not pretend I really understand > what do you really need to solve your problem. But please do not forget > the kernel is already very complex and full of misc hacks ;) > > > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > > > to mofify the distro/userspace? > > > > That's definitely the goal (not modify distro/userspace running inside > > container). > > In this case I do not understand how prctl() can help. I'm not talking about prctl() - as I understand Daniel the prctl() part is for the process outside of the container, not the one inside. So for container hypervisor to say if it wants to get informed or not. > But please do not try to convince me, this is simply unnecessary ;) No, trying to know what's reasonably possible without having to do a second iteration when a new (foreseeable) features arrives. And also hopefully having the same understanding of the issue. Bruno ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-22 19:17 ` Bruno Prémont @ 2011-08-23 13:33 ` Oleg Nesterov 2011-08-23 14:09 ` Greg Kurz [not found] ` <20110823133302.GA19582-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [not found] ` <20110822211716.7c141d5c-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-23 13:33 UTC (permalink / raw) To: Bruno Prémont Cc: Daniel Lezcano, Serge E. Hallyn, akpm, containers, linux-kernel On 08/22, Bruno Prémont wrote: > > On Mon, 22 August 2011 Oleg Nesterov wrote: > > On 08/22, Bruno Prémont wrote: > > > > > > Isn't it possible to add the two cases to si_code possible values, e.g. > > > CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT > > > > How? You should change do_wait() paths then. Even if we could, personally > > I'd strongly object ;) Look, you have the very specific problem. The kernel > > can't do everything to make everyone happy. There is tradeoff. > > > > But if you really meant siginfo->si_code, I do not understand at all what > > you actually mean. This info is not preserved when the task exits. > > I've been reading do_wait() code a bit, it decides between CLD_KILLED and > CLD_DUMPED based on a bit of (struct task_struct).exit_code. Yes. But we were talking about the CLD_EXITED/CLD_KILLED difference. And in fact this CLD_ doesn't matter at all. sys_waitid(info) can see it, but you can simply look at "status". There is no additional info. > So struct_task IS still available Sure. But I do not understand why do you mention this... And, I think, in this discussion we can pretend that only task->exit_code is still available. > (how about it's namespace references? If > the namespaces are not the reboot reason would need to be stored somewhere > inside of struct task which might be some overhead too much) > > So as long as container init's task_struct exists the reboot reason could > be preserved and used to replace CLD_DUMPED/CLD_KILLED siginfo->si_code. At least now I understand why did you mention si_code/CLD before. You meant waitid(). I thought you were talking about the death-notifications which can't report CLD_ you need. I strongly object. We shouldn't uglify wait_task_zombie() to solve the very specific problem. And once again. sub_init->parent does wiat(&status) and sees WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really killed by SIGHUP, it must be CMD_RESTART. Why this can't work? Why do you want the additional complications? > > > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > > > > to mofify the distro/userspace? > > > > > > That's definitely the goal (not modify distro/userspace running inside > > > container). > > > > In this case I do not understand how prctl() can help. > > I'm not talking about prctl() - as I understand Daniel the prctl() part is > for the process outside of the container, not the one inside. > So for container hypervisor to say if it wants to get informed or not. OK, I certainly misunderstood him. And still can't understand what exactly was suggested ;) > > But please do not try to convince me, this is simply unnecessary ;) > > No, trying to know what's reasonably possible Can't resist... IMHO, imho, imho, but in this case I believe "reasonably possible" == "simplest" ;) Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-23 13:33 ` Oleg Nesterov @ 2011-08-23 14:09 ` Greg Kurz [not found] ` <1314108566.3465.29.camel-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org> 2011-08-23 14:29 ` Oleg Nesterov [not found] ` <20110823133302.GA19582-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Greg Kurz @ 2011-08-23 14:09 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Bruno Prémont, containers, akpm, linux-kernel On Tue, 2011-08-23 at 15:33 +0200, Oleg Nesterov wrote: > [...] > At least now I understand why did you mention si_code/CLD before. You > meant waitid(). I thought you were talking about the death-notifications > which can't report CLD_ you need. > > I strongly object. We shouldn't uglify wait_task_zombie() to solve the > very specific problem. > > And once again. sub_init->parent does wiat(&status) and sees > WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that > sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really > killed by SIGHUP, it must be CMD_RESTART. > > Why this can't work? Why do you want the additional complications? > I don't see either what could go wrong with you approach. It doesn't mess with critical wait() or signal paths. It's definitely the way to go. Thanks. -- Gregory Kurz gkurz@fr.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 "Anarchy is about taking complete responsibility for yourself." Alan Moore. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <1314108566.3465.29.camel-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <1314108566.3465.29.camel-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org> @ 2011-08-23 14:29 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-23 14:29 UTC (permalink / raw) To: Greg Kurz Cc: Bruno Prémont, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/23, Greg Kurz wrote: > > On Tue, 2011-08-23 at 15:33 +0200, Oleg Nesterov wrote: > > [...] > > At least now I understand why did you mention si_code/CLD before. You > > meant waitid(). I thought you were talking about the death-notifications > > which can't report CLD_ you need. > > > > I strongly object. We shouldn't uglify wait_task_zombie() to solve the > > very specific problem. > > > > And once again. sub_init->parent does wiat(&status) and sees > > WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that > > sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really > > killed by SIGHUP, it must be CMD_RESTART. > > > > Why this can't work? Why do you want the additional complications? > > > > I don't see either what could go wrong with you approach. Thanks ;) Just in case... instead of WIFSIGNALED/WTERMSIG we can also report the exit code in the upper bits. I mean, switch (reboot_cmd) { case LINUX_REBOOT_CMD_RESTART: code = 1 << 16; break; case LINUX_REBOOT_CMD_HALT: code = 2 << 16; break; } this can't be confused with the normal exit(code), just the parent should be careful, I am not sure this can't confuse WIFEXITED/WEXITSTATUS user-space macroses. wait(&status) takes "int *", we have a room for additional info, and wait_task_zombie() simply copies exit_code. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-23 14:09 ` Greg Kurz [not found] ` <1314108566.3465.29.camel-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org> @ 2011-08-23 14:29 ` Oleg Nesterov [not found] ` <20110823142914.GA22593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-24 19:44 ` Bruno Prémont 1 sibling, 2 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-23 14:29 UTC (permalink / raw) To: Greg Kurz; +Cc: Bruno Prémont, containers, akpm, linux-kernel On 08/23, Greg Kurz wrote: > > On Tue, 2011-08-23 at 15:33 +0200, Oleg Nesterov wrote: > > [...] > > At least now I understand why did you mention si_code/CLD before. You > > meant waitid(). I thought you were talking about the death-notifications > > which can't report CLD_ you need. > > > > I strongly object. We shouldn't uglify wait_task_zombie() to solve the > > very specific problem. > > > > And once again. sub_init->parent does wiat(&status) and sees > > WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that > > sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really > > killed by SIGHUP, it must be CMD_RESTART. > > > > Why this can't work? Why do you want the additional complications? > > > > I don't see either what could go wrong with you approach. Thanks ;) Just in case... instead of WIFSIGNALED/WTERMSIG we can also report the exit code in the upper bits. I mean, switch (reboot_cmd) { case LINUX_REBOOT_CMD_RESTART: code = 1 << 16; break; case LINUX_REBOOT_CMD_HALT: code = 2 << 16; break; } this can't be confused with the normal exit(code), just the parent should be careful, I am not sure this can't confuse WIFEXITED/WEXITSTATUS user-space macroses. wait(&status) takes "int *", we have a room for additional info, and wait_task_zombie() simply copies exit_code. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110823142914.GA22593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110823142914.GA22593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-24 19:44 ` Bruno Prémont 0 siblings, 0 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-24 19:44 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Tue, 23 August 2011 Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On 08/23, Greg Kurz wrote: > > > > On Tue, 2011-08-23 at 15:33 +0200, Oleg Nesterov wrote: > > > [...] > > > At least now I understand why did you mention si_code/CLD before. You > > > meant waitid(). I thought you were talking about the death-notifications > > > which can't report CLD_ you need. > > > > > > I strongly object. We shouldn't uglify wait_task_zombie() to solve the > > > very specific problem. > > > > > > And once again. sub_init->parent does wiat(&status) and sees > > > WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that > > > sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really > > > killed by SIGHUP, it must be CMD_RESTART. > > > > > > Why this can't work? Why do you want the additional complications? > > > > > > > I don't see either what could go wrong with you approach. > > Thanks ;) > > Just in case... instead of WIFSIGNALED/WTERMSIG we can also report > the exit code in the upper bits. I mean, > > switch (reboot_cmd) { > case LINUX_REBOOT_CMD_RESTART: > code = 1 << 16; > break; > case LINUX_REBOOT_CMD_HALT: > code = 2 << 16; > break; > } That looks nice and simple! > this can't be confused with the normal exit(code), just the parent > should be careful, I am not sure this can't confuse WIFEXITED/WEXITSTATUS > user-space macroses. For recent glibc (2.12.2): sys/wait.h and stdlib.h: # define WIFEXITED(status) __WIFEXITED (__WAIT_INT (status)) # define WEXITSTATUS(status) __WEXITSTATUS (__WAIT_INT (status)) bits/waitstatus.h: #define __WIFEXITED(status) (__WTERMSIG(status) == 0) #define __WTERMSIG(status) ((status) & 0x7f) #define __WEXITSTATUS(status) (((status) & 0xff00) >> 8) For this version of glibc on x86 and amd64 it looks fine. Would have to look at other versions and other libcs to make sure there are no surprises. Bruno > wait(&status) takes "int *", we have a room for additional info, and > wait_task_zombie() simply copies exit_code. > > Oleg. > ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-23 14:29 ` Oleg Nesterov [not found] ` <20110823142914.GA22593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-24 19:44 ` Bruno Prémont 2011-08-25 15:37 ` Oleg Nesterov [not found] ` <20110824214418.474b24c6-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 1 sibling, 2 replies; 63+ messages in thread From: Bruno Prémont @ 2011-08-24 19:44 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Greg Kurz, containers, akpm, linux-kernel On Tue, 23 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > On 08/23, Greg Kurz wrote: > > > > On Tue, 2011-08-23 at 15:33 +0200, Oleg Nesterov wrote: > > > [...] > > > At least now I understand why did you mention si_code/CLD before. You > > > meant waitid(). I thought you were talking about the death-notifications > > > which can't report CLD_ you need. > > > > > > I strongly object. We shouldn't uglify wait_task_zombie() to solve the > > > very specific problem. > > > > > > And once again. sub_init->parent does wiat(&status) and sees > > > WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that > > > sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really > > > killed by SIGHUP, it must be CMD_RESTART. > > > > > > Why this can't work? Why do you want the additional complications? > > > > > > > I don't see either what could go wrong with you approach. > > Thanks ;) > > Just in case... instead of WIFSIGNALED/WTERMSIG we can also report > the exit code in the upper bits. I mean, > > switch (reboot_cmd) { > case LINUX_REBOOT_CMD_RESTART: > code = 1 << 16; > break; > case LINUX_REBOOT_CMD_HALT: > code = 2 << 16; > break; > } That looks nice and simple! > this can't be confused with the normal exit(code), just the parent > should be careful, I am not sure this can't confuse WIFEXITED/WEXITSTATUS > user-space macroses. For recent glibc (2.12.2): sys/wait.h and stdlib.h: # define WIFEXITED(status) __WIFEXITED (__WAIT_INT (status)) # define WEXITSTATUS(status) __WEXITSTATUS (__WAIT_INT (status)) bits/waitstatus.h: #define __WIFEXITED(status) (__WTERMSIG(status) == 0) #define __WTERMSIG(status) ((status) & 0x7f) #define __WEXITSTATUS(status) (((status) & 0xff00) >> 8) For this version of glibc on x86 and amd64 it looks fine. Would have to look at other versions and other libcs to make sure there are no surprises. Bruno > wait(&status) takes "int *", we have a room for additional info, and > wait_task_zombie() simply copies exit_code. > > Oleg. > ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot 2011-08-24 19:44 ` Bruno Prémont @ 2011-08-25 15:37 ` Oleg Nesterov [not found] ` <20110824214418.474b24c6-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 1 sibling, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-25 15:37 UTC (permalink / raw) To: Bruno Prémont; +Cc: Greg Kurz, containers, akpm, linux-kernel On 08/24, Bruno Prémont wrote: > > On Tue, 23 August 2011 Oleg Nesterov <oleg@redhat.com> wrote: > > > > Just in case... instead of WIFSIGNALED/WTERMSIG we can also report > > the exit code in the upper bits. I mean, > > > > switch (reboot_cmd) { > > case LINUX_REBOOT_CMD_RESTART: > > code = 1 << 16; > > break; > > case LINUX_REBOOT_CMD_HALT: > > code = 2 << 16; > > break; > > } > > That looks nice and simple! Great. To me, WIFSIGNALED() looks better, but this is subjective and in any case this is up to you. > > this can't be confused with the normal exit(code), just the parent > > should be careful, I am not sure this can't confuse WIFEXITED/WEXITSTATUS > > user-space macroses. > > For recent glibc (2.12.2): > sys/wait.h and stdlib.h: > # define WIFEXITED(status) __WIFEXITED (__WAIT_INT (status)) > # define WEXITSTATUS(status) __WEXITSTATUS (__WAIT_INT (status)) > bits/waitstatus.h: > #define __WIFEXITED(status) (__WTERMSIG(status) == 0) > #define __WTERMSIG(status) ((status) & 0x7f) > #define __WEXITSTATUS(status) (((status) & 0xff00) >> 8) ... and note that __WEXITSTATUS() does "& 0xff00", it uses the lower 16 bits. So the parent should read status "by hand". Not a problem, I think. Say, a traced task reports the additional info this way. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110824214418.474b24c6-hY15tx4IgV39zxVx7UNMDg@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110824214418.474b24c6-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> @ 2011-08-25 15:37 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-25 15:37 UTC (permalink / raw) To: Bruno Prémont Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/24, Bruno Prémont wrote: > > On Tue, 23 August 2011 Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > Just in case... instead of WIFSIGNALED/WTERMSIG we can also report > > the exit code in the upper bits. I mean, > > > > switch (reboot_cmd) { > > case LINUX_REBOOT_CMD_RESTART: > > code = 1 << 16; > > break; > > case LINUX_REBOOT_CMD_HALT: > > code = 2 << 16; > > break; > > } > > That looks nice and simple! Great. To me, WIFSIGNALED() looks better, but this is subjective and in any case this is up to you. > > this can't be confused with the normal exit(code), just the parent > > should be careful, I am not sure this can't confuse WIFEXITED/WEXITSTATUS > > user-space macroses. > > For recent glibc (2.12.2): > sys/wait.h and stdlib.h: > # define WIFEXITED(status) __WIFEXITED (__WAIT_INT (status)) > # define WEXITSTATUS(status) __WEXITSTATUS (__WAIT_INT (status)) > bits/waitstatus.h: > #define __WIFEXITED(status) (__WTERMSIG(status) == 0) > #define __WTERMSIG(status) ((status) & 0x7f) > #define __WEXITSTATUS(status) (((status) & 0xff00) >> 8) ... and note that __WEXITSTATUS() does "& 0xff00", it uses the lower 16 bits. So the parent should read status "by hand". Not a problem, I think. Say, a traced task reports the additional info this way. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110823133302.GA19582-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110823133302.GA19582-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-23 14:09 ` Greg Kurz 0 siblings, 0 replies; 63+ messages in thread From: Greg Kurz @ 2011-08-23 14:09 UTC (permalink / raw) To: Oleg Nesterov Cc: Bruno Prémont, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Tue, 2011-08-23 at 15:33 +0200, Oleg Nesterov wrote: > [...] > At least now I understand why did you mention si_code/CLD before. You > meant waitid(). I thought you were talking about the death-notifications > which can't report CLD_ you need. > > I strongly object. We shouldn't uglify wait_task_zombie() to solve the > very specific problem. > > And once again. sub_init->parent does wiat(&status) and sees > WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that > sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really > killed by SIGHUP, it must be CMD_RESTART. > > Why this can't work? Why do you want the additional complications? > I don't see either what could go wrong with you approach. It doesn't mess with critical wait() or signal paths. It's definitely the way to go. Thanks. -- Gregory Kurz gkurz-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 "Anarchy is about taking complete responsibility for yourself." Alan Moore. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110822211716.7c141d5c-hY15tx4IgV39zxVx7UNMDg@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110822211716.7c141d5c-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> @ 2011-08-23 13:33 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-23 13:33 UTC (permalink / raw) To: Bruno Prémont Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/22, Bruno Prémont wrote: > > On Mon, 22 August 2011 Oleg Nesterov wrote: > > On 08/22, Bruno Prémont wrote: > > > > > > Isn't it possible to add the two cases to si_code possible values, e.g. > > > CDL_RESTART, CDL_HALT (or CDL_SYS_RESTART, CDL_SYS_HALT > > > > How? You should change do_wait() paths then. Even if we could, personally > > I'd strongly object ;) Look, you have the very specific problem. The kernel > > can't do everything to make everyone happy. There is tradeoff. > > > > But if you really meant siginfo->si_code, I do not understand at all what > > you actually mean. This info is not preserved when the task exits. > > I've been reading do_wait() code a bit, it decides between CLD_KILLED and > CLD_DUMPED based on a bit of (struct task_struct).exit_code. Yes. But we were talking about the CLD_EXITED/CLD_KILLED difference. And in fact this CLD_ doesn't matter at all. sys_waitid(info) can see it, but you can simply look at "status". There is no additional info. > So struct_task IS still available Sure. But I do not understand why do you mention this... And, I think, in this discussion we can pretend that only task->exit_code is still available. > (how about it's namespace references? If > the namespaces are not the reboot reason would need to be stored somewhere > inside of struct task which might be some overhead too much) > > So as long as container init's task_struct exists the reboot reason could > be preserved and used to replace CLD_DUMPED/CLD_KILLED siginfo->si_code. At least now I understand why did you mention si_code/CLD before. You meant waitid(). I thought you were talking about the death-notifications which can't report CLD_ you need. I strongly object. We shouldn't uglify wait_task_zombie() to solve the very specific problem. And once again. sub_init->parent does wiat(&status) and sees WIFSIGNALED() && WTERMSIG(status) == SIGHUP. This can only mean that sys_reboot(LINUX_REBOOT_CMD_RESTART) was called. It _can not_ be really killed by SIGHUP, it must be CMD_RESTART. Why this can't work? Why do you want the additional complications? > > > > And, iiuc, the point was to "fix" sys_reboot() so that we do not need > > > > to mofify the distro/userspace? > > > > > > That's definitely the goal (not modify distro/userspace running inside > > > container). > > > > In this case I do not understand how prctl() can help. > > I'm not talking about prctl() - as I understand Daniel the prctl() part is > for the process outside of the container, not the one inside. > So for container hypervisor to say if it wants to get informed or not. OK, I certainly misunderstood him. And still can't understand what exactly was suggested ;) > > But please do not try to convince me, this is simply unnecessary ;) > > No, trying to know what's reasonably possible Can't resist... IMHO, imho, imho, but in this case I believe "reasonably possible" == "simplest" ;) Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <4E524B73.3050704-GANU6spQydw@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <4E524B73.3050704-GANU6spQydw@public.gmane.org> @ 2011-08-22 15:44 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-22 15:44 UTC (permalink / raw) To: Daniel Lezcano Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/22, Daniel Lezcano wrote: > > If we pass the reason to the exit_code of the init process, that will be > a bit weird as the process is signaled and did not exited no ? Just in case, you shouldn't change ->exit_code blindly. We should only change it if init was a) SIGKILL'ed and b) pid_ns->reboot_cmd is set. In this case we can assume that it was killed by sys_reboot. Now. I didn't really mean exit_state should be equal to sys_reboot's cmd arg. I thought about something like swicth (reboot_cmd) { case LINUX_REBOOT_CMD_RESTART: code = SIGHUP; break; case LINUX_REBOOT_CMD_HALT: code = SIGINT; // doesn't really matter what we report ... } we know that init can't be killed by SIGHUP/SIGINT, and this can't be confused with the case when init does exit(exit_code). But in fact I do not not think that WIFSIGNALED() is that important. init shouldn't exit anyway. > Furthermore, how to differentiate an application container (eg. a > script) exiting with an error with the same value of a reboot reason ? Well, I think it is better to fix the script than the kernel. Daniel, I am not arguing. I agree that this looks like the hack anyway. Just I think that other approaches are even worse imho. We should try to make the kernel change as simple as possible. > Wouldn't make sense to let the user to specify a signal via prctl where > the si_code is filled with the reason ? Sorry, I don't quite understand the idea... And, iiuc, the point was to "fix" sys_reboot() so that we do not need to mofify the distro/userspace? In short. Please do what you like more. But I'd like you to know, I'll argue with any complications which (afaics!) we can avoid, I promise ;) Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <4E4DA461.8030006-GANU6spQydw@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <4E4DA461.8030006-GANU6spQydw@public.gmane.org> @ 2011-08-19 15:24 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-19 15:24 UTC (permalink / raw) To: Daniel Lezcano Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/19, Daniel Lezcano wrote: > > On 08/15/2011 04:47 PM, Oleg Nesterov wrote: > > > > - sys_reboot(cmd) does > > > > if (!global_namespace) { > > task_active_pid_ns(current)->reboot_cmd = cmd; > > sigkill_my_init(); > > } > > Hi Oleg, > > what would be your advice to get rid of from_ancestor_ns which prevent > the signal to be delivered to the init process ? Sure, a plain kill can't work. You can do force_sig_info(), this clears SIGNAL_UNKILLABLE. Hmm. But now I seem to recall we have other reasons to make the new sigkill_task() helper... We will see. Anyway, force_ should work afaics. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110814213642.GB13799-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110814213642.GB13799-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> @ 2011-08-15 14:47 ` Oleg Nesterov 0 siblings, 0 replies; 63+ messages in thread From: Oleg Nesterov @ 2011-08-15 14:47 UTC (permalink / raw) To: Serge E. Hallyn Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA On 08/14, Serge E. Hallyn wrote: > > Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > On 08/11, Daniel Lezcano wrote: > > > > > > In the case of a VPS, when we shutdown/halt/reboot the container, the > > > reboot utility will invoke the sys_reboot syscall which has the bad > > > effect to reboot the host. > > > > Stupid question. Can't sys_reboot() simply kill init (and thus the whole > > pid_ns) in this case? > > The goal is to be able to distinguish a request for reboot from shutdown. > If we just kill the init, then the parent of init (the container monitor) > cannot restart the container to emulate reboot. OK, thanks. What if init reports the reason it was killed? Ignoring LINUX_REBOOT_CMD_CAD_/etc, I mean, roughly, - add "int reboot_cmd" into struct pid_namespace - sys_reboot(cmd) does if (!global_namespace) { task_active_pid_ns(current)->reboot_cmd = cmd; sigkill_my_init(); } - change zap_pid_ns_processes() to do if (pid_ns->reboot_cmd) // approximately current->exit_state = pid_ns->reboot_cmd; Then its parent can look at status after wait(&status). Not that I think this is very nice, but signals are not reliable. And once again, SIGCHLD doesn't queue. And, perhaps this doesn't matter, but sys_reboot() sends SIGCHLD and returns -EPERM, this can confuse the container. In any case. If you want to send a signal, please do not introduce SA_CLDREBOOT. Please do not play with ptrace or __wake_up_parent, this is meaningless. Just fill siginfo and send SIGCHLD unconditionally. Oleg. ^ permalink raw reply [flat|nested] 63+ messages in thread
[parent not found: <20110814161707.GB30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot [not found] ` <20110814161707.GB30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2011-08-14 21:36 ` Serge E. Hallyn 0 siblings, 0 replies; 63+ messages in thread From: Serge E. Hallyn @ 2011-08-14 21:36 UTC (permalink / raw) To: Oleg Nesterov Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, bonbons-ud5FBsm0p/xEiooADzr8i9i2O/JbrIOy, linux-kernel-u79uwXL29TY76Z2rM5mHXA Quoting Oleg Nesterov (oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > On 08/11, Daniel Lezcano wrote: > > > > In the case of a VPS, when we shutdown/halt/reboot the container, the > > reboot utility will invoke the sys_reboot syscall which has the bad > > effect to reboot the host. > > Stupid question. Can't sys_reboot() simply kill init (and thus the whole > pid_ns) in this case? The goal is to be able to distinguish a request for reboot from shutdown. If we just kill the init, then the parent of init (the container monitor) cannot restart the container to emulate reboot. [ sorry if I misunderstand what you're asking ] -serge ^ permalink raw reply [flat|nested] 63+ messages in thread
end of thread, other threads:[~2011-08-25 15:41 UTC | newest] Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-08-11 20:23 [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Daniel Lezcano [not found] ` <1313094241-3674-1-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 20:24 ` [PATCH 1/2] add SA_CLDREBOOT flag Daniel Lezcano 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov 2011-08-11 20:24 ` [PATCH 1/2] add SA_CLDREBOOT flag Daniel Lezcano [not found] ` <1313094241-3674-2-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-14 16:15 ` Oleg Nesterov 2011-08-14 16:15 ` Oleg Nesterov [not found] ` <20110814161532.GA30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-14 16:36 ` Bruno Prémont 2011-08-14 16:36 ` Bruno Prémont 2011-08-14 17:10 ` Oleg Nesterov [not found] ` <20110814183611.05937c96-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 2011-08-14 17:10 ` Oleg Nesterov 2011-08-11 20:24 ` [PATCH 2/2] Notify container-init parent a 'reboot' occured Daniel Lezcano 2011-08-11 21:09 ` Serge Hallyn 2011-08-11 21:30 ` Daniel Lezcano 2011-08-11 21:30 ` Daniel Lezcano [not found] ` <4E444A04.3070403-GANU6spQydw@public.gmane.org> 2011-08-11 21:50 ` Serge Hallyn 2011-08-12 16:29 ` Serge Hallyn 2011-08-12 16:29 ` Serge Hallyn 2011-08-12 20:42 ` Daniel Lezcano [not found] ` <4E45904F.60904-GANU6spQydw@public.gmane.org> 2011-08-12 21:13 ` Serge Hallyn 2011-08-12 21:13 ` Serge Hallyn 2011-08-12 20:42 ` Daniel Lezcano 2011-08-11 21:50 ` Serge Hallyn [not found] ` <1313094241-3674-3-git-send-email-daniel.lezcano-GANU6spQydw@public.gmane.org> 2011-08-11 21:09 ` Serge Hallyn 2011-08-13 0:19 ` Matt Helsley 2011-08-14 16:01 ` Oleg Nesterov 2011-08-13 0:19 ` Matt Helsley 2011-08-13 14:41 ` Daniel Lezcano [not found] ` <20110813001959.GB5777-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> 2011-08-13 14:41 ` Daniel Lezcano 2011-08-14 16:01 ` Oleg Nesterov 2011-08-14 16:17 ` [PATCH 0/2] Send a SIGCHLD to the init's pid namespace parent when reboot Oleg Nesterov 2011-08-14 21:36 ` Serge E. Hallyn 2011-08-15 14:47 ` Oleg Nesterov 2011-08-15 17:39 ` Serge E. Hallyn 2011-08-15 17:50 ` Daniel Lezcano [not found] ` <20110815173940.GA19620-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 2011-08-15 17:50 ` Daniel Lezcano [not found] ` <20110815144744.GA9660-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-15 17:39 ` Serge E. Hallyn 2011-08-18 23:46 ` Daniel Lezcano 2011-08-18 23:46 ` Daniel Lezcano 2011-08-19 15:24 ` Oleg Nesterov [not found] ` <20110819152416.GA17034-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-22 12:28 ` Daniel Lezcano 2011-08-22 12:28 ` Daniel Lezcano 2011-08-22 15:44 ` Oleg Nesterov [not found] ` <20110822154448.GA8527-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-22 16:31 ` Bruno Prémont 2011-08-22 16:31 ` Bruno Prémont [not found] ` <20110822183134.10390b46-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 2011-08-22 17:39 ` Oleg Nesterov 2011-08-22 17:39 ` Oleg Nesterov [not found] ` <20110822173949.GA13242-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-22 19:17 ` Bruno Prémont 2011-08-22 19:17 ` Bruno Prémont 2011-08-23 13:33 ` Oleg Nesterov 2011-08-23 14:09 ` Greg Kurz [not found] ` <1314108566.3465.29.camel-GiB8zCg7hOfDOqzlkpFKJg@public.gmane.org> 2011-08-23 14:29 ` Oleg Nesterov 2011-08-23 14:29 ` Oleg Nesterov [not found] ` <20110823142914.GA22593-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-24 19:44 ` Bruno Prémont 2011-08-24 19:44 ` Bruno Prémont 2011-08-25 15:37 ` Oleg Nesterov [not found] ` <20110824214418.474b24c6-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 2011-08-25 15:37 ` Oleg Nesterov [not found] ` <20110823133302.GA19582-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-23 14:09 ` Greg Kurz [not found] ` <20110822211716.7c141d5c-hY15tx4IgV39zxVx7UNMDg@public.gmane.org> 2011-08-23 13:33 ` Oleg Nesterov [not found] ` <4E524B73.3050704-GANU6spQydw@public.gmane.org> 2011-08-22 15:44 ` Oleg Nesterov [not found] ` <4E4DA461.8030006-GANU6spQydw@public.gmane.org> 2011-08-19 15:24 ` Oleg Nesterov [not found] ` <20110814213642.GB13799-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 2011-08-15 14:47 ` Oleg Nesterov [not found] ` <20110814161707.GB30846-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2011-08-14 21:36 ` Serge E. Hallyn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.