* SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) @ 2020-04-19 20:13 Christof Meerwald 2020-04-20 17:05 ` [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent Eric W. Biederman 2020-04-21 14:59 ` SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) Eric W. Biederman 0 siblings, 2 replies; 22+ messages in thread From: Christof Meerwald @ 2020-04-19 20:13 UTC (permalink / raw) To: linux-kernel; +Cc: Eric W. Biederman Hi, this is probably related to commit 7a0cf094944e2540758b7f957eb6846d5126f535 (signal: Correct namespace fixups of si_pid and si_uid). With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a properly set si_pid field - this seems to happen for multi-threaded child processes. A simple test program (based on the sample from the signalfd man page): #include <sys/signalfd.h> #include <signal.h> #include <unistd.h> #include <spawn.h> #include <stdlib.h> #include <stdio.h> #define handle_error(msg) \ do { perror(msg); exit(EXIT_FAILURE); } while (0) int main(int argc, char *argv[]) { sigset_t mask; int sfd; struct signalfd_siginfo fdsi; ssize_t s; sigemptyset(&mask); sigaddset(&mask, SIGCHLD); if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1) handle_error("sigprocmask"); pid_t chldpid; char *chldargv[] = { "./sfdclient", NULL }; posix_spawn(&chldpid, "./sfdclient", NULL, NULL, chldargv, NULL); sfd = signalfd(-1, &mask, 0); if (sfd == -1) handle_error("signalfd"); for (;;) { s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo)); if (s != sizeof(struct signalfd_siginfo)) handle_error("read"); if (fdsi.ssi_signo == SIGCHLD) { printf("Got SIGCHLD %d %d %d %d\n", fdsi.ssi_status, fdsi.ssi_code, fdsi.ssi_uid, fdsi.ssi_pid); return 0; } else { printf("Read unexpected signal\n"); } } } and a multi-threaded client to test with: #include <unistd.h> #include <pthread.h> void *f(void *arg) { sleep(100); } int main() { pthread_t t[8]; for (int i = 0; i != 8; ++i) { pthread_create(&t[i], NULL, f, NULL); } } I tried to do a bit of debugging and what seems to be happening is that /* From an ancestor pid namespace? */ if (!task_pid_nr_ns(current, task_active_pid_ns(t))) { fails inside task_pid_nr_ns because the check for "pid_alive" fails. This code seems to be called from do_notify_parent and there we actually have "tsk != current" (I am assuming both are threads of the current process?) Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-19 20:13 SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) Christof Meerwald @ 2020-04-20 17:05 ` Eric W. Biederman 2020-04-21 8:30 ` Christian Brauner 2020-04-21 9:04 ` Oleg Nesterov 2020-04-21 14:59 ` SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) Eric W. Biederman 1 sibling, 2 replies; 22+ messages in thread From: Eric W. Biederman @ 2020-04-20 17:05 UTC (permalink / raw) To: linux-kernel; +Cc: Christof Meerwald, Linux Containers, Oleg Nesterov Christof Meerwald <cmeerw@cmeerw.org> writes: > Hi, > > this is probably related to commit > 7a0cf094944e2540758b7f957eb6846d5126f535 (signal: Correct namespace > fixups of si_pid and si_uid). > > With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a > properly set si_pid field - this seems to happen for multi-threaded > child processes. > > A simple test program (based on the sample from the signalfd man page): > > #include <sys/signalfd.h> > #include <signal.h> > #include <unistd.h> > #include <spawn.h> > #include <stdlib.h> > #include <stdio.h> > > #define handle_error(msg) \ > do { perror(msg); exit(EXIT_FAILURE); } while (0) > > int main(int argc, char *argv[]) > { > sigset_t mask; > int sfd; > struct signalfd_siginfo fdsi; > ssize_t s; > > sigemptyset(&mask); > sigaddset(&mask, SIGCHLD); > > if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1) > handle_error("sigprocmask"); > > pid_t chldpid; > char *chldargv[] = { "./sfdclient", NULL }; > posix_spawn(&chldpid, "./sfdclient", NULL, NULL, chldargv, NULL); > > sfd = signalfd(-1, &mask, 0); > if (sfd == -1) > handle_error("signalfd"); > > for (;;) { > s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo)); > if (s != sizeof(struct signalfd_siginfo)) > handle_error("read"); > > if (fdsi.ssi_signo == SIGCHLD) { > printf("Got SIGCHLD %d %d %d %d\n", > fdsi.ssi_status, fdsi.ssi_code, > fdsi.ssi_uid, fdsi.ssi_pid); > return 0; > } else { > printf("Read unexpected signal\n"); > } > } > } > > > and a multi-threaded client to test with: > > #include <unistd.h> > #include <pthread.h> > > void *f(void *arg) > { > sleep(100); > } > > int main() > { > pthread_t t[8]; > > for (int i = 0; i != 8; ++i) > { > pthread_create(&t[i], NULL, f, NULL); > } > } > > I tried to do a bit of debugging and what seems to be happening is > that > > /* From an ancestor pid namespace? */ > if (!task_pid_nr_ns(current, task_active_pid_ns(t))) { > > fails inside task_pid_nr_ns because the check for "pid_alive" fails. > > This code seems to be called from do_notify_parent and there we > actually have "tsk != current" (I am assuming both are threads of the > current process?) I instrumented the code with a warning and received the following backtrace: > WARNING: CPU: 0 PID: 777 at kernel/pid.c:501 __task_pid_nr_ns.cold.6+0xc/0x15 > Modules linked in: > CPU: 0 PID: 777 Comm: sfdclient Not tainted 5.7.0-rc1userns+ #2924 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > RIP: 0010:__task_pid_nr_ns.cold.6+0xc/0x15 > Code: ff 66 90 48 83 ec 08 89 7c 24 04 48 8d 7e 08 48 8d 74 24 04 e8 9a b6 44 00 48 83 c4 08 c3 48 c7 c7 59 9f ac 82 e8 c2 c4 04 00 <0f> 0b e9 3fd > RSP: 0018:ffffc9000042fbf8 EFLAGS: 00010046 > RAX: 000000000000000c RBX: 0000000000000000 RCX: ffffc9000042faf4 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81193d29 > RBP: ffffc9000042fc18 R08: 0000000000000000 R09: 0000000000000001 > R10: 000000100f938416 R11: 0000000000000309 R12: ffff8880b941c140 > R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880b941c140 > FS: 0000000000000000(0000) GS:ffff8880bca00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f2e8c0a32e0 CR3: 0000000002e10000 CR4: 00000000000006f0 > Call Trace: > send_signal+0x1c8/0x310 > do_notify_parent+0x50f/0x550 > release_task.part.21+0x4fd/0x620 > do_exit+0x6f6/0xaf0 > do_group_exit+0x42/0xb0 > get_signal+0x13b/0xbb0 > do_signal+0x2b/0x670 > ? __audit_syscall_exit+0x24d/0x2b0 > ? rcu_read_lock_sched_held+0x4d/0x60 > ? kfree+0x24c/0x2b0 > do_syscall_64+0x176/0x640 > ? trace_hardirqs_off_thunk+0x1a/0x1c > entry_SYSCALL_64_after_hwframe+0x49/0xb3 The immediate problem is as Christof noticed that "pid_alive(current) == false". This happens because do_notify_parent is called from the last thread to exit in a process after that thread has been reaped. The bigger issue is that do_notify_parent can be called from any process that manages to wait on a thread of a multi-threaded process from wait_task_zombie. So any logic based upon current for do_notify_parent is just nonsense, as current can be pretty much anything. So change do_notify_parent to call __send_signal directly. Inspecting the code it appears this problem has existed since the pid namespace support started handling this case in 2.6.30. This fix only backports to 7a0cf094944e ("signal: Correct namespace fixups of si_pid and si_uid") where the problem logic was moved out of __send_signal and into send_signal. Cc: stable@vger.kernel.org Fixes: 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary Ref: 921cf9f63089 ("signals: protect cinit from unblocked SIG_DFL signals") Link: https://lore.kernel.org/lkml/20200419201336.GI22017@edge.cmeerw.net/ Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- Unless someone has an objection I will apply this one and send it to Linus. kernel/signal.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/signal.c b/kernel/signal.c index 9899c5f91ee1..a88a89422227 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1993,8 +1993,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig) if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) sig = 0; } + /* + * Bypass send_signal as the si_pid and si_uid values have + * been generated in the parent's namespaces. + */ if (valid_signal(sig) && sig) - __group_send_sig_info(sig, &info, tsk->parent); + __send_signal(sig, &info, tsk->parent, PIDTYPE_TGID, false); __wake_up_parent(tsk, tsk->parent); spin_unlock_irqrestore(&psig->siglock, flags); -- 2.20.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-20 17:05 ` [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent Eric W. Biederman @ 2020-04-21 8:30 ` Christian Brauner 2020-04-21 9:28 ` Oleg Nesterov ` (2 more replies) 2020-04-21 9:04 ` Oleg Nesterov 1 sibling, 3 replies; 22+ messages in thread From: Christian Brauner @ 2020-04-21 8:30 UTC (permalink / raw) To: Eric W. Biederman Cc: linux-kernel, Linux Containers, Christof Meerwald, Oleg Nesterov On Mon, Apr 20, 2020 at 12:05:38PM -0500, Eric W. Biederman wrote: > > Christof Meerwald <cmeerw@cmeerw.org> writes: > > Hi, > > > > this is probably related to commit > > 7a0cf094944e2540758b7f957eb6846d5126f535 (signal: Correct namespace > > fixups of si_pid and si_uid). > > > > With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a > > properly set si_pid field - this seems to happen for multi-threaded > > child processes. > > > > A simple test program (based on the sample from the signalfd man page): > > > > #include <sys/signalfd.h> > > #include <signal.h> > > #include <unistd.h> > > #include <spawn.h> > > #include <stdlib.h> > > #include <stdio.h> > > > > #define handle_error(msg) \ > > do { perror(msg); exit(EXIT_FAILURE); } while (0) > > > > int main(int argc, char *argv[]) > > { > > sigset_t mask; > > int sfd; > > struct signalfd_siginfo fdsi; > > ssize_t s; > > > > sigemptyset(&mask); > > sigaddset(&mask, SIGCHLD); > > > > if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1) > > handle_error("sigprocmask"); > > > > pid_t chldpid; > > char *chldargv[] = { "./sfdclient", NULL }; > > posix_spawn(&chldpid, "./sfdclient", NULL, NULL, chldargv, NULL); > > > > sfd = signalfd(-1, &mask, 0); > > if (sfd == -1) > > handle_error("signalfd"); > > > > for (;;) { > > s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo)); > > if (s != sizeof(struct signalfd_siginfo)) > > handle_error("read"); > > > > if (fdsi.ssi_signo == SIGCHLD) { > > printf("Got SIGCHLD %d %d %d %d\n", > > fdsi.ssi_status, fdsi.ssi_code, > > fdsi.ssi_uid, fdsi.ssi_pid); > > return 0; > > } else { > > printf("Read unexpected signal\n"); > > } > > } > > } > > > > > > and a multi-threaded client to test with: > > > > #include <unistd.h> > > #include <pthread.h> > > > > void *f(void *arg) > > { > > sleep(100); > > } > > > > int main() > > { > > pthread_t t[8]; > > > > for (int i = 0; i != 8; ++i) > > { > > pthread_create(&t[i], NULL, f, NULL); > > } > > } > > > > I tried to do a bit of debugging and what seems to be happening is > > that > > > > /* From an ancestor pid namespace? */ > > if (!task_pid_nr_ns(current, task_active_pid_ns(t))) { > > > > fails inside task_pid_nr_ns because the check for "pid_alive" fails. > > > > This code seems to be called from do_notify_parent and there we > > actually have "tsk != current" (I am assuming both are threads of the > > current process?) > > I instrumented the code with a warning and received the following backtrace: > > WARNING: CPU: 0 PID: 777 at kernel/pid.c:501 __task_pid_nr_ns.cold.6+0xc/0x15 > > Modules linked in: > > CPU: 0 PID: 777 Comm: sfdclient Not tainted 5.7.0-rc1userns+ #2924 > > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > > RIP: 0010:__task_pid_nr_ns.cold.6+0xc/0x15 > > Code: ff 66 90 48 83 ec 08 89 7c 24 04 48 8d 7e 08 48 8d 74 24 04 e8 9a b6 44 00 48 83 c4 08 c3 48 c7 c7 59 9f ac 82 e8 c2 c4 04 00 <0f> 0b e9 3fd > > RSP: 0018:ffffc9000042fbf8 EFLAGS: 00010046 > > RAX: 000000000000000c RBX: 0000000000000000 RCX: ffffc9000042faf4 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81193d29 > > RBP: ffffc9000042fc18 R08: 0000000000000000 R09: 0000000000000001 > > R10: 000000100f938416 R11: 0000000000000309 R12: ffff8880b941c140 > > R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880b941c140 > > FS: 0000000000000000(0000) GS:ffff8880bca00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007f2e8c0a32e0 CR3: 0000000002e10000 CR4: 00000000000006f0 > > Call Trace: > > send_signal+0x1c8/0x310 > > do_notify_parent+0x50f/0x550 > > release_task.part.21+0x4fd/0x620 > > do_exit+0x6f6/0xaf0 > > do_group_exit+0x42/0xb0 > > get_signal+0x13b/0xbb0 > > do_signal+0x2b/0x670 > > ? __audit_syscall_exit+0x24d/0x2b0 > > ? rcu_read_lock_sched_held+0x4d/0x60 > > ? kfree+0x24c/0x2b0 > > do_syscall_64+0x176/0x640 > > ? trace_hardirqs_off_thunk+0x1a/0x1c > > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > The immediate problem is as Christof noticed that "pid_alive(current) == false". > This happens because do_notify_parent is called from the last thread to exit > in a process after that thread has been reaped. > > The bigger issue is that do_notify_parent can be called from any > process that manages to wait on a thread of a multi-threaded process > from wait_task_zombie. So any logic based upon current for > do_notify_parent is just nonsense, as current can be pretty much > anything. > > So change do_notify_parent to call __send_signal directly. > > Inspecting the code it appears this problem has existed since the pid > namespace support started handling this case in 2.6.30. This fix only > backports to 7a0cf094944e ("signal: Correct namespace fixups of si_pid and si_uid") > where the problem logic was moved out of __send_signal and into send_signal. > > Cc: stable@vger.kernel.org > Fixes: 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary > Ref: 921cf9f63089 ("signals: protect cinit from unblocked SIG_DFL signals") > Link: https://lore.kernel.org/lkml/20200419201336.GI22017@edge.cmeerw.net/ > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > --- > > Unless someone has an objection I will apply this one and send it to > Linus. > > kernel/signal.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/signal.c b/kernel/signal.c > index 9899c5f91ee1..a88a89422227 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1993,8 +1993,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig) > if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) > sig = 0; > } > + /* > + * Bypass send_signal as the si_pid and si_uid values have > + * been generated in the parent's namespaces. > + */ At first I misread that comment as saying that we're skipping sending a signal not that it relates to a specific function (and I won't admit that I wrote a whole long paragraph on why I'm confused we're skipping sending signals on invalid si_pid and si_uid...). I think it would be worth to say something that simply states the facts such as: "If we're not autoreaping, send a signal with si_pid and si_uid set as generated in the parent's namespaces." which imho is way clearer then pointing to out that we're skipping send_signal(). The logic here and the comment in its current form are hard to correlate, especially since send_signal() was never called directly here but is rather called by __group_send_sig_info(). The details of why we switched from __group_send_sign_info() to __send_signal() could just go into the commit message. It's more confusing in the code. > if (valid_signal(sig) && sig) (Not related to you patch but odd ordering of the if-branch here. I would've expected this to read if (sig && valid_signal(sig)) especially since "sig" can be set to 0 right before. > - __group_send_sig_info(sig, &info, tsk->parent); > + __send_signal(sig, &info, tsk->parent, PIDTYPE_TGID, false); So below you switch to __send_signal() but set the "force" argument to to "false". Before that, if the signal was generated from another pid namespace and we fixed up si_pid and si_uid the "force" argument was set to "true", i.e. __send_signal(..., ..., ..., ..., true) Even though from reading through the code I think that change is safe to make let me verify that this was an intentional change? Thanks! Christian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 8:30 ` Christian Brauner @ 2020-04-21 9:28 ` Oleg Nesterov 2020-04-21 10:21 ` Christian Brauner 2020-04-21 10:28 ` Christian Brauner 2020-04-21 14:57 ` Eric W. Biederman 2 siblings, 1 reply; 22+ messages in thread From: Oleg Nesterov @ 2020-04-21 9:28 UTC (permalink / raw) To: Christian Brauner Cc: Eric W. Biederman, linux-kernel, Linux Containers, Christof Meerwald On 04/21, Christian Brauner wrote: > > > - __group_send_sig_info(sig, &info, tsk->parent); > > + __send_signal(sig, &info, tsk->parent, PIDTYPE_TGID, false); > > So below you switch to __send_signal() but set the "force" argument to > to "false". it must be false, the signal is generated from the parent's namespace or its descendant > Before that, if the signal was generated from another pid > namespace and we fixed up si_pid and si_uid the "force" argument was set > to "true", before that the "force" argument could be falsely true by the same reason, task_pid_nr_ns(tsk, tsk->parent) can return 0 because "tsk" no longer have pids after __unhash_process(). Oleg. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 9:28 ` Oleg Nesterov @ 2020-04-21 10:21 ` Christian Brauner 2020-04-21 11:11 ` Oleg Nesterov 0 siblings, 1 reply; 22+ messages in thread From: Christian Brauner @ 2020-04-21 10:21 UTC (permalink / raw) To: Oleg Nesterov Cc: Eric W. Biederman, linux-kernel, Linux Containers, Christof Meerwald On Tue, Apr 21, 2020 at 11:28:47AM +0200, Oleg Nesterov wrote: > On 04/21, Christian Brauner wrote: > > > > > - __group_send_sig_info(sig, &info, tsk->parent); > > > + __send_signal(sig, &info, tsk->parent, PIDTYPE_TGID, false); > > > > So below you switch to __send_signal() but set the "force" argument to > > to "false". > > it must be false, the signal is generated from the parent's namespace or > its descendant > > > Before that, if the signal was generated from another pid > > namespace and we fixed up si_pid and si_uid the "force" argument was set > > to "true", > > before that the "force" argument could be falsely true by the same reason, > task_pid_nr_ns(tsk, tsk->parent) can return 0 because "tsk" no longer have > pids after __unhash_process(). As I said in my mail, looking at the codepath it seems safe. My question was whether it is _always_ incorrectly false which I do think it is because child subreapers can't come from outside the pid namespace. If they could you could create a scenario where the signal is generated from a sibling pid namespace in which case it would be correctly set to true. Christian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 10:21 ` Christian Brauner @ 2020-04-21 11:11 ` Oleg Nesterov 2020-04-21 11:26 ` Christian Brauner 2020-04-21 11:28 ` Oleg Nesterov 0 siblings, 2 replies; 22+ messages in thread From: Oleg Nesterov @ 2020-04-21 11:11 UTC (permalink / raw) To: Christian Brauner Cc: Eric W. Biederman, linux-kernel, Linux Containers, Christof Meerwald Sorry Christian, I don't understand... On 04/21, Christian Brauner wrote: > > On Tue, Apr 21, 2020 at 11:28:47AM +0200, Oleg Nesterov wrote: > > On 04/21, Christian Brauner wrote: > > > > > > > - __group_send_sig_info(sig, &info, tsk->parent); > > > > + __send_signal(sig, &info, tsk->parent, PIDTYPE_TGID, false); > > > > > > So below you switch to __send_signal() but set the "force" argument to > > > to "false". > > > > it must be false, the signal is generated from the parent's namespace or > > its descendant > > > > > Before that, if the signal was generated from another pid > > > namespace and we fixed up si_pid and si_uid the "force" argument was set > > > to "true", > > > > before that the "force" argument could be falsely true by the same reason, > > task_pid_nr_ns(tsk, tsk->parent) can return 0 because "tsk" no longer have > > pids after __unhash_process(). > > As I said in my mail, looking at the codepath it seems safe. My question > was whether it is _always_ incorrectly false which I do think it is Again, it must be always "false", it can be incorrectly "true" and this too is fixed by Eric's patch. > because child subreapers can't come from outside the pid namespace. If > they could you could create a scenario where the signal is generated > from a sibling pid namespace in which case it would be correctly set to > true. not sure I understand, but probably the answer is "yes"... task and task->parent either live in the same namespace or the child's namespace is the descendant of task->parent's namespace. In both cases task_pid_nr_ns(tsk, tsk->parent) should return the valid pid_nr and "force" must be false. The corner case is release_task() when the last exiting sub-thread sends a signal on behalf of its ->group_leader, and at this point all the tsk's pid pointers are NULL, that is why "force" can be falsely "true". Oleg. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 11:11 ` Oleg Nesterov @ 2020-04-21 11:26 ` Christian Brauner 2020-04-21 12:17 ` Oleg Nesterov 2020-04-21 11:28 ` Oleg Nesterov 1 sibling, 1 reply; 22+ messages in thread From: Christian Brauner @ 2020-04-21 11:26 UTC (permalink / raw) To: Oleg Nesterov Cc: Linux Containers, Christof Meerwald, Eric W. Biederman, linux-kernel On Tue, Apr 21, 2020 at 01:11:40PM +0200, Oleg Nesterov wrote: > Sorry Christian, I don't understand... In my original mail, it was really just a clarification question. I said the patch is correct from looking at the codepaths. :) I was just trying to see whether there was a potential corner-case we're missing where "force" could be _validly_ true. > > > because child subreapers can't come from outside the pid namespace. If > > they could you could create a scenario where the signal is generated > > from a sibling pid namespace in which case it would be correctly set to > > true. > > not sure I understand, but probably the answer is "yes"... (This is really purely academic now since it isn't possible, but for pure amusement assume that a child subreaper could cross namespace boundaries (which they don't). A marks itself as a subreaper and creates a new process B in a new pid namespace <pidnsB>, process B setnses into <pidnsC> which is a sibling pid namespace, B clones a new proces in <pidnsC> which is now a full member of <pidnsC>, B dies and C is reparented to A, B exits and then you'd be getting a sigchld from a pid in a pid namespace in which you have no pid nr.) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 11:26 ` Christian Brauner @ 2020-04-21 12:17 ` Oleg Nesterov 2020-04-21 12:59 ` Christian Brauner 0 siblings, 1 reply; 22+ messages in thread From: Oleg Nesterov @ 2020-04-21 12:17 UTC (permalink / raw) To: Christian Brauner Cc: Linux Containers, Christof Meerwald, Eric W. Biederman, linux-kernel On 04/21, Christian Brauner wrote: > > process B setnses into > <pidnsC> which is a sibling pid namespace, please see pidns_install(), it verifies that * Only allow entering the current active pid namespace * or a child of the current active pid namespace. Oleg. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 12:17 ` Oleg Nesterov @ 2020-04-21 12:59 ` Christian Brauner 2020-04-21 13:42 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Christian Brauner @ 2020-04-21 12:59 UTC (permalink / raw) To: Oleg Nesterov Cc: Linux Containers, Christof Meerwald, Eric W. Biederman, linux-kernel On Tue, Apr 21, 2020 at 02:17:22PM +0200, Oleg Nesterov wrote: > On 04/21, Christian Brauner wrote: > > > > process B setnses into > > <pidnsC> which is a sibling pid namespace, > > please see pidns_install(), it verifies that > > * Only allow entering the current active pid namespace > * or a child of the current active pid namespace. I forgot about that. Though, don't we have the same problem in: static void do_notify_parent_cldstop(struct task_struct *tsk, bool for_ptracer, int why) at least for the for_ptrace is false case? Christian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 12:59 ` Christian Brauner @ 2020-04-21 13:42 ` Eric W. Biederman 0 siblings, 0 replies; 22+ messages in thread From: Eric W. Biederman @ 2020-04-21 13:42 UTC (permalink / raw) To: Christian Brauner Cc: Oleg Nesterov, Linux Containers, Christof Meerwald, linux-kernel Christian Brauner <christian.brauner@ubuntu.com> writes: > On Tue, Apr 21, 2020 at 02:17:22PM +0200, Oleg Nesterov wrote: >> On 04/21, Christian Brauner wrote: >> > >> > process B setnses into >> > <pidnsC> which is a sibling pid namespace, >> >> please see pidns_install(), it verifies that >> >> * Only allow entering the current active pid namespace >> * or a child of the current active pid namespace. > > I forgot about that. > > Though, don't we have the same problem in: > > static void do_notify_parent_cldstop(struct task_struct *tsk, > bool for_ptracer, int why) > > at least for the for_ptrace is false case? The same problem does not exist with do_notify_parent_cldstop because do_notify_parent_cldstop is always called from current (there is one case of current->group_leader but that is close enough calculations made against current are true). However because do_notify_parent_cldstop calculates si_pid and si_uid of the target parent process I think we can still get the wrong si_uid. So it probably makes sense to generalize the fixup code in send_signal and make do_notify_parent_cldstop just generate ids in the current namespace and let the fixup code do it's job. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 11:11 ` Oleg Nesterov 2020-04-21 11:26 ` Christian Brauner @ 2020-04-21 11:28 ` Oleg Nesterov 2020-04-21 11:38 ` Christian Brauner 1 sibling, 1 reply; 22+ messages in thread From: Oleg Nesterov @ 2020-04-21 11:28 UTC (permalink / raw) To: Christian Brauner Cc: Eric W. Biederman, linux-kernel, Linux Containers, Christof Meerwald On 04/21, Oleg Nesterov wrote: > > The corner case is release_task() when the last exiting sub-thread sends > a signal on behalf of its ->group_leader, and at this point all the tsk's > pid pointers are NULL, that is why "force" can be falsely "true". Or do_notify_parent() can be called by debugger from the parent namespace, in this case "force" can be falsely "true" too. Oleg. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 11:28 ` Oleg Nesterov @ 2020-04-21 11:38 ` Christian Brauner 0 siblings, 0 replies; 22+ messages in thread From: Christian Brauner @ 2020-04-21 11:38 UTC (permalink / raw) To: Oleg Nesterov Cc: Eric W. Biederman, linux-kernel, Linux Containers, Christof Meerwald On Tue, Apr 21, 2020 at 01:28:31PM +0200, Oleg Nesterov wrote: > On 04/21, Oleg Nesterov wrote: > > > > The corner case is release_task() when the last exiting sub-thread sends > > a signal on behalf of its ->group_leader, and at this point all the tsk's > > pid pointers are NULL, that is why "force" can be falsely "true". > > Or do_notify_parent() can be called by debugger from the parent namespace, > in this case "force" can be falsely "true" too. That's an interesting scenario to think about as well. Cross-pid-namespace interactions are fun... That's why the cross-pid-namespace-signal sending aspects we discussed a while back on-list though pretty nice to have at some point are somewhat scary. Christian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 8:30 ` Christian Brauner 2020-04-21 9:28 ` Oleg Nesterov @ 2020-04-21 10:28 ` Christian Brauner 2020-04-21 14:57 ` Eric W. Biederman 2 siblings, 0 replies; 22+ messages in thread From: Christian Brauner @ 2020-04-21 10:28 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, Christof Meerwald, linux-kernel, Oleg Nesterov On Tue, Apr 21, 2020 at 10:30:31AM +0200, Christian Brauner wrote: > On Mon, Apr 20, 2020 at 12:05:38PM -0500, Eric W. Biederman wrote: > > > > Christof Meerwald <cmeerw@cmeerw.org> writes: > > > Hi, > > > > > > this is probably related to commit > > > 7a0cf094944e2540758b7f957eb6846d5126f535 (signal: Correct namespace > > > fixups of si_pid and si_uid). > > > > > > With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a > > > properly set si_pid field - this seems to happen for multi-threaded > > > child processes. > > > > > > A simple test program (based on the sample from the signalfd man page): > > > > > > #include <sys/signalfd.h> > > > #include <signal.h> > > > #include <unistd.h> > > > #include <spawn.h> > > > #include <stdlib.h> > > > #include <stdio.h> > > > > > > #define handle_error(msg) \ > > > do { perror(msg); exit(EXIT_FAILURE); } while (0) > > > > > > int main(int argc, char *argv[]) > > > { > > > sigset_t mask; > > > int sfd; > > > struct signalfd_siginfo fdsi; > > > ssize_t s; > > > > > > sigemptyset(&mask); > > > sigaddset(&mask, SIGCHLD); > > > > > > if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1) > > > handle_error("sigprocmask"); > > > > > > pid_t chldpid; > > > char *chldargv[] = { "./sfdclient", NULL }; > > > posix_spawn(&chldpid, "./sfdclient", NULL, NULL, chldargv, NULL); > > > > > > sfd = signalfd(-1, &mask, 0); > > > if (sfd == -1) > > > handle_error("signalfd"); > > > > > > for (;;) { > > > s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo)); > > > if (s != sizeof(struct signalfd_siginfo)) > > > handle_error("read"); > > > > > > if (fdsi.ssi_signo == SIGCHLD) { > > > printf("Got SIGCHLD %d %d %d %d\n", > > > fdsi.ssi_status, fdsi.ssi_code, > > > fdsi.ssi_uid, fdsi.ssi_pid); > > > return 0; > > > } else { > > > printf("Read unexpected signal\n"); > > > } > > > } > > > } > > > > > > > > > and a multi-threaded client to test with: > > > > > > #include <unistd.h> > > > #include <pthread.h> > > > > > > void *f(void *arg) > > > { > > > sleep(100); > > > } > > > > > > int main() > > > { > > > pthread_t t[8]; > > > > > > for (int i = 0; i != 8; ++i) > > > { > > > pthread_create(&t[i], NULL, f, NULL); > > > } > > > } > > > > > > I tried to do a bit of debugging and what seems to be happening is > > > that > > > > > > /* From an ancestor pid namespace? */ > > > if (!task_pid_nr_ns(current, task_active_pid_ns(t))) { > > > > > > fails inside task_pid_nr_ns because the check for "pid_alive" fails. > > > > > > This code seems to be called from do_notify_parent and there we > > > actually have "tsk != current" (I am assuming both are threads of the > > > current process?) > > > > I instrumented the code with a warning and received the following backtrace: > > > WARNING: CPU: 0 PID: 777 at kernel/pid.c:501 __task_pid_nr_ns.cold.6+0xc/0x15 > > > Modules linked in: > > > CPU: 0 PID: 777 Comm: sfdclient Not tainted 5.7.0-rc1userns+ #2924 > > > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > > > RIP: 0010:__task_pid_nr_ns.cold.6+0xc/0x15 > > > Code: ff 66 90 48 83 ec 08 89 7c 24 04 48 8d 7e 08 48 8d 74 24 04 e8 9a b6 44 00 48 83 c4 08 c3 48 c7 c7 59 9f ac 82 e8 c2 c4 04 00 <0f> 0b e9 3fd > > > RSP: 0018:ffffc9000042fbf8 EFLAGS: 00010046 > > > RAX: 000000000000000c RBX: 0000000000000000 RCX: ffffc9000042faf4 > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81193d29 > > > RBP: ffffc9000042fc18 R08: 0000000000000000 R09: 0000000000000001 > > > R10: 000000100f938416 R11: 0000000000000309 R12: ffff8880b941c140 > > > R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880b941c140 > > > FS: 0000000000000000(0000) GS:ffff8880bca00000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007f2e8c0a32e0 CR3: 0000000002e10000 CR4: 00000000000006f0 > > > Call Trace: > > > send_signal+0x1c8/0x310 > > > do_notify_parent+0x50f/0x550 > > > release_task.part.21+0x4fd/0x620 > > > do_exit+0x6f6/0xaf0 > > > do_group_exit+0x42/0xb0 > > > get_signal+0x13b/0xbb0 > > > do_signal+0x2b/0x670 > > > ? __audit_syscall_exit+0x24d/0x2b0 > > > ? rcu_read_lock_sched_held+0x4d/0x60 > > > ? kfree+0x24c/0x2b0 > > > do_syscall_64+0x176/0x640 > > > ? trace_hardirqs_off_thunk+0x1a/0x1c > > > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > > > The immediate problem is as Christof noticed that "pid_alive(current) == false". > > This happens because do_notify_parent is called from the last thread to exit > > in a process after that thread has been reaped. > > > > The bigger issue is that do_notify_parent can be called from any > > process that manages to wait on a thread of a multi-threaded process > > from wait_task_zombie. So any logic based upon current for > > do_notify_parent is just nonsense, as current can be pretty much > > anything. > > > > So change do_notify_parent to call __send_signal directly. > > > > Inspecting the code it appears this problem has existed since the pid > > namespace support started handling this case in 2.6.30. This fix only > > backports to 7a0cf094944e ("signal: Correct namespace fixups of si_pid and si_uid") > > where the problem logic was moved out of __send_signal and into send_signal. > > > > Cc: stable@vger.kernel.org > > Fixes: 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary > > Ref: 921cf9f63089 ("signals: protect cinit from unblocked SIG_DFL signals") > > Link: https://lore.kernel.org/lkml/20200419201336.GI22017@edge.cmeerw.net/ > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > > --- > > > > Unless someone has an objection I will apply this one and send it to > > Linus. > > > > kernel/signal.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/signal.c b/kernel/signal.c > > index 9899c5f91ee1..a88a89422227 100644 > > --- a/kernel/signal.c > > +++ b/kernel/signal.c > > @@ -1993,8 +1993,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig) > > if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) > > sig = 0; > > } > > + /* > > + * Bypass send_signal as the si_pid and si_uid values have > > + * been generated in the parent's namespaces. > > + */ > > At first I misread that comment as saying that we're skipping sending a > signal not that it relates to a specific function (and I won't admit that > I wrote a whole long paragraph on why I'm confused we're skipping > sending signals on invalid si_pid and si_uid...). > > I think it would be worth to say something that simply states the facts > such as: > "If we're not autoreaping, send a signal with si_pid and si_uid set > as generated in the parent's namespaces." > which imho is way clearer then pointing to out that we're skipping > send_signal(). The logic here and the comment in its current form are > hard to correlate, especially since send_signal() was never called > directly here but is rather called by __group_send_sig_info(). > The details of why we switched from __group_send_sign_info() to > __send_signal() could just go into the commit message. It's more > confusing in the code. Forgot before, otherwise: Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Thanks! Christian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 8:30 ` Christian Brauner 2020-04-21 9:28 ` Oleg Nesterov 2020-04-21 10:28 ` Christian Brauner @ 2020-04-21 14:57 ` Eric W. Biederman 2020-04-21 15:08 ` Christian Brauner 2 siblings, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2020-04-21 14:57 UTC (permalink / raw) To: Christian Brauner Cc: linux-kernel, Linux Containers, Christof Meerwald, Oleg Nesterov Christian Brauner <christian.brauner@ubuntu.com> writes: > On Mon, Apr 20, 2020 at 12:05:38PM -0500, Eric W. Biederman wrote: >> diff --git a/kernel/signal.c b/kernel/signal.c >> index 9899c5f91ee1..a88a89422227 100644 >> --- a/kernel/signal.c >> +++ b/kernel/signal.c >> @@ -1993,8 +1993,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig) >> if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) >> sig = 0; >> } >> + /* >> + * Bypass send_signal as the si_pid and si_uid values have >> + * been generated in the parent's namespaces. >> + */ > > At first I misread that comment as saying that we're skipping sending a > signal not that it relates to a specific function (and I won't admit that > I wrote a whole long paragraph on why I'm confused we're skipping > sending signals on invalid si_pid and si_uid...). I have updated the comment to read: + /* + * Send with __send_signal as si_pid and si_uid are in the + * parent's namespaces. + */ That should be enough of a hint for someone to read the code and figure out what is going on. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-21 14:57 ` Eric W. Biederman @ 2020-04-21 15:08 ` Christian Brauner 0 siblings, 0 replies; 22+ messages in thread From: Christian Brauner @ 2020-04-21 15:08 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, Christof Meerwald, linux-kernel, Oleg Nesterov On Tue, Apr 21, 2020 at 09:57:09AM -0500, Eric W. Biederman wrote: > Christian Brauner <christian.brauner@ubuntu.com> writes: > > > On Mon, Apr 20, 2020 at 12:05:38PM -0500, Eric W. Biederman wrote: > >> diff --git a/kernel/signal.c b/kernel/signal.c > >> index 9899c5f91ee1..a88a89422227 100644 > >> --- a/kernel/signal.c > >> +++ b/kernel/signal.c > >> @@ -1993,8 +1993,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig) > >> if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) > >> sig = 0; > >> } > >> + /* > >> + * Bypass send_signal as the si_pid and si_uid values have > >> + * been generated in the parent's namespaces. > >> + */ > > > > At first I misread that comment as saying that we're skipping sending a > > signal not that it relates to a specific function (and I won't admit that > > I wrote a whole long paragraph on why I'm confused we're skipping > > sending signals on invalid si_pid and si_uid...). > > I have updated the comment to read: > + /* > + * Send with __send_signal as si_pid and si_uid are in the > + * parent's namespaces. > + */ > > That should be enough of a hint for someone to read the code and figure > out what is going on. Perfect, thanks! Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Christian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent 2020-04-20 17:05 ` [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent Eric W. Biederman 2020-04-21 8:30 ` Christian Brauner @ 2020-04-21 9:04 ` Oleg Nesterov 2020-04-21 10:19 ` [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() Oleg Nesterov 1 sibling, 1 reply; 22+ messages in thread From: Oleg Nesterov @ 2020-04-21 9:04 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel, Christof Meerwald, Linux Containers On 04/20, Eric W. Biederman wrote: > > The immediate problem is as Christof noticed that "pid_alive(current) == false". this is slightly offtopic, but we can probably remove this "pid_alive" check, pid_nr_ns() checks pid != NULL anyway. > Inspecting the code it appears this problem has existed since the pid > namespace support started handling this case Agreed... > @@ -1993,8 +1993,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig) > if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) > sig = 0; > } > + /* > + * Bypass send_signal as the si_pid and si_uid values have > + * been generated in the parent's namespaces. > + */ > if (valid_signal(sig) && sig) > - __group_send_sig_info(sig, &info, tsk->parent); > + __send_signal(sig, &info, tsk->parent, PIDTYPE_TGID, false); Acked-by: Oleg Nesterov <oleg@redhat.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() 2020-04-21 9:04 ` Oleg Nesterov @ 2020-04-21 10:19 ` Oleg Nesterov 2020-04-21 10:50 ` Christian Brauner 2020-04-21 15:05 ` Eric W. Biederman 0 siblings, 2 replies; 22+ messages in thread From: Oleg Nesterov @ 2020-04-21 10:19 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel, Christof Meerwald, Linux Containers Starting from 2c4704756cab ("pids: Move the pgrp and session pid pointers from task_struct to signal_struct") __task_pid_nr_ns() doesn't dereference task->group_leader, we can remove the pid_alive() check. pid_nr_ns() has to check pid != NULL anyway, pid_alive() just adds the unnecessary confusion. Signed-off-by: Oleg Nesterov <oleg@redhat.com> --- kernel/pid.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/pid.c b/kernel/pid.c index bc21c0fb26d8..47221db038e2 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -475,8 +475,7 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type, rcu_read_lock(); if (!ns) ns = task_active_pid_ns(current); - if (likely(pid_alive(task))) - nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); + nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); rcu_read_unlock(); return nr; -- 2.25.1.362.g51ebf55 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() 2020-04-21 10:19 ` [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() Oleg Nesterov @ 2020-04-21 10:50 ` Christian Brauner 2020-04-21 15:05 ` Eric W. Biederman 1 sibling, 0 replies; 22+ messages in thread From: Christian Brauner @ 2020-04-21 10:50 UTC (permalink / raw) To: Oleg Nesterov Cc: Eric W. Biederman, Linux Containers, Christof Meerwald, linux-kernel On Tue, Apr 21, 2020 at 12:19:04PM +0200, Oleg Nesterov wrote: > Starting from 2c4704756cab ("pids: Move the pgrp and session pid pointers > from task_struct to signal_struct") __task_pid_nr_ns() doesn't dereference > task->group_leader, we can remove the pid_alive() check. > > pid_nr_ns() has to check pid != NULL anyway, pid_alive() just adds the > unnecessary confusion. > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- Acked-by: Christian Brauner <christian.brauner@ubuntu.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() 2020-04-21 10:19 ` [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() Oleg Nesterov 2020-04-21 10:50 ` Christian Brauner @ 2020-04-21 15:05 ` Eric W. Biederman 2020-04-24 18:05 ` Oleg Nesterov 1 sibling, 1 reply; 22+ messages in thread From: Eric W. Biederman @ 2020-04-21 15:05 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-kernel, Christof Meerwald, Linux Containers Oleg Nesterov <oleg@redhat.com> writes: > Starting from 2c4704756cab ("pids: Move the pgrp and session pid pointers > from task_struct to signal_struct") __task_pid_nr_ns() doesn't dereference > task->group_leader, we can remove the pid_alive() check. > > pid_nr_ns() has to check pid != NULL anyway, pid_alive() just adds the > unnecessary confusion. > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> > --- Good catch that does simplify things. Eric > kernel/pid.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/kernel/pid.c b/kernel/pid.c > index bc21c0fb26d8..47221db038e2 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -475,8 +475,7 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type, > rcu_read_lock(); > if (!ns) > ns = task_active_pid_ns(current); > - if (likely(pid_alive(task))) > - nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); > + nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); > rcu_read_unlock(); > > return nr; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() 2020-04-21 15:05 ` Eric W. Biederman @ 2020-04-24 18:05 ` Oleg Nesterov 2020-04-24 19:54 ` Eric W. Biederman 0 siblings, 1 reply; 22+ messages in thread From: Oleg Nesterov @ 2020-04-24 18:05 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel, Christof Meerwald, Linux Containers On 04/21, Eric W. Biederman wrote: > > Oleg Nesterov <oleg@redhat.com> writes: > > > Starting from 2c4704756cab ("pids: Move the pgrp and session pid pointers > > from task_struct to signal_struct") __task_pid_nr_ns() doesn't dereference > > task->group_leader, we can remove the pid_alive() check. > > > > pid_nr_ns() has to check pid != NULL anyway, pid_alive() just adds the > > unnecessary confusion. > > > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Thanks, can you take this patch? Oleg. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() 2020-04-24 18:05 ` Oleg Nesterov @ 2020-04-24 19:54 ` Eric W. Biederman 0 siblings, 0 replies; 22+ messages in thread From: Eric W. Biederman @ 2020-04-24 19:54 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-kernel, Christof Meerwald, Linux Containers Oleg Nesterov <oleg@redhat.com> writes: > On 04/21, Eric W. Biederman wrote: >> >> Oleg Nesterov <oleg@redhat.com> writes: >> >> > Starting from 2c4704756cab ("pids: Move the pgrp and session pid pointers >> > from task_struct to signal_struct") __task_pid_nr_ns() doesn't dereference >> > task->group_leader, we can remove the pid_alive() check. >> > >> > pid_nr_ns() has to check pid != NULL anyway, pid_alive() just adds the >> > unnecessary confusion. >> > >> > Signed-off-by: Oleg Nesterov <oleg@redhat.com> >> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> > > Thanks, can you take this patch? I plan to. Probably on a topic branch for signals. I am sorting that out now. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) 2020-04-19 20:13 SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) Christof Meerwald 2020-04-20 17:05 ` [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent Eric W. Biederman @ 2020-04-21 14:59 ` Eric W. Biederman 1 sibling, 0 replies; 22+ messages in thread From: Eric W. Biederman @ 2020-04-21 14:59 UTC (permalink / raw) To: Christof Meerwald; +Cc: linux-kernel Christof Meerwald <cmeerw@cmeerw.org> writes: > Hi, > > this is probably related to commit > 7a0cf094944e2540758b7f957eb6846d5126f535 (signal: Correct namespace > fixups of si_pid and si_uid). > > With a 5.6.5 kernel I am seeing SIGCHLD signals that don't include a > properly set si_pid field - this seems to happen for multi-threaded > child processes. Christof I want to say very good spotting and reporting of this issue. Thank you. Eric ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2020-04-24 19:58 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-04-19 20:13 SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) Christof Meerwald 2020-04-20 17:05 ` [PATCH] signal: Avoid corrupting si_pid and si_uid in do_notify_parent Eric W. Biederman 2020-04-21 8:30 ` Christian Brauner 2020-04-21 9:28 ` Oleg Nesterov 2020-04-21 10:21 ` Christian Brauner 2020-04-21 11:11 ` Oleg Nesterov 2020-04-21 11:26 ` Christian Brauner 2020-04-21 12:17 ` Oleg Nesterov 2020-04-21 12:59 ` Christian Brauner 2020-04-21 13:42 ` Eric W. Biederman 2020-04-21 11:28 ` Oleg Nesterov 2020-04-21 11:38 ` Christian Brauner 2020-04-21 10:28 ` Christian Brauner 2020-04-21 14:57 ` Eric W. Biederman 2020-04-21 15:08 ` Christian Brauner 2020-04-21 9:04 ` Oleg Nesterov 2020-04-21 10:19 ` [PATCH] remove the no longer needed pid_alive() check in __task_pid_nr_ns() Oleg Nesterov 2020-04-21 10:50 ` Christian Brauner 2020-04-21 15:05 ` Eric W. Biederman 2020-04-24 18:05 ` Oleg Nesterov 2020-04-24 19:54 ` Eric W. Biederman 2020-04-21 14:59 ` SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5) Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).