On 2018-11-20, Aleksa Sarai wrote: > On 2018-11-19, Christian Brauner wrote: > > On Tue, Nov 20, 2018 at 07:28:57AM +1100, Aleksa Sarai wrote: > > > On 2018-11-19, Christian Brauner wrote: > > > > + if (info) { > > > > + ret = __copy_siginfo_from_user(sig, &kinfo, info); > > > > + if (unlikely(ret)) > > > > + goto err; > > > > + /* > > > > + * Not even root can pretend to send signals from the kernel. > > > > + * Nor can they impersonate a kill()/tgkill(), which adds > > > > + * source info. > > > > + */ > > > > + ret = -EPERM; > > > > + if ((kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL) && > > > > + (task_pid(current) != pid)) > > > > + goto err; > > > > + } else { > > > > + prepare_kill_siginfo(sig, &kinfo); > > > > + } > > > > > > I wonder whether we should also have a pidns restriction here, since > > > currently it isn't possible for a container process using a pidns to > > > signal processes outside its pidns. AFAICS, this isn't done through an > > > explicit check -- it's a side-effect of processes in a pidns not being > > > able to address non-descendant-pidns processes. > > > > > > But maybe it's reasonable to allow sending a procfd to a different pidns > > > and the same operations working on it? If we extend the procfd API to > > > > No, I don't think so. I really don't want any fancy semantics in here. > > Fancy doesn't get merged and fancy is hard to maintain. So we should do > > something like: > > > > if (proc_pid_ns() != current_pid_ns) > > return EINVAL > > This isn't quite sufficient. The key thing is that you have to be in an > *ancestor* (or same) pidns, not the *same* pidns. Ideally you can re-use > the check already in pidns_get_parent, and expose it. It would be > something as trivial as: > > bool pidns_is_descendant(struct pid_namespace *ns, > struct pid_namespace *ancestor) > { > for (;;) { > if (!ns) > return false; > if (ns == ancestor) > break; > ns = ns->parent; > } > return true; > } > > And you can rewrite pidns_get_parent to use it. So you would instead be > doing: > > if (pidns_is_descendant(proc_pid_ns, task_active_pid_ns(current))) > return -EPERM; Scratch the last bit, -EPERM is wrong here. I would argue that -EINVAL is *somewhat* wrong because arguable the more semantically consistent error (with kill(2)) would be -ESRCH -- but then you're mixing the "pid is dead" and "pid is not visible to you" cases. I'm not sure what the right errno would be here (I'm sure some of the LKML greybeards will have a better clue.) :P -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH