On 2018-11-19, Daniel Colascione wrote: > On Mon, Nov 19, 2018 at 1:21 PM, Christian Brauner wrote: > > That can be done without a loop by comparing the level counter for the > > two pid namespaces. > > > >> > >> And you can rewrite pidns_get_parent to use it. So you would instead be > >> doing: > >> > >> if (pidns_is_descendant(proc_pid_ns, task_active_pid_ns(current))) > >> return -EPERM; > >> > >> (Or you can just copy the 5-line loop into procfd_signal -- though I > >> imagine we'll need this for all of the procfd_* APIs.) > > Why is any of this even necessary? Why does the child namespace we're > considering even have a file descriptor to its ancestor's procfs? If > it has one of these FDs, it can already *read* all sorts of > information it really shouldn't be able to acquire, so the additional > ability to send a signal (subject to the usual permission checks) > feels like sticking a finger in a dike that's already well-perforated. > IMHO, we shouldn't bother with this check. The patch would be simpler > without it. First of all, currently it isn't possible to signal processes in an ancestor pidns. Given the long thread about exit code visibility semantics, I'm sure you see why bringing up this question is reasonable. Some people (stupidly) bind-mount / into containers. There were several CVEs in both LXC and runc where you could access the host filesystem (including the host /proc). I'd prefer to not provide a mechanism for such escalations to start sending signals to host processes, since I don't see a strong reason why it should be allowed (and allowing it would add more cracks to the isolation of pidns). I think there is a huge difference between having read access to /proc and being able to use /proc to signal processes which you ordinarily would not be able to signal. And another important point is that of semantics. If we move forward with procfd_new() and the rest of the API we are discussing, I'd argue we'd want to allow passing an nsfs fd to specify what pidns we want the process to be created in (for procfd_new()). This will obviously require a permission check to make sure we aren't creating processes in a parent pidns -- and so for consistency all procfd_* operations should have similar checks. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH