On 2018-09-29, Andy Lutomirski wrote: > >> On Sat, Sep 29, 2018 at 4:29 PM Aleksa Sarai wrote: > >> The primary motivation for the need for this flag is container runtimes > >> which have to interact with malicious root filesystems in the host > >> namespaces. One of the first requirements for a container runtime to be > >> secure against a malicious rootfs is that they correctly scope symlinks > >> (that is, they should be scoped as though they are chroot(2)ed into the > >> container's rootfs) and ".."-style paths. The already-existing AT_XDEV > >> and AT_NO_PROCLINKS help defend against other potential attacks in a > >> malicious rootfs scenario. > > > > So, I really like the concept for patch 1 of this series (but haven't > > read the code yet); but I dislike this patch because of its footgun > > potential. > > > > The code could do it differently: do the path walk and then, before > accepting the result, walk back up and make sure the result is under > the starting point. > > This is *not* a full solution, though, since a walk above the root gas > side effects on timing, various caches, and possibly network traffic, > so it’s open to Spectre-like attacks in which a malicious container > could use a runtime-initiated AT_THIS_ROOT to infer the existence of > directories outside the container. I think that one way to solve this problem might be to have more strict checks on nd->root in follow_dotdot(). The problem here (as far as I can tell) is that ".." could end up skipping past the root because of a rename, however walking *down* into a path shouldn't be a problem (even absolute symlinks shouldn't be a problem because they will nd_jump_root and will land back in the root). However, I'm not entirely sure what happens to nd->root if it gets renamed -- can you still safely do checks against it (we'd need to do some sort of is_descendant() check on the current path before we handle ".." in follow_dotdot). That way, we wouldn't shouldn't have the spectre-like attack problem (since the attack would be halted at the ".." stage -- before the path walk can proceed into host paths). Would this be sufficient or is there a more serious issue I'm missing? > But what’s the container usecase? Any sane container is based on > pivot_root or similar, so the runtime can just do the walk in the > container context. IOW I’m a bit confused as to the exact intended use > of the whole series. Can you elaborate? I went into this in my response to Jann. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH