On 2022-06-20, Amir Goldstein wrote: > > > > The goal would be that the semantics of fooat(, AT_EMPTY_PATH) and > > > > foo(/proc/self/fd/) should always be identical, and the current > > > > semantics of /proc/self/fd/ are too leaky so we shouldn't always > > > > assume that keeping them makes sense (the most obvious example is being > > > > able to do tricks to open /proc/$pid/exe as O_RDWR). > > > > > > Please make a note that I have applications relying on current magic symlink > > > semantics w.r.t setxattr() and other metadata operations, and the libselinux > > > commit linked from the patch commit message proves that magic symlink > > > semantics are used in the wild, so it is not likely that those semantics could > > > be changed, unless userspace breakage could be justified by fixing a serious > > > security issue (i.e. open /proc/$pid/exe as O_RDWR). > > > > Agreed. We also use magiclinks for similar TOCTOU-protection purposes in > > runc (as does lxc) as well as in libpathrs so I'm aware we need to be > > careful about changing existing behaviours. I would prefer to have the > > default be as restrictive as possible, but naturally back-compat is > > more important. > > > > > > I suspect that the long-term solution would be to have more upgrade > > > > masks so that userspace can opt-in to not allowing any kind of > > > > (metadata) write access through a particular file descriptor. You're > > > > quite right that we have several metadata write AT_EMPTY_PATH APIs, and > > > > so we can't retroactively block /everything/ but we should try to come > > > > up with less leaky rules by default if it won't break userspace. > > > > > > Ok, let me try to say this in my own words using an example to see that > > > we are all on the same page: > > > > > > - lsetxattr(PATH_TO_FILE,..) has inherent TOCTOU races > > > - fsetxattr(fd,...) is not applicable for symbolic links > > > > While I agree with Christian's concerns about making O_PATH descriptors > > more leaky, if userspace already relies on this through /proc/self/fd/$x > > then there's not much we can do about it other than having an opt-out > > available in openat2(2). Having the option to disable this stuff to > > avoid making O_PATH descriptors less safe as a mechanism for passing > > around "capability-less" file handles should make most people happy > > (with the note that ideally we would not be *adding* capabilities to > > O_PATH we don't need to). > > > > > - setxattr("/proc/self/fd/",...) is the current API to avoid TOCTOU races > > > when setting xattr on symbolic links > > > - setxattrat(o_path_fd, "", ..., AT_EMPTY_PATH) is proposed as a the > > > "new API" for setting xattr on symlinks (and special files) > > > > If this is a usecase we need to support then we may as well just re-use > > fsetxattr() since it's basically an *at(2) syscall already (and I don't > > see why we'd want to split up the capabilities between two similar > > *at(2)-like syscalls). Though this does come with the above caveats that > > we need to have the opt-outs available if we're going to enshrine this > > as intentional part of the ABI. > > > Christian preferred that new functionality be added with a new API > and I agree that this is nicer and more explicit. Fair enough -- I misread the man page, setxattrat(2) makes more sense. > The bigger question IMO is, whether fsomething() should stay identical > to somethingat(,,,AT_EMPTY_PATH). I don't think that it should. > > To me, open(path,O_PATH)+somethingat(,,,AT_EMPTY_PATH) is identical > to something(path) - it just breaks the path resolution and operation to two > distinguished steps. > > fsomething() was traditionally used for "really" open fds, so if we don't need > to, we better not relax it further by allowing O_PATH, but that's just one > opinion. Yeah, you're right -- it would be better to not muddle the two (even though they are conceptually very similar). > > > - The new API is going to be more strict than the old magic symlink API > > > - *If* it turns out to not break user applications, old API can also become > > > more strict to align with new API (unlikely the case for setxattr()) > > > - This will allow sandboxed containers to opt-out of the "old API", by > > > restricting access to /proc/self/fd and to implement more fine grained > > > control over which metadata operations are allowed on an O_PATH fd > > > > > > Did I understand the plan correctly? > > > > Yup, except I don't think we need setxattrat(2). > > > > > Do you agree with me that the plan to keep AT_EMPTY_PATH and magic > > > symlink semantics may not be realistic? > > > > To clarify -- my view is that if any current /proc/self/fd/$n semantic > > needs to be maintained then I would prefer that the proc-less method of > > doing it (such as through AT_EMPTY_PATH et al) would have the same > > capability and semantics. There are some cases where the current > > /proc/self/fd/$n semantics need to be fixed (such as the /proc/$pid/exe > > example) and in that case the proc-less semantics also need to be made > > safe. > > > > While I would like us to restrict O_PATH as much as possible, if > > userspace already depends on certain behaviour then we may not be able > > to do much about it. Having an opt-out would be very important since > > enshrining these leaky behaviours (which seem to have been overlooked) > > means we need to consider how userspace can opt out of them. > > > > Unfortunately, it should be noted that due to the "magical" nature of > > nd_jump_link(), I'm not sure how happy Al Viro will be with the kinds of > > restrictions necessary. Even my current (quite limited) upgrade-mask > > patchset has to do a fair bit of work to unify the semantics of > > magic-links and openat(O_EMPTYPATH) -- expanding this to all *at(2) > > syscalls might be quite painful. (There are also several handfuls of > > semantic questions which need to be answered about magic-link modes and > > whether for other *at(2) operations we may need even more complicated > > rules or even a re-thinking of my current approach.) > > The question remains, regarding the $SUBJECT patch, > is it fair to block it and deprive libselinux of a non buggy API > until such time that all the details around masking O_PATH fds > will be made clear and the new API implemented? > > There is no guarantee this will ever happen, so it does not seem > reasonable to me. > > To be a reasonable reaction to the currently broken API is > to either accept the patch as is or request that setxattrat() > will be added to provide the new functionality. Since the current functionality cannot be retroactively disabled as it is being used already through /proc/self/fd/$n, adding *xattrat(AT_EMPTY_PATH) doesn't really change what is currently possible by userspace. I would say we should add *xattrat(2) and then we can add an upgrade mask blocking it (and other operations) later. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH