On 2020-02-13, Ross Zwisler wrote: > On Thu, Feb 06, 2020 at 12:10:45PM -0700, Ross Zwisler wrote: > > On Tue, Feb 4, 2020 at 8:45 PM Aleksa Sarai wrote: > > > On 2020-02-04, Matthew Wilcox wrote: > > > > On Tue, Feb 04, 2020 at 04:49:48PM -0700, Ross Zwisler wrote: > > > > > On Tue, Feb 4, 2020 at 3:11 PM Ross Zwisler wrote: > > > > > > On Tue, Feb 4, 2020 at 2:53 PM Raul Rangel wrote: > > > > > > > > --- a/include/uapi/linux/mount.h > > > > > > > > +++ b/include/uapi/linux/mount.h > > > > > > > > @@ -34,6 +34,7 @@ > > > > > > > > #define MS_I_VERSION (1<<23) /* Update inode I_version field */ > > > > > > > > #define MS_STRICTATIME (1<<24) /* Always perform atime updates */ > > > > > > > > #define MS_LAZYTIME (1<<25) /* Update the on-disk [acm]times lazily */ > > > > > > > > +#define MS_NOSYMFOLLOW (1<<26) /* Do not follow symlinks */ > > > > > > > Doesn't this conflict with MS_SUBMOUNT below? > > > > > > > > > > > > > > > > /* These sb flags are internal to the kernel */ > > > > > > > > #define MS_SUBMOUNT (1<<26) > > > > > > > > > > > > Yep. Thanks for the catch, v6 on it's way. > > > > > > > > > > It actually looks like most of the flags which are internal to the > > > > > kernel are actually unused (MS_SUBMOUNT, MS_NOREMOTELOCK, MS_NOSEC, > > > > > MS_BORN and MS_ACTIVE). Several are unused completely, and the rest > > > > > are just part of the AA_MS_IGNORE_MASK which masks them off in the > > > > > apparmor LSM, but I'm pretty sure they couldn't have been set anyway. > > > > > > > > > > I'll just take over (1<<26) for MS_NOSYMFOLLOW, and remove the rest in > > > > > a second patch. > > > > > > > > > > If someone thinks these flags are actually used by something and I'm > > > > > just missing it, please let me know. > > > > > > > > Afraid you did miss it ... > > > > > > > > /* > > > > * sb->s_flags. Note that these mirror the equivalent MS_* flags where > > > > * represented in both. > > > > */ > > > > ... > > > > #define SB_SUBMOUNT (1<<26) > > > > > > > > It's not entirely clear to me why they need to be the same, but I haven't > > > > been paying close attention to the separation of superblock and mount > > > > flags, so someone else can probably explain the why of it. > > > > > > I could be wrong, but I believe this is historic and originates from the > > > kernel setting certain flags internally (similar to the whole O_* flag, > > > "internal" O_* flag, and FMODE_NOTIFY mixup). > > > > > > Also, one of the arguments for the new mount API was that we'd run out > > > MS_* bits so it's possible that you have to enable this new mount option > > > in the new mount API only. (Though Howells is the right person to talk > > > to on this point.) > > > > As far as I can tell, SB_SUBMOUNT doesn't actually have any dependence on > > MS_SUBMOUNT. Nothing ever sets or checks MS_SUBMOUNT from within the kernel, > > and whether or not it's set from userspace has no bearing on how SB_SUBMOUNT > > is used. SB_SUBMOUNT is set independently inside of the kernel in > > vfs_submount(). > > > > I agree that their association seems to be historical, introduced in this > > commit from David Howells: > > > > e462ec50cb5fa VFS: Differentiate mount flags (MS_*) from internal superblock flags > > > > In that commit message David notes: > > > > (1) Some MS_* flags get translated to MNT_* flags (such as MS_NODEV -> > > MNT_NODEV) without passing this on to the filesystem, but some > > filesystems set such flags anyway. > > > > I think this is sort of what we are trying to do with MS_NOSYMFOLLOW: have a > > userspace flag that translates to MNT_NOSYMFOLLOW, but which doesn't need an > > associated SB_* flag. Is it okay to reclaim the bit currently owned by > > MS_SUBMOUNT and use it for MS_NOSYMFOLLOW. > > > > A second option would be to choose one of the unused MS_* values from the > > middle of the range, such as 256 or 512. Looking back as far as git will let > > me, I don't think that these flags have been used for MS_* values at least > > since v2.6.12: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/fs.h?id=1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 > > > > I think maybe these used to be S_WRITE and S_APPEND, which weren't filesystem > > mount flags? > > > > https://sites.uclouvain.be/SystInfo/usr/include/sys/mount.h.html > > > > A third option would be to create this flag using the new mount system: > > > > https://lwn.net/Articles/753473/ > > https://lwn.net/Articles/759499/ > > > > My main concern with this option is that for Chrome OS we'd like to be able to > > backport whatever solution we come up with to a variety of older kernels, and > > if we go with the new mount system this would require us to backport the > > entire new mount system to those kernels, which I think is infeasible. > > > > David, what are your thoughts on this? Of these three options for supporting > > a new MS_NOSYMFOLLOW flag: > > > > 1) reclaim the bit currently used by MS_SUBMOUNT > > 2) use a smaller unused value for the flag, 256 or 512 > > 3) implement the new flag only in the new mount system > > > > do you think either #1 or #2 are workable? If so, which would you prefer? > > Gentle ping on this - do either of the options using the existing mount API > seem possible? Would it be useful for me to send out example patches in one > of those directions? Or is it out of the question, and I should spend my time > on making patches using the new mount system? Thanks! I think (1) or (2) sound reasonable, but I'm not really the right person to ask. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH