All of lore.kernel.org
 help / color / mirror / Atom feed
* Use of unshare(CLONE_FS) in virtiofsd
@ 2022-11-04  7:50 ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2022-11-04  7:50 UTC (permalink / raw)
  To: virtio-fs
  Cc: qemu-devel, Misono Tomohiro, Vivek Goyal, Dr. David Alan Gilbert,
	Miklos Szeredi, Stefan Hajnoczi

I've got a proposed extension for glibc's pthread_create which allows
the creation of threads with a dedicated current working
directory/umask/chroot:

  [PATCH 0/2] Introduce per-thread file system properties on Linux
  <https://sourceware.org/pipermail/libc-alpha/2022-October/142640.html>

I expect that glibc integration will work around the seccomp issue
mentioned in a comment (also brought up by the Samba people for their
use) because glibc will perform the unshare directly during the clone
system call, and not via a separate system call.

I see that unshare(CLONE_FS) was introduced in this commit:

commit bdfd66788349acc43cd3f1298718ad491663cfcc
Author: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Date:   Thu Feb 27 14:59:27 2020 +0900

    virtiofsd: Fix xattr operations
    
    Current virtiofsd has problems about xattr operations and
    they does not work properly for directory/symlink/special file.
    
    The fundamental cause is that virtiofsd uses openat() + f...xattr()
    systemcalls for xattr operation but we should not open symlink/special
    file in the daemon. Therefore the function is restricted.
    
    Fix this problem by:
     1. during setup of each thread, call unshare(CLONE_FS)
     2. in xattr operations (i.e. lo_getxattr), if inode is not a regular
        file or directory, use fchdir(proc_loot_fd) + ...xattr() +
        fchdir(root.fd) instead of openat() + f...xattr()
    
        (Note: for a regular file/directory openat() + f...xattr()
         is still used for performance reason)
    
    With this patch, xfstests generic/062 passes on virtiofs.
    
    This fix is suggested by Miklos Szeredi and Stefan Hajnoczi.
    The original discussion can be found here:
      https://www.redhat.com/archives/virtio-fs/2019-October/msg00046.html
    
    Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
    Message-Id: <20200227055927.24566-3-misono.tomohiro@jp.fujitsu.com>
    Acked-by: Vivek Goyal <vgoyal@redhat.com>
    Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
    Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Now the question has come up on the libc-coord list why the *at
interfaces are not used in such cases:

  <https://www.openwall.com/lists/libc-coord/2022/10/24/3>

Clearly the kernel lacks support for fgetxattrat today.  The usual
recommendation for emulating it is to use openat with O_PATH, and then
use getxattr on the virtual /proc/self/fd path.  This needs an
additional system call (openat, getxattr, close instead of fchdir,
getxattr), but it avoids the unshare(CLONE_FS) call behind libc's back.
The directory entries in /proc/self/fd present as symbolic links, but
are not implemented as such by the kernel: there is no separate pathname
lookup for already-open O_PATH descriptors, so there is no race.

Thoughts?

Thanks,
Florian



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-11-04 17:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-04  7:50 Use of unshare(CLONE_FS) in virtiofsd Florian Weimer
2022-11-04  7:50 ` [Virtio-fs] " Florian Weimer
2022-11-04 13:00 ` Vivek Goyal
2022-11-04 13:00   ` [Virtio-fs] " Vivek Goyal
2022-11-04 17:44   ` Florian Weimer
2022-11-04 17:44     ` [Virtio-fs] " Florian Weimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.