All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Colin Walters <walters@verbum.org>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH] virtiofsd: Use clone() and not unshare(), support non-root
Date: Mon, 4 May 2020 10:51:49 +0100	[thread overview]
Message-ID: <20200504095149.GG115875@redhat.com> (raw)
In-Reply-To: <348d4774-bd5f-4832-bd7e-a21491fdac8d@www.fastmail.com>

On Fri, May 01, 2020 at 02:25:48PM -0400, Colin Walters wrote:
> I'd like to make use of virtiofs as part of our tooling in
> https://github.com/coreos/coreos-assembler
> Most of the code runs as non-root today; qemu also runs as non-root.
> We use 9p right now.
> 
> virtiofsd's builtin sandboxing effectively assumes it runs as
> root.
> 
> First, change the code to use `clone()` and not `unshare()+fork()`.
> 
> Next, automatically use `CLONE_NEWUSER` if we're running as non root.

I'd suggest splitting these two, so that the re-factoring is separate
from introducing new functionality.

> 
> This is similar logic to that in https://github.com/containers/bubblewrap
> (Which...BTW, it could make sense for virtiofs to depend on bubblewrap
>  and re-exec itself rather than re-implementing the containerization
>  itself)
> 
> Signed-off-by: Colin Walters <walters@verbum.org>
> ---
>  tools/virtiofsd/passthrough_ll.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
> index 4c35c95b25..468617f6d6 100644
> --- a/tools/virtiofsd/passthrough_ll.c
> +++ b/tools/virtiofsd/passthrough_ll.c
> @@ -2530,6 +2530,21 @@ static void print_capabilities(void)
>      printf("}\n");
>  }
>  
> +/* Copied from bubblewrap */
> +static int
> +raw_clone(unsigned long flags, void *child_stack)
> +{
> +#if defined(__s390__) || defined(__CRIS__)
> +  /*
> +   * On s390 and cris the order of the first and second arguments
> +   * of the raw clone() system call is reversed.
> +   */
> +    return (int) syscall(__NR_clone, child_stack, flags);
> +#else
> +    return (int) syscall(__NR_clone, flags, child_stack);
> +#endif
> +}

What's the reason for using the raw syscall ?  Was it just to avoid having
to allocate a new stack space ?


> +
>  /*
>   * Move to a new mount, net, and pid namespaces to isolate this process.
>   */
> @@ -2547,14 +2562,15 @@ static void setup_namespaces(struct lo_data *lo, struct fuse_session *se)
>       * an empty network namespace to prevent TCP/IP and other network
>       * activity in case this process is compromised.
>       */
> -    if (unshare(CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET) != 0) {
> -        fuse_log(FUSE_LOG_ERR, "unshare(CLONE_NEWPID | CLONE_NEWNS): %m\n");
> -        exit(1);
> +    int clone_flags = SIGCHLD | CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET;
> +    /* If we're non root, we need a new user namespace */
> +    if (getuid() != 0) {
> +        clone_flags |= CLONE_NEWUSER;
>      }

IIUC, with CLONE_NEWUSER we need to set a UID/GID mapping, otherwise all
file accesses will be squashed to the UID -1.  Or was it intentional
that you're only trying to provide read-only access to files that are
world-accessible ?



> -    child = fork();
> +    child = raw_clone(clone_flags, NULL);
>      if (child < 0) {
> -        fuse_log(FUSE_LOG_ERR, "fork() failed: %m\n");
> +        fuse_log(FUSE_LOG_ERR, "clone() failed: %m\n");
>          exit(1);
>      }
>      if (child > 0) {

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2020-05-04  9:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01 18:25 [PATCH] virtiofsd: Use clone() and not unshare(), support non-root Colin Walters
2020-05-04  9:51 ` Daniel P. Berrangé [this message]
2020-05-04 13:49 ` Stefan Hajnoczi
2020-05-04 13:49   ` [Virtio-fs] " Stefan Hajnoczi
2020-05-04 14:07 ` Marc-André Lureau
2020-05-04 14:20   ` Colin Walters
2020-05-04 15:43     ` Marc-André Lureau
2020-05-05 15:23   ` Stefan Hajnoczi
2020-05-05 15:32     ` Daniel P. Berrangé
2020-05-06 19:16 ` Dr. David Alan Gilbert
2020-05-07  9:28   ` Daniel P. Berrangé
2020-05-21 10:19     ` Stefan Hajnoczi
2020-05-21 10:43       ` Daniel P. Berrangé
2020-05-27 11:16         ` Stefan Hajnoczi
2020-06-02  9:55 ` Stefan Hajnoczi
2020-06-03  1:53   ` Colin Walters
2020-06-17 12:50     ` Stefan Hajnoczi
2020-06-17 12:55       ` Colin Walters
2020-06-23 12:34         ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200504095149.GG115875@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=walters@verbum.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.