All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: vromanso@redhat.com,
	Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>,
	qemu-devel@nongnu.org, virtio-fs@redhat.com, rmohr@redhat.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error
Date: Wed, 22 Jul 2020 18:03:28 +0100	[thread overview]
Message-ID: <20200722170328.GU2324845@redhat.com> (raw)
In-Reply-To: <20200722130206.224898-4-stefanha@redhat.com>

On Wed, Jul 22, 2020 at 02:02:06PM +0100, Stefan Hajnoczi wrote:
> An assertion failure is raised during request processing if
> unshare(CLONE_FS) fails. Implement a probe at startup so the problem can
> be detected right away.
> 
> Unfortunately Docker/Moby does not include unshare in the seccomp.json
> list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always
> include unshare (e.g. podman is unaffected):
> https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json
> 
> Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the
> default seccomp.json is missing unshare.
> 
> Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tools/virtiofsd/fuse_virtio.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
> index 3b6d16a041..ebeb352514 100644
> --- a/tools/virtiofsd/fuse_virtio.c
> +++ b/tools/virtiofsd/fuse_virtio.c
> @@ -949,6 +949,19 @@ int virtio_session_mount(struct fuse_session *se)
>  {
>      int ret;
>  
> +    /*
> +     * Test that unshare(CLONE_FS) works. fv_queue_worker() will need it. It's
> +     * an unprivileged system call but some Docker/Moby versions are known to
> +     * reject it via seccomp when CAP_SYS_ADMIN is not given.
> +     */
> +    ret = unshare(CLONE_FS);
> +    if (ret == -1 && errno == EPERM) {
> +        fuse_log(FUSE_LOG_ERR, "unshare(CLONE_FS) failed with EPERM. If "
> +                "running in a container please check that the container "
> +                "runtime seccomp policy allows unshare.\n");
> +        return -1;
> +    }
> +

This describes the unshare() call as a "probe" and a "test", but that's
misleading IMHO. A "probe" / "test" implies that after it has completed,
there's no lingering side-effect, which isn't the case here.

This is actively changing the process' namespace environment in the
success case, and not putting it back how it was originally.

May be this is in fact OK, but if so I think the commit message and
comment should explain/justify what its fine to have this lingering
side-effect.

If we want to avoid the side-effect then we need to fork() and run
unshare() in the child, and use a check of exit status of the child
to determine the result.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



WARNING: multiple messages have this Message-ID (diff)
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: vromanso@redhat.com, qemu-devel@nongnu.org, virtio-fs@redhat.com,
	rmohr@redhat.com
Subject: Re: [Virtio-fs] [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error
Date: Wed, 22 Jul 2020 18:03:28 +0100	[thread overview]
Message-ID: <20200722170328.GU2324845@redhat.com> (raw)
In-Reply-To: <20200722130206.224898-4-stefanha@redhat.com>

On Wed, Jul 22, 2020 at 02:02:06PM +0100, Stefan Hajnoczi wrote:
> An assertion failure is raised during request processing if
> unshare(CLONE_FS) fails. Implement a probe at startup so the problem can
> be detected right away.
> 
> Unfortunately Docker/Moby does not include unshare in the seccomp.json
> list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always
> include unshare (e.g. podman is unaffected):
> https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json
> 
> Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the
> default seccomp.json is missing unshare.
> 
> Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tools/virtiofsd/fuse_virtio.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
> index 3b6d16a041..ebeb352514 100644
> --- a/tools/virtiofsd/fuse_virtio.c
> +++ b/tools/virtiofsd/fuse_virtio.c
> @@ -949,6 +949,19 @@ int virtio_session_mount(struct fuse_session *se)
>  {
>      int ret;
>  
> +    /*
> +     * Test that unshare(CLONE_FS) works. fv_queue_worker() will need it. It's
> +     * an unprivileged system call but some Docker/Moby versions are known to
> +     * reject it via seccomp when CAP_SYS_ADMIN is not given.
> +     */
> +    ret = unshare(CLONE_FS);
> +    if (ret == -1 && errno == EPERM) {
> +        fuse_log(FUSE_LOG_ERR, "unshare(CLONE_FS) failed with EPERM. If "
> +                "running in a container please check that the container "
> +                "runtime seccomp policy allows unshare.\n");
> +        return -1;
> +    }
> +

This describes the unshare() call as a "probe" and a "test", but that's
misleading IMHO. A "probe" / "test" implies that after it has completed,
there's no lingering side-effect, which isn't the case here.

This is actively changing the process' namespace environment in the
success case, and not putting it back how it was originally.

May be this is in fact OK, but if so I think the commit message and
comment should explain/justify what its fine to have this lingering
side-effect.

If we want to avoid the side-effect then we need to fork() and run
unshare() in the child, and use a check of exit status of the child
to determine the result.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


  reply	other threads:[~2020-07-22 17:04 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-22 13:02 [PATCH for-5.1 0/3] virtiofsd: allow virtiofsd to run in a container Stefan Hajnoczi
2020-07-22 13:02 ` [Virtio-fs] " Stefan Hajnoczi
2020-07-22 13:02 ` [PATCH for-5.1 1/3] virtiofsd: drop CAP_DAC_READ_SEARCH Stefan Hajnoczi
2020-07-22 13:02   ` [Virtio-fs] " Stefan Hajnoczi
2020-07-22 16:51   ` Dr. David Alan Gilbert
2020-07-22 16:51     ` [Virtio-fs] " Dr. David Alan Gilbert
2020-07-22 13:02 ` [PATCH for-5.1 2/3] virtiofsd: add container-friendly -o chroot sandboxing option Stefan Hajnoczi
2020-07-22 13:02   ` [Virtio-fs] " Stefan Hajnoczi
2020-07-22 16:58   ` Daniel P. Berrangé
2020-07-22 16:58     ` [Virtio-fs] " Daniel P. Berrangé
2020-07-23 12:17     ` Stefan Hajnoczi
2020-07-23 12:17       ` [Virtio-fs] " Stefan Hajnoczi
2020-07-22 17:58   ` Dr. David Alan Gilbert
2020-07-22 17:58     ` [Virtio-fs] " Dr. David Alan Gilbert
2020-07-23 12:28     ` Stefan Hajnoczi
2020-07-23 12:28       ` [Virtio-fs] " Stefan Hajnoczi
2020-07-23 13:47       ` Vivek Goyal
2020-07-23 13:47         ` Vivek Goyal
2020-07-23 15:36         ` Stefan Hajnoczi
2020-07-23 15:36           ` Stefan Hajnoczi
2020-07-22 18:17   ` Vivek Goyal
2020-07-23 12:29     ` Stefan Hajnoczi
2020-07-22 19:03   ` Dr. David Alan Gilbert
2020-07-22 19:03     ` [Virtio-fs] " Dr. David Alan Gilbert
2020-07-23 12:32     ` Stefan Hajnoczi
2020-07-23 12:32       ` [Virtio-fs] " Stefan Hajnoczi
2020-07-23 17:55       ` Dr. David Alan Gilbert
2020-07-23 17:55         ` [Virtio-fs] " Dr. David Alan Gilbert
2020-07-24 12:22         ` Stefan Hajnoczi
2020-07-24 12:22           ` [Virtio-fs] " Stefan Hajnoczi
2020-07-22 13:02 ` [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error Stefan Hajnoczi
2020-07-22 13:02   ` [Virtio-fs] " Stefan Hajnoczi
2020-07-22 17:03   ` Daniel P. Berrangé [this message]
2020-07-22 17:03     ` Daniel P. Berrangé
2020-07-23 12:46     ` Stefan Hajnoczi
2020-07-23 12:46       ` [Virtio-fs] " Stefan Hajnoczi
2020-07-23 12:50       ` Daniel P. Berrangé
2020-07-23 12:50         ` [Virtio-fs] " Daniel P. Berrangé
2020-07-23 13:56         ` Vivek Goyal
2020-07-23 13:56           ` Vivek Goyal
2020-07-23 15:19           ` Stefan Hajnoczi
2020-07-22 18:19 ` [Virtio-fs] [PATCH for-5.1 0/3] virtiofsd: allow virtiofsd to run in a container Vivek Goyal
2020-07-23 12:46   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200722170328.GU2324845@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=misono.tomohiro@jp.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rmohr@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-fs@redhat.com \
    --cc=vromanso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.