On Thu, Jul 30, 2020 at 06:21:34PM -0400, Daniel Walsh wrote: > On 7/29/20 10:40, Stefan Hajnoczi wrote: > > On Wed, Jul 29, 2020 at 09:59:01AM +0200, Roman Mohr wrote: > >> On Tue, Jul 28, 2020 at 3:13 PM Vivek Goyal wrote: > >> > >>> On Tue, Jul 28, 2020 at 12:00:20PM +0200, Roman Mohr wrote: > >>>> On Tue, Jul 28, 2020 at 3:07 AM misono.tomohiro@fujitsu.com < > >>>> misono.tomohiro@fujitsu.com> wrote: > >>>> > >>>>>> Subject: [PATCH v2 3/3] virtiofsd: probe unshare(CLONE_FS) and print > >>> an > >>>>> error > >> Yes they can run as root. I can tell you what we plan to do with the > >> containerized virtiofsd: We run it as part of the user-owned pod (a set of > >> containers). > >> One of our main goals at the moment is to run VMs in a user-owned pod > >> without additional privileges. > >> So that in case the user (VM-creator/owner) enters the pod or something > >> breaks out of the VM they are just in the unprivileged container sandbox. > >> As part of that we try to get also rid of running containers in the > >> user-context with the root user. > >> > >> One possible scenario which I could think of as being desirable from a > >> kubevirt perspective: > >> We would run the VM in one container and have an unprivileged > >> virtiofsd container in parallel. > >> This container already has its own mount namespace and it is not that > >> critical if something manages to enter this sandbox. > >> > >> But we are not as far yet as getting completely rid of root right now in > >> kubevirt, so if as a temporary step it needs root, the current proposed > >> changes would still be very useful for us. > > What is the issue with root in user namespaces? > > > > I remember a few years ago it was seen as a major security issue but > > don't remember if container runtimes were already using user namespaces > > back then. > > > > I guess the goal might be simply to minimize Linux capabilities as much > > as possible? > > > > virtiofsd could nominally run with an arbitrary uid/gid but it still > > needs the Linux capabilities that allow it to change uid/gid and > > override file system permission checks just like the root user. Not sure > > if there is any advantage to running with uid 1000 when you still have > > these Linux capabilities. > > > > Stefan > > When you run in a user namespace, virtiofsd would only have > setuid/setgid over the range of UIDs mapped into the user namespace.  So > if UID=0 on the host is not mapped, then the container can not create > real UID=0 files on disk. > > Similarly you can protect the user directories and any content by > running the containers in a really high UID Mapping. Roman, do user namespaces address your concerns about uid 0 in containers? Stefan