From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752934AbcELTGw (ORCPT ); Thu, 12 May 2016 15:06:52 -0400 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:57334 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752669AbcELTGu (ORCPT ); Thu, 12 May 2016 15:06:50 -0400 Message-ID: <1463080006.2380.39.camel@HansenPartnership.com> Subject: [RFC 0/1] shiftfs: uid/gid shifting filesystem From: James Bottomley To: Djalal Harouni , Chris Mason , tytso@mit.edu, Serge Hallyn , Josh Triplett , "Eric W. Biederman" , Andy Lutomirski , Seth Forshee , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, Dongsu Park , David Herrmann , Miklos Szeredi , Alban Crequy , Al Viro Date: Thu, 12 May 2016 12:06:46 -0700 Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is currently an RFC because the patch applies to Linus head, but needs altering for the vfs tree, so I'll respin and resend after the merge window closes. My use case for this is that I run a lot of unprivileged architectural emulation containers on my system using user namespaces. Details here: http://blog.hansenpartnership.com/unprivileged-build-containers/ They're mostly for building non-x86 stuff (like aarch64 and arm secure boot and mips images). For builds, I have all the environments in my home directory with downshifted uids; however, sometimes I need to use them to administer real images that run on systems, meaning the uids are the usual privileged ones not the downshifted ones. The only current choice I have is to start the emulation as root so the uid/gids match. The reason for this filesystem is to use my standard unprivileged containers to maintain these images. The way I do this is crack the image with a loop and then shift the uids before bringing up the container. I usually loop mount into /var/tmp/images/, so it's owned by real root there: jarvis:~ # ls -l /var/tmp/images/mips|head -4 total 0 drwxr-xr-x 1 root root 8192 May 12 08:33 bin drwxr-xr-x 1 root root 6 May 12 08:33 boot drwxr-xr-x 1 root root 167 May 12 08:33 dev And I usually run my build containers with a uid_map of 0 100000 1000 1000 1000 1 65534 101000 1 (maps 0-999 shifted, then shifts nobody to 1000 and keeps my uid [1000] fixed so I can mount my home directory into the namespace) and something similar with gid_map. So I shift mount the mips image with mount -t shiftfs -o uidmap=0:100000:1000,uidmap=65534:101000:1,gidmap=0:100000:100,gidmap=101:100101:899,gidmap=65533:101000:2 /var/tmp/images/mips /home/jejb/containers/mips and I now see it as jejb@jarvis:~> ls -l containers/mips|head -4 total 0 drwxr-xr-x 1 100000 100000 8192 May 12 08:33 bin/ drwxr-xr-x 1 100000 100000 6 May 12 08:33 boot/ drwxr-xr-x 1 100000 100000 167 May 12 08:33 dev/ Like my usual unprivileged build roots and I can now use an unprivileged container to enter and administer the image. It seems like a lot of container systems need to do something similar when they try and provide unprivileged access to standard images. Right at the moment, the security mechanism only allows root in the host to use this, but it's not impossible to come up with a scheme for marking trees that can safely be shift mounted by unprivileged user namespaces. James --- fs/Kconfig | 8 + fs/Makefile | 1 + fs/shiftfs.c | 833 +++++++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/magic.h | 2 + 4 files changed, 844 insertions(+)