From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755790AbeARO6t (ORCPT ); Thu, 18 Jan 2018 09:58:49 -0500 Received: from mail-pf0-f171.google.com ([209.85.192.171]:40126 "EHLO mail-pf0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754789AbeARO6n (ORCPT ); Thu, 18 Jan 2018 09:58:43 -0500 X-Google-Smtp-Source: ACJfBovqkDwmWp1OQzBDdmTze8AXRkOINxfM4gKZ5ismGKITQAYZysCMC7eJ4R3zIUPxMzkOtMDqBJauqJvzCz4h/84= MIME-Version: 1.0 In-Reply-To: References: <877etbcmnd.fsf@xmission.com> From: Alban Crequy Date: Thu, 18 Jan 2018 15:58:41 +0100 Message-ID: Subject: Re: [PATCH v5 00/11] FUSE mounts from non-init user namespaces To: "Eric W. Biederman" Cc: Dongsu Park , LKML , Linux Containers , Miklos Szeredi , Seth Forshee , Sargun Dhillon Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 9, 2018 at 4:05 PM, Dongsu Park wrote: > Hi, > > On Mon, Dec 25, 2017 at 8:05 AM, Eric W. Biederman > wrote: >> Dongsu Park writes: >> >>> This patchset v5 is based on work by Seth Forshee and Eric Biederman. >>> The latest patchset was v4: >>> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1132206.html >>> >>> At the moment, filesystems backed by physical medium can only be mounted >>> by real root in the initial user namespace. This restriction exists >>> because if it's allowed for root user in non-init user namespaces to >>> mount the filesystem, then it effectively allows the user to control the >>> underlying source of the filesystem. In case of FUSE, the source would >>> mean any underlying device. >>> >>> However, in many use cases such as containers, it's necessary to allow >>> filesystems to be mounted from non-init user namespaces. Goal of this >>> patchset is to allow FUSE filesystems to be mounted from non-init user >>> namespaces. Support for other filesystems like ext4 are not in the >>> scope of this patchset. >>> >>> Let me describe how to test mounting from non-init user namespaces. It's >>> assumed that tests are done via sshfs, a userspace filesystem based on >>> FUSE with ssh as backend. Testing system is Fedora 27. >> >> In general I am for this work, and more bodies and more eyes on it is >> generally better. >> >> I will review this after the New Year, I am out for the holidays right >> now. > > Thanks. I'll wait for your review. Hi Eric, Do you have some cycles for this now that it is the new year? A review on the associated ima issue would also be appreciated: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1587678.html Cheers, Alban >>> ==== >>> $ sudo dnf install -y sshfs >>> $ sudo mkdir -p /mnt/userns >>> >>> ### workaround to get the sshfs permission checks >>> $ sudo chown -R $UID:$UID /etc/ssh/ssh_config.d /usr/share/crypto-policies >>> >>> $ unshare -U -r -m >>> # sshfs root@localhost: /mnt/userns >>> >>> ### You can see sshfs being mounted from a non-init user namespace >>> # mount | grep sshfs >>> root@localhost: on /mnt/userns type fuse.sshfs >>> (rw,nosuid,nodev,relatime,user_id=0,group_id=0) >>> >>> # touch /mnt/userns/test >>> # ls -l /mnt/userns/test >>> -rw-r--r-- 1 root root 0 Dec 11 19:01 /mnt/userns/test >>> ==== >>> >>> Open another terminal, check the mountpoint from outside the namespace. >>> >>> ==== >>> $ grep userns /proc/$(pidof sshfs)/mountinfo >>> 131 102 0:35 / /mnt/userns rw,nosuid,nodev,relatime - fuse.sshfs >>> root@localhost: rw,user_id=0,group_id=0 >>> ==== >>> >>> After all tests are done, you can unmount the filesystem >>> inside the namespace. >>> >>> ==== >>> # fusermount -u /mnt/userns >>> ==== >>> >>> Changes since v4: >>> * Remove other parts like ext4 to keep the patchset minimal for FUSE >>> * Add and change commit messages >>> * Describe how to test non-init user namespaces >>> >>> TODO: >>> * Think through potential security implications. There are 2 patches >>> being prepared for security issues. One is "ima: define a new policy >>> option named force" by Mimi Zohar, which adds an option to specify >>> that the results should not be cached: >>> https://marc.info/?l=linux-integrity&m=151275680115856&w=2 >>> The other one is to basically prevent FUSE results from being cached, >>> which is still in progress. >>> >>> * Test IMA/LSMs. Details are written in >>> https://github.com/kinvolk/fuse-userns-patches/blob/master/tests/TESTING_INTEGRITY.md >>> >>> Patches 1-2 deal with an additional flag of lookup_bdev() to check for >>> additional inode permission. >>> >>> Patches 3-7 allow the superblock owner to change ownership of inodes, and >>> deal with additional capability checks w.r.t user namespaces. >>> >>> Patches 8-10 allow FUSE filesystems to be mounted outside of the init >>> user namespace. >>> >>> Patch 11 handles a corner case of non-root users in EVM. >>> >>> The patchset is also available in our github repo: >>> https://github.com/kinvolk/linux/tree/dongsu/fuse-userns-v5-1 >>> >>> >>> Eric W. Biederman (1): >>> fs: Allow superblock owner to change ownership of inodes >>> >>> Seth Forshee (10): >>> block_dev: Support checking inode permissions in lookup_bdev() >>> mtd: Check permissions towards mtd block device inode when mounting >>> fs: Don't remove suid for CAP_FSETID for userns root >>> fs: Allow superblock owner to access do_remount_sb() >>> capabilities: Allow privileged user in s_user_ns to set security.* >>> xattrs >>> fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems >>> fuse: Support fuse filesystems outside of init_user_ns >>> fuse: Restrict allow_other to the superblock's namespace or a >>> descendant >>> fuse: Allow user namespace mounts >>> evm: Don't update hmacs in user ns mounts >>> >>> drivers/md/bcache/super.c | 2 +- >>> drivers/md/dm-table.c | 2 +- >>> drivers/mtd/mtdsuper.c | 6 +++++- >>> fs/attr.c | 34 ++++++++++++++++++++++++++-------- >>> fs/block_dev.c | 13 ++++++++++--- >>> fs/fuse/cuse.c | 3 ++- >>> fs/fuse/dev.c | 11 ++++++++--- >>> fs/fuse/dir.c | 16 ++++++++-------- >>> fs/fuse/fuse_i.h | 6 +++++- >>> fs/fuse/inode.c | 35 +++++++++++++++++++++-------------- >>> fs/inode.c | 6 ++++-- >>> fs/ioctl.c | 4 ++-- >>> fs/namespace.c | 4 ++-- >>> fs/proc/base.c | 7 +++++++ >>> fs/proc/generic.c | 7 +++++++ >>> fs/proc/proc_sysctl.c | 7 +++++++ >>> fs/quota/quota.c | 2 +- >>> include/linux/fs.h | 2 +- >>> kernel/user_namespace.c | 1 + >>> security/commoncap.c | 8 ++++++-- >>> security/integrity/evm/evm_crypto.c | 3 ++- >>> 21 files changed, 127 insertions(+), 52 deletions(-)