All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Seth Forshee <sforshee@digitalocean.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org,
	Christian Brauner <christian.brauner@ubuntu.com>
Subject: [PATCH 00/10] Extend and tweak mapping support
Date: Tue, 23 Nov 2021 12:42:17 +0100	[thread overview]
Message-ID: <20211123114227.3124056-1-brauner@kernel.org> (raw)

From: Christian Brauner <christian.brauner@ubuntu.com>

Hey,

This series extend the mapping infrastructure in order to support mapped
mounts of mapped filesystems in the future.

Currently we only support mapped mounts of filesystems mounted without an
idmapping. This was a consicous decision mentioned in multiple places. For
example, see [1].

In our mapping documentation in [3] we explained in detail that it is
perfectly fine to extend support for mapped mounts to filesystem's mounted
with an idmapping should the need arise. The need has been there for some
time now (cf. [2]).

Before we can port any such filesystem we need to first extend the mapping
helpers to account for the filesystem's idmapping in the remapping helpers.
This again, is explained at length in our documentation at [3].

Currently, the low-level mapping helpers implement the remapping algorithms
described in [3] in a simplified manner. Because we could rely on the fact
that all filesystems supporting mapped mounts are mounted without an
idmapping the translation step from or into the filesystem idmapping could
be skipped.

In order to support mapped mounts of filesystem's mountable with an
idmapping the translation step we were able to skip before cannot be
skipped anymore. A filesystem mounted with an idmapping is very likely to
not use an identity mapping and will instead use a non-identity mapping. So
the translation step from or into the filesystem's idmapping in the
remapping algorithm cannot be skipped for such filesystems. More details
with examples can be found in [3].

This series adds a few new as well as prepares and tweaks some already
existing low-level mapping helpers to perform the full translation
algorithm explained in [3]. The low-level helpers can be written in a way
that they only perform the additional translation step when the filesystem
is indeed mounted with an idmapping.

Since we don't yet support such a filesystem yet a kernel was compiled
carrying a trivial patch making ext4 mountable with an idmapping:

# We're located on the host with the initial idmapping.
ubuntu@f2-vm:~$ cat /proc/self/uid_map
         0          0 4294967295

# Mount an ext4 filesystem with the initial idmapping.
ubuntu@f2-vm:~$ sudo mount -t ext4 /dev/loop0 /mnt

# The filesystem contains two files. One owned by id 0 and another one owned by
# id 1000 in the initial idmapping.
ubuntu@f2-vm:~$ ls -al /mnt/
total 8
drwxrwxrwx  2 root   root   4096 Nov 22 17:04 .
drwxr-xr-x 24 root   root   4096 Nov 20 11:24 ..
-rw-r--r--  1 root   root      0 Nov 22 17:04 file_init_mapping_0
-rw-r--r--  1 ubuntu ubuntu    0 Nov 22 17:04 file_init_mapping_1000

# Umount it again so we we can mount it in another namespace later.
ubuntu@f2-vm:~$ sudo umount  /mnt

# Use the lxc-usernsexec binary to run a shell in a user and mount namespace
# with an idmapping of 0:10000:100000000.
#
# This idmapping will have the effect that files which are owned by i_{g,u}id
# 10000 and files that are owned by i_{g,u}id 11000 will be owned by {g,u}id
# 0 and {g,u}id 1000 with that namespace respectively.
ubuntu@f2-vm:~$ sudo lxc-usernsexec -m b:0:10000:100000000 -- bash

# Verify that we're really running with the expected idmapping.
root@f2-vm:/home/ubuntu# cat /proc/self/uid_map
         0      10000  100000000

# Mount the ext4 filesystem in the user and mountns with the idmapping
# 0:10000:100000000.
# 
# Note, that this requires a test kernel that makes ext4 mountable in a
# non-initial userns. The patch is simply:
#
# diff --git a/fs/ext4/super.c b/fs/ext4/super.c
# index 4e33b5eca694..0221e8211e5b 100644
# --- a/fs/ext4/super.c
# +++ b/fs/ext4/super.c
# @@ -6584,7 +6584,7 @@ static struct file_system_type ext4_fs_type = {
#         .name           = "ext4",
#         .mount          = ext4_mount,
#         .kill_sb        = kill_block_super,
# -       .fs_flags       = FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
# +       .fs_flags       = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_USERNS_MOUNT,
#  };
#  MODULE_ALIAS_FS("ext4");
root@f2-vm:/home/ubuntu# mount -t ext4 /dev/loop0 /mnt

# We verify that we can still interact with the files and that they are
# correctly owned from witin the namespace.
# 
# As mentioned before, a file owned by {g,u}id 0 within the userns is mapped to
# {g,u}id 10000 in the initial userns. So inode->i_{g,u}id will contain
# 10000. Similarly a file owned by {g,u}id 1000 within the userns means that
# inode->i_{g,u}id will contain 11000.
the inode->i_{g,u}id 
root@f2-vm:/home/ubuntu# ls -al /mnt/
total 8
drwxrwxrwx  2 root   root    4096 Nov 22 17:04 .
drwxr-xr-x 24 nobody nogroup 4096 Nov 20 11:24 ..
-rw-r--r--  1 root   root       0 Nov 22 17:04 file_init_mapping_0
-rw-r--r--  1 ubuntu ubuntu     0 Nov 22 17:04 file_init_mapping_1000

# Since we own the superblock we can also create idmapped mounts of the
# filesystem.
# First, we create an idmapped mount that maps {g,u}id 1000 to {g,u}id 0.
# Note that the idmapping 1000:0:1 as written from userspace is different from
# the idmapping the kernel actually stores. The idmapping is always translated
# into the initial userns. So the kernel actually stores the idmapping
# 1000:10000:1.
root@f2-vm:/home/ubuntu# /mount-idmapped --map-mount b:1000:0:1 /mnt /opt

# Verify that files owned by {g,u}id 1000 are now owned by {g,u}id 0 and all
# others are owned by the overflow{g,u}id.
root@f2-vm:/home/ubuntu# ls -al /opt
total 8
drwxrwxrwx  2 nobody nogroup 4096 Nov 22 17:04 .
drwxr-xr-x 24 nobody nogroup 4096 Nov 20 11:24 ..
-rw-r--r--  1 nobody nogroup    0 Nov 22 17:04 file_init_mapping_0
-rw-r--r--  1 root   root       0 Nov 22 17:04 file_init_mapping_1000

# Umount the idmapped mount.
root@f2-vm:/mnt# umount /opt

# Now create another idmapped mount with a mapping of 0:20000:1000000.
#
# Similar to what we said above the kernel doesn't store 0:20000:1000000 but
# 0:30000:1000000.
root@f2-vm:/mnt# /mount-idmapped --map-mount b:0:20000:1000000 /mnt /opt

# Verify that files owned by {g,u}id 0 in the filesystem's idmapping are owned
# by {g,u}id 20000 in the mount's idmapping and that files owned by {g,u}id
# 10000 in the filesystem's idmapping are owned by {g,u}id 21000 in the
# mount's idmapping.
root@f2-vm:/mnt# ls -al /opt
total 8
drwxrwxrwx  2  20000   20000 4096 Nov 22 17:04 .
drwxr-xr-x 24 nobody nogroup 4096 Nov 20 11:24 ..
-rw-r--r--  1  20000   20000    0 Nov 22 17:04 file_init_mapping_0
-rw-r--r--  1  21000   21000    0 Nov 22 17:04 file_init_mapping_1000

# Use the lxc-usernsexec binary to run a shell in a user and mount namespace
# with an idmapping of 0:20000:1000000.
#
# Similar to what we said above the kernel doesn't store 0:20000:1000000 but
# 0:30000:1000000.
root@f2-vm:/mnt# lxc-usernsexec -m b:0:20000:1000000 -- bash

# Verify that we managed to use the correct nested mapping.
root@f2-vm:/mnt# cat /proc/self/uid_map
         0      20000    1000000

# Verify that files owned by {g,u}id 20000 in the idmapped mount when accessed
# from the ancestor userns are now owned by {g,u}id 0 in the idmapped mount
# when accessed from the new namespace. Similarly, verify that files owned by
# {g,u}id 21000 in the idmapped mount when accessed from the ancestor userns
# are now owned by {g,u}id 1000 in the idmapped mount when accesed from the new
# namespace.
root@f2-vm:/mnt# ls -al /opt/
total 8
drwxrwxrwx  2 root   root    4096 Nov 22 17:04 .
drwxr-xr-x 24 nobody nogroup 4096 Nov 20 11:24 ..
-rw-r--r--  1 root   root       0 Nov 22 17:04 file_init_mapping_0
-rw-r--r--  1 ubuntu ubuntu     0 Nov 22 17:04 file_init_mapping_1000

# Create a file as id 0 in the current namespace through the idmapped mount.
root@f2-vm:/mnt# touch /opt/file_from_nested_ns

# Verify that the newly created file is owned by {g,u}id 0.
root@f2-vm:/mnt# ls -al /opt
total 8
drwxrwxrwx  2 root   root    4096 Nov 22 17:32 .
drwxr-xr-x 24 nobody nogroup 4096 Nov 20 11:24 ..
-rw-r--r--  1 root   root       0 Nov 22 17:32 file_from_nested_ns
-rw-r--r--  1 root   root       0 Nov 22 17:04 file_init_mapping_0
-rw-r--r--  1 ubuntu ubuntu     0 Nov 22 17:04 file_init_mapping_1000

[1]: commit 2ca4dcc4909d ("fs/mount_setattr: tighten permission checks")
[2]: https://github.com/containers/podman/issues/10374
[3]: Documentations/filesystems/idmappings.rst

Thanks!
Christian

Christian Brauner (10):
  fs: add is_mapped_mnt() helper
  fs: move mapping helpers
  fs: tweak fsuidgid_has_mapping()
  fs: account for filesystem mappings
  docs: update mapping documentation
  fs: use low-level mapping helpers
  fs: remove unused low-level mapping helpers
  fs: port higher-level mapping helpers
  fs: add i_user_ns() helper
  fs: support mapped mounts of mapped filesystems

 Documentation/filesystems/idmappings.rst |  72 -------
 fs/cachefiles/bind.c                     |   2 +-
 fs/ecryptfs/main.c                       |   2 +-
 fs/ksmbd/smbacl.c                        |  18 +-
 fs/ksmbd/smbacl.h                        |   4 +-
 fs/namespace.c                           |  38 ++--
 fs/nfsd/export.c                         |   2 +-
 fs/open.c                                |   7 +-
 fs/overlayfs/super.c                     |   2 +-
 fs/posix_acl.c                           |  16 +-
 fs/proc_namespace.c                      |   2 +-
 fs/xfs/xfs_inode.c                       |  10 +-
 fs/xfs/xfs_symlink.c                     |   5 +-
 include/linux/fs.h                       | 141 ++++----------
 include/linux/mnt_mapping.h              | 234 +++++++++++++++++++++++
 security/commoncap.c                     |  14 +-
 16 files changed, 339 insertions(+), 230 deletions(-)
 create mode 100644 include/linux/mnt_mapping.h


base-commit: 136057256686de39cc3a07c2e39ef6bc43003ff6
-- 
2.30.2


             reply	other threads:[~2021-11-23 11:42 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23 11:42 Christian Brauner [this message]
2021-11-23 11:42 ` [PATCH 01/10] fs: add is_mapped_mnt() helper Christian Brauner
2021-11-30  6:25   ` Amir Goldstein
2021-11-30  8:48     ` Christian Brauner
2021-11-23 11:42 ` [PATCH 02/10] fs: move mapping helpers Christian Brauner
2021-11-30  6:35   ` Amir Goldstein
2021-11-30  8:53     ` Christian Brauner
2021-11-23 11:42 ` [PATCH 03/10] fs: tweak fsuidgid_has_mapping() Christian Brauner
2021-11-30  6:44   ` Amir Goldstein
2021-11-23 11:42 ` [PATCH 04/10] fs: account for filesystem mappings Christian Brauner
2021-11-23 11:42 ` [PATCH 05/10] docs: update mapping documentation Christian Brauner
2021-11-23 11:42 ` [PATCH 06/10] fs: use low-level mapping helpers Christian Brauner
2021-11-23 11:42 ` [PATCH 07/10] fs: remove unused " Christian Brauner
2021-11-30  6:53   ` Amir Goldstein
2021-11-23 11:42 ` [PATCH 08/10] fs: port higher-level " Christian Brauner
2021-11-30  7:15   ` Amir Goldstein
2021-11-30  8:52     ` Christian Brauner
2021-11-23 11:42 ` [PATCH 09/10] fs: add i_user_ns() helper Christian Brauner
2021-11-30  7:02   ` Amir Goldstein
2021-11-23 11:42 ` [PATCH 10/10] fs: support mapped mounts of mapped filesystems Christian Brauner
2021-11-29 10:36 ` [PATCH 00/10] Extend and tweak mapping support Christian Brauner
2021-11-30  5:51 ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211123114227.3124056-1-brauner@kernel.org \
    --to=brauner@kernel.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=sforshee@digitalocean.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.