All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/21] btrfs: support idmapped mounts
@ 2021-07-19 11:10 Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 01/21] namei: add mapping aware lookup helper Christian Brauner
                   ` (21 more replies)
  0 siblings, 22 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

Hey everyone,

This series enables the creation of idmapped mounts on btrfs. On the list of
filesystems btrfs was pretty high-up and requested quite often from userspace
(cf. [1]). This series requires just a few changes to the vfs for specific
lookup helpers that btrfs relies on to perform permission checking when looking
up an inode. The changes are required to port some other filesystem as well.

The conversion of the necessary btrfs internals was fairly straightforward. No
invasive changes were needed. I've decided to split up the patchset into very
small individual patches. This hopefully makes the series more readable and
fairly easy to review. The overall changeset is quite small.

All non-filesystem wide ioctls that peform permission checking based on inodes
can be supported on idmapped mounts. There are really just a few restrictions.
This should really only affect the deletion of subvolumes by subvolume id which
can be used to delete any subvolume in the filesystem even though the caller
might not even be able to see the subvolume under their mount. Other than that
behavior on idmapped and non-idmapped mounts is identical for all enabled
ioctls.

The changeset has an associated new testsuite specific to btrfs. The
core vfs operations that btrfs implements are covered by the generic
idmapped mount testsuite. For the ioctls a new testsuite was added. It
is sent alongside this patchset for ease of review but will very likely
be merged independent of it.

All patches are based on v5.14-rc2.

The series can be pulled from:
https://git.kernel.org/brauner/h/fs.idmapped.btrfs
https://github.com/brauner/linux/tree/fs.idmapped.btrfs

The xfstests can be pulled from:
https://git.kernel.org/brauner/xfstests-dev/h/fs.idmapped.btrfs
https://github.com/brauner/xfstests/tree/fs.idmapped.btrfs

Note, the new btrfs xfstests patch is on top of a branch of mine
containing a few more preliminary patches. So if you want to run the
tests, please simply pull the branch and build from there.

The series has been tested with xfstests including the newly added btrfs
specific test. All tests pass.
There were three unrelated failures that I observed: btrfs/219,
btrfs/2020 and btrfs/235. All three also fail on earlier kernels
without the patch series applied.

Thanks!
Christian

[1]: https://github.com/systemd/systemd/pull/19438#discussion_r622807165

Christian Brauner (20):
  namei: add mapping aware lookup helper
  btrfs/inode: handle idmaps in btrfs_new_inode()
  btrfs/inode: allow idmapped rename iop
  btrfs/inode: allow idmapped getattr iop
  btrfs/inode: allow idmapped mknod iop
  btrfs/inode: allow idmapped create iop
  btrfs/inode: allow idmapped mkdir iop
  btrfs/inode: allow idmapped symlink iop
  btrfs/inode: allow idmapped tmpfile iop
  btrfs/inode: allow idmapped setattr iop
  btrfs/inode: allow idmapped permission iop
  btrfs/ioctl: check whether fs{g,u}id are mapped during subvolume
    creation
  btrfs/inode: allow idmapped BTRFS_IOC_{SNAP,SUBVOL}_CREATE{_V2} ioctl
  btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl
  btrfs/ioctl: relax restrictions for BTRFS_IOC_SNAP_DESTROY_V2 with
    subvolids
  btrfs/ioctl: allow idmapped BTRFS_IOC_SET_RECEIVED_SUBVOL{_32} ioctl
  btrfs/ioctl: allow idmapped BTRFS_IOC_SUBVOL_SETFLAGS ioctl
  btrfs/ioctl: allow idmapped BTRFS_IOC_INO_LOOKUP_USER ioctl
  btrfs/acl: handle idmapped mounts
  btrfs/super: allow idmapped btrfs

 fs/btrfs/acl.c        | 11 ++---
 fs/btrfs/ctree.h      |  3 +-
 fs/btrfs/inode.c      | 62 ++++++++++++++++------------
 fs/btrfs/ioctl.c      | 96 ++++++++++++++++++++++++++++---------------
 fs/btrfs/super.c      |  2 +-
 fs/namei.c            | 44 +++++++++++++++++---
 include/linux/namei.h |  2 +
 7 files changed, 148 insertions(+), 72 deletions(-)


base-commit: 2734d6c1b1a089fb593ef6a23d4b70903526fe0c
-- 
2.30.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 01/21] namei: add mapping aware lookup helper
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 02/21] btrfs/inode: handle idmaps in btrfs_new_inode() Christian Brauner
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig, linux-fsdevel

From: Christian Brauner <christian.brauner@ubuntu.com>

Various filesystems rely on the lookup_one_len() helper to lookup a single path
component relative to a well-known starting point. Allow such filesystems to
support idmapped mounts by adding a version of this helper to take the idmap
into account when calling inode_permission(). This change is a required to let
btrfs (and other filesystems) support idmapped mounts.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
- Al Viro <viro@zeniv.linux.org.uk>:
  - Add a new lookup helper instead of changing the old ones.
---
 fs/namei.c            | 44 +++++++++++++++++++++++++++++++++++++------
 include/linux/namei.h |  2 ++
 2 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf6d8a738c59..8f416698ee34 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2575,8 +2575,9 @@ int vfs_path_lookup(struct dentry *dentry, struct vfsmount *mnt,
 }
 EXPORT_SYMBOL(vfs_path_lookup);
 
-static int lookup_one_len_common(const char *name, struct dentry *base,
-				 int len, struct qstr *this)
+static int lookup_one_len_common(struct user_namespace *mnt_userns,
+				 const char *name, struct dentry *base, int len,
+				 struct qstr *this)
 {
 	this->name = name;
 	this->len = len;
@@ -2604,7 +2605,7 @@ static int lookup_one_len_common(const char *name, struct dentry *base,
 			return err;
 	}
 
-	return inode_permission(&init_user_ns, base->d_inode, MAY_EXEC);
+	return inode_permission(mnt_userns, base->d_inode, MAY_EXEC);
 }
 
 /**
@@ -2628,7 +2629,7 @@ struct dentry *try_lookup_one_len(const char *name, struct dentry *base, int len
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_one_len_common(name, base, len, &this);
+	err = lookup_one_len_common(&init_user_ns, name, base, len, &this);
 	if (err)
 		return ERR_PTR(err);
 
@@ -2655,7 +2656,7 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 
 	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
 
-	err = lookup_one_len_common(name, base, len, &this);
+	err = lookup_one_len_common(&init_user_ns, name, base, len, &this);
 	if (err)
 		return ERR_PTR(err);
 
@@ -2664,6 +2665,37 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 }
 EXPORT_SYMBOL(lookup_one_len);
 
+/**
+ * lookup_mapped_one_len - filesystem helper to lookup single pathname component
+ * @mnt_userns:	user namespace of the mount the lookup is performed from
+ * @name:	pathname component to lookup
+ * @base:	base directory to lookup from
+ * @len:	maximum length @len should be interpreted to
+ *
+ * Note that this routine is purely a helper for filesystem usage and should
+ * not be called by generic code.
+ *
+ * The caller must hold base->i_mutex.
+ */
+struct dentry *lookup_mapped_one_len(struct user_namespace *mnt_userns,
+				     const char *name, struct dentry *base,
+				     int len)
+{
+	struct dentry *dentry;
+	struct qstr this;
+	int err;
+
+	WARN_ON_ONCE(!inode_is_locked(base->d_inode));
+
+	err = lookup_one_len_common(mnt_userns, name, base, len, &this);
+	if (err)
+		return ERR_PTR(err);
+
+	dentry = lookup_dcache(&this, base, 0);
+	return dentry ? dentry : __lookup_slow(&this, base, 0);
+}
+EXPORT_SYMBOL(lookup_mapped_one_len);
+
 /**
  * lookup_one_len_unlocked - filesystem helper to lookup single pathname component
  * @name:	pathname component to lookup
@@ -2683,7 +2715,7 @@ struct dentry *lookup_one_len_unlocked(const char *name,
 	int err;
 	struct dentry *ret;
 
-	err = lookup_one_len_common(name, base, len, &this);
+	err = lookup_one_len_common(&init_user_ns, name, base, len, &this);
 	if (err)
 		return ERR_PTR(err);
 
diff --git a/include/linux/namei.h b/include/linux/namei.h
index be9a2b349ca7..fd9d22128df6 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -68,6 +68,8 @@ extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
 extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
 extern struct dentry *lookup_positive_unlocked(const char *, struct dentry *, int);
+extern struct dentry *lookup_mapped_one_len(struct user_namespace *,
+					    const char *, struct dentry *, int);
 
 extern int follow_down_one(struct path *);
 extern int follow_down(struct path *);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 02/21] btrfs/inode: handle idmaps in btrfs_new_inode()
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 01/21] namei: add mapping aware lookup helper Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 03/21] btrfs/inode: allow idmapped rename iop Christian Brauner
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Extend btrfs_new_inode() to take the idmapped mount into account when
initializing a new inode. This is just a matter of passing down the mount's
userns. The rest is taken care of in inode_init_owner(). This is a preliminary
patch to make the individual btrfs inode operations idmapped mount aware.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 34 +++++++++++++++++++---------------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8f60314c36c5..19e83afc7d45 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6370,6 +6370,7 @@ static void btrfs_inherit_iflags(struct inode *inode, struct inode *dir)
 
 static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 				     struct btrfs_root *root,
+				     struct user_namespace *mnt_userns,
 				     struct inode *dir,
 				     const char *name, int name_len,
 				     u64 ref_objectid, u64 objectid,
@@ -6479,7 +6480,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	if (ret != 0)
 		goto fail_unlock;
 
-	inode_init_owner(&init_user_ns, inode, dir, mode);
+	inode_init_owner(mnt_userns, inode, dir, mode);
 	inode_set_bytes(inode, 0);
 
 	inode->i_mtime = current_time(inode);
@@ -6664,9 +6665,9 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	inode = btrfs_new_inode(trans, root, dir, dentry->d_name.name,
-			dentry->d_name.len, btrfs_ino(BTRFS_I(dir)), objectid,
-			mode, &index);
+	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+			dentry->d_name.name, dentry->d_name.len,
+			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -6728,9 +6729,9 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	inode = btrfs_new_inode(trans, root, dir, dentry->d_name.name,
-			dentry->d_name.len, btrfs_ino(BTRFS_I(dir)), objectid,
-			mode, &index);
+	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+			dentry->d_name.name, dentry->d_name.len,
+			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -6873,8 +6874,9 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_fail;
 
-	inode = btrfs_new_inode(trans, root, dir, dentry->d_name.name,
-			dentry->d_name.len, btrfs_ino(BTRFS_I(dir)), objectid,
+	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+			dentry->d_name.name, dentry->d_name.len,
+			btrfs_ino(BTRFS_I(dir)), objectid,
 			S_IFDIR | mode, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
@@ -8949,7 +8951,8 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 	if (err < 0)
 		return err;
 
-	inode = btrfs_new_inode(trans, new_root, NULL, "..", 2, ino, ino,
+	inode = btrfs_new_inode(trans, new_root, &init_user_ns, NULL, "..", 2,
+				ino, ino,
 				S_IFDIR | (~current_umask() & S_IRWXUGO),
 				&index);
 	if (IS_ERR(inode))
@@ -9442,7 +9445,7 @@ static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 	if (ret)
 		return ret;
 
-	inode = btrfs_new_inode(trans, root, dir,
+	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
 				dentry->d_name.name,
 				dentry->d_name.len,
 				btrfs_ino(BTRFS_I(dir)),
@@ -9941,9 +9944,10 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	inode = btrfs_new_inode(trans, root, dir, dentry->d_name.name,
-				dentry->d_name.len, btrfs_ino(BTRFS_I(dir)),
-				objectid, S_IFLNK|S_IRWXUGO, &index);
+	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+				dentry->d_name.name, dentry->d_name.len,
+				btrfs_ino(BTRFS_I(dir)), objectid,
+				S_IFLNK | S_IRWXUGO, &index);
 	if (IS_ERR(inode)) {
 		err = PTR_ERR(inode);
 		inode = NULL;
@@ -10292,7 +10296,7 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	if (ret)
 		goto out;
 
-	inode = btrfs_new_inode(trans, root, dir, NULL, 0,
+	inode = btrfs_new_inode(trans, root, &init_user_ns, dir, NULL, 0,
 			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 03/21] btrfs/inode: allow idmapped rename iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 01/21] namei: add mapping aware lookup helper Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 02/21] btrfs/inode: handle idmaps in btrfs_new_inode() Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 04/21] btrfs/inode: allow idmapped getattr iop Christian Brauner
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_rename() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 19e83afc7d45..6e4e9f4fbdf3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9433,6 +9433,7 @@ static int btrfs_rename_exchange(struct inode *old_dir,
 
 static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 				     struct btrfs_root *root,
+				     struct user_namespace *mnt_userns,
 				     struct inode *dir,
 				     struct dentry *dentry)
 {
@@ -9445,7 +9446,7 @@ static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 	if (ret)
 		return ret;
 
-	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 				dentry->d_name.name,
 				dentry->d_name.len,
 				btrfs_ino(BTRFS_I(dir)),
@@ -9482,9 +9483,10 @@ static int btrfs_whiteout_for_rename(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
-			   struct inode *new_dir, struct dentry *new_dentry,
-			   unsigned int flags)
+static int btrfs_rename(struct user_namespace *mnt_userns,
+			struct inode *old_dir, struct dentry *old_dentry,
+			struct inode *new_dir, struct dentry *new_dentry,
+			unsigned int flags)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(old_dir->i_sb);
 	struct btrfs_trans_handle *trans;
@@ -9657,8 +9659,8 @@ static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	}
 
 	if (flags & RENAME_WHITEOUT) {
-		ret = btrfs_whiteout_for_rename(trans, root, old_dir,
-						old_dentry);
+		ret = btrfs_whiteout_for_rename(trans, root, mnt_userns,
+						old_dir, old_dentry);
 
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
@@ -9708,7 +9710,8 @@ static int btrfs_rename2(struct user_namespace *mnt_userns, struct inode *old_di
 		return btrfs_rename_exchange(old_dir, old_dentry, new_dir,
 					  new_dentry);
 
-	return btrfs_rename(old_dir, old_dentry, new_dir, new_dentry, flags);
+	return btrfs_rename(mnt_userns, old_dir, old_dentry, new_dir,
+			    new_dentry, flags);
 }
 
 struct btrfs_delalloc_work {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 04/21] btrfs/inode: allow idmapped getattr iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (2 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 03/21] btrfs/inode: allow idmapped rename iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 05/21] btrfs/inode: allow idmapped mknod iop Christian Brauner
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_getattr() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 6e4e9f4fbdf3..9b345af1c3e0 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9195,7 +9195,7 @@ static int btrfs_getattr(struct user_namespace *mnt_userns,
 				  STATX_ATTR_IMMUTABLE |
 				  STATX_ATTR_NODUMP);
 
-	generic_fillattr(&init_user_ns, inode, stat);
+	generic_fillattr(mnt_userns, inode, stat);
 	stat->dev = BTRFS_I(inode)->root->anon_dev;
 
 	spin_lock(&BTRFS_I(inode)->lock);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 05/21] btrfs/inode: allow idmapped mknod iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (3 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 04/21] btrfs/inode: allow idmapped getattr iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 06/21] btrfs/inode: allow idmapped create iop Christian Brauner
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_mknod() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9b345af1c3e0..1a50a039dc43 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6665,7 +6665,7 @@ static int btrfs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
 			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
 	if (IS_ERR(inode)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 06/21] btrfs/inode: allow idmapped create iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (4 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 05/21] btrfs/inode: allow idmapped mknod iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 07/21] btrfs/inode: allow idmapped mkdir iop Christian Brauner
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_create() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1a50a039dc43..1fa290bb5272 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6729,7 +6729,7 @@ static int btrfs_create(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
 			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
 	if (IS_ERR(inode)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 07/21] btrfs/inode: allow idmapped mkdir iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (5 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 06/21] btrfs/inode: allow idmapped create iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 08/21] btrfs/inode: allow idmapped symlink iop Christian Brauner
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_mkdir() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1fa290bb5272..9f0af257f89f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6874,7 +6874,7 @@ static int btrfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_fail;
 
-	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 			dentry->d_name.name, dentry->d_name.len,
 			btrfs_ino(BTRFS_I(dir)), objectid,
 			S_IFDIR | mode, &index);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 08/21] btrfs/inode: allow idmapped symlink iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (6 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 07/21] btrfs/inode: allow idmapped mkdir iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 09/21] btrfs/inode: allow idmapped tmpfile iop Christian Brauner
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_symlink() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9f0af257f89f..d7ccbd9a2723 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9947,7 +9947,7 @@ static int btrfs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
 	if (err)
 		goto out_unlock;
 
-	inode = btrfs_new_inode(trans, root, &init_user_ns, dir,
+	inode = btrfs_new_inode(trans, root, mnt_userns, dir,
 				dentry->d_name.name, dentry->d_name.len,
 				btrfs_ino(BTRFS_I(dir)), objectid,
 				S_IFLNK | S_IRWXUGO, &index);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 09/21] btrfs/inode: allow idmapped tmpfile iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (7 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 08/21] btrfs/inode: allow idmapped symlink iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 10/21] btrfs/inode: allow idmapped setattr iop Christian Brauner
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_tmpfile() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d7ccbd9a2723..dc368394722a 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10299,7 +10299,7 @@ static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
 	if (ret)
 		goto out;
 
-	inode = btrfs_new_inode(trans, root, &init_user_ns, dir, NULL, 0,
+	inode = btrfs_new_inode(trans, root, mnt_userns, dir, NULL, 0,
 			btrfs_ino(BTRFS_I(dir)), objectid, mode, &index);
 	if (IS_ERR(inode)) {
 		ret = PTR_ERR(inode);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 10/21] btrfs/inode: allow idmapped setattr iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (8 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 09/21] btrfs/inode: allow idmapped tmpfile iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 11/21] btrfs/inode: allow idmapped permission iop Christian Brauner
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_setattr() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index dc368394722a..53b038029440 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5342,7 +5342,7 @@ static int btrfs_setattr(struct user_namespace *mnt_userns, struct dentry *dentr
 	if (btrfs_root_readonly(root))
 		return -EROFS;
 
-	err = setattr_prepare(&init_user_ns, dentry, attr);
+	err = setattr_prepare(mnt_userns, dentry, attr);
 	if (err)
 		return err;
 
@@ -5353,12 +5353,12 @@ static int btrfs_setattr(struct user_namespace *mnt_userns, struct dentry *dentr
 	}
 
 	if (attr->ia_valid) {
-		setattr_copy(&init_user_ns, inode, attr);
+		setattr_copy(mnt_userns, inode, attr);
 		inode_inc_iversion(inode);
 		err = btrfs_dirty_inode(inode);
 
 		if (!err && attr->ia_valid & ATTR_MODE)
-			err = posix_acl_chmod(&init_user_ns, inode,
+			err = posix_acl_chmod(mnt_userns, inode,
 					      inode->i_mode);
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 11/21] btrfs/inode: allow idmapped permission iop
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (9 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 10/21] btrfs/inode: allow idmapped setattr iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 12/21] btrfs/ioctl: check whether fs{g,u}id are mapped during subvolume creation Christian Brauner
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Enable btrfs_permission() to handle idmapped mounts. This is just a matter of
passing down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 53b038029440..a2a36a45998e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10274,7 +10274,7 @@ static int btrfs_permission(struct user_namespace *mnt_userns,
 		if (BTRFS_I(inode)->flags & BTRFS_INODE_READONLY)
 			return -EACCES;
 	}
-	return generic_permission(&init_user_ns, inode, mask);
+	return generic_permission(mnt_userns, inode, mask);
 }
 
 static int btrfs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 12/21] btrfs/ioctl: check whether fs{g,u}id are mapped during subvolume creation
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (10 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 11/21] btrfs/inode: allow idmapped permission iop Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 13/21] btrfs/inode: allow idmapped BTRFS_IOC_{SNAP,SUBVOL}_CREATE{_V2} ioctl Christian Brauner
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

When a new subvolume is created btrfs currently doesn't check whether the
fs{g,u}id of the caller actually have a mapping in the user namespace attached
to the filesystem. The vfs always checks this to make sure that the caller's
fs{g,u}id can be represented on-disk. This is most relevant for filesystems
that can be mounted inside user namespaces but it is in general a good
hardening measure to prevent unrepresentable {g,u}ids from being written to
disk.
Since we want to support idmapped mounts for btrfs ioctls to create subvolumes
in follow-up patches this becomes important since we want to make sure the
fs{g,u}id of the caller as mapped according to the idmapped mount can be
represented on-disk. Simply add the missing fsuidgid_has_mapping() line from
the vfs may_create() version to btrfs_may_create().

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ioctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0ba98e08a029..7a6a886df7c4 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -870,6 +870,8 @@ static inline int btrfs_may_create(struct inode *dir, struct dentry *child)
 		return -EEXIST;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
+	if (!fsuidgid_has_mapping(dir->i_sb, &init_user_ns))
+		return -EOVERFLOW;
 	return inode_permission(&init_user_ns, dir, MAY_WRITE | MAY_EXEC);
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 13/21] btrfs/inode: allow idmapped BTRFS_IOC_{SNAP,SUBVOL}_CREATE{_V2} ioctl
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (11 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 12/21] btrfs/ioctl: check whether fs{g,u}id are mapped during subvolume creation Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl Christian Brauner
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Creating subvolumes and snapshots is one of the core features of btrfs and is
even available to unprivileged users. Make it possible to use subvolume and
snapshot creation on idmapped mounts. This is a fairly straightforward
operation since all the permission checking helpers are already capable of
handling idmapped mounts. So we just need to pass down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ctree.h |  3 ++-
 fs/btrfs/inode.c |  5 +++--
 fs/btrfs/ioctl.c | 48 ++++++++++++++++++++++++++++--------------------
 3 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index e5e53e592d4f..ee1876571b3f 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3145,7 +3145,8 @@ int btrfs_set_extent_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
 			      struct extent_state **cached_state);
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 			     struct btrfs_root *new_root,
-			     struct btrfs_root *parent_root);
+			     struct btrfs_root *parent_root,
+			     struct user_namespace *mnt_userns);
  void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
 			       unsigned *bits);
 void btrfs_clear_delalloc_extent(struct inode *inode,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a2a36a45998e..84f732b062dd 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8940,7 +8940,8 @@ static int btrfs_truncate(struct inode *inode, bool skip_writeback)
  */
 int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 			     struct btrfs_root *new_root,
-			     struct btrfs_root *parent_root)
+			     struct btrfs_root *parent_root,
+			     struct user_namespace *mnt_userns)
 {
 	struct inode *inode;
 	int err;
@@ -8951,7 +8952,7 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle *trans,
 	if (err < 0)
 		return err;
 
-	inode = btrfs_new_inode(trans, new_root, &init_user_ns, NULL, "..", 2,
+	inode = btrfs_new_inode(trans, new_root, mnt_userns, NULL, "..", 2,
 				ino, ino,
 				S_IFDIR | (~current_umask() & S_IRWXUGO),
 				&index);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 7a6a886df7c4..be52891ba571 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -492,8 +492,8 @@ int __pure btrfs_is_empty_uuid(u8 *uuid)
 	return 1;
 }
 
-static noinline int create_subvol(struct inode *dir,
-				  struct dentry *dentry,
+static noinline int create_subvol(struct user_namespace *mnt_userns,
+				  struct inode *dir, struct dentry *dentry,
 				  const char *name, int namelen,
 				  struct btrfs_qgroup_inherit *inherit)
 {
@@ -638,7 +638,7 @@ static noinline int create_subvol(struct inode *dir,
 		goto fail;
 	}
 
-	ret = btrfs_create_subvol_root(trans, new_root, root);
+	ret = btrfs_create_subvol_root(trans, new_root, root, mnt_userns);
 	btrfs_put_root(new_root);
 	if (ret) {
 		/* We potentially lose an unused inode item here */
@@ -864,15 +864,16 @@ static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
 }
 
 /* copy of may_create in fs/namei.c() */
-static inline int btrfs_may_create(struct inode *dir, struct dentry *child)
+static inline int btrfs_may_create(struct user_namespace *mnt_userns,
+				   struct inode *dir, struct dentry *child)
 {
 	if (d_really_is_positive(child))
 		return -EEXIST;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
-	if (!fsuidgid_has_mapping(dir->i_sb, &init_user_ns))
+	if (!fsuidgid_has_mapping(dir->i_sb, mnt_userns))
 		return -EOVERFLOW;
-	return inode_permission(&init_user_ns, dir, MAY_WRITE | MAY_EXEC);
+	return inode_permission(mnt_userns, dir, MAY_WRITE | MAY_EXEC);
 }
 
 /*
@@ -881,6 +882,7 @@ static inline int btrfs_may_create(struct inode *dir, struct dentry *child)
  * inside this filesystem so it's quite a bit simpler.
  */
 static noinline int btrfs_mksubvol(const struct path *parent,
+				   struct user_namespace *mnt_userns,
 				   const char *name, int namelen,
 				   struct btrfs_root *snap_src,
 				   bool readonly,
@@ -895,12 +897,13 @@ static noinline int btrfs_mksubvol(const struct path *parent,
 	if (error == -EINTR)
 		return error;
 
-	dentry = lookup_one_len(name, parent->dentry, namelen);
+	dentry = lookup_mapped_one_len(mnt_userns, name,
+				       parent->dentry, namelen);
 	error = PTR_ERR(dentry);
 	if (IS_ERR(dentry))
 		goto out_unlock;
 
-	error = btrfs_may_create(dir, dentry);
+	error = btrfs_may_create(mnt_userns, dir, dentry);
 	if (error)
 		goto out_dput;
 
@@ -922,7 +925,7 @@ static noinline int btrfs_mksubvol(const struct path *parent,
 	if (snap_src)
 		error = create_snapshot(snap_src, dir, dentry, readonly, inherit);
 	else
-		error = create_subvol(dir, dentry, name, namelen, inherit);
+		error = create_subvol(mnt_userns, dir, dentry, name, namelen, inherit);
 
 	if (!error)
 		fsnotify_mkdir(dir, dentry);
@@ -936,6 +939,7 @@ static noinline int btrfs_mksubvol(const struct path *parent,
 }
 
 static noinline int btrfs_mksnapshot(const struct path *parent,
+				   struct user_namespace *mnt_userns,
 				   const char *name, int namelen,
 				   struct btrfs_root *root,
 				   bool readonly,
@@ -965,7 +969,7 @@ static noinline int btrfs_mksnapshot(const struct path *parent,
 
 	btrfs_wait_ordered_extents(root, U64_MAX, 0, (u64)-1);
 
-	ret = btrfs_mksubvol(parent, name, namelen,
+	ret = btrfs_mksubvol(parent, mnt_userns, name, namelen,
 			     root, readonly, inherit);
 out:
 	if (snapshot_force_cow)
@@ -1794,6 +1798,7 @@ static noinline int btrfs_ioctl_resize(struct file *file,
 }
 
 static noinline int __btrfs_ioctl_snap_create(struct file *file,
+				struct user_namespace *mnt_userns,
 				const char *name, unsigned long fd, int subvol,
 				bool readonly,
 				struct btrfs_qgroup_inherit *inherit)
@@ -1821,8 +1826,8 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
 	}
 
 	if (subvol) {
-		ret = btrfs_mksubvol(&file->f_path, name, namelen,
-				     NULL, readonly, inherit);
+		ret = btrfs_mksubvol(&file->f_path, mnt_userns, name,
+				     namelen, NULL, readonly, inherit);
 	} else {
 		struct fd src = fdget(fd);
 		struct inode *src_inode;
@@ -1836,16 +1841,17 @@ static noinline int __btrfs_ioctl_snap_create(struct file *file,
 			btrfs_info(BTRFS_I(file_inode(file))->root->fs_info,
 				   "Snapshot src from another FS");
 			ret = -EXDEV;
-		} else if (!inode_owner_or_capable(&init_user_ns, src_inode)) {
+		} else if (!inode_owner_or_capable(mnt_userns, src_inode)) {
 			/*
 			 * Subvolume creation is not restricted, but snapshots
 			 * are limited to own subvolumes only
 			 */
 			ret = -EPERM;
 		} else {
-			ret = btrfs_mksnapshot(&file->f_path, name, namelen,
-					     BTRFS_I(src_inode)->root,
-					     readonly, inherit);
+			ret = btrfs_mksnapshot(&file->f_path, mnt_userns,
+					       name, namelen,
+					       BTRFS_I(src_inode)->root,
+					       readonly, inherit);
 		}
 		fdput(src);
 	}
@@ -1869,8 +1875,9 @@ static noinline int btrfs_ioctl_snap_create(struct file *file,
 		return PTR_ERR(vol_args);
 	vol_args->name[BTRFS_PATH_NAME_MAX] = '\0';
 
-	ret = __btrfs_ioctl_snap_create(file, vol_args->name, vol_args->fd,
-					subvol, false, NULL);
+	ret = __btrfs_ioctl_snap_create(file, file_mnt_user_ns(file),
+					vol_args->name, vol_args->fd, subvol,
+					false, NULL);
 
 	kfree(vol_args);
 	return ret;
@@ -1928,8 +1935,9 @@ static noinline int btrfs_ioctl_snap_create_v2(struct file *file,
 		}
 	}
 
-	ret = __btrfs_ioctl_snap_create(file, vol_args->name, vol_args->fd,
-					subvol, readonly, inherit);
+	ret = __btrfs_ioctl_snap_create(file, file_mnt_user_ns(file),
+					vol_args->name, vol_args->fd, subvol,
+					readonly, inherit);
 	if (ret)
 		goto free_inherit;
 free_inherit:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (12 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 13/21] btrfs/inode: allow idmapped BTRFS_IOC_{SNAP,SUBVOL}_CREATE{_V2} ioctl Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-21 14:15   ` David Sterba
  2021-07-19 11:10 ` [PATCH v2 15/21] btrfs/ioctl: relax restrictions for BTRFS_IOC_SNAP_DESTROY_V2 with subvolids Christian Brauner
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Destroying subvolumes and snapshots are important features of btrfs. Both
operations are available to unprivileged users if the filesystem has been
mounted with the "user_subvol_rm_allowed" mount option. Allow subvolume and
snapshot deletion on idmapped mounts. This is a fairly straightforward
operation since all the permission checking helpers are already capable of
handling idmapped mounts. So we just need to pass down the mount's userns.

In addition to regular subvolume or snapshot deletion by specifying the name of
the subvolume or snapshot the BTRFS_IOC_SNAP_DESTROY_V2 ioctl allows the
deletion of subvolumes and snapshots via subvolume and snapshot ids when the
BTRFS_SUBVOL_SPEC_BY_ID flag is raised.

This feature is blocked on idmapped mounts as this allows filesystem wide
subvolume deletions and thus can escape the scope of what's exposed under the
mount identified by the fd passed with the ioctl.

Here is an example where a btrfs subvolume is deleted through a subvolume mount
that does not expose the subvolume to be delete but it can still be deleted by
using the subvolume id:

 /* Compile the following program as "delete_by_spec". */

 #define _GNU_SOURCE
 #include <fcntl.h>
 #include <inttypes.h>
 #include <linux/btrfs.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/ioctl.h>
 #include <sys/stat.h>
 #include <sys/types.h>
 #include <unistd.h>

 static int rm_subvolume_by_id(int fd, uint64_t subvolid)
 {
 	struct btrfs_ioctl_vol_args_v2 args = {};
 	int ret;

 	args.flags = BTRFS_SUBVOL_SPEC_BY_ID;
 	args.subvolid = subvolid;

 	ret = ioctl(fd, BTRFS_IOC_SNAP_DESTROY_V2, &args);
 	if (ret < 0)
 		return -1;

 	return 0;
 }

 int main(int argc, char *argv[])
 {
 	int subvolid = 0;

 	if (argc < 3)
 		exit(1);

 	fprintf(stderr, "Opening %s\n", argv[1]);
 	int fd = open(argv[1], O_CLOEXEC | O_DIRECTORY);
 	if (fd < 0)
 		exit(2);

 	subvolid = atoi(argv[2]);

 	fprintf(stderr, "Deleting subvolume with subvolid %d\n", subvolid);
 	int ret = rm_subvolume_by_id(fd, subvolid);
 	if (ret < 0)
 		exit(3);

 	exit(0);
 }
 #include <stdio.h>"
 #include <stdlib.h>"
 #include <linux/btrfs.h"

 truncate -s 10G btrfs.img
 mkfs.btrfs btrfs.img
 export LOOPDEV=$(sudo losetup -f --show btrfs.img)
 mount ${LOOPDEV} /mnt
 sudo chown $(id -u):$(id -g) /mnt
 btrfs subvolume create /mnt/A
 btrfs subvolume create /mnt/B/C
 # Get subvolume id via:
 sudo btrfs subvolume show /mnt/A
 # Save subvolid
 SUBVOLID=<nr>
 sudo umount /mnt
 sudo mount ${LOOPDEV} -o subvol=B/C,user_subvol_rm_allowed /mnt
 ./delete_by_spec /mnt ${SUBVOLID}

With idmapped mounts this can potentially be used by users to delete
subvolumes/snapshots they would otherwise not have access to as the idmapping
would be applied to an inode that is not exposed in the mount of the subvolume.

The fact that this is a filesystem wide operation suggests it might be a good
idea to expose this under a separate ioctl that clearly indicates this. In
essence, the file descriptor passed with the ioctl is merely used to identify
the filesystem on which to operate when BTRFS_SUBVOL_SPEC_BY_ID is used.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ioctl.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index be52891ba571..5416b0c0ee7a 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -830,7 +830,8 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
  *     nfs_async_unlink().
  */
 
-static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
+static int btrfs_may_delete(struct user_namespace *mnt_userns,
+			    struct inode *dir, struct dentry *victim, int isdir)
 {
 	int error;
 
@@ -840,12 +841,12 @@ static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
 	BUG_ON(d_inode(victim->d_parent) != dir);
 	audit_inode_child(dir, victim, AUDIT_TYPE_CHILD_DELETE);
 
-	error = inode_permission(&init_user_ns, dir, MAY_WRITE | MAY_EXEC);
+	error = inode_permission(mnt_userns, dir, MAY_WRITE | MAY_EXEC);
 	if (error)
 		return error;
 	if (IS_APPEND(dir))
 		return -EPERM;
-	if (check_sticky(&init_user_ns, dir, d_inode(victim)) ||
+	if (check_sticky(mnt_userns, dir, d_inode(victim)) ||
 	    IS_APPEND(d_inode(victim)) || IS_IMMUTABLE(d_inode(victim)) ||
 	    IS_SWAPFILE(d_inode(victim)))
 		return -EPERM;
@@ -2915,6 +2916,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 	struct btrfs_root *dest = NULL;
 	struct btrfs_ioctl_vol_args *vol_args = NULL;
 	struct btrfs_ioctl_vol_args_v2 *vol_args2 = NULL;
+	struct user_namespace *mnt_userns = file_mnt_user_ns(file);
 	char *subvol_name, *subvol_name_ptr = NULL;
 	int subvol_namelen;
 	int err = 0;
@@ -2942,6 +2944,18 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 			if (err)
 				goto out;
 		} else {
+			/*
+			 * Deleting by subvolume id can be used to delete
+			 * subvolumes/snapshots anywhere in the filesystem.
+			 * Ensure that users can't abuse idmapped mounts of
+			 * btrfs subvolumes/snapshots to perform operations in
+			 * the whole filesystem.
+			 */
+			if (mnt_userns != &init_user_ns) {
+				err = -EINVAL;
+				goto out;
+			}
+
 			if (vol_args2->subvolid < BTRFS_FIRST_FREE_OBJECTID) {
 				err = -EINVAL;
 				goto out;
@@ -3026,7 +3040,8 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 	err = down_write_killable_nested(&dir->i_rwsem, I_MUTEX_PARENT);
 	if (err == -EINTR)
 		goto free_subvol_name;
-	dentry = lookup_one_len(subvol_name, parent, subvol_namelen);
+	dentry = lookup_mapped_one_len(mnt_userns, subvol_name,
+				       parent, subvol_namelen);
 	if (IS_ERR(dentry)) {
 		err = PTR_ERR(dentry);
 		goto out_unlock_dir;
@@ -3068,14 +3083,14 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 		if (root == dest)
 			goto out_dput;
 
-		err = inode_permission(&init_user_ns, inode,
+		err = inode_permission(mnt_userns, inode,
 				       MAY_WRITE | MAY_EXEC);
 		if (err)
 			goto out_dput;
 	}
 
 	/* check if subvolume may be deleted by a user */
-	err = btrfs_may_delete(dir, dentry, 1);
+	err = btrfs_may_delete(mnt_userns, dir, dentry, 1);
 	if (err)
 		goto out_dput;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 15/21] btrfs/ioctl: relax restrictions for BTRFS_IOC_SNAP_DESTROY_V2 with subvolids
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (13 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 16/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SET_RECEIVED_SUBVOL{_32} ioctl Christian Brauner
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

So far we prevented the deletion of subvolumes and snapshots using subvolume
ids possible with the BTRFS_SUBVOL_SPEC_BY_ID flag.
This restriction is necessary on idmapped mounts as this allows filesystem wide
subvolume and snapshot deletions and thus can escape the scope of what's
exposed under the mount identified by the fd passed with the ioctl.

Deletion by subvolume id works by looking for an alias of the parent of the
subvolume or snapshot to be deleted. The parent alias can be anywhere in the
filesystem. However, as long as the alias of the parent that is found is the
same as the one identified by the file descriptor passed through the ioctl we
can allow the deletion.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ioctl.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 5416b0c0ee7a..72045ae30d1c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2944,17 +2944,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 			if (err)
 				goto out;
 		} else {
-			/*
-			 * Deleting by subvolume id can be used to delete
-			 * subvolumes/snapshots anywhere in the filesystem.
-			 * Ensure that users can't abuse idmapped mounts of
-			 * btrfs subvolumes/snapshots to perform operations in
-			 * the whole filesystem.
-			 */
-			if (mnt_userns != &init_user_ns) {
-				err = -EINVAL;
-				goto out;
-			}
+			struct inode *old_dir;
 
 			if (vol_args2->subvolid < BTRFS_FIRST_FREE_OBJECTID) {
 				err = -EINVAL;
@@ -2992,6 +2982,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 				err = PTR_ERR(parent);
 				goto out_drop_write;
 			}
+			old_dir = dir;
 			dir = d_inode(parent);
 
 			/*
@@ -3002,6 +2993,20 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
 			 */
 			destroy_parent = true;
 
+			/*
+			 * On idmapped mounts, deletion via subvolid is
+			 * restricted to subvolumes that are immediate
+			 * ancestors of the inode referenced by the file
+			 * descriptor in the ioctl. Otherwise the idmapping
+			 * could potentially be abused to delete subvolumes
+			 * anywhere in the filesystem the user wouldn't be able
+			 * to delete without an idmapped mount.
+			 */
+			if (old_dir != dir && mnt_userns != &init_user_ns) {
+				err = -EINVAL;
+				goto free_parent;
+			}
+
 			subvol_name_ptr = btrfs_get_subvol_name_from_objectid(
 						fs_info, vol_args2->subvolid);
 			if (IS_ERR(subvol_name_ptr)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 16/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SET_RECEIVED_SUBVOL{_32} ioctl
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (14 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 15/21] btrfs/ioctl: relax restrictions for BTRFS_IOC_SNAP_DESTROY_V2 with subvolids Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 17/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SUBVOL_SETFLAGS ioctl Christian Brauner
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

The BTRFS_IOC_SET_RECEIVED_SUBVOL{_32} are used to set information about a
received subvolume. Make it possible to set information about a received
subvolume on idmapped mounts. This is a fairly straightforward operation since
all the permission checking helpers are already capable of handling idmapped
mounts. So we just need to pass down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ioctl.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 72045ae30d1c..d631a1cb621d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4466,6 +4466,7 @@ static long btrfs_ioctl_quota_rescan_wait(struct btrfs_fs_info *fs_info,
 }
 
 static long _btrfs_ioctl_set_received_subvol(struct file *file,
+					    struct user_namespace *mnt_userns,
 					    struct btrfs_ioctl_received_subvol_args *sa)
 {
 	struct inode *inode = file_inode(file);
@@ -4477,7 +4478,7 @@ static long _btrfs_ioctl_set_received_subvol(struct file *file,
 	int ret = 0;
 	int received_uuid_changed;
 
-	if (!inode_owner_or_capable(&init_user_ns, inode))
+	if (!inode_owner_or_capable(mnt_userns, inode))
 		return -EPERM;
 
 	ret = mnt_want_write_file(file);
@@ -4582,7 +4583,7 @@ static long btrfs_ioctl_set_received_subvol_32(struct file *file,
 	args64->rtime.nsec = args32->rtime.nsec;
 	args64->flags = args32->flags;
 
-	ret = _btrfs_ioctl_set_received_subvol(file, args64);
+	ret = _btrfs_ioctl_set_received_subvol(file, file_mnt_user_ns(file), args64);
 	if (ret)
 		goto out;
 
@@ -4616,7 +4617,7 @@ static long btrfs_ioctl_set_received_subvol(struct file *file,
 	if (IS_ERR(sa))
 		return PTR_ERR(sa);
 
-	ret = _btrfs_ioctl_set_received_subvol(file, sa);
+	ret = _btrfs_ioctl_set_received_subvol(file, file_mnt_user_ns(file), sa);
 
 	if (ret)
 		goto out;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 17/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SUBVOL_SETFLAGS ioctl
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (15 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 16/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SET_RECEIVED_SUBVOL{_32} ioctl Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 18/21] btrfs/ioctl: allow idmapped BTRFS_IOC_INO_LOOKUP_USER ioctl Christian Brauner
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Setting flags on subvolumes or snapshots are core features of btrfs. The
BTRFS_IOC_SUBVOL_SETFLAGS ioctl is especially important as it allows to make
subvolumes and snapshots read-only or read-write. Allow setting flags on btrfs
subvolumes and snapshots on idmapped mounts. This is a fairly straightforward
operation since all the permission checking helpers are already capable of
handling idmapped mounts. So we just need to pass down the mount's userns.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index d631a1cb621d..73a477ead145 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1982,7 +1982,7 @@ static noinline int btrfs_ioctl_subvol_setflags(struct file *file,
 	u64 flags;
 	int ret = 0;
 
-	if (!inode_owner_or_capable(&init_user_ns, inode))
+	if (!inode_owner_or_capable(file_mnt_user_ns(file), inode))
 		return -EPERM;
 
 	ret = mnt_want_write_file(file);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 18/21] btrfs/ioctl: allow idmapped BTRFS_IOC_INO_LOOKUP_USER ioctl
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (16 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 17/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SUBVOL_SETFLAGS ioctl Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 19/21] btrfs/acl: handle idmapped mounts Christian Brauner
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

The BTRFS_IOC_INO_LOOKUP_USER is an unprivileged version of the
BTRFS_IOC_INO_LOOKUP ioctl and has the following restrictions. The main
difference between the two is that BTRFS_IOC_INO_LOOKUP is filesystem wide
operation wheres BTRFS_IOC_INO_LOOKUP_USER is scoped beneath the file
descriptor passed with the ioctl. Specifically, BTRFS_IOC_INO_LOOKUP_USER must
adhere to the following restrictions:
- The caller must be privileged over each inode of each path component for the
  path they are trying to lookup.
- The path for the subvolume the caller is trying to lookup must be reachable
  from the inode associated with the file descriptor passed with the ioctl.
The second condition makes it possible to scope the lookup of the path to the
mount identified by the file descriptor passed with the ioctl. This allows us
to enable this ioctl on idmapped mounts.

Specifically, this is possible because all child subvolumes of a parent
subvolume are reachable when the parent subvolume is mounted. So if the user
had access to open the parent subvolume or has been given the fd then they can
lookup the path if they had access to it provided they were privileged over
each path component.

Note, the BTRFS_IOC_INO_LOOKUP_USER ioctl allows a user to learn the path and
name of a subvolume even though they would otherwise be restricted from doing
so via regular vfs-based lookup.
So think about a parent subvolume with multiple child subvolumes. Someone could
mount he parent subvolume and restrict access to the child subvolumes by
overmounting them with empty directories. At this point the user can't traverse
the child subvolumes and they can't open files in the child subvolumes.
However, they can still learn the path of child subvolumes as long as they have
access to the parent subvolume by using the BTRFS_IOC_INO_LOOKUP_USER ioctl.

The underlying assumption here is that it's ok that the lookup ioctls can't
really take mounts into account other than the original mount the fd belongs to
during lookup. Since this assumption is baked into the original
BTRFS_IOC_INO_LOOKUP_USER ioctl we can extend it to idmapped mounts.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/ioctl.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 73a477ead145..c96037d15bf7 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2440,7 +2440,8 @@ static noinline int btrfs_search_path_in_tree(struct btrfs_fs_info *info,
 	return ret;
 }
 
-static int btrfs_search_path_in_tree_user(struct inode *inode,
+static int btrfs_search_path_in_tree_user(struct user_namespace *mnt_userns,
+				struct inode *inode,
 				struct btrfs_ioctl_ino_lookup_user_args *args)
 {
 	struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
@@ -2538,7 +2539,7 @@ static int btrfs_search_path_in_tree_user(struct inode *inode,
 				ret = PTR_ERR(temp_inode);
 				goto out_put;
 			}
-			ret = inode_permission(&init_user_ns, temp_inode,
+			ret = inode_permission(mnt_userns, temp_inode,
 					       MAY_READ | MAY_EXEC);
 			iput(temp_inode);
 			if (ret) {
@@ -2680,7 +2681,7 @@ static int btrfs_ioctl_ino_lookup_user(struct file *file, void __user *argp)
 		return -EACCES;
 	}
 
-	ret = btrfs_search_path_in_tree_user(inode, args);
+	ret = btrfs_search_path_in_tree_user(file_mnt_user_ns(file), inode, args);
 
 	if (ret == 0 && copy_to_user(argp, args, sizeof(*args)))
 		ret = -EFAULT;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 19/21] btrfs/acl: handle idmapped mounts
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (17 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 18/21] btrfs/ioctl: allow idmapped BTRFS_IOC_INO_LOOKUP_USER ioctl Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 20/21] btrfs/super: allow idmapped btrfs Christian Brauner
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Make the btrfs acl code idmapped mount aware. The posix default and posix
access acls are the only acls other than some specific xattrs that take dac
permissions into account. On an idmapped mount they need to be translated
according to the mount's userns. The main change is done to __btrfs_set_acl()
which is responsible for translating posix acls to their final on-disk
representation. The btrfs_init_acl() helper does not need to take the idmapped
mount into account since it is called in the context of file creation
operations (mknod, create, mkdir, symlink, tmpfile) and is used for
btrfs_init_inode_security() to copy posix default and posix access permissions
from the parent directory. These acls need to be inherited unmodified from the
parent directory. This is identical to what we do for ext4 and xfs.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/acl.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index d95eb5c8cb37..c9f9789e828f 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -53,7 +53,8 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type)
 }
 
 static int __btrfs_set_acl(struct btrfs_trans_handle *trans,
-			 struct inode *inode, struct posix_acl *acl, int type)
+			   struct user_namespace *mnt_userns,
+			   struct inode *inode, struct posix_acl *acl, int type)
 {
 	int ret, size = 0;
 	const char *name;
@@ -114,12 +115,12 @@ int btrfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
 	umode_t old_mode = inode->i_mode;
 
 	if (type == ACL_TYPE_ACCESS && acl) {
-		ret = posix_acl_update_mode(&init_user_ns, inode,
+		ret = posix_acl_update_mode(mnt_userns, inode,
 					    &inode->i_mode, &acl);
 		if (ret)
 			return ret;
 	}
-	ret = __btrfs_set_acl(NULL, inode, acl, type);
+	ret = __btrfs_set_acl(NULL, mnt_userns, inode, acl, type);
 	if (ret)
 		inode->i_mode = old_mode;
 	return ret;
@@ -140,14 +141,14 @@ int btrfs_init_acl(struct btrfs_trans_handle *trans,
 		return ret;
 
 	if (default_acl) {
-		ret = __btrfs_set_acl(trans, inode, default_acl,
+		ret = __btrfs_set_acl(trans, &init_user_ns, inode, default_acl,
 				      ACL_TYPE_DEFAULT);
 		posix_acl_release(default_acl);
 	}
 
 	if (acl) {
 		if (!ret)
-			ret = __btrfs_set_acl(trans, inode, acl,
+			ret = __btrfs_set_acl(trans, &init_user_ns, inode, acl,
 					      ACL_TYPE_ACCESS);
 		posix_acl_release(acl);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 20/21] btrfs/super: allow idmapped btrfs
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (18 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 19/21] btrfs/acl: handle idmapped mounts Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 11:10 ` [PATCH v2 21/21] btrfs/242: introduce btrfs specific idmapped mounts tests Christian Brauner
  2021-07-19 15:11 ` [PATCH v2 00/21] btrfs: support idmapped mounts Josef Bacik
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig

From: Christian Brauner <christian.brauner@ubuntu.com>

Now that we converted btrfs internally to account for idmapped mounts allow the
creation of idmapped mounts on btrfs by setting the FS_ALLOW_IDMAP flag.  We
only need to raise this flag on the btrfs_root_fs_type filesystem since
btrfs_mount_root() is ultimately responsible for allocating the superblock and
is called into from btrfs_mount() associated with btrfs_fs_type.

The conversion of the btrfs inode operations was straightforward. Regarding
btrfs specific ioctls that perform checks based on inode permissions only those
have been allowed that are not filesystem wide operations and hence can be
reasonably charged against a specific mount.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
unchanged
---
 fs/btrfs/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d07b18b2b250..5ba21f6b443c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2381,7 +2381,7 @@ static struct file_system_type btrfs_root_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount_root,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP,
 };
 
 MODULE_ALIAS_FS("btrfs");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 21/21] btrfs/242: introduce btrfs specific idmapped mounts tests
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (19 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 20/21] btrfs/super: allow idmapped btrfs Christian Brauner
@ 2021-07-19 11:10 ` Christian Brauner
  2021-07-19 15:11 ` [PATCH v2 00/21] btrfs: support idmapped mounts Josef Bacik
  21 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-19 11:10 UTC (permalink / raw)
  To: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner, Christoph Hellwig, fstests

From: Christian Brauner <christian.brauner@ubuntu.com>

While core vfs functionality that btrfs implements is completely covered
by the generic test-suite the btrfs specific ioctls are not.
This patch expands the test-suite to cover btrfs specific ioctls that
are have required changes to work on idmapped mounts. We deliberately
don't use the libbtrfsutil library as we need to know exactly what
ioctl's are issued and we need to be in control of all privileges at all
times. This test-suite currently tests:
- BTRFS_IOC_{SNAP,SUBVOL}_CREATE_V2
  - subvolume creation on idmapped mounts where the fsids do have a
    mapping in the superblock
  - snapshot creation on idmapped mounts where the fsids do have a
    mapping in the superblock
  - subvolume creation on idmapped mounts where the fsids do not have a
    mapping in the superblock
  - snapshot creation on idmapped mounts where the fsids do not have
    a mapping in the superblock
  - subvolume creation on idmapped mounts where the caller is
    located in a user namespace and the fsids do have a mapping
    in the superblock
  - snapshot creation on idmapped mounts where the caller is located
    in a user namespace and the fsids do have a mapping in the
    superblock
  - subvolume creation on idmapped mounts where the caller is
    located in a user namespace and the fsids do not have a mapping
    in the superblock
  - snapshot creation on idmapped mounts where the caller is located
    in a user namespace and the fsids do not have a mapping in the
    superblock

- BTRFS_IOC_SNAP_DESTROY_V2
  - subvolume deletion on idmapped mounts where the fsids do have a
    mapping in the superblock
  - snapshot deletion on idmapped mounts where the fsids do have a
    mapping in the superblock
  - subvolume deletion on idmapped mounts where the fsids do not have a
    mapping in the superblock
  - snapshot deletion on idmapped mounts where the fsids do not have
    a mapping in the superblock
  - subvolume deletion on idmapped mounts where the caller is
    located in a user namespace and the fsids do have a mapping
    in the superblock
  - snapshot deletion on idmapped mounts where the caller is located
    in a user namespace and the fsids do have a mapping in the
    superblock
  - subvolume deletion on idmapped mounts where the caller is
    located in a user namespace and the fsids do not have a mapping
    in the superblock
  - snapshot deletion on idmapped mounts where the caller is located
    in a user namespace and the fsids do not have a mapping in the
    superblock
  - unprivileged subvolume deletion on idmapped mounts where the fsids
    do have a mapping in the superblock and the filesystem is mounted
    with "user_subvol_rm_allowed"
  - unprivileged snapshot deletion on idmapped mounts where the fsids do
    have a mapping in the superblock and the filesystem is mounted with
    "user_subvol_rm_allowed"
  - subvolume deletion on idmapped mounts where the caller is
    located in a user namespace and the fsids do have a mapping
    in the superblock and the filesystem is mounted with
    "user_subvol_rm_allowed"
  - snapshot deletion on idmapped mounts where the caller is located
    in a user namespace and the fsids do have a mapping in the
    superblock and the filesystem is mounted with
    "user_subvol_rm_allowed"

- BTRFS_IOC_SUBVOL_SETFLAGS
  - subvolume flags on idmapped mounts where the fsids do have a mapping
    in the superblock
  - snapshot flags on idmapped mounts where the fsids do have a mapping
    in the superblock
  - subvolume flags on idmapped mounts where the fsids do not have a
    mapping in the superblock
  - snapshot flags on idmapped mounts where the fsids do not have a
    mapping in the superblock
  - subvolume flags on idmapped mounts where the caller is located in a
    user namespace and the fsids do have a mapping in the superblock
  - snapshot flags on idmapped mounts where the caller is located in a
    user namespace and the fsids do have a mapping in the superblock
  - subvolume flags on idmapped mounts where the caller is located in a
    user namespace and the fsids do not have a mapping in the superblock
  - snapshot flags on idmapped mounts where the caller is located in a
    user namespace and the fsids do not have a mapping in the superblock

- BTRFS_IOC_INO_LOOKUP_USER
  - subvolume lookup on idmapped mounts where the fsids do have a mapping
    in the superblock
  - subvolume lookup on idmapped mounts where the fsids do not have a
    mapping in the superblock
  - subvolume lookup on idmapped mounts where the caller is located in a
    user namespace and the fsids do have a mapping in the superblock
  - subvolume lookup on idmapped mounts where the caller is located in a
    user namespace and the fsids do not have a mapping in the superblock

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Sterba <dsterba@suse.com>
Cc: fstests@vger.kernel.org
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 src/idmapped-mounts/idmapped-mounts.c | 4061 +++++++++++++++++++++++--
 tests/btrfs/242                       |   34 +
 tests/btrfs/242.out                   |    2 +
 3 files changed, 3903 insertions(+), 194 deletions(-)
 create mode 100755 tests/btrfs/242
 create mode 100644 tests/btrfs/242.out

diff --git a/src/idmapped-mounts/idmapped-mounts.c b/src/idmapped-mounts/idmapped-mounts.c
index f155e0b4..f7ac1c0a 100644
--- a/src/idmapped-mounts/idmapped-mounts.c
+++ b/src/idmapped-mounts/idmapped-mounts.c
@@ -11,6 +11,8 @@
 #include <getopt.h>
 #include <grp.h>
 #include <limits.h>
+#include <linux/btrfs.h>
+#include <linux/btrfs_tree.h>
 #include <linux/limits.h>
 #include <linux/types.h>
 #include <pthread.h>
@@ -91,12 +93,21 @@ const char *t_fstype;
 /* path of the test device */
 const char *t_device;
 
+/* path of the test scratch device */
+const char *t_device_scratch;
+
 /* mountpoint of the test device */
 const char *t_mountpoint;
 
+/* mountpoint of the test device */
+const char *t_mountpoint_scratch;
+
 /* fd for @t_mountpoint */
 int t_mnt_fd;
 
+/* fd for @t_mountpoint_scratch */
+int t_mnt_scratch_fd;
+
 /* fd for @T_DIR1 */
 int t_dir1_fd;
 
@@ -9520,240 +9531,3902 @@ out:
 	return fret;
 }
 
-static void usage(void)
+static int btrfs_delete_subvolume(int parent_fd, const char *name)
 {
-	fprintf(stderr, "Description:\n");
-	fprintf(stderr, "    Run idmapped mount tests\n\n");
+	struct btrfs_ioctl_vol_args args = {};
+	size_t len;
+	int ret;
 
-	fprintf(stderr, "Arguments:\n");
-	fprintf(stderr, "--device                     Device used in the tests\n");
-	fprintf(stderr, "--fstype                     Filesystem type used in the tests\n");
-	fprintf(stderr, "--help                       Print help\n");
-	fprintf(stderr, "--mountpoint                 Mountpoint of device\n");
-	fprintf(stderr, "--supported                  Test whether idmapped mounts are supported on this filesystem\n");
-	fprintf(stderr, "--test-core                  Run core idmapped mount testsuite\n");
-	fprintf(stderr, "--test-fscaps-regression     Run fscap regression tests\n");
+	len = strlen(name);
+	if (len >= sizeof(args.name))
+		return -ENAMETOOLONG;
 
-	_exit(EXIT_SUCCESS);
+	memcpy(args.name, name, len);
+	args.name[len] = '\0';
+
+	ret = ioctl(parent_fd, BTRFS_IOC_SNAP_DESTROY, &args);
+	if (ret < 0)
+		return -1;
+
+	return 0;
 }
 
-static const struct option longopts[] = {
-	{"device",			required_argument,	0,	1},
-	{"fstype",			required_argument,	0,	2},
-	{"mountpoint",			required_argument,	0,	3},
-	{"supported",			no_argument,		0,	4},
-	{"help",			no_argument,		0,	5},
-	{"test-core",			no_argument,		0,	6},
-	{"test-fscaps-regression",	no_argument,		0,	7},
-	{"test-nested-userns",		no_argument,		0,	8},
-	{NULL,				0,			0,	0},
-};
+static int btrfs_delete_subvolume_id(int parent_fd, uint64_t subvolid)
+{
+	struct btrfs_ioctl_vol_args_v2 args = {};
+	int ret;
 
-struct t_idmapped_mounts {
-	int (*test)(void);
-	const char *description;
-} basic_suite[] = {
-	{ acls,								"posix acls on regular mounts",									},
-	{ create_in_userns,						"create operations in user namespace",								},
-	{ device_node_in_userns,					"device node in user namespace",								},
-	{ expected_uid_gid_idmapped_mounts,				"expected ownership on idmapped mounts",							},
-	{ fscaps,							"fscaps on regular mounts",									},
-	{ fscaps_idmapped_mounts,					"fscaps on idmapped mounts",									},
-	{ fscaps_idmapped_mounts_in_userns,				"fscaps on idmapped mounts in user namespace",							},
-	{ fscaps_idmapped_mounts_in_userns_separate_userns,		"fscaps on idmapped mounts in user namespace with different id mappings",			},
-	{ fsids_mapped,							"mapped fsids",											},
-	{ fsids_unmapped,						"unmapped fsids",										},
-	{ hardlink_crossing_mounts,					"cross mount hardlink",										},
-	{ hardlink_crossing_idmapped_mounts,				"cross idmapped mount hardlink",								},
-	{ hardlink_from_idmapped_mount,					"hardlinks from idmapped mounts",								},
-	{ hardlink_from_idmapped_mount_in_userns,			"hardlinks from idmapped mounts in user namespace",						},
-#ifdef HAVE_LIBURING_H
-	{ io_uring,							"io_uring",											},
-	{ io_uring_userns,						"io_uring in user namespace",									},
-	{ io_uring_idmapped,						"io_uring from idmapped mounts",								},
-	{ io_uring_idmapped_userns,					"io_uring from idmapped mounts in user namespace",						},
-	{ io_uring_idmapped_unmapped,					"io_uring from idmapped mounts with unmapped ids",						},
-	{ io_uring_idmapped_unmapped_userns,				"io_uring from idmapped mounts with unmapped ids in user namespace",				},
-#endif
-	{ protected_symlinks,						"following protected symlinks on regular mounts",						},
-	{ protected_symlinks_idmapped_mounts,				"following protected symlinks on idmapped mounts",						},
-	{ protected_symlinks_idmapped_mounts_in_userns,			"following protected symlinks on idmapped mounts in user namespace",				},
-	{ rename_crossing_mounts,					"cross mount rename",										},
-	{ rename_crossing_idmapped_mounts,				"cross idmapped mount rename",									},
-	{ rename_from_idmapped_mount,					"rename from idmapped mounts",									},
-	{ rename_from_idmapped_mount_in_userns,				"rename from idmapped mounts in user namespace",						},
-	{ setattr_truncate,						"setattr truncate",										},
-	{ setattr_truncate_idmapped,					"setattr truncate on idmapped mounts",								},
-	{ setattr_truncate_idmapped_in_userns,				"setattr truncate on idmapped mounts in user namespace",					},
-	{ setgid_create,						"create operations in directories with setgid bit set",						},
-	{ setgid_create_idmapped,					"create operations in directories with setgid bit set on idmapped mounts",			},
-	{ setgid_create_idmapped_in_userns,				"create operations in directories with setgid bit set on idmapped mounts in user namespace",	},
-	{ setid_binaries,						"setid binaries on regular mounts",								},
-	{ setid_binaries_idmapped_mounts,				"setid binaries on idmapped mounts",								},
-	{ setid_binaries_idmapped_mounts_in_userns,			"setid binaries on idmapped mounts in user namespace",						},
-	{ setid_binaries_idmapped_mounts_in_userns_separate_userns,	"setid binaries on idmapped mounts in user namespace with different id mappings",		},
-	{ sticky_bit_unlink,						"sticky bit unlink operations on regular mounts",						},
-	{ sticky_bit_unlink_idmapped_mounts,				"sticky bit unlink operations on idmapped mounts",						},
-	{ sticky_bit_unlink_idmapped_mounts_in_userns,			"sticky bit unlink operations on idmapped mounts in user namespace",				},
-	{ sticky_bit_rename,						"sticky bit rename operations on regular mounts",						},
-	{ sticky_bit_rename_idmapped_mounts,				"sticky bit rename operations on idmapped mounts",						},
-	{ sticky_bit_rename_idmapped_mounts_in_userns,			"sticky bit rename operations on idmapped mounts in user namespace",				},
-	{ symlink_regular_mounts,					"symlink from regular mounts",									},
-	{ symlink_idmapped_mounts,					"symlink from idmapped mounts",									},
-	{ symlink_idmapped_mounts_in_userns,				"symlink from idmapped mounts in user namespace",						},
-	{ threaded_idmapped_mount_interactions,				"threaded operations on idmapped mounts",							},
-};
+	args.flags = BTRFS_SUBVOL_SPEC_BY_ID;
+	args.subvolid = subvolid;
 
-struct t_idmapped_mounts fscaps_in_ancestor_userns[] = {
-	{ fscaps_idmapped_mounts_in_userns_valid_in_ancestor_userns,	"fscaps on idmapped mounts in user namespace writing fscap valid in ancestor userns",		},
-};
+	ret = ioctl(parent_fd, BTRFS_IOC_SNAP_DESTROY_V2, &args);
+	if (ret < 0)
+		return -1;
 
-struct t_idmapped_mounts t_nested_userns[] = {
-	{ nested_userns,						"test that nested user namespaces behave correctly when attached to idmapped mounts",		},
-};
+	return 0;
+}
 
-static bool run_test(struct t_idmapped_mounts suite[], size_t suite_size)
+static int btrfs_create_subvolume(int parent_fd, const char *name)
 {
-	int i;
+	struct btrfs_ioctl_vol_args_v2 args = {};
+	size_t len;
+	int ret;
 
-	for (i = 0; i < suite_size; i++) {
-		struct t_idmapped_mounts *t = &suite[i];
-		int ret;
-		pid_t pid;
+	len = strlen(name);
+	if (len >= sizeof(args.name))
+		return -ENAMETOOLONG;
 
-		test_setup();
+	memcpy(args.name, name, len);
+	args.name[len] = '\0';
 
-		pid = fork();
-		if (pid < 0)
-			return false;
+	ret = ioctl(parent_fd, BTRFS_IOC_SUBVOL_CREATE_V2, &args);
+	if (ret < 0)
+		return -1;
 
-		if (pid == 0) {
-			ret = t->test();
-			if (ret)
-				die("failure: %s", t->description);
+	return 0;
+}
 
-			exit(EXIT_SUCCESS);
-		}
+static int btrfs_create_snapshot(int fd, int parent_fd, const char *name,
+				 int flags)
+{
+	struct btrfs_ioctl_vol_args_v2 args = {
+		.fd = fd,
+	};
+	size_t len;
+	int ret;
 
-		ret = wait_for_pid(pid);
-		test_cleanup();
+	if (flags & ~BTRFS_SUBVOL_RDONLY)
+		return -EINVAL;
 
-		if (ret)
-			return false;
-	}
+	len = strlen(name);
+	if (len >= sizeof(args.name))
+		return -ENAMETOOLONG;
+	memcpy(args.name, name, len);
+	args.name[len] = '\0';
 
-	return true;
+	if (flags & BTRFS_SUBVOL_RDONLY)
+		args.flags |= BTRFS_SUBVOL_RDONLY;
+	ret = ioctl(parent_fd, BTRFS_IOC_SNAP_CREATE_V2, &args);
+	if (ret < 0)
+		return -1;
+
+	return 0;
 }
 
-int main(int argc, char *argv[])
+static int btrfs_get_subvolume_ro(int fd, bool *read_only_ret)
 {
-	int fret, ret;
-	int index = 0;
-	bool supported = false, test_core = false,
-	     test_fscaps_regression = false, test_nested_userns = false;
+	uint64_t flags;
+	int ret;
 
-	while ((ret = getopt_long(argc, argv, "", longopts, &index)) != -1) {
-		switch (ret) {
-		case 1:
-			t_device = optarg;
-			break;
-		case 2:
-			t_fstype = optarg;
-			break;
-		case 3:
-			t_mountpoint = optarg;
-			break;
-		case 4:
-			supported = true;
-			break;
-		case 6:
-			test_core = true;
-			break;
-		case 7:
-			test_fscaps_regression = true;
-			break;
-		case 8:
-			test_nested_userns = true;
-			break;
-		case 5:
-			/* fallthrough */
-		default:
-			usage();
-		}
-	}
+	ret = ioctl(fd, BTRFS_IOC_SUBVOL_GETFLAGS, &flags);
+	if (ret < 0)
+		return -1;
 
-	if (!t_device)
-		die_errno(EINVAL, "test device missing");
+	*read_only_ret = flags & BTRFS_SUBVOL_RDONLY;
+	return 0;
+}
 
-	if (!t_fstype)
-		die_errno(EINVAL, "test filesystem type missing");
+static int btrfs_set_subvolume_ro(int fd, bool read_only)
+{
+	uint64_t flags;
+	int ret;
 
-	if (!t_mountpoint)
-		die_errno(EINVAL, "mountpoint of test device missing");
+	ret = ioctl(fd, BTRFS_IOC_SUBVOL_GETFLAGS, &flags);
+	if (ret < 0)
+		return -1;
 
-	/* create separate mount namespace */
-	if (unshare(CLONE_NEWNS))
-		die("failure: create new mount namespace");
+	if (read_only)
+		flags |= BTRFS_SUBVOL_RDONLY;
+	else
+		flags &= ~BTRFS_SUBVOL_RDONLY;
 
-	/* turn off mount propagation */
-	if (sys_mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, 0))
-		die("failure: turn mount propagation off");
+	ret = ioctl(fd, BTRFS_IOC_SUBVOL_SETFLAGS, &flags);
+	if (ret < 0)
+		return -1;
 
-	t_mnt_fd = openat(-EBADF, t_mountpoint, O_CLOEXEC | O_DIRECTORY);
-	if (t_mnt_fd < 0)
-		die("failed to open %s", t_mountpoint);
+	return 0;
+}
 
-	/*
-	 * Caller just wants to know whether the filesystem we're on supports
-	 * idmapped mounts.
-	 */
-	if (supported) {
-		int open_tree_fd = -EBADF;
-		struct mount_attr attr = {
-			.attr_set	= MOUNT_ATTR_IDMAP,
-			.userns_fd	= -EBADF,
-		};
+static int btrfs_get_subvolume_id(int fd, uint64_t *id_ret)
+{
+	struct btrfs_ioctl_ino_lookup_args args = {
+	    .treeid = 0,
+	    .objectid = BTRFS_FIRST_FREE_OBJECTID,
+	};
+	int ret;
 
-		/* Changing mount properties on a detached mount. */
-		attr.userns_fd	= get_userns_fd(0, 1000, 1);
-		if (attr.userns_fd < 0)
-			exit(EXIT_FAILURE);
+	ret = ioctl(fd, BTRFS_IOC_INO_LOOKUP, &args);
+	if (ret < 0)
+		return -1;
 
-		open_tree_fd = sys_open_tree(t_mnt_fd, "",
-					     AT_EMPTY_PATH |
-					     AT_NO_AUTOMOUNT |
-					     AT_SYMLINK_NOFOLLOW |
-					     OPEN_TREE_CLOEXEC |
-					     OPEN_TREE_CLONE);
-		if (open_tree_fd < 0)
-			ret = -1;
-		else
-			ret = sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr));
+	*id_ret = args.treeid;
 
-		close(open_tree_fd);
-		close(attr.userns_fd);
+	return 0;
+}
 
-		if (ret)
-			exit(EXIT_FAILURE);
+/*
+ * The following helpers are adapted from the btrfsutils library. We can't use
+ * the library directly since we need full control over how the subvolume
+ * iteration happens. We need to be able to check whether unprivileged
+ * subvolume iteration is possible, i.e. whether BTRFS_IOC_INO_LOOKUP_USER is
+ * available and also ensure that it is actually used when looking up paths.
+ */
+struct btrfs_stack {
+	uint64_t tree_id;
+	struct btrfs_ioctl_get_subvol_rootref_args rootref_args;
+	size_t items_pos;
+	size_t path_len;
+};
 
-		exit(EXIT_SUCCESS);
+struct btrfs_iter {
+	int fd;
+	int cur_fd;
+
+	struct btrfs_stack *search_stack;
+	size_t stack_len;
+	size_t stack_capacity;
+
+	char *cur_path;
+	size_t cur_path_capacity;
+};
+
+static struct btrfs_stack *top_stack_entry(struct btrfs_iter *iter)
+{
+	return &iter->search_stack[iter->stack_len - 1];
+}
+
+static int pop_stack(struct btrfs_iter *iter)
+{
+	struct btrfs_stack *top, *parent;
+	int fd, parent_fd;
+	size_t i;
+
+	if (iter->stack_len == 1) {
+		iter->stack_len--;
+		return 0;
 	}
 
-	stash_overflowuid();
-	stash_overflowgid();
+	top = top_stack_entry(iter);
+	iter->stack_len--;
+	parent = top_stack_entry(iter);
 
-	fret = EXIT_FAILURE;
+	fd = iter->cur_fd;
+	for (i = parent->path_len; i < top->path_len; i++) {
+		if (i == 0 || iter->cur_path[i] == '/') {
+			parent_fd = openat(fd, "..", O_RDONLY);
+			if (fd != iter->cur_fd)
+				close(fd);
+			if (parent_fd == -1)
+				return -1;
+			fd = parent_fd;
+		}
+	}
+	if (iter->cur_fd != iter->fd)
+		close(iter->cur_fd);
+	iter->cur_fd = fd;
 
-	if (test_core && !run_test(basic_suite, ARRAY_SIZE(basic_suite)))
-		goto out;
+	return 0;
+}
 
-	if (test_fscaps_regression &&
-	    !run_test(fscaps_in_ancestor_userns,
-		      ARRAY_SIZE(fscaps_in_ancestor_userns)))
-		goto out;
+static int append_stack(struct btrfs_iter *iter, uint64_t tree_id, size_t path_len)
+{
+	struct btrfs_stack *entry;
 
-	if (test_nested_userns &&
-	    !run_test(t_nested_userns, ARRAY_SIZE(t_nested_userns)))
+	if (iter->stack_len >= iter->stack_capacity) {
+		size_t new_capacity = iter->stack_capacity * 2;
+		struct btrfs_stack *new_search_stack;
+
+		new_search_stack = reallocarray(iter->search_stack, new_capacity,
+						sizeof(*iter->search_stack));
+		if (!new_search_stack)
+			return -ENOMEM;
+
+		iter->stack_capacity = new_capacity;
+		iter->search_stack = new_search_stack;
+	}
+
+	entry = &iter->search_stack[iter->stack_len];
+
+	memset(entry, 0, sizeof(*entry));
+	entry->path_len = path_len;
+	entry->tree_id = tree_id;
+
+	if (iter->stack_len) {
+		struct btrfs_stack *top;
+		char *path;
+		int fd;
+
+		top = top_stack_entry(iter);
+		path = &iter->cur_path[top->path_len];
+		if (*path == '/')
+			path++;
+		fd = openat(iter->cur_fd, path, O_RDONLY);
+		if (fd == -1)
+			return -errno;
+
+		close(iter->cur_fd);
+		iter->cur_fd = fd;
+	}
+
+	iter->stack_len++;
+
+	return 0;
+}
+
+static int btrfs_iterator_start(int fd, uint64_t top, struct btrfs_iter **ret)
+{
+	struct btrfs_iter *iter;
+	int err;
+
+	iter = malloc(sizeof(*iter));
+	if (!iter)
+		return -ENOMEM;
+
+	iter->fd = fd;
+	iter->cur_fd = fd;
+
+	iter->stack_len = 0;
+	iter->stack_capacity = 4;
+	iter->search_stack = malloc(sizeof(*iter->search_stack) *
+				    iter->stack_capacity);
+	if (!iter->search_stack) {
+		err = -ENOMEM;
+		goto out_iter;
+	}
+
+	iter->cur_path_capacity = 256;
+	iter->cur_path = malloc(iter->cur_path_capacity);
+	if (!iter->cur_path) {
+		err = -ENOMEM;
+		goto out_search_stack;
+	}
+
+	err = append_stack(iter, top, 0);
+	if (err)
+		goto out_cur_path;
+
+	*ret = iter;
+
+	return 0;
+
+out_cur_path:
+	free(iter->cur_path);
+out_search_stack:
+	free(iter->search_stack);
+out_iter:
+	free(iter);
+	return err;
+}
+
+static void btrfs_iterator_end(struct btrfs_iter *iter)
+{
+	if (iter) {
+		free(iter->cur_path);
+		free(iter->search_stack);
+		if (iter->cur_fd != iter->fd)
+			close(iter->cur_fd);
+		close(iter->fd);
+		free(iter);
+	}
+}
+
+static int __append_path(struct btrfs_iter *iter, const char *name,
+			 size_t name_len, const char *dir, size_t dir_len,
+			 size_t *path_len_ret)
+{
+	struct btrfs_stack *top = top_stack_entry(iter);
+	size_t path_len;
+	char *p;
+
+	path_len = top->path_len;
+	/*
+	 * We need a joining slash if we have a current path and a subdirectory.
+	 */
+	if (top->path_len && dir_len)
+		path_len++;
+	path_len += dir_len;
+	/*
+	 * We need another joining slash if we have a current path and a name,
+	 * but not if we have a subdirectory, because the lookup ioctl includes
+	 * a trailing slash.
+	 */
+	if (top->path_len && !dir_len && name_len)
+		path_len++;
+	path_len += name_len;
+
+	/* We need one extra character for the NUL terminator. */
+	if (path_len + 1 > iter->cur_path_capacity) {
+		char *tmp = realloc(iter->cur_path, path_len + 1);
+
+		if (!tmp)
+			return -ENOMEM;
+		iter->cur_path = tmp;
+		iter->cur_path_capacity = path_len + 1;
+	}
+
+	p = iter->cur_path + top->path_len;
+	if (top->path_len && dir_len)
+		*p++ = '/';
+	memcpy(p, dir, dir_len);
+	p += dir_len;
+	if (top->path_len && !dir_len && name_len)
+		*p++ = '/';
+	memcpy(p, name, name_len);
+	p += name_len;
+	*p = '\0';
+
+	*path_len_ret = path_len;
+
+	return 0;
+}
+
+static int get_subvolume_path(struct btrfs_iter *iter, uint64_t treeid,
+			      uint64_t dirid, size_t *path_len_ret)
+{
+	struct btrfs_ioctl_ino_lookup_user_args args = {
+		.treeid = treeid,
+		.dirid = dirid,
+	};
+	int ret;
+
+	ret = ioctl(iter->cur_fd, BTRFS_IOC_INO_LOOKUP_USER, &args);
+	if (ret == -1)
+		return -1;
+
+	return __append_path(iter, args.name, strlen(args.name), args.path,
+			     strlen(args.path), path_len_ret);
+}
+
+static int btrfs_iterator_next(struct btrfs_iter *iter, char **path_ret,
+			       uint64_t *id_ret)
+{
+	struct btrfs_stack *top;
+	uint64_t treeid, dirid;
+	size_t path_len;
+	int ret, err;
+
+	for (;;) {
+		for (;;) {
+			if (iter->stack_len == 0)
+				return 1;
+
+			top = top_stack_entry(iter);
+			if (top->items_pos < top->rootref_args.num_items) {
+				break;
+			} else {
+				ret = ioctl(iter->cur_fd,
+					    BTRFS_IOC_GET_SUBVOL_ROOTREF,
+					    &top->rootref_args);
+				if (ret == -1 && errno != EOVERFLOW)
+					return -1;
+				top->items_pos = 0;
+
+				if (top->rootref_args.num_items == 0) {
+					err = pop_stack(iter);
+					if (err)
+						return err;
+				}
+			}
+		}
+
+		treeid = top->rootref_args.rootref[top->items_pos].treeid;
+		dirid = top->rootref_args.rootref[top->items_pos].dirid;
+		top->items_pos++;
+		err = get_subvolume_path(iter, treeid, dirid, &path_len);
+		if (err) {
+			/* Skip the subvolume if we can't access it. */
+			if (errno == EACCES)
+				continue;
+			return err;
+		}
+
+		err = append_stack(iter, treeid, path_len);
+		if (err) {
+			/*
+			 * Skip the subvolume if it does not exist (which can
+			 * happen if there is another filesystem mounted over a
+			 * parent directory) or we don't have permission to
+			 * access it.
+			 */
+			if (errno == ENOENT || errno == EACCES)
+				continue;
+			return err;
+		}
+
+		top = top_stack_entry(iter);
+		goto out;
+	}
+
+out:
+	if (path_ret) {
+		*path_ret = malloc(top->path_len + 1);
+		if (!*path_ret)
+			return -ENOMEM;
+		memcpy(*path_ret, iter->cur_path, top->path_len);
+		(*path_ret)[top->path_len] = '\0';
+	}
+	if (id_ret)
+		*id_ret = top->tree_id;
+	return 0;
+}
+
+#define BTRFS_SUBVOLUME1 "subvol1"
+#define BTRFS_SUBVOLUME1_SNAPSHOT1 "subvol1_snapshot1"
+#define BTRFS_SUBVOLUME1_SNAPSHOT1_RO "subvol1_snapshot1_ro"
+#define BTRFS_SUBVOLUME1_RENAME "subvol1_rename"
+#define BTRFS_SUBVOLUME2 "subvol2"
+
+static int btrfs_subvolumes_fsids_mapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		if (!switch_fsids(10000, 10000))
+			die("failure: switch fsids");
+
+		if (!caps_up())
+			die("failure: raise caps");
+
+		/*
+		 * The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must succeed.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: check ownership");
+
+		/* remove subvolume */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: check ownership");
+
+		if (!caps_down())
+			die("failure: lower caps");
+
+		/*
+		 * The filesystem is not mounted with user_subvol_rm_allowed so
+		 * subvolume deletion must fail.
+		 */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+		die("failure: check ownership");
+
+	/* remove subvolume */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_fsids_mapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		/* The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must fail.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 0, 0))
+			die("failure: check ownership");
+
+		/* remove subvolume */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove subvolume */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_fsids_unmapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+
+	/* create directory for rename test */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	/* change ownership of all files to uid 0 */
+	if (fchownat(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: fchownat");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	if (!switch_fsids(0, 0)) {
+		log_stderr("failure: switch_fsids");
+		goto out;
+	}
+
+	/*
+	 * The caller's fsids don't have a mappings in the idmapped mount so
+	 * any file creation must fail.
+	 */
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (!btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME2)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+	if (errno != EOVERFLOW) {
+		log_stderr("failure: errno");
+		goto out;
+	}
+
+	/* try to rename a subvolume */
+	if (!renameat2(open_tree_fd, BTRFS_SUBVOLUME1, open_tree_fd,
+		       BTRFS_SUBVOLUME1_RENAME, 0)) {
+		log_stderr("failure: renameat2");
+		goto out;
+	}
+	if (errno != EOVERFLOW) {
+		log_stderr("failure: errno");
+		goto out;
+	}
+
+	/* The caller is privileged over the inode so file deletion must work. */
+
+	/* remove subvolume */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_fsids_unmapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF, userns_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	/* create directory for rename test */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	/* change ownership of all files to uid 0 */
+	if (fchownat(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: fchownat");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	userns_fd = get_userns_fd(0, 30000, 10000);
+	if (userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		if (!switch_userns(userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		if (!expected_uid_gid(open_tree_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		/*
+		 * The caller's fsids don't have a mappings in the idmapped mount so
+		 * any file creation must fail.
+		 */
+
+		/* create subvolume */
+		if (!btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME2))
+			die("failure: btrfs_create_subvolume");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/* try to rename a subvolume */
+		if (!renameat2(open_tree_fd, BTRFS_SUBVOLUME1, open_tree_fd,
+					BTRFS_SUBVOLUME1_RENAME, 0))
+			die("failure: renameat2");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/*
+		 * The caller is not privileged over the inode so subvolume
+		 * deletion must fail.
+		 */
+
+		/* remove subvolume */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove subvolume */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+	safe_close(userns_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_fsids_mapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+
+		if (!switch_fsids(10000, 10000))
+			die("failure: switch fsids");
+
+		if (!caps_up())
+			die("failure: raise caps");
+
+		/* The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must fail.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		/* create read-only snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					  BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		safe_close(subvolume_fd);
+
+		/* remove subvolume */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		/* remove read-write snapshot */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1))
+			die("failure: btrfs_delete_subvolume");
+
+		/* remove read-only snapshot */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO))
+			die("failure: btrfs_delete_subvolume");
+
+		/* create directory */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		/* create read-only snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					  BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	/* remove read-write snapshot */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	/* remove read-only snapshot */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_fsids_mapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		/* create read-only snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					  BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+		die("failure: expected_uid_gid");
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000))
+		die("failure: expected_uid_gid");
+
+	/* remove read-write snapshot */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 10000, 10000))
+		die("failure: expected_uid_gid");
+
+	/* remove read-only snapshot */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_fsids_unmapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* create directory for rename test */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	/* change ownership of all files to uid 0 */
+	if (fchownat(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: fchownat");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr,
+			      sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+
+		if (!switch_fsids(0, 0)) {
+			log_stderr("failure: switch_fsids");
+			goto out;
+		}
+
+		/*
+		 * The caller's fsids don't have a mappings in the idmapped
+		 * mount so any file creation must fail.
+		 */
+
+		/*
+		 * The open_tree() syscall returns an O_PATH file descriptor
+		 * which we can't use with ioctl(). So let's reopen it as a
+		 * proper file descriptor.
+		 */
+		tree_fd = openat(open_tree_fd, ".",
+				 O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (tree_fd < 0)
+			die("failure: openat");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create directory */
+		if (!btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME2))
+			die("failure: btrfs_create_subvolume");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/* create read-write snapshot */
+		if (!btrfs_create_snapshot(subvolume_fd, tree_fd,
+					   BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/* create read-only snapshot */
+		if (!btrfs_create_snapshot(subvolume_fd, tree_fd,
+					   BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					   BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/* try to rename a directory */
+		if (!renameat2(open_tree_fd, BTRFS_SUBVOLUME1, open_tree_fd,
+			       BTRFS_SUBVOLUME1_RENAME, 0))
+			die("failure: renameat2");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		if (!caps_down())
+			die("failure: caps_down");
+
+		/* create read-write snapshot */
+		if (!btrfs_create_snapshot(subvolume_fd, tree_fd,
+					   BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		/* create read-only snapshot */
+		if (!btrfs_create_snapshot(subvolume_fd, tree_fd,
+					   BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					   BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		/*
+		 * The caller is not privileged over the inode so subvolume
+		 * deletion must fail.
+		 */
+
+		/* remove directory */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		if (!caps_up())
+			die("failure: caps_down");
+
+		/*
+		 * The caller is privileged over the inode so subvolume
+		 * deletion must work.
+		 */
+
+		/* remove directory */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_fsids_unmapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, subvolume_fd = -EBADF, tree_fd = -EBADF,
+	    userns_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* create directory for rename test */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	/* change ownership of all files to uid 0 */
+	if (fchownat(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: fchownat");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	userns_fd = get_userns_fd(0, 30000, 10000);
+	if (userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr,
+			      sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor
+	 * which we can't use with ioctl(). So let's reopen it as a
+	 * proper file descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".",
+			O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+			O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		if (!switch_userns(userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		if (!expected_uid_gid(open_tree_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		/*
+		 * The caller's fsids don't have a mappings in the idmapped
+		 * mount so any file creation must fail.
+		 */
+
+		/* create directory */
+		if (!btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME2))
+			die("failure: btrfs_create_subvolume");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/* create read-write snapshot */
+		if (!btrfs_create_snapshot(subvolume_fd, tree_fd,
+					   BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		/* create read-only snapshot */
+		if (!btrfs_create_snapshot(subvolume_fd, tree_fd,
+					   BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					   BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		/* try to rename a directory */
+		if (!renameat2(open_tree_fd, BTRFS_SUBVOLUME1, open_tree_fd,
+			       BTRFS_SUBVOLUME1_RENAME, 0))
+			die("failure: renameat2");
+		if (errno != EOVERFLOW)
+			die("failure: errno");
+
+		/*
+		 * The caller is not privileged over the inode so subvolume
+		 * deletion must fail.
+		 */
+
+		/* remove directory */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+		die("failure: btrfs_delete_subvolume");
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(subvolume_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_fsids_mapped_user_subvol_rm_allowed(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_mnt_scratch_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		if (!switch_fsids(10000, 10000))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: raise caps");
+
+		/*
+		 * The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must succedd.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: check ownership");
+
+		/*
+		 * The scratch device is mounted with user_subvol_rm_allowed so
+		 * subvolume deletion must succeed.
+		 */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_fsids_mapped_userns_user_subvol_rm_allowed(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_mnt_scratch_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		/* The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must fail.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 0, 0))
+			die("failure: check ownership");
+
+		/*
+		 * The scratch device is mounted with user_subvol_rm_allowed so
+		 * subvolume deletion must succeed.
+		 */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_fsids_mapped_user_subvol_rm_allowed(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_mnt_scratch_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+
+		if (!switch_fsids(10000, 10000))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: raise caps");
+
+		/*
+		 * The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must succeed.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		/* create read-only snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					  BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		safe_close(subvolume_fd);
+
+		/* remove subvolume */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		/* remove read-write snapshot */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1))
+			die("failure: btrfs_delete_subvolume");
+
+		/* remove read-only snapshot */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO))
+			die("failure: btrfs_delete_subvolume");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		if (btrfs_set_subvolume_ro(subvolume_fd, false))
+			die("failure: btrfs_set_subvolume_ro");
+
+		safe_close(subvolume_fd);
+
+		/* remove read-only snapshot */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_fsids_mapped_userns_user_subvol_rm_allowed(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_mnt_scratch_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		/* create read-only snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+					  BTRFS_SUBVOL_RDONLY))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		/* remove directory */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_delete_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		/* remove read-write snapshot */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1))
+			die("failure: btrfs_delete_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		/* remove read-only snapshot */
+		if (!btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO))
+			die("failure: btrfs_delete_subvolume");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		if (btrfs_set_subvolume_ro(subvolume_fd, false))
+			die("failure: btrfs_set_subvolume_ro");
+
+		safe_close(subvolume_fd);
+
+		/* remove read-only snapshot */
+		if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1_RO))
+			die("failure: btrfs_delete_subvolume");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_delete_by_spec_id(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, subvolume_fd = -EBADF, tree_fd = -EBADF;
+	uint64_t subvolume_id1 = -EINVAL, subvolume_id2 = -EINVAL;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(t_mnt_scratch_fd, "A")) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(t_mnt_scratch_fd, "B")) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	subvolume_fd = openat(t_mnt_scratch_fd, "B", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(subvolume_fd, "C")) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	safe_close(subvolume_fd);
+
+	subvolume_fd = openat(t_mnt_scratch_fd, "A", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (btrfs_get_subvolume_id(subvolume_fd, &subvolume_id1)) {
+		log_stderr("failure: btrfs_get_subvolume_id");
+		goto out;
+	}
+
+	subvolume_fd = openat(t_mnt_scratch_fd, "B/C", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (btrfs_get_subvolume_id(subvolume_fd, &subvolume_id2)) {
+		log_stderr("failure: btrfs_get_subvolume_id");
+		goto out;
+	}
+
+	if (sys_mount(t_device_scratch, t_mountpoint, "btrfs", 0, "subvol=B/C")) {
+		log_stderr("failure: mount");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(-EBADF, t_mountpoint,
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		/*
+		 * The subvolume isn't exposed in the idmapped mount so
+		 * delation via spec id must fail.
+		 */
+		if (!btrfs_delete_subvolume_id(tree_fd, subvolume_id1))
+			die("failure: btrfs_delete_subvolume_id");
+		if (errno != EINVAL)
+			die("failure: errno");
+
+		if (btrfs_delete_subvolume_id(t_mnt_scratch_fd, subvolume_id1))
+			die("failure: btrfs_delete_subvolume_id");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+	sys_umount2(t_mountpoint, MNT_DETACH);
+	btrfs_delete_subvolume_id(t_mnt_scratch_fd, subvolume_id2);
+	btrfs_delete_subvolume(t_mnt_scratch_fd, "B");
+
+	return fret;
+}
+
+static int btrfs_subvolumes_setflags_fsids_mapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+		bool read_only = false;
+
+		if (!switch_fsids(10000, 10000))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: raise caps");
+
+		/* The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must fail.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (btrfs_set_subvolume_ro(subvolume_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (!read_only)
+			die("failure: not read_only");
+
+		if (btrfs_set_subvolume_ro(subvolume_fd, false))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_setflags_fsids_mapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+		bool read_only = false;
+
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		/* The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must fail.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (btrfs_set_subvolume_ro(subvolume_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (!read_only)
+			die("failure: not read_only");
+
+		if (btrfs_set_subvolume_ro(subvolume_fd, false))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_setflags_fsids_unmapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+		bool read_only = false;
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		if (!switch_fsids(0, 0))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: raise caps");
+
+		/*
+		 * The caller's fsids don't have mappings in the idmapped mount
+		 * so any file creation must fail.
+		 */
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (!btrfs_set_subvolume_ro(subvolume_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_subvolumes_setflags_fsids_unmapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF, userns_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	userns_fd = get_userns_fd(0, 30000, 10000);
+	if (userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int subvolume_fd = -EBADF;
+		bool read_only = false;
+
+		/*
+		 * The caller's fsids don't have mappings in the idmapped mount
+		 * so any file creation must fail.
+		 */
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		if (!switch_userns(userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		if (!expected_uid_gid(open_tree_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		if (btrfs_get_subvolume_ro(subvolume_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (!btrfs_set_subvolume_ro(subvolume_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+	safe_close(userns_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_setflags_fsids_mapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int snapshot_fd = -EBADF, subvolume_fd = -EBADF;
+		bool read_only = false;
+
+		if (!switch_fsids(10000, 10000))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: raise caps");
+
+		/*
+		 * The caller's fsids now have mappings in the idmapped mount
+		 * so any file creation must succeed.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000))
+			die("failure: expected_uid_gid");
+
+		snapshot_fd = openat(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1,
+				     O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (snapshot_fd < 0)
+			die("failure: openat");
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (btrfs_set_subvolume_ro(snapshot_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (!read_only)
+			die("failure: not read_only");
+
+		if (btrfs_set_subvolume_ro(snapshot_fd, false))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		safe_close(snapshot_fd);
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_setflags_fsids_mapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int snapshot_fd = -EBADF, subvolume_fd = -EBADF;
+		bool read_only = false;
+
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		/*
+		 * The caller's fsids now have mappings in the idmapped mount so
+		 * any file creation must succeed.
+		 */
+
+		/* create subvolume */
+		if (btrfs_create_subvolume(tree_fd, BTRFS_SUBVOLUME1))
+			die("failure: btrfs_create_subvolume");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		subvolume_fd = openat(tree_fd, BTRFS_SUBVOLUME1,
+				      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (subvolume_fd < 0)
+			die("failure: openat");
+
+		/* create read-write snapshot */
+		if (btrfs_create_snapshot(subvolume_fd, tree_fd,
+					  BTRFS_SUBVOLUME1_SNAPSHOT1, 0))
+			die("failure: btrfs_create_snapshot");
+
+		if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 0, 0))
+			die("failure: expected_uid_gid");
+
+		snapshot_fd = openat(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1,
+				     O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (snapshot_fd < 0)
+			die("failure: openat");
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (btrfs_set_subvolume_ro(snapshot_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (!read_only)
+			die("failure: not read_only");
+
+		if (btrfs_set_subvolume_ro(snapshot_fd, false))
+			die("failure: btrfs_set_subvolume_ro");
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		safe_close(snapshot_fd);
+		safe_close(subvolume_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_setflags_fsids_unmapped(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, subvolume_fd = -EBADF, tree_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	subvolume_fd = openat(t_dir1_fd, BTRFS_SUBVOLUME1,
+			      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create read-write snapshot */
+	if (btrfs_create_snapshot(subvolume_fd, t_dir1_fd,
+				  BTRFS_SUBVOLUME1_SNAPSHOT1, 0)) {
+		log_stderr("failure: btrfs_create_snapshot");
+		goto out;
+	}
+
+	if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 0, 0)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int snapshot_fd = -EBADF;
+		bool read_only = false;
+
+		snapshot_fd = openat(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1,
+				     O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (snapshot_fd < 0)
+			die("failure: openat");
+
+		if (!switch_fsids(0, 0))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: raise caps");
+
+		/*
+		 * The caller's fsids don't have mappings in the idmapped mount
+		 * so any file creation must fail.
+		 */
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (!btrfs_set_subvolume_ro(snapshot_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		safe_close(snapshot_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(subvolume_fd);
+	safe_close(tree_fd);
+
+	return fret;
+}
+
+static int btrfs_snapshots_setflags_fsids_unmapped_userns(void)
+{
+	int fret = -1;
+	int open_tree_fd = -EBADF, subvolume_fd = -EBADF, tree_fd = -EBADF,
+	    userns_fd = -EBADF;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+
+	if (!caps_supported())
+		return 0;
+
+	/* Changing mount properties on a detached mount. */
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	/* Changing mount properties on a detached mount. */
+	userns_fd = get_userns_fd(0, 30000, 10000);
+	if (userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(t_dir1_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(t_dir1_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0, 0, 0)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1, 0, 10000, 10000)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	subvolume_fd = openat(t_dir1_fd, BTRFS_SUBVOLUME1,
+			      O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create read-write snapshot */
+	if (btrfs_create_snapshot(subvolume_fd, t_dir1_fd,
+				  BTRFS_SUBVOLUME1_SNAPSHOT1, 0)) {
+		log_stderr("failure: btrfs_create_snapshot");
+		goto out;
+	}
+
+	if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 0, 0)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	if (!expected_uid_gid(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1, 0, 10000, 10000)) {
+		log_stderr("failure: expected_uid_gid");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		int snapshot_fd = -EBADF;
+		bool read_only = false;
+
+		/*
+		 * The caller's fsids don't have mappings in the idmapped mount
+		 * so any file creation must fail.
+		 */
+
+		snapshot_fd = openat(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1,
+				     O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+		if (snapshot_fd < 0)
+			die("failure: openat");
+
+
+		if (!switch_userns(userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		if (!expected_uid_gid(t_dir1_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		if (!expected_uid_gid(open_tree_fd, BTRFS_SUBVOLUME1, 0,
+				      t_overflowuid, t_overflowgid))
+			die("failure: expected_uid_gid");
+
+		/*
+		 * The caller's fsids don't have mappings in the idmapped mount
+		 * so any file creation must fail.
+		 */
+
+		if (btrfs_get_subvolume_ro(snapshot_fd, &read_only))
+			die("failure: btrfs_get_subvolume_ro");
+
+		if (read_only)
+			die("failure: read_only");
+
+		if (!btrfs_set_subvolume_ro(snapshot_fd, true))
+			die("failure: btrfs_set_subvolume_ro");
+		if (errno != EPERM)
+			die("failure: errno");
+
+		safe_close(snapshot_fd);
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	/* remove directory */
+	if (btrfs_delete_subvolume(tree_fd, BTRFS_SUBVOLUME1_SNAPSHOT1)) {
+		log_stderr("failure: btrfs_delete_subvolume");
+		goto out;
+	}
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(attr.userns_fd);
+	safe_close(open_tree_fd);
+	safe_close(subvolume_fd);
+	safe_close(tree_fd);
+	safe_close(userns_fd);
+
+	return fret;
+}
+
+#define BTRFS_SUBVOLUME_SUBVOL1 "subvol1"
+#define BTRFS_SUBVOLUME_SUBVOL2 "subvol2"
+#define BTRFS_SUBVOLUME_SUBVOL3 "subvol3"
+#define BTRFS_SUBVOLUME_SUBVOL4 "subvol4"
+
+#define BTRFS_SUBVOLUME_SUBVOL1_ID 0
+#define BTRFS_SUBVOLUME_SUBVOL2_ID 1
+#define BTRFS_SUBVOLUME_SUBVOL3_ID 2
+#define BTRFS_SUBVOLUME_SUBVOL4_ID 3
+
+#define BTRFS_SUBVOLUME_DIR1 "dir1"
+#define BTRFS_SUBVOLUME_DIR2 "dir2"
+
+#define BTRFS_SUBVOLUME_MNT "mnt_subvolume1"
+
+#define BTRFS_SUBVOLUME_SUBVOL1xSUBVOL3 "subvol1/subvol3"
+#define BTRFS_SUBVOLUME_SUBVOL1xDIR1xDIR2 "subvol1/dir1/dir2"
+#define BTRFS_SUBVOLUME_SUBVOL1xDIR1xDIR2xSUBVOL4 "subvol1/dir1/dir2/subvol4"
+
+/*
+ * We create the following mount layout to test lookup:
+ *
+ * |-/mnt/test                    /dev/loop0                   btrfs       rw,relatime,space_cache,subvolid=5,subvol=/
+ * | |-/mnt/test/mnt1             /dev/loop1[/subvol1]         btrfs       rw,relatime,space_cache,user_subvol_rm_allowed,subvolid=268,subvol=/subvol1
+ * '-/mnt/scratch                 /dev/loop1                   btrfs       rw,relatime,space_cache,user_subvol_rm_allowed,subvolid=5,subvol=/
+ */
+static int btrfs_subvolume_lookup_user(void)
+{
+	int fret = -1;
+	int dir1_fd = -EBADF, dir2_fd = -EBADF, mnt_fd = -EBADF,
+	    open_tree_fd = -EBADF, tree_fd = -EBADF, userns_fd = -EBADF;
+	int subvolume_fds[BTRFS_SUBVOLUME_SUBVOL4_ID + 1];
+	uint64_t subvolume_ids[BTRFS_SUBVOLUME_SUBVOL4_ID + 1];
+	uint64_t subvolid = -EINVAL;
+	struct mount_attr attr = {
+		.attr_set = MOUNT_ATTR_IDMAP,
+	};
+	pid_t pid;
+	struct btrfs_iter *iter;
+
+	if (!caps_supported())
+		return 0;
+
+	for (int i = 0; i < ARRAY_SIZE(subvolume_fds); i++)
+		subvolume_fds[i] = -EBADF;
+
+	for (int i = 0; i < ARRAY_SIZE(subvolume_ids); i++)
+		subvolume_ids[i] = -EINVAL;
+
+	if (btrfs_create_subvolume(t_mnt_scratch_fd, BTRFS_SUBVOLUME_SUBVOL1)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (btrfs_create_subvolume(t_mnt_scratch_fd, BTRFS_SUBVOLUME_SUBVOL2)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	subvolume_fds[BTRFS_SUBVOLUME_SUBVOL1_ID] = openat(t_mnt_scratch_fd,
+							   BTRFS_SUBVOLUME_SUBVOL1,
+							   O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fds[BTRFS_SUBVOLUME_SUBVOL1_ID] < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	/* create subvolume */
+	if (btrfs_create_subvolume(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL1_ID], BTRFS_SUBVOLUME_SUBVOL3)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (mkdirat(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL1_ID], BTRFS_SUBVOLUME_DIR1, 0777)) {
+		log_stderr("failure: mkdirat");
+		goto out;
+	}
+
+	dir1_fd = openat(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL1_ID], BTRFS_SUBVOLUME_DIR1,
+			 O_CLOEXEC | O_DIRECTORY);
+	if (dir1_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (mkdirat(dir1_fd, BTRFS_SUBVOLUME_DIR2, 0777)) {
+		log_stderr("failure: mkdirat");
+		goto out;
+	}
+
+	dir2_fd = openat(dir1_fd, BTRFS_SUBVOLUME_DIR2, O_CLOEXEC | O_DIRECTORY);
+	if (dir2_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (btrfs_create_subvolume(dir2_fd, BTRFS_SUBVOLUME_SUBVOL4)) {
+		log_stderr("failure: btrfs_create_subvolume");
+		goto out;
+	}
+
+	if (mkdirat(t_mnt_fd, BTRFS_SUBVOLUME_MNT, 0777)) {
+		log_stderr("failure: mkdirat");
+		goto out;
+	}
+
+	snprintf(t_buf, sizeof(t_buf), "%s/%s", t_mountpoint, BTRFS_SUBVOLUME_MNT);
+	if (sys_mount(t_device_scratch, t_buf, "btrfs", 0,
+		      "subvol=" BTRFS_SUBVOLUME_SUBVOL1)) {
+		log_stderr("failure: mount");
+		goto out;
+	}
+
+	mnt_fd = openat(t_mnt_fd, BTRFS_SUBVOLUME_MNT, O_CLOEXEC | O_DIRECTORY);
+	if (mnt_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (chown_r(t_mnt_scratch_fd, ".", 1000, 1000)) {
+		log_stderr("failure: chown_r");
+		goto out;
+	}
+
+	subvolume_fds[BTRFS_SUBVOLUME_SUBVOL2_ID] = openat(t_mnt_scratch_fd,
+							   BTRFS_SUBVOLUME_SUBVOL2,
+							   O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fds[BTRFS_SUBVOLUME_SUBVOL2_ID] < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (btrfs_get_subvolume_id(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL1_ID],
+				   &subvolume_ids[BTRFS_SUBVOLUME_SUBVOL1_ID])) {
+		log_stderr("failure: btrfs_get_subvolume_id");
+		goto out;
+	}
+
+	if (btrfs_get_subvolume_id(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL2_ID],
+				   &subvolume_ids[BTRFS_SUBVOLUME_SUBVOL2_ID])) {
+		log_stderr("failure: btrfs_get_subvolume_id");
+		goto out;
+	}
+
+	subvolume_fds[BTRFS_SUBVOLUME_SUBVOL3_ID] = openat(t_mnt_scratch_fd,
+							   BTRFS_SUBVOLUME_SUBVOL1xSUBVOL3,
+							   O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fds[BTRFS_SUBVOLUME_SUBVOL3_ID] < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (btrfs_get_subvolume_id(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL3_ID],
+				   &subvolume_ids[BTRFS_SUBVOLUME_SUBVOL3_ID])) {
+		log_stderr("failure: btrfs_get_subvolume_id");
+		goto out;
+	}
+
+	subvolume_fds[BTRFS_SUBVOLUME_SUBVOL4_ID] = openat(t_mnt_scratch_fd,
+							   BTRFS_SUBVOLUME_SUBVOL1xDIR1xDIR2xSUBVOL4,
+							   O_CLOEXEC | O_DIRECTORY);
+	if (subvolume_fds[BTRFS_SUBVOLUME_SUBVOL4_ID] < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	if (btrfs_get_subvolume_id(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL4_ID],
+				   &subvolume_ids[BTRFS_SUBVOLUME_SUBVOL4_ID])) {
+		log_stderr("failure: btrfs_get_subvolume_id");
+		goto out;
+	}
+
+
+	if (fchmod(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL3_ID], S_IRUSR | S_IWUSR | S_IXUSR), 0) {
+		log_stderr("failure: fchmod");
+		goto out;
+	}
+
+	if (fchmod(subvolume_fds[BTRFS_SUBVOLUME_SUBVOL4_ID], S_IRUSR | S_IWUSR | S_IXUSR), 0) {
+		log_stderr("failure: fchmod");
+		goto out;
+	}
+
+	attr.userns_fd	= get_userns_fd(0, 10000, 10000);
+	if (attr.userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	open_tree_fd = sys_open_tree(mnt_fd, "",
+				     AT_EMPTY_PATH |
+				     AT_NO_AUTOMOUNT |
+				     AT_SYMLINK_NOFOLLOW |
+				     OPEN_TREE_CLOEXEC |
+				     OPEN_TREE_CLONE);
+	if (open_tree_fd < 0) {
+		log_stderr("failure: sys_open_tree");
+		goto out;
+	}
+
+	if (sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr))) {
+		log_stderr("failure: sys_mount_setattr");
+		goto out;
+	}
+
+	/*
+	 * The open_tree() syscall returns an O_PATH file descriptor which we
+	 * can't use with ioctl(). So let's reopen it as a proper file
+	 * descriptor.
+	 */
+	tree_fd = openat(open_tree_fd, ".", O_RDONLY | O_CLOEXEC | O_DIRECTORY);
+	if (tree_fd < 0) {
+		log_stderr("failure: openat");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		bool subvolume3_found = false, subvolume4_found = false;
+
+		if (!switch_fsids(11000, 11000))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: lower caps");
+
+		if (btrfs_iterator_start(tree_fd, 0, &iter))
+			die("failure: btrfs_iterator_start");
+
+		for (;;) {
+			char *subvol_path = NULL;
+			int ret;
+
+			ret = btrfs_iterator_next(iter, &subvol_path, &subvolid);
+			if (ret == 1)
+				break;
+			else if (ret)
+				die("failure: btrfs_iterator_next");
+
+			if (subvolid != subvolume_ids[BTRFS_SUBVOLUME_SUBVOL3_ID] &&
+			    subvolid != subvolume_ids[BTRFS_SUBVOLUME_SUBVOL4_ID])
+				die("failure: subvolume id %llu->%s",
+				    (long long unsigned)subvolid, subvol_path);
+
+			if (subvolid == subvolume_ids[BTRFS_SUBVOLUME_SUBVOL3_ID])
+				subvolume3_found = true;
+
+			if (subvolid == subvolume_ids[BTRFS_SUBVOLUME_SUBVOL4_ID])
+				subvolume4_found = true;
+
+			free(subvol_path);
+		}
+		btrfs_iterator_end(iter);
+
+		if (!subvolume3_found || !subvolume4_found)
+			die("failure: subvolume id");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		bool subvolume3_found = false, subvolume4_found = false;
+
+		if (!switch_userns(attr.userns_fd, 0, 0, false))
+			die("failure: switch_userns");
+
+		if (btrfs_iterator_start(tree_fd, 0, &iter))
+			die("failure: btrfs_iterator_start");
+
+		for (;;) {
+			char *subvol_path = NULL;
+			int ret;
+
+			ret = btrfs_iterator_next(iter, &subvol_path, &subvolid);
+			if (ret == 1)
+				break;
+			else if (ret)
+				die("failure: btrfs_iterator_next");
+
+			if (subvolid != subvolume_ids[BTRFS_SUBVOLUME_SUBVOL3_ID] &&
+			    subvolid != subvolume_ids[BTRFS_SUBVOLUME_SUBVOL4_ID])
+				die("failure: subvolume id %llu->%s",
+				    (long long unsigned)subvolid, subvol_path);
+
+			if (subvolid == subvolume_ids[BTRFS_SUBVOLUME_SUBVOL3_ID])
+				subvolume3_found = true;
+
+			if (subvolid == subvolume_ids[BTRFS_SUBVOLUME_SUBVOL4_ID])
+				subvolume4_found = true;
+
+			free(subvol_path);
+		}
+		btrfs_iterator_end(iter);
+
+		if (!subvolume3_found || !subvolume4_found)
+			die("failure: subvolume id");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		bool subvolume_found = false;
+
+		if (!switch_fsids(0, 0))
+			die("failure: switch fsids");
+
+		if (!caps_down())
+			die("failure: lower caps");
+
+		if (btrfs_iterator_start(tree_fd, 0, &iter))
+			die("failure: btrfs_iterator_start");
+
+		for (;;) {
+			char *subvol_path = NULL;
+			int ret;
+
+			ret = btrfs_iterator_next(iter, &subvol_path, &subvolid);
+			if (ret == 1)
+				break;
+			else if (ret)
+				die("failure: btrfs_iterator_next");
+
+			free(subvol_path);
+
+			subvolume_found = true;
+			break;
+		}
+		btrfs_iterator_end(iter);
+
+		if (subvolume_found)
+			die("failure: subvolume id");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	userns_fd = get_userns_fd(0, 30000, 10000);
+	if (userns_fd < 0) {
+		log_stderr("failure: get_userns_fd");
+		goto out;
+	}
+
+	pid = fork();
+	if (pid < 0) {
+		log_stderr("failure: fork");
+		goto out;
+	}
+	if (pid == 0) {
+		bool subvolume_found = false;
+
+		if (!switch_userns(userns_fd, 0, 0, true))
+			die("failure: switch_userns");
+
+		if (btrfs_iterator_start(tree_fd, 0, &iter))
+			die("failure: btrfs_iterator_start");
+
+		for (;;) {
+			char *subvol_path = NULL;
+			int ret;
+
+			ret = btrfs_iterator_next(iter, &subvol_path, &subvolid);
+			if (ret == 1)
+				break;
+			else if (ret)
+				die("failure: btrfs_iterator_next");
+
+			free(subvol_path);
+
+			subvolume_found = true;
+			break;
+		}
+		btrfs_iterator_end(iter);
+
+		if (subvolume_found)
+			die("failure: subvolume id");
+
+		exit(EXIT_SUCCESS);
+	}
+	if (wait_for_pid(pid))
+		goto out;
+
+	fret = 0;
+	log_debug("Ran test");
+out:
+	safe_close(dir1_fd);
+	safe_close(dir2_fd);
+	safe_close(open_tree_fd);
+	safe_close(tree_fd);
+	safe_close(userns_fd);
+	for (int i = 0; i < ARRAY_SIZE(subvolume_fds); i++)
+		safe_close(subvolume_fds[i]);
+	snprintf(t_buf, sizeof(t_buf), "%s/%s", t_mountpoint, BTRFS_SUBVOLUME_MNT);
+	sys_umount2(t_buf, MNT_DETACH);
+	unlinkat(t_mnt_fd, BTRFS_SUBVOLUME_MNT, AT_REMOVEDIR);
+
+	return fret;
+}
+
+static void usage(void)
+{
+	fprintf(stderr, "Description:\n");
+	fprintf(stderr, "    Run idmapped mount tests\n\n");
+
+	fprintf(stderr, "Arguments:\n");
+	fprintf(stderr, "--device                     Device used in the tests\n");
+	fprintf(stderr, "--fstype                     Filesystem type used in the tests\n");
+	fprintf(stderr, "--help                       Print help\n");
+	fprintf(stderr, "--mountpoint                 Mountpoint of device\n");
+	fprintf(stderr, "--supported                  Test whether idmapped mounts are supported on this filesystem\n");
+	fprintf(stderr, "--test-core                  Run core idmapped mount testsuite\n");
+	fprintf(stderr, "--test-fscaps-regression     Run fscap regression tests\n");
+	fprintf(stderr, "--scratch-mountpoint         Mountpoint of scratch device used in the tests\n");
+	fprintf(stderr, "--scratch-device             Scratch device used in the tests\n");
+
+	_exit(EXIT_SUCCESS);
+}
+
+static const struct option longopts[] = {
+	{"device",			required_argument,	0,	 1},
+	{"fstype",			required_argument,	0,	 2},
+	{"mountpoint",			required_argument,	0,	 3},
+	{"supported",			no_argument,		0,	 4},
+	{"help",			no_argument,		0,	 5},
+	{"test-core",			no_argument,		0,	 6},
+	{"test-fscaps-regression",	no_argument,		0,	 7},
+	{"test-nested-userns",		no_argument,		0,	 8},
+	{"test-btrfs",			no_argument,		0,	 9},
+	{"scratch-mountpoint",		required_argument,	0,	10},
+	{"scratch-device",		required_argument,	0,	11},
+	{NULL,				0,			0,	 0},
+};
+
+struct t_idmapped_mounts {
+	int (*test)(void);
+	const char *description;
+} basic_suite[] = {
+	{ acls,								"posix acls on regular mounts",									},
+	{ create_in_userns,						"create operations in user namespace",								},
+	{ device_node_in_userns,					"device node in user namespace",								},
+	{ expected_uid_gid_idmapped_mounts,				"expected ownership on idmapped mounts",							},
+	{ fscaps,							"fscaps on regular mounts",									},
+	{ fscaps_idmapped_mounts,					"fscaps on idmapped mounts",									},
+	{ fscaps_idmapped_mounts_in_userns,				"fscaps on idmapped mounts in user namespace",							},
+	{ fscaps_idmapped_mounts_in_userns_separate_userns,		"fscaps on idmapped mounts in user namespace with different id mappings",			},
+	{ fsids_mapped,							"mapped fsids",											},
+	{ fsids_unmapped,						"unmapped fsids",										},
+	{ hardlink_crossing_mounts,					"cross mount hardlink",										},
+	{ hardlink_crossing_idmapped_mounts,				"cross idmapped mount hardlink",								},
+	{ hardlink_from_idmapped_mount,					"hardlinks from idmapped mounts",								},
+	{ hardlink_from_idmapped_mount_in_userns,			"hardlinks from idmapped mounts in user namespace",						},
+#ifdef HAVE_LIBURING_H
+	{ io_uring,							"io_uring",											},
+	{ io_uring_userns,						"io_uring in user namespace",									},
+	{ io_uring_idmapped,						"io_uring from idmapped mounts",								},
+	{ io_uring_idmapped_userns,					"io_uring from idmapped mounts in user namespace",						},
+	{ io_uring_idmapped_unmapped,					"io_uring from idmapped mounts with unmapped ids",						},
+	{ io_uring_idmapped_unmapped_userns,				"io_uring from idmapped mounts with unmapped ids in user namespace",				},
+#endif
+	{ protected_symlinks,						"following protected symlinks on regular mounts",						},
+	{ protected_symlinks_idmapped_mounts,				"following protected symlinks on idmapped mounts",						},
+	{ protected_symlinks_idmapped_mounts_in_userns,			"following protected symlinks on idmapped mounts in user namespace",				},
+	{ rename_crossing_mounts,					"cross mount rename",										},
+	{ rename_crossing_idmapped_mounts,				"cross idmapped mount rename",									},
+	{ rename_from_idmapped_mount,					"rename from idmapped mounts",									},
+	{ rename_from_idmapped_mount_in_userns,				"rename from idmapped mounts in user namespace",						},
+	{ setattr_truncate,						"setattr truncate",										},
+	{ setattr_truncate_idmapped,					"setattr truncate on idmapped mounts",								},
+	{ setattr_truncate_idmapped_in_userns,				"setattr truncate on idmapped mounts in user namespace",					},
+	{ setgid_create,						"create operations in directories with setgid bit set",						},
+	{ setgid_create_idmapped,					"create operations in directories with setgid bit set on idmapped mounts",			},
+	{ setgid_create_idmapped_in_userns,				"create operations in directories with setgid bit set on idmapped mounts in user namespace",	},
+	{ setid_binaries,						"setid binaries on regular mounts",								},
+	{ setid_binaries_idmapped_mounts,				"setid binaries on idmapped mounts",								},
+	{ setid_binaries_idmapped_mounts_in_userns,			"setid binaries on idmapped mounts in user namespace",						},
+	{ setid_binaries_idmapped_mounts_in_userns_separate_userns,	"setid binaries on idmapped mounts in user namespace with different id mappings",		},
+	{ sticky_bit_unlink,						"sticky bit unlink operations on regular mounts",						},
+	{ sticky_bit_unlink_idmapped_mounts,				"sticky bit unlink operations on idmapped mounts",						},
+	{ sticky_bit_unlink_idmapped_mounts_in_userns,			"sticky bit unlink operations on idmapped mounts in user namespace",				},
+	{ sticky_bit_rename,						"sticky bit rename operations on regular mounts",						},
+	{ sticky_bit_rename_idmapped_mounts,				"sticky bit rename operations on idmapped mounts",						},
+	{ sticky_bit_rename_idmapped_mounts_in_userns,			"sticky bit rename operations on idmapped mounts in user namespace",				},
+	{ symlink_regular_mounts,					"symlink from regular mounts",									},
+	{ symlink_idmapped_mounts,					"symlink from idmapped mounts",									},
+	{ symlink_idmapped_mounts_in_userns,				"symlink from idmapped mounts in user namespace",						},
+	{ threaded_idmapped_mount_interactions,				"threaded operations on idmapped mounts",							},
+};
+
+struct t_idmapped_mounts fscaps_in_ancestor_userns[] = {
+	{ fscaps_idmapped_mounts_in_userns_valid_in_ancestor_userns,	"fscaps on idmapped mounts in user namespace writing fscap valid in ancestor userns",		},
+};
+
+struct t_idmapped_mounts t_nested_userns[] = {
+	{ nested_userns,						"test that nested user namespaces behave correctly when attached to idmapped mounts",		},
+};
+
+struct t_idmapped_mounts t_btrfs[] = {
+	{ btrfs_subvolumes_fsids_mapped,				"test subvolumes with mapped fsids",								},
+	{ btrfs_subvolumes_fsids_mapped_userns,				"test subvolumes with mapped fsids inside user namespace",					},
+	{ btrfs_subvolumes_fsids_mapped_user_subvol_rm_allowed,		"test subvolume deletion with user_subvol_rm_allowed mount option",				},
+	{ btrfs_subvolumes_fsids_mapped_userns_user_subvol_rm_allowed,	"test subvolume deletion with user_subvol_rm_allowed mount option inside user namespace",	},
+	{ btrfs_subvolumes_fsids_unmapped,				"test subvolumes with unmapped fsids",								},
+	{ btrfs_subvolumes_fsids_unmapped_userns,			"test subvolumes with unmapped fsids inside user namespace",					},
+	{ btrfs_snapshots_fsids_mapped,					"test snapshots with mapped fsids",								},
+	{ btrfs_snapshots_fsids_mapped_userns,				"test snapshots with mapped fsids inside user namespace",					},
+	{ btrfs_snapshots_fsids_mapped_user_subvol_rm_allowed,		"test snapshots deletion with user_subvol_rm_allowed mount option",				},
+	{ btrfs_snapshots_fsids_mapped_userns_user_subvol_rm_allowed,	"test snapshots deletion with user_subvol_rm_allowed mount option inside user namespace",	},
+	{ btrfs_snapshots_fsids_unmapped,				"test snapshots with unmapped fsids",								},
+	{ btrfs_snapshots_fsids_unmapped_userns,			"test snapshots with unmapped fsids inside user namespace",					},
+	{ btrfs_delete_by_spec_id,					"test subvolume deletion by spec id",								},
+	{ btrfs_subvolumes_setflags_fsids_mapped,			"test subvolume flags with mapped fsids",							},
+	{ btrfs_subvolumes_setflags_fsids_mapped_userns,		"test subvolume flags with mapped fsids inside user namespace",					},
+	{ btrfs_subvolumes_setflags_fsids_unmapped,			"test subvolume flags with unmapped fsids",							},
+	{ btrfs_subvolumes_setflags_fsids_unmapped_userns,		"test subvolume flags with unmapped fsids inside user namespace",				},
+	{ btrfs_snapshots_setflags_fsids_mapped,			"test snapshots flags with mapped fsids",							},
+	{ btrfs_snapshots_setflags_fsids_mapped_userns,			"test snapshots flags with mapped fsids inside user namespace",					},
+	{ btrfs_snapshots_setflags_fsids_unmapped,			"test snapshots flags with unmapped fsids",							},
+	{ btrfs_snapshots_setflags_fsids_unmapped_userns,		"test snapshots flags with unmapped fsids inside user namespace",				},
+	{ btrfs_subvolume_lookup_user,					"test unprivileged subvolume lookup",								},
+};
+
+static bool run_test(struct t_idmapped_mounts suite[], size_t suite_size)
+{
+	int i;
+
+	for (i = 0; i < suite_size; i++) {
+		struct t_idmapped_mounts *t = &suite[i];
+		int ret;
+		pid_t pid;
+
+		test_setup();
+
+		pid = fork();
+		if (pid < 0)
+			return false;
+
+		if (pid == 0) {
+			ret = t->test();
+			if (ret)
+				die("failure: %s", t->description);
+
+			exit(EXIT_SUCCESS);
+		}
+
+		ret = wait_for_pid(pid);
+		test_cleanup();
+
+		if (ret)
+			return false;
+	}
+
+	return true;
+}
+
+int main(int argc, char *argv[])
+{
+	int fret, ret;
+	int index = 0;
+	bool supported = false, test_btrfs = false, test_core = false,
+	     test_fscaps_regression = false, test_nested_userns = false;
+
+	while ((ret = getopt_long(argc, argv, "", longopts, &index)) != -1) {
+		switch (ret) {
+		case 1:
+			t_device = optarg;
+			break;
+		case 2:
+			t_fstype = optarg;
+			break;
+		case 3:
+			t_mountpoint = optarg;
+			break;
+		case 4:
+			supported = true;
+			break;
+		case 6:
+			test_core = true;
+			break;
+		case 7:
+			test_fscaps_regression = true;
+			break;
+		case 8:
+			test_nested_userns = true;
+			break;
+		case 9:
+			test_btrfs = true;
+			break;
+		case 10:
+			t_mountpoint_scratch = optarg;
+			break;
+		case 11:
+			t_device_scratch = optarg;
+			break;
+		case 5:
+			/* fallthrough */
+		default:
+			usage();
+		}
+	}
+
+	if (!t_device)
+		die_errno(EINVAL, "test device missing");
+
+	if (!t_fstype)
+		die_errno(EINVAL, "test filesystem type missing");
+
+	if (!t_mountpoint)
+		die_errno(EINVAL, "mountpoint of test device missing");
+
+	/* create separate mount namespace */
+	if (unshare(CLONE_NEWNS))
+		die("failure: create new mount namespace");
+
+	/* turn off mount propagation */
+	if (sys_mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, 0))
+		die("failure: turn mount propagation off");
+
+	t_mnt_fd = openat(-EBADF, t_mountpoint, O_CLOEXEC | O_DIRECTORY);
+	if (t_mnt_fd < 0)
+		die("failed to open %s", t_mountpoint);
+
+	t_mnt_scratch_fd = openat(-EBADF, t_mountpoint_scratch, O_CLOEXEC | O_DIRECTORY);
+	if (t_mnt_fd < 0)
+		die("failed to open %s", t_mountpoint_scratch);
+
+	/*
+	 * Caller just wants to know whether the filesystem we're on supports
+	 * idmapped mounts.
+	 */
+	if (supported) {
+		int open_tree_fd = -EBADF;
+		struct mount_attr attr = {
+			.attr_set	= MOUNT_ATTR_IDMAP,
+			.userns_fd	= -EBADF,
+		};
+
+		/* Changing mount properties on a detached mount. */
+		attr.userns_fd	= get_userns_fd(0, 1000, 1);
+		if (attr.userns_fd < 0)
+			exit(EXIT_FAILURE);
+
+		open_tree_fd = sys_open_tree(t_mnt_fd, "",
+					     AT_EMPTY_PATH |
+					     AT_NO_AUTOMOUNT |
+					     AT_SYMLINK_NOFOLLOW |
+					     OPEN_TREE_CLOEXEC |
+					     OPEN_TREE_CLONE);
+		if (open_tree_fd < 0)
+			ret = -1;
+		else
+			ret = sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr));
+
+		close(open_tree_fd);
+		close(attr.userns_fd);
+
+		if (ret)
+			exit(EXIT_FAILURE);
+
+		exit(EXIT_SUCCESS);
+	}
+
+	stash_overflowuid();
+	stash_overflowgid();
+
+	fret = EXIT_FAILURE;
+
+	if (test_core && !run_test(basic_suite, ARRAY_SIZE(basic_suite)))
+		goto out;
+
+	if (test_fscaps_regression &&
+	    !run_test(fscaps_in_ancestor_userns,
+		      ARRAY_SIZE(fscaps_in_ancestor_userns)))
+		goto out;
+
+	if (test_nested_userns &&
+	    !run_test(t_nested_userns, ARRAY_SIZE(t_nested_userns)))
+		goto out;
+
+	if (test_btrfs && !run_test(t_btrfs, ARRAY_SIZE(t_btrfs)))
 		goto out;
 
 	fret = EXIT_SUCCESS;
diff --git a/tests/btrfs/242 b/tests/btrfs/242
new file mode 100755
index 00000000..bb833842
--- /dev/null
+++ b/tests/btrfs/242
@@ -0,0 +1,34 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2021 Christian Brauner.  All Rights Reserved.
+#
+# FS QA Test 242
+#
+# Test that idmapped mounts behave correctly with btrfs specific features such
+# as subvolume and snapshot creation and deletion.
+#
+. ./common/preamble
+_begin_fstest auto quick idmapped subvolume
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+
+_supported_fs btrfs
+_require_idmapped_mounts
+_require_test
+_require_scratch
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount "-o user_subvol_rm_allowed" >> $seqres.full
+
+echo "Silence is golden"
+
+$here/src/idmapped-mounts/idmapped-mounts --test-btrfs --device "$TEST_DEV" \
+	--mountpoint "$TEST_DIR" --scratch-device "$SCRATCH_DEV" \
+	--scratch-mountpoint "$SCRATCH_MNT" --fstype "$FSTYP"
+
+status=$?
+exit
diff --git a/tests/btrfs/242.out b/tests/btrfs/242.out
new file mode 100644
index 00000000..a46d7770
--- /dev/null
+++ b/tests/btrfs/242.out
@@ -0,0 +1,2 @@
+QA output created by 242
+Silence is golden

base-commit: 4b1e66c2544b61d55ac0e8d58601bbade31d9f59
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 00/21] btrfs: support idmapped mounts
  2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
                   ` (20 preceding siblings ...)
  2021-07-19 11:10 ` [PATCH v2 21/21] btrfs/242: introduce btrfs specific idmapped mounts tests Christian Brauner
@ 2021-07-19 15:11 ` Josef Bacik
  21 siblings, 0 replies; 25+ messages in thread
From: Josef Bacik @ 2021-07-19 15:11 UTC (permalink / raw)
  To: Christian Brauner, Christoph Hellwig, Chris Mason, David Sterba, Al Viro
  Cc: linux-btrfs, Christian Brauner

On 7/19/21 7:10 AM, Christian Brauner wrote:
> From: Christian Brauner <christian.brauner@ubuntu.com>
> 
> Hey everyone,
> 
> This series enables the creation of idmapped mounts on btrfs. On the list of
> filesystems btrfs was pretty high-up and requested quite often from userspace
> (cf. [1]). This series requires just a few changes to the vfs for specific
> lookup helpers that btrfs relies on to perform permission checking when looking
> up an inode. The changes are required to port some other filesystem as well.
> 
> The conversion of the necessary btrfs internals was fairly straightforward. No
> invasive changes were needed. I've decided to split up the patchset into very
> small individual patches. This hopefully makes the series more readable and
> fairly easy to review. The overall changeset is quite small.
> 
> All non-filesystem wide ioctls that peform permission checking based on inodes
> can be supported on idmapped mounts. There are really just a few restrictions.
> This should really only affect the deletion of subvolumes by subvolume id which
> can be used to delete any subvolume in the filesystem even though the caller
> might not even be able to see the subvolume under their mount. Other than that
> behavior on idmapped and non-idmapped mounts is identical for all enabled
> ioctls.
> 
> The changeset has an associated new testsuite specific to btrfs. The
> core vfs operations that btrfs implements are covered by the generic
> idmapped mount testsuite. For the ioctls a new testsuite was added. It
> is sent alongside this patchset for ease of review but will very likely
> be merged independent of it.
> 
> All patches are based on v5.14-rc2.
> 
> The series can be pulled from:
> https://git.kernel.org/brauner/h/fs.idmapped.btrfs
> https://github.com/brauner/linux/tree/fs.idmapped.btrfs
> 
> The xfstests can be pulled from:
> https://git.kernel.org/brauner/xfstests-dev/h/fs.idmapped.btrfs
> https://github.com/brauner/xfstests/tree/fs.idmapped.btrfs
> 
> Note, the new btrfs xfstests patch is on top of a branch of mine
> containing a few more preliminary patches. So if you want to run the
> tests, please simply pull the branch and build from there.
> 
> The series has been tested with xfstests including the newly added btrfs
> specific test. All tests pass.
> There were three unrelated failures that I observed: btrfs/219,
> btrfs/2020 and btrfs/235. All three also fail on earlier kernels
> without the patch series applied.
> 
> Thanks!
> Christian

Thanks for this work Christian, you can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to the series.

Josef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl
  2021-07-19 11:10 ` [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl Christian Brauner
@ 2021-07-21 14:15   ` David Sterba
  2021-07-21 15:48     ` Christian Brauner
  0 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2021-07-21 14:15 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Chris Mason, Josef Bacik, David Sterba,
	Al Viro, linux-btrfs, Christian Brauner, Christoph Hellwig

On Mon, Jul 19, 2021 at 01:10:45PM +0200, Christian Brauner wrote:
> From: Christian Brauner <christian.brauner@ubuntu.com>
> 
> Destroying subvolumes and snapshots are important features of btrfs. Both
> operations are available to unprivileged users if the filesystem has been
> mounted with the "user_subvol_rm_allowed" mount option. Allow subvolume and
> snapshot deletion on idmapped mounts. This is a fairly straightforward
> operation since all the permission checking helpers are already capable of
> handling idmapped mounts. So we just need to pass down the mount's userns.
> 
> In addition to regular subvolume or snapshot deletion by specifying the name of
> the subvolume or snapshot the BTRFS_IOC_SNAP_DESTROY_V2 ioctl allows the
> deletion of subvolumes and snapshots via subvolume and snapshot ids when the
> BTRFS_SUBVOL_SPEC_BY_ID flag is raised.
> 
> This feature is blocked on idmapped mounts as this allows filesystem wide
> subvolume deletions and thus can escape the scope of what's exposed under the
> mount identified by the fd passed with the ioctl.
> 
> Here is an example where a btrfs subvolume is deleted through a subvolume mount
> that does not expose the subvolume to be delete but it can still be deleted by
> using the subvolume id:
> 
>  /* Compile the following program as "delete_by_spec". */
> 
>  #define _GNU_SOURCE
>  #include <fcntl.h>
>  #include <inttypes.h>
>  #include <linux/btrfs.h>
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <sys/ioctl.h>
>  #include <sys/stat.h>
>  #include <sys/types.h>
>  #include <unistd.h>
> 
>  static int rm_subvolume_by_id(int fd, uint64_t subvolid)
>  {
>  	struct btrfs_ioctl_vol_args_v2 args = {};
>  	int ret;
> 
>  	args.flags = BTRFS_SUBVOL_SPEC_BY_ID;
>  	args.subvolid = subvolid;
> 
>  	ret = ioctl(fd, BTRFS_IOC_SNAP_DESTROY_V2, &args);
>  	if (ret < 0)
>  		return -1;
> 
>  	return 0;
>  }
> 
>  int main(int argc, char *argv[])
>  {
>  	int subvolid = 0;
> 
>  	if (argc < 3)
>  		exit(1);
> 
>  	fprintf(stderr, "Opening %s\n", argv[1]);
>  	int fd = open(argv[1], O_CLOEXEC | O_DIRECTORY);
>  	if (fd < 0)
>  		exit(2);
> 
>  	subvolid = atoi(argv[2]);
> 
>  	fprintf(stderr, "Deleting subvolume with subvolid %d\n", subvolid);
>  	int ret = rm_subvolume_by_id(fd, subvolid);
>  	if (ret < 0)
>  		exit(3);
> 
>  	exit(0);
>  }
>  #include <stdio.h>"
>  #include <stdlib.h>"
>  #include <linux/btrfs.h"
> 
>  truncate -s 10G btrfs.img
>  mkfs.btrfs btrfs.img
>  export LOOPDEV=$(sudo losetup -f --show btrfs.img)
>  mount ${LOOPDEV} /mnt
>  sudo chown $(id -u):$(id -g) /mnt
>  btrfs subvolume create /mnt/A
>  btrfs subvolume create /mnt/B/C
>  # Get subvolume id via:
>  sudo btrfs subvolume show /mnt/A
>  # Save subvolid
>  SUBVOLID=<nr>
>  sudo umount /mnt
>  sudo mount ${LOOPDEV} -o subvol=B/C,user_subvol_rm_allowed /mnt
>  ./delete_by_spec /mnt ${SUBVOLID}
> 
> With idmapped mounts this can potentially be used by users to delete
> subvolumes/snapshots they would otherwise not have access to as the idmapping
> would be applied to an inode that is not exposed in the mount of the subvolume.
> 
> The fact that this is a filesystem wide operation suggests it might be a good
> idea to expose this under a separate ioctl that clearly indicates this. In
> essence, the file descriptor passed with the ioctl is merely used to identify
> the filesystem on which to operate when BTRFS_SUBVOL_SPEC_BY_ID is used.
> 
> Cc: Chris Mason <clm@fb.com>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: David Sterba <dsterba@suse.com>
> Cc: linux-btrfs@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
> /* v2 */
> unchanged
> ---
>  fs/btrfs/ioctl.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index be52891ba571..5416b0c0ee7a 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -830,7 +830,8 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
>   *     nfs_async_unlink().
>   */
>  
> -static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
> +static int btrfs_may_delete(struct user_namespace *mnt_userns,
> +			    struct inode *dir, struct dentry *victim, int isdir)
>  {
>  	int error;
>  
> @@ -840,12 +841,12 @@ static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
>  	BUG_ON(d_inode(victim->d_parent) != dir);
>  	audit_inode_child(dir, victim, AUDIT_TYPE_CHILD_DELETE);
>  
> -	error = inode_permission(&init_user_ns, dir, MAY_WRITE | MAY_EXEC);
> +	error = inode_permission(mnt_userns, dir, MAY_WRITE | MAY_EXEC);
>  	if (error)
>  		return error;
>  	if (IS_APPEND(dir))
>  		return -EPERM;
> -	if (check_sticky(&init_user_ns, dir, d_inode(victim)) ||
> +	if (check_sticky(mnt_userns, dir, d_inode(victim)) ||
>  	    IS_APPEND(d_inode(victim)) || IS_IMMUTABLE(d_inode(victim)) ||
>  	    IS_SWAPFILE(d_inode(victim)))
>  		return -EPERM;
> @@ -2915,6 +2916,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
>  	struct btrfs_root *dest = NULL;
>  	struct btrfs_ioctl_vol_args *vol_args = NULL;
>  	struct btrfs_ioctl_vol_args_v2 *vol_args2 = NULL;
> +	struct user_namespace *mnt_userns = file_mnt_user_ns(file);
>  	char *subvol_name, *subvol_name_ptr = NULL;
>  	int subvol_namelen;
>  	int err = 0;
> @@ -2942,6 +2944,18 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
>  			if (err)
>  				goto out;
>  		} else {
> +			/*
> +			 * Deleting by subvolume id can be used to delete
> +			 * subvolumes/snapshots anywhere in the filesystem.
> +			 * Ensure that users can't abuse idmapped mounts of
> +			 * btrfs subvolumes/snapshots to perform operations in
> +			 * the whole filesystem.
> +			 */
> +			if (mnt_userns != &init_user_ns) {
> +				err = -EINVAL;
> +				goto out;
> +			}

How does this work with CAP_SYS_ADMIN and the root user? This namespace
check is in the preparatory phase, in the actual deletion phase there's
capability(CAP_SYS_ADMIN). A different namespace won't reach that, which
means that it's not possible to delete the subvolume at all.

I read the changelog as it is meant for an unprivileged user, this makes
sense but I don't understand how it's supposed to behave with a root
user in the context of namespaces.

Also -EINVAL is IMHO not the right error code, it's for the cases where
the arguments are invalid, like wrong flags or subvolid. For namespaces
it could be something like EXDEV (but we also have that for reall cross
filesystem subvolume deletion attempt, limited options).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl
  2021-07-21 14:15   ` David Sterba
@ 2021-07-21 15:48     ` Christian Brauner
  0 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2021-07-21 15:48 UTC (permalink / raw)
  To: David Sterba
  Cc: Christian Brauner, Christoph Hellwig, Chris Mason, Josef Bacik,
	David Sterba, Al Viro, linux-btrfs, Christoph Hellwig

On Wed, Jul 21, 2021 at 04:15:18PM +0200, David Sterba wrote:
> On Mon, Jul 19, 2021 at 01:10:45PM +0200, Christian Brauner wrote:
> > From: Christian Brauner <christian.brauner@ubuntu.com>
> > 
> > Destroying subvolumes and snapshots are important features of btrfs. Both
> > operations are available to unprivileged users if the filesystem has been
> > mounted with the "user_subvol_rm_allowed" mount option. Allow subvolume and
> > snapshot deletion on idmapped mounts. This is a fairly straightforward
> > operation since all the permission checking helpers are already capable of
> > handling idmapped mounts. So we just need to pass down the mount's userns.
> > 
> > In addition to regular subvolume or snapshot deletion by specifying the name of
> > the subvolume or snapshot the BTRFS_IOC_SNAP_DESTROY_V2 ioctl allows the
> > deletion of subvolumes and snapshots via subvolume and snapshot ids when the
> > BTRFS_SUBVOL_SPEC_BY_ID flag is raised.
> > 
> > This feature is blocked on idmapped mounts as this allows filesystem wide
> > subvolume deletions and thus can escape the scope of what's exposed under the
> > mount identified by the fd passed with the ioctl.
> > 
> > Here is an example where a btrfs subvolume is deleted through a subvolume mount
> > that does not expose the subvolume to be delete but it can still be deleted by
> > using the subvolume id:
> > 
> >  /* Compile the following program as "delete_by_spec". */
> > 
> >  #define _GNU_SOURCE
> >  #include <fcntl.h>
> >  #include <inttypes.h>
> >  #include <linux/btrfs.h>
> >  #include <stdio.h>
> >  #include <stdlib.h>
> >  #include <sys/ioctl.h>
> >  #include <sys/stat.h>
> >  #include <sys/types.h>
> >  #include <unistd.h>
> > 
> >  static int rm_subvolume_by_id(int fd, uint64_t subvolid)
> >  {
> >  	struct btrfs_ioctl_vol_args_v2 args = {};
> >  	int ret;
> > 
> >  	args.flags = BTRFS_SUBVOL_SPEC_BY_ID;
> >  	args.subvolid = subvolid;
> > 
> >  	ret = ioctl(fd, BTRFS_IOC_SNAP_DESTROY_V2, &args);
> >  	if (ret < 0)
> >  		return -1;
> > 
> >  	return 0;
> >  }
> > 
> >  int main(int argc, char *argv[])
> >  {
> >  	int subvolid = 0;
> > 
> >  	if (argc < 3)
> >  		exit(1);
> > 
> >  	fprintf(stderr, "Opening %s\n", argv[1]);
> >  	int fd = open(argv[1], O_CLOEXEC | O_DIRECTORY);
> >  	if (fd < 0)
> >  		exit(2);
> > 
> >  	subvolid = atoi(argv[2]);
> > 
> >  	fprintf(stderr, "Deleting subvolume with subvolid %d\n", subvolid);
> >  	int ret = rm_subvolume_by_id(fd, subvolid);
> >  	if (ret < 0)
> >  		exit(3);
> > 
> >  	exit(0);
> >  }
> >  #include <stdio.h>"
> >  #include <stdlib.h>"
> >  #include <linux/btrfs.h"
> > 
> >  truncate -s 10G btrfs.img
> >  mkfs.btrfs btrfs.img
> >  export LOOPDEV=$(sudo losetup -f --show btrfs.img)
> >  mount ${LOOPDEV} /mnt
> >  sudo chown $(id -u):$(id -g) /mnt
> >  btrfs subvolume create /mnt/A
> >  btrfs subvolume create /mnt/B/C
> >  # Get subvolume id via:
> >  sudo btrfs subvolume show /mnt/A
> >  # Save subvolid
> >  SUBVOLID=<nr>
> >  sudo umount /mnt
> >  sudo mount ${LOOPDEV} -o subvol=B/C,user_subvol_rm_allowed /mnt
> >  ./delete_by_spec /mnt ${SUBVOLID}
> > 
> > With idmapped mounts this can potentially be used by users to delete
> > subvolumes/snapshots they would otherwise not have access to as the idmapping
> > would be applied to an inode that is not exposed in the mount of the subvolume.
> > 
> > The fact that this is a filesystem wide operation suggests it might be a good
> > idea to expose this under a separate ioctl that clearly indicates this. In
> > essence, the file descriptor passed with the ioctl is merely used to identify
> > the filesystem on which to operate when BTRFS_SUBVOL_SPEC_BY_ID is used.
> > 
> > Cc: Chris Mason <clm@fb.com>
> > Cc: Josef Bacik <josef@toxicpanda.com>
> > Cc: Christoph Hellwig <hch@infradead.org>
> > Cc: David Sterba <dsterba@suse.com>
> > Cc: linux-btrfs@vger.kernel.org
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> > /* v2 */
> > unchanged
> > ---
> >  fs/btrfs/ioctl.c | 27 +++++++++++++++++++++------
> >  1 file changed, 21 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index be52891ba571..5416b0c0ee7a 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -830,7 +830,8 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir,
> >   *     nfs_async_unlink().
> >   */
> >  
> > -static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
> > +static int btrfs_may_delete(struct user_namespace *mnt_userns,
> > +			    struct inode *dir, struct dentry *victim, int isdir)
> >  {
> >  	int error;
> >  
> > @@ -840,12 +841,12 @@ static int btrfs_may_delete(struct inode *dir, struct dentry *victim, int isdir)
> >  	BUG_ON(d_inode(victim->d_parent) != dir);
> >  	audit_inode_child(dir, victim, AUDIT_TYPE_CHILD_DELETE);
> >  
> > -	error = inode_permission(&init_user_ns, dir, MAY_WRITE | MAY_EXEC);
> > +	error = inode_permission(mnt_userns, dir, MAY_WRITE | MAY_EXEC);
> >  	if (error)
> >  		return error;
> >  	if (IS_APPEND(dir))
> >  		return -EPERM;
> > -	if (check_sticky(&init_user_ns, dir, d_inode(victim)) ||
> > +	if (check_sticky(mnt_userns, dir, d_inode(victim)) ||
> >  	    IS_APPEND(d_inode(victim)) || IS_IMMUTABLE(d_inode(victim)) ||
> >  	    IS_SWAPFILE(d_inode(victim)))
> >  		return -EPERM;
> > @@ -2915,6 +2916,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
> >  	struct btrfs_root *dest = NULL;
> >  	struct btrfs_ioctl_vol_args *vol_args = NULL;
> >  	struct btrfs_ioctl_vol_args_v2 *vol_args2 = NULL;
> > +	struct user_namespace *mnt_userns = file_mnt_user_ns(file);
> >  	char *subvol_name, *subvol_name_ptr = NULL;
> >  	int subvol_namelen;
> >  	int err = 0;
> > @@ -2942,6 +2944,18 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
> >  			if (err)
> >  				goto out;
> >  		} else {
> > +			/*
> > +			 * Deleting by subvolume id can be used to delete
> > +			 * subvolumes/snapshots anywhere in the filesystem.
> > +			 * Ensure that users can't abuse idmapped mounts of
> > +			 * btrfs subvolumes/snapshots to perform operations in
> > +			 * the whole filesystem.
> > +			 */
> > +			if (mnt_userns != &init_user_ns) {
> > +				err = -EINVAL;
> > +				goto out;
> > +			}
> 
> How does this work with CAP_SYS_ADMIN and the root user? This namespace
> check is in the preparatory phase, in the actual deletion phase there's
> capability(CAP_SYS_ADMIN). A different namespace won't reach that, which
> means that it's not possible to delete the subvolume at all.
> 
> I read the changelog as it is meant for an unprivileged user, this makes
> sense but I don't understand how it's supposed to behave with a root
> user in the context of namespaces.

Hey David,

thanks for that question. Here's how I thought about this. No matter if
a root/cap_sys_admin capable user or an unprivileged user tries to
delete a subvolume they are always subject to the permission checks in
btrfs_may_delete().

And that calls into inode_permission() which may check whether the inode
has a mapping in the filesystem's idmapping. So even though btrfs as a
filesystem isn't mountable with a non-initial idmapping it shows that if
it were even a root or cap_sys_admin capable user would fail to delete
the subvolume if the hypothetical filesystem idmapping prevented it.
Thereby making it impossible for a root/cap_sys_admin capable user to
delete a subvolume. The idmapped mount case here is the same only that
the idmapping is restricted to a mount.

Another reason, why I thought that should be the case is that there are
users that want to create an idmapped mount without a mapping for the
root user to prevent root from writing to disk. That usecase makes it
desirable to have arbitrary subvolume deletion fail even for a
root/cap_sys_admin capable user.

The alternative would be to say that a root or cap_sys_admin capable
user must always be able to delete a subvolume independent of any
idmapping. If that's the case then I would think a root or cap_sys_admin
capable user should also not be subject to the inode_permission() check
in btrfs_may_delete().

Last, a root/cap-sys-admin capable user could always create another
mount allowing them to delete arbitrary subvolumes.

> 
> Also -EINVAL is IMHO not the right error code, it's for the cases where
> the arguments are invalid, like wrong flags or subvolid. For namespaces
> it could be something like EXDEV (but we also have that for reall cross
> filesystem subvolume deletion attempt, limited options).

I was going with what xfs was doing but I'm happy with either EXDEV or
e.g. EOPNOTSUPP.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-07-21 15:48 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-19 11:10 [PATCH v2 00/21] btrfs: support idmapped mounts Christian Brauner
2021-07-19 11:10 ` [PATCH v2 01/21] namei: add mapping aware lookup helper Christian Brauner
2021-07-19 11:10 ` [PATCH v2 02/21] btrfs/inode: handle idmaps in btrfs_new_inode() Christian Brauner
2021-07-19 11:10 ` [PATCH v2 03/21] btrfs/inode: allow idmapped rename iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 04/21] btrfs/inode: allow idmapped getattr iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 05/21] btrfs/inode: allow idmapped mknod iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 06/21] btrfs/inode: allow idmapped create iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 07/21] btrfs/inode: allow idmapped mkdir iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 08/21] btrfs/inode: allow idmapped symlink iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 09/21] btrfs/inode: allow idmapped tmpfile iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 10/21] btrfs/inode: allow idmapped setattr iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 11/21] btrfs/inode: allow idmapped permission iop Christian Brauner
2021-07-19 11:10 ` [PATCH v2 12/21] btrfs/ioctl: check whether fs{g,u}id are mapped during subvolume creation Christian Brauner
2021-07-19 11:10 ` [PATCH v2 13/21] btrfs/inode: allow idmapped BTRFS_IOC_{SNAP,SUBVOL}_CREATE{_V2} ioctl Christian Brauner
2021-07-19 11:10 ` [PATCH v2 14/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SNAP_DESTROY{_V2} ioctl Christian Brauner
2021-07-21 14:15   ` David Sterba
2021-07-21 15:48     ` Christian Brauner
2021-07-19 11:10 ` [PATCH v2 15/21] btrfs/ioctl: relax restrictions for BTRFS_IOC_SNAP_DESTROY_V2 with subvolids Christian Brauner
2021-07-19 11:10 ` [PATCH v2 16/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SET_RECEIVED_SUBVOL{_32} ioctl Christian Brauner
2021-07-19 11:10 ` [PATCH v2 17/21] btrfs/ioctl: allow idmapped BTRFS_IOC_SUBVOL_SETFLAGS ioctl Christian Brauner
2021-07-19 11:10 ` [PATCH v2 18/21] btrfs/ioctl: allow idmapped BTRFS_IOC_INO_LOOKUP_USER ioctl Christian Brauner
2021-07-19 11:10 ` [PATCH v2 19/21] btrfs/acl: handle idmapped mounts Christian Brauner
2021-07-19 11:10 ` [PATCH v2 20/21] btrfs/super: allow idmapped btrfs Christian Brauner
2021-07-19 11:10 ` [PATCH v2 21/21] btrfs/242: introduce btrfs specific idmapped mounts tests Christian Brauner
2021-07-19 15:11 ` [PATCH v2 00/21] btrfs: support idmapped mounts Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.