linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] fsinfo: Add mount topology query [ver #15]
@ 2019-06-28 15:46 David Howells
  2019-06-28 15:47 ` [PATCH 1/6] vfs: Allow fsinfo() to look up a mount object by ID " David Howells
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: David Howells @ 2019-06-28 15:46 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel


Here's a set of patches that builds upon the previously posted fsinfo()
interface to:

 (a) Make it possible to invoke a query based on a mount ID rather than a
     path.  This is done by setting AT_FSINFO_MOUNTID_PATH and pointing the
     pathname argument to the mount ID as a string.

     A pathname is not a unique handle into the mount topology tree.  It's
     possible for there to be multiple overlying mounts at any particular
     point in the tree, and only the topmost can be directly accessed (the
     bottom might be inferrable from the parent).

     Usage of the mount ID permits all mount objects to be queried.  It
     would be possible to restrict the query based on the method used to
     address the object, though I haven't done this for now.

 (b) Provide a change ID for each mount object that is incremented each
     time a change is applied to that mount object.

 (c) Allow the mount topology to be queried.  The mount topology
     information returned is sprinkled with change IDs to make it easier to
     check for changes during multiple queries.  A future notification
     mechanism will also help with this.

 (d) Provide a system-unique superblock identifier that can be used to
     check to see if a mount object references the same superblock as it
     used to or as another mount object without relying on device numbers -
     which might not be seen to change over an unmount-mount combo.

A sample is also provided that allows the mount topology tree at a point to
be listed.

The patches can be found here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git

on branch:

	fsinfo-mount

[!] Note that this depends on the fsinfo branch.


===================
SIGNIFICANT CHANGES
===================

 ver #15:

 (*) Split from the fsinfo-core branch.

 (*) Rename notify_counter to change_counter as there's no notification
     stuff here (that's in separate branches).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] vfs: Allow fsinfo() to look up a mount object by ID [ver #15]
  2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
@ 2019-06-28 15:47 ` David Howells
  2019-06-28 15:47 ` [PATCH 2/6] vfs: Introduce a non-repeating system-unique superblock " David Howells
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: David Howells @ 2019-06-28 15:47 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel

Allow the fsinfo() syscall to look up a mount object by ID rather than by
pathname.  This is necessary as there can be multiple mounts stacked up at
the same pathname and there's no way to look through them otherwise.

This is done by passing AT_FSINFO_MOUNTID_PATH to fsinfo() in the
parameters and then passing the mount ID as a string to fsinfo() in place
of the filename:

	struct fsinfo_params params = {
		.at_flags = AT_FSINFO_MOUNTID_PATH,
		.request = FSINFO_ATTR_IDS,
	};

	ret = fsinfo(AT_FDCWD, "21", &params, buffer, sizeof(buffer));

The caller is only permitted to query a mount object if the root directory
of that mount connects directly to the current chroot if dfd == AT_FDCWD[*]
or the directory specified by dfd otherwise.  Note that this is not
available to the pathwalk of any other syscall.

[*] This needs to be something other than AT_FDCWD, perhaps AT_FDROOT.

[!] This probably needs an LSM hook.

[!] This might want to check the permissions on all the intervening dirs -
    but it would have to do that under RCU conditions.

[!] This might want to check a CAP_* flag.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fsinfo.c                |   56 +++++++++++++++++++++
 fs/internal.h              |    2 +
 fs/namespace.c             |  117 +++++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/fcntl.h |    1 
 4 files changed, 173 insertions(+), 3 deletions(-)

diff --git a/fs/fsinfo.c b/fs/fsinfo.c
index b3f605c39eb5..4eaa33b2188b 100644
--- a/fs/fsinfo.c
+++ b/fs/fsinfo.c
@@ -522,6 +522,57 @@ static int vfs_fsinfo_fscontext(int fd, struct fsinfo_kparams *params)
 	return ret;
 }
 
+/*
+ * Look up the root of a mount object.  This allows access to mount objects
+ * (and their attached superblocks) that can't be retrieved by path because
+ * they're entirely covered.
+ *
+ * We only permit access to a mount that has a direct path between either the
+ * dentry pointed to by dfd or to our chroot (if dfd is AT_FDCWD).
+ */
+static int vfs_fsinfo_mount(int dfd, const char __user *filename,
+			    struct fsinfo_kparams *params)
+{
+	struct path path;
+	struct fd f = {};
+	char *name;
+	long mnt_id;
+	int ret;
+
+	if ((params->at_flags & ~AT_FSINFO_MOUNTID_PATH) ||
+	    !filename)
+		return -EINVAL;
+
+	name = strndup_user(filename, 32);
+	if (IS_ERR(name))
+		return PTR_ERR(name);
+	ret = kstrtoul(name, 0, &mnt_id);
+	if (ret < 0)
+		goto out_name;
+	if (mnt_id > INT_MAX)
+		goto out_name;
+
+	if (dfd != AT_FDCWD) {
+		ret = -EBADF;
+		f = fdget_raw(dfd);
+		if (!f.file)
+			goto out_name;
+	}
+
+	ret = lookup_mount_object(f.file ? &f.file->f_path : NULL,
+				  mnt_id, &path);
+	if (ret < 0)
+		goto out_fd;
+
+	ret = vfs_fsinfo(&path, params);
+	path_put(&path);
+out_fd:
+	fdput(f);
+out_name:
+	kfree(name);
+	return ret;
+}
+
 /*
  * Return buffer information by requestable attribute.
  *
@@ -638,6 +689,9 @@ SYSCALL_DEFINE5(fsinfo,
 
 		if ((kparams.at_flags & AT_FSINFO_FROM_FSOPEN) && pathname)
 			return -EINVAL;
+		if ((kparams.at_flags & (AT_FSINFO_FROM_FSOPEN | AT_FSINFO_MOUNTID_PATH)) ==
+		    (AT_FSINFO_FROM_FSOPEN | AT_FSINFO_MOUNTID_PATH))
+			return -EINVAL;
 	} else {
 		kparams.request = FSINFO_ATTR_STATFS;
 	}
@@ -696,6 +750,8 @@ SYSCALL_DEFINE5(fsinfo,
 
 	if (kparams.at_flags & AT_FSINFO_FROM_FSOPEN)
 		ret = vfs_fsinfo_fscontext(dfd, &kparams);
+	else if (kparams.at_flags & AT_FSINFO_MOUNTID_PATH)
+		ret = vfs_fsinfo_mount(dfd, pathname, &kparams);
 	else if (pathname)
 		ret = vfs_fsinfo_path(dfd, pathname, &kparams);
 	else
diff --git a/fs/internal.h b/fs/internal.h
index 0010889f2e85..d5283a55b25d 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -96,6 +96,8 @@ extern int __mnt_want_write_file(struct file *);
 extern void __mnt_drop_write_file(struct file *);
 
 extern void dissolve_on_fput(struct vfsmount *);
+extern int lookup_mount_object(struct path *, int, struct path *);
+
 /*
  * fs_struct.c
  */
diff --git a/fs/namespace.c b/fs/namespace.c
index ffb13f0562b0..d96bc1dfab03 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -62,7 +62,7 @@ static int __init set_mphash_entries(char *str)
 __setup("mphash_entries=", set_mphash_entries);
 
 static u64 event;
-static DEFINE_IDA(mnt_id_ida);
+static DEFINE_IDR(mnt_id_ida);
 static DEFINE_IDA(mnt_group_ida);
 
 static struct hlist_head *mount_hashtable __read_mostly;
@@ -101,17 +101,27 @@ static inline struct hlist_head *mp_hash(struct dentry *dentry)
 
 static int mnt_alloc_id(struct mount *mnt)
 {
-	int res = ida_alloc(&mnt_id_ida, GFP_KERNEL);
+	int res;
 
+	/* Allocate an ID, but don't set the pointer back to the mount until
+	 * later, as once we do that, we have to follow RCU protocols to get
+	 * rid of the mount struct.
+	 */
+	res = idr_alloc(&mnt_id_ida, NULL, 0, INT_MAX, GFP_KERNEL);
 	if (res < 0)
 		return res;
 	mnt->mnt_id = res;
 	return 0;
 }
 
+static void mnt_publish_id(struct mount *mnt)
+{
+	idr_replace(&mnt_id_ida, mnt, mnt->mnt_id);
+}
+
 static void mnt_free_id(struct mount *mnt)
 {
-	ida_free(&mnt_id_ida, mnt->mnt_id);
+	idr_remove(&mnt_id_ida, mnt->mnt_id);
 }
 
 /*
@@ -974,6 +984,7 @@ struct vfsmount *vfs_create_mount(struct fs_context *fc)
 	lock_mount_hash();
 	list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
 	unlock_mount_hash();
+	mnt_publish_id(mnt);
 	return &mnt->mnt;
 }
 EXPORT_SYMBOL(vfs_create_mount);
@@ -1067,6 +1078,7 @@ static struct mount *clone_mnt(struct mount *old, struct dentry *root,
 	lock_mount_hash();
 	list_add_tail(&mnt->mnt_instance, &sb->s_mounts);
 	unlock_mount_hash();
+	mnt_publish_id(mnt);
 
 	if ((flag & CL_SLAVE) ||
 	    ((flag & CL_SHARED_TO_SLAVE) && IS_MNT_SHARED(old))) {
@@ -3986,3 +3998,102 @@ const struct proc_ns_operations mntns_operations = {
 	.install	= mntns_install,
 	.owner		= mntns_owner,
 };
+
+/*
+ * See if one path point connects directly to another by ancestral relationship
+ * across mountpoints.  Must call with the RCU read lock held.
+ */
+static bool are_paths_connected(struct path *ancestor, struct path *to_check)
+{
+	struct mount *mnt, *parent;
+	struct path cursor;
+	unsigned seq;
+	bool connected;
+
+	seq = 0;
+restart:
+	cursor = *to_check;
+
+	read_seqbegin_or_lock(&rename_lock, &seq);
+	while (cursor.mnt != ancestor->mnt) {
+		mnt = real_mount(cursor.mnt);
+		parent = READ_ONCE(mnt->mnt_parent);
+		if (mnt == parent)
+			goto failed;
+		cursor.dentry = READ_ONCE(mnt->mnt_mountpoint);
+		cursor.mnt = &parent->mnt;
+	}
+
+	while (cursor.dentry != ancestor->dentry) {
+		if (cursor.dentry == cursor.mnt->mnt_root ||
+		    IS_ROOT(cursor.dentry))
+			goto failed;
+		cursor.dentry = READ_ONCE(cursor.dentry->d_parent);
+	}
+
+	connected = true;
+out:
+	done_seqretry(&rename_lock, seq);
+	return connected;
+
+failed:
+	if (need_seqretry(&rename_lock, seq)) {
+		seq = 1;
+		goto restart;
+	}
+	connected = false;
+	goto out;
+}
+
+/**
+ * lookup_mount_object - Look up a vfsmount object by ID
+ * @root: The mount root must connect backwards to this point (or chroot if NULL).
+ * @id: The ID of the mountpoint.
+ * @_mntpt: Where to return the resulting mountpoint path.
+ *
+ * Look up the root of the mount with the corresponding ID.  This is only
+ * permitted if that mount connects directly to the specified root/chroot.
+ */
+int lookup_mount_object(struct path *root, int mnt_id, struct path *_mntpt)
+{
+	struct mount *mnt;
+	struct path stop, mntpt = {};
+	int ret = -EPERM;
+
+	if (!root)
+		get_fs_root(current->fs, &stop);
+	else
+		stop = *root;
+
+	rcu_read_lock();
+	lock_mount_hash();
+	mnt = idr_find(&mnt_id_ida, mnt_id);
+	if (!mnt)
+		goto out_unlock_mh;
+	if (mnt->mnt.mnt_flags & (MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED))
+		goto out_unlock_mh;
+	if (mnt_get_count(mnt) == 0)
+		goto out_unlock_mh;
+	mnt_add_count(mnt, 1);
+	mntpt.mnt = &mnt->mnt;
+	mntpt.dentry = dget(mnt->mnt.mnt_root);
+	unlock_mount_hash();
+
+	if (are_paths_connected(&stop, &mntpt)) {
+		*_mntpt = mntpt;
+		mntpt.mnt = NULL;
+		mntpt.dentry = NULL;
+		ret = 0;
+	}
+
+out_unlock:
+	rcu_read_unlock();
+	if (!root)
+		path_put(&stop);
+	path_put(&mntpt);
+	return ret;
+
+out_unlock_mh:
+	unlock_mount_hash();
+	goto out_unlock;
+}
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 6a2402a8fa30..5fda91cfca8a 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -92,6 +92,7 @@
 #define AT_STATX_DONT_SYNC	0x4000	/* - Don't sync attributes with the server */
 
 #define AT_FSINFO_FROM_FSOPEN	0x2000	/* Examine the fs_context attached to dfd by fsopen() */
+#define AT_FSINFO_MOUNTID_PATH	0x4000	/* The path is a mount object ID, not an actual path */
 
 #define AT_RECURSIVE		0x8000	/* Apply to the entire subtree */
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] vfs: Introduce a non-repeating system-unique superblock ID [ver #15]
  2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
  2019-06-28 15:47 ` [PATCH 1/6] vfs: Allow fsinfo() to look up a mount object by ID " David Howells
@ 2019-06-28 15:47 ` David Howells
  2019-06-28 15:47 ` [PATCH 3/6] vfs: Add mount change counter " David Howells
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: David Howells @ 2019-06-28 15:47 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel

Introduce an (effectively) non-repeating system-unique superblock ID that
can be used to determine that two object are in the same superblock without
risking reuse of the ID in the meantime (as is possible with device IDs).

The ID is time-based to make it harder to use it as a covert communications
channel.

Also make it so that this ID can be fetched by the fsinfo() system call.
The ID added so that superblock notification messages will also be able to
be tagged with it.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/fsinfo.c               |    1 +
 fs/super.c                |   24 ++++++++++++++++++++++++
 include/linux/fs.h        |    3 +++
 samples/vfs/test-fsinfo.c |    1 +
 4 files changed, 29 insertions(+)

diff --git a/fs/fsinfo.c b/fs/fsinfo.c
index 4eaa33b2188b..aee7fedace19 100644
--- a/fs/fsinfo.c
+++ b/fs/fsinfo.c
@@ -87,6 +87,7 @@ static int fsinfo_generic_ids(struct path *path, struct fsinfo_ids *p)
 	p->f_fstype	= sb->s_magic;
 	p->f_dev_major	= MAJOR(sb->s_dev);
 	p->f_dev_minor	= MINOR(sb->s_dev);
+	p->f_sb_id	= sb->s_unique_id;
 
 	memcpy(&p->f_fsid, &buf.f_fsid, sizeof(p->f_fsid));
 	strlcpy(p->f_fs_name, path->dentry->d_sb->s_type->name,
diff --git a/fs/super.c b/fs/super.c
index 2739f57515f8..c04f9481a708 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -43,6 +43,8 @@ static int thaw_super_locked(struct super_block *sb);
 
 static LIST_HEAD(super_blocks);
 static DEFINE_SPINLOCK(sb_lock);
+static u64 sb_last_identifier;
+static u64 sb_identifier_offset;
 
 static char *sb_writers_name[SB_FREEZE_LEVELS] = {
 	"sb_writers",
@@ -187,6 +189,27 @@ static void destroy_unused_super(struct super_block *s)
 	destroy_super_work(&s->destroy_work);
 }
 
+/*
+ * Generate a unique identifier for a superblock.
+ */
+static void generate_super_id(struct super_block *s)
+{
+	u64 id = ktime_to_ns(ktime_get());
+
+	spin_lock(&sb_lock);
+
+	id += sb_identifier_offset;
+	if (id <= sb_last_identifier) {
+		id = sb_last_identifier + 1;
+		sb_identifier_offset = sb_last_identifier - id;
+	}
+
+	sb_last_identifier = id;
+	spin_unlock(&sb_lock);
+
+	s->s_unique_id = id;
+}
+
 /**
  *	alloc_super	-	create new superblock
  *	@type:	filesystem type superblock should belong to
@@ -270,6 +293,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 		goto fail;
 	if (list_lru_init_memcg(&s->s_inode_lru, &s->s_shrink))
 		goto fail;
+	generate_super_id(s);
 	return s;
 
 fail:
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 50f58eac3e1f..61098cded376 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1527,6 +1527,9 @@ struct super_block {
 
 	spinlock_t		s_inode_wblist_lock;
 	struct list_head	s_inodes_wb;	/* writeback inodes */
+
+	/* Superblock event notifications */
+	u64			s_unique_id;
 } __randomize_layout;
 
 /* Helper functions so that in most cases filesystems will
diff --git a/samples/vfs/test-fsinfo.c b/samples/vfs/test-fsinfo.c
index 6389ae781cbb..27c4bb93c219 100644
--- a/samples/vfs/test-fsinfo.c
+++ b/samples/vfs/test-fsinfo.c
@@ -185,6 +185,7 @@ static void dump_attr_IDS(union reply *r, int size)
 	printf("\tdev   : %02x:%02x\n", f->f_dev_major, f->f_dev_minor);
 	printf("\tfs    : type=%x name=%s\n", f->f_fstype, f->f_fs_name);
 	printf("\tfsid  : %llx\n", (unsigned long long)f->f_fsid);
+	printf("\tsbid  : %llx\n", (unsigned long long)f->f_sb_id);
 }
 
 static void dump_attr_LIMITS(union reply *r, int size)


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] vfs: Add mount change counter [ver #15]
  2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
  2019-06-28 15:47 ` [PATCH 1/6] vfs: Allow fsinfo() to look up a mount object by ID " David Howells
  2019-06-28 15:47 ` [PATCH 2/6] vfs: Introduce a non-repeating system-unique superblock " David Howells
@ 2019-06-28 15:47 ` David Howells
  2019-06-28 15:47 ` [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() " David Howells
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: David Howells @ 2019-06-28 15:47 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel

Add a change counter on each mount object so that the user can easily check
to see if a mount has changed its attributes or its children.

Future patches will:

 (1) Provide this value through fsinfo() attributes.

 (2) Implement a watch_mount() system call to provide a notification
     interface for userspace monitoring.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/mount.h     |   22 ++++++++++++++++++++++
 fs/namespace.c |   11 +++++++++++
 2 files changed, 33 insertions(+)

diff --git a/fs/mount.h b/fs/mount.h
index 6250de544760..65cb51f47c8c 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -70,6 +70,7 @@ struct mount {
 	struct hlist_head mnt_pins;
 	struct fs_pin mnt_umount;
 	struct dentry *mnt_ex_mountpoint;
+	atomic_t mnt_change_counter;	/* Number of changed applied */
 } __randomize_layout;
 
 #define MNT_NS_INTERNAL ERR_PTR(-EINVAL) /* distinct from any mnt_namespace */
@@ -151,3 +152,24 @@ static inline bool is_anon_ns(struct mnt_namespace *ns)
 {
 	return ns->seq == 0;
 }
+
+/*
+ * Type of mount topology change notification.
+ */
+enum mount_notification_subtype {
+	NOTIFY_MOUNT_NEW_MOUNT	= 0, /* New mount added */
+	NOTIFY_MOUNT_UNMOUNT	= 1, /* Mount removed manually */
+	NOTIFY_MOUNT_EXPIRY	= 2, /* Automount expired */
+	NOTIFY_MOUNT_READONLY	= 3, /* Mount R/O state changed */
+	NOTIFY_MOUNT_SETATTR	= 4, /* Mount attributes changed */
+	NOTIFY_MOUNT_MOVE_FROM	= 5, /* Mount moved from here */
+	NOTIFY_MOUNT_MOVE_TO	= 6, /* Mount moved to here (compare op_id) */
+};
+
+static inline void notify_mount(struct mount *changed,
+				struct mount *aux,
+				enum mount_notification_subtype subtype,
+				u32 info_flags)
+{
+	atomic_inc(&changed->mnt_change_counter);
+}
diff --git a/fs/namespace.c b/fs/namespace.c
index d96bc1dfab03..c306e9362604 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -513,6 +513,8 @@ static int mnt_make_readonly(struct mount *mnt)
 	smp_wmb();
 	mnt->mnt.mnt_flags &= ~MNT_WRITE_HOLD;
 	unlock_mount_hash();
+	if (ret == 0)
+		notify_mount(mnt, NULL, NOTIFY_MOUNT_READONLY, 0x10000);
 	return ret;
 }
 
@@ -521,6 +523,7 @@ static int __mnt_unmake_readonly(struct mount *mnt)
 	lock_mount_hash();
 	mnt->mnt.mnt_flags &= ~MNT_READONLY;
 	unlock_mount_hash();
+	notify_mount(mnt, NULL, NOTIFY_MOUNT_READONLY, 0);
 	return 0;
 }
 
@@ -833,6 +836,7 @@ static void umount_mnt(struct mount *mnt)
 {
 	/* old mountpoint will be dropped when we can do that */
 	mnt->mnt_ex_mountpoint = mnt->mnt_mountpoint;
+	notify_mount(mnt->mnt_parent, mnt, NOTIFY_MOUNT_UNMOUNT, 0);
 	unhash_mnt(mnt);
 }
 
@@ -1472,6 +1476,7 @@ static void umount_tree(struct mount *mnt, enum umount_tree_flags how)
 		p = list_first_entry(&tmp_list, struct mount, mnt_list);
 		list_del_init(&p->mnt_expire);
 		list_del_init(&p->mnt_list);
+
 		ns = p->mnt_ns;
 		if (ns) {
 			ns->mounts--;
@@ -2095,7 +2100,10 @@ static int attach_recursive_mnt(struct mount *source_mnt,
 		lock_mount_hash();
 	}
 	if (parent_path) {
+		notify_mount(source_mnt->mnt_parent, source_mnt,
+			     NOTIFY_MOUNT_MOVE_FROM, 0);
 		detach_mnt(source_mnt, parent_path);
+		notify_mount(dest_mnt, source_mnt, NOTIFY_MOUNT_MOVE_TO, 0);
 		attach_mnt(source_mnt, dest_mnt, dest_mp);
 		touch_mnt_namespace(source_mnt->mnt_ns);
 	} else {
@@ -2104,6 +2112,7 @@ static int attach_recursive_mnt(struct mount *source_mnt,
 			list_del_init(&source_mnt->mnt_ns->list);
 		}
 		mnt_set_mountpoint(dest_mnt, dest_mp, source_mnt);
+		notify_mount(dest_mnt, source_mnt, NOTIFY_MOUNT_NEW_MOUNT, 0);
 		commit_tree(source_mnt);
 	}
 
@@ -2480,6 +2489,7 @@ static void set_mount_attributes(struct mount *mnt, unsigned int mnt_flags)
 	mnt->mnt.mnt_flags = mnt_flags;
 	touch_mnt_namespace(mnt->mnt_ns);
 	unlock_mount_hash();
+	notify_mount(mnt, NULL, NOTIFY_MOUNT_SETATTR, 0);
 }
 
 /*
@@ -2878,6 +2888,7 @@ void mark_mounts_for_expiry(struct list_head *mounts)
 		if (!xchg(&mnt->mnt_expiry_mark, 1) ||
 			propagate_mount_busy(mnt, 1))
 			continue;
+		notify_mount(mnt, NULL, NOTIFY_MOUNT_EXPIRY, 0);
 		list_move(&mnt->mnt_expire, &graveyard);
 	}
 	while (!list_empty(&graveyard)) {


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() [ver #15]
  2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
                   ` (2 preceding siblings ...)
  2019-06-28 15:47 ` [PATCH 3/6] vfs: Add mount change counter " David Howells
@ 2019-06-28 15:47 ` David Howells
  2019-07-03  1:09   ` Ian Kent
  2019-06-28 15:47 ` [PATCH 5/6] vfs: fsinfo sample: Mount listing program " David Howells
  2019-06-28 15:47 ` [PATCH 6/6] fsinfo: Add documentation for mount and sb watches " David Howells
  5 siblings, 1 reply; 10+ messages in thread
From: David Howells @ 2019-06-28 15:47 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel

Allow mount information, including information about the topology tree to
be queried with the fsinfo() system call.  Usage of AT_FSINFO_MOUNTID_PATH
allows overlapping mounts to be queried.

To this end, four fsinfo() attributes are provided:

 (1) FSINFO_ATTR_MOUNT_INFO.

     This is a structure providing information about a mount, including:

	- Mounted superblock ID.
	- Mount ID (as AT_FSINFO_MOUNTID_PATH).
	- Parent mount ID.
	- Mount attributes (eg. R/O, NOEXEC).
	- A change counter.

     Note that the parent mount ID is overridden to the ID of the queried
     mount if the parent lies outside of the chroot or dfd tree.

 (2) FSINFO_ATTR_MOUNT_DEVNAME.

     This a string providing the device name associated with the mount.

     Note that the device name may be a path that lies outside of the root.

 (3) FSINFO_ATTR_MOUNT_CHILDREN.

     This produces an array of structures, one for each child and capped
     with one for the argument mount (checked after listing all the
     children).  Each element contains the mount ID and the change counter
     of the respective mount object.

 (4) FSINFO_ATTR_MOUNT_SUBMOUNT.

     This is a 1D array of strings, indexed with struct fsinfo_params::Nth.
     Each string is the relative pathname of the corresponding child
     returned by FSINFO_ATTR_MOUNT_CHILDREN.

     Note that paths in the mount at the base of the tree (whether that be
     dfd or chroot) are relative to the base of the tree, not the root
     directory of that mount.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/d_path.c                 |    2 
 fs/fsinfo.c                 |    8 ++
 fs/internal.h               |    9 ++
 fs/namespace.c              |  177 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/fsinfo.h |   28 +++++++
 samples/vfs/test-fsinfo.c   |   47 +++++++++++
 6 files changed, 267 insertions(+), 4 deletions(-)

diff --git a/fs/d_path.c b/fs/d_path.c
index e8fce6b1174f..71b3e8cd79b8 100644
--- a/fs/d_path.c
+++ b/fs/d_path.c
@@ -227,7 +227,7 @@ static int prepend_unreachable(char **buffer, int *buflen)
 	return prepend(buffer, buflen, "(unreachable)", 13);
 }
 
-static void get_fs_root_rcu(struct fs_struct *fs, struct path *root)
+void get_fs_root_rcu(struct fs_struct *fs, struct path *root)
 {
 	unsigned seq;
 
diff --git a/fs/fsinfo.c b/fs/fsinfo.c
index aee7fedace19..758d1cbf8eba 100644
--- a/fs/fsinfo.c
+++ b/fs/fsinfo.c
@@ -353,6 +353,10 @@ int generic_fsinfo(struct path *path, struct fsinfo_kparams *params)
 	case _genf(PARAM_SPECIFICATION,	param_specification);
 	case _genf(PARAM_ENUM,		param_enum);
 	case _genp(PARAMETERS,		parameters);
+	case _genp(MOUNT_INFO,		mount_info);
+	case _genp(MOUNT_DEVNAME,	mount_devname);
+	case _genp(MOUNT_CHILDREN,	mount_children);
+	case _genp(MOUNT_SUBMOUNT,	mount_submount);
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -637,6 +641,10 @@ static const struct fsinfo_attr_info fsinfo_buffer_info[FSINFO_ATTR__NR] = {
 	FSINFO_STRING_N		(SERVER_NAME),
 	FSINFO_STRUCT_NM	(SERVER_ADDRESS,	server_address),
 	FSINFO_STRING		(AFS_CELL_NAME),
+	FSINFO_STRUCT		(MOUNT_INFO,		mount_info),
+	FSINFO_STRING		(MOUNT_DEVNAME),
+	FSINFO_STRUCT_ARRAY	(MOUNT_CHILDREN,	mount_child),
+	FSINFO_STRING_N		(MOUNT_SUBMOUNT),
 };
 
 /**
diff --git a/fs/internal.h b/fs/internal.h
index d5283a55b25d..d75bdd97cdd9 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -52,6 +52,11 @@ void __generic_write_end(struct inode *inode, loff_t pos, unsigned copied,
  */
 extern void __init chrdev_init(void);
 
+/*
+ * d_path.c
+ */
+extern void get_fs_root_rcu(struct fs_struct *fs, struct path *root);
+
 /*
  * fs_context.c
  */
@@ -97,6 +102,10 @@ extern void __mnt_drop_write_file(struct file *);
 
 extern void dissolve_on_fput(struct vfsmount *);
 extern int lookup_mount_object(struct path *, int, struct path *);
+extern int fsinfo_generic_mount_info(struct path *, struct fsinfo_kparams *);
+extern int fsinfo_generic_mount_devname(struct path *, struct fsinfo_kparams *);
+extern int fsinfo_generic_mount_children(struct path *, struct fsinfo_kparams *);
+extern int fsinfo_generic_mount_submount(struct path *, struct fsinfo_kparams *);
 
 /*
  * fs_struct.c
diff --git a/fs/namespace.c b/fs/namespace.c
index c306e9362604..925602b8c329 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -29,6 +29,7 @@
 #include <linux/sched/task.h>
 #include <uapi/linux/mount.h>
 #include <linux/fs_context.h>
+#include <linux/fsinfo.h>
 
 #include "pnode.h"
 #include "internal.h"
@@ -4108,3 +4109,179 @@ int lookup_mount_object(struct path *root, int mnt_id, struct path *_mntpt)
 	unlock_mount_hash();
 	goto out_unlock;
 }
+
+#ifdef CONFIG_FSINFO
+int fsinfo_generic_mount_info(struct path *path, struct fsinfo_kparams *params)
+{
+	struct fsinfo_mount_info *p = params->buffer;
+	struct super_block *sb;
+	struct mount *m;
+	struct path root;
+	unsigned int flags;
+
+	if (!path->mnt)
+		return -ENODATA;
+
+	m = real_mount(path->mnt);
+	sb = m->mnt.mnt_sb;
+
+	p->f_sb_id		= sb->s_unique_id;
+	p->mnt_id		= m->mnt_id;
+	p->parent_id		= m->mnt_parent->mnt_id;
+	p->change_counter	= atomic_read(&m->mnt_change_counter);
+
+	get_fs_root(current->fs, &root);
+	if (path->mnt == root.mnt) {
+		p->parent_id = p->mnt_id;
+	} else {
+		rcu_read_lock();
+		if (!are_paths_connected(&root, path))
+			p->parent_id = p->mnt_id;
+		rcu_read_unlock();
+	}
+	if (IS_MNT_SHARED(m))
+		p->group_id = m->mnt_group_id;
+	if (IS_MNT_SLAVE(m)) {
+		int master = m->mnt_master->mnt_group_id;
+		int dom = get_dominating_id(m, &root);
+		p->master_id = master;
+		if (dom && dom != master)
+			p->from_id = dom;
+	}
+	path_put(&root);
+
+	flags = READ_ONCE(m->mnt.mnt_flags);
+	if (flags & MNT_READONLY)
+		p->attr |= MOUNT_ATTR_RDONLY;
+	if (flags & MNT_NOSUID)
+		p->attr |= MOUNT_ATTR_NOSUID;
+	if (flags & MNT_NODEV)
+		p->attr |= MOUNT_ATTR_NODEV;
+	if (flags & MNT_NOEXEC)
+		p->attr |= MOUNT_ATTR_NOEXEC;
+	if (flags & MNT_NODIRATIME)
+		p->attr |= MOUNT_ATTR_NODIRATIME;
+
+	if (flags & MNT_NOATIME)
+		p->attr |= MOUNT_ATTR_NOATIME;
+	else if (flags & MNT_RELATIME)
+		p->attr |= MOUNT_ATTR_RELATIME;
+	else
+		p->attr |= MOUNT_ATTR_STRICTATIME;
+	return sizeof(*p);
+}
+
+int fsinfo_generic_mount_devname(struct path *path, struct fsinfo_kparams *params)
+{
+	struct mount *m;
+	size_t len;
+
+	if (!path->mnt)
+		return -ENODATA;
+
+	m = real_mount(path->mnt);
+	len = strlen(m->mnt_devname);
+	memcpy(params->buffer, m->mnt_devname, len);
+	return len;
+}
+
+/*
+ * Store a mount record into the fsinfo buffer.
+ */
+static void store_mount_fsinfo(struct fsinfo_kparams *params,
+			       struct fsinfo_mount_child *child)
+{
+	unsigned int usage = params->usage;
+	unsigned int total = sizeof(*child);
+
+	if (params->usage >= INT_MAX)
+		return;
+	params->usage = usage + total;
+	if (params->buffer && params->usage <= params->buf_size)
+		memcpy(params->buffer + usage, child, total);
+}
+
+/*
+ * Return information about the submounts relative to path.
+ */
+int fsinfo_generic_mount_children(struct path *path, struct fsinfo_kparams *params)
+{
+	struct fsinfo_mount_child record;
+	struct mount *m, *child;
+
+	if (!path->mnt)
+		return -ENODATA;
+
+	m = real_mount(path->mnt);
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(child, &m->mnt_mounts, mnt_child) {
+		if (child->mnt_parent != m)
+			continue;
+		record.mnt_id = child->mnt_id;
+		record.change_counter = atomic_read(&child->mnt_change_counter);
+		store_mount_fsinfo(params, &record);
+	}
+	rcu_read_unlock();
+
+	/* End the list with a copy of the parameter mount's details so that
+	 * userspace can quickly check for changes.
+	 */
+	record.mnt_id = m->mnt_id;
+	record.change_counter = atomic_read(&m->mnt_change_counter);
+	store_mount_fsinfo(params, &record);
+	return params->usage;
+}
+
+/*
+ * Return the path of the Nth submount relative to path.  This is derived from
+ * d_path(), but the root determination is more complicated.
+ */
+int fsinfo_generic_mount_submount(struct path *path, struct fsinfo_kparams *params)
+{
+	struct mountpoint *mp;
+	struct mount *m, *child;
+	struct path mountpoint, root;
+	unsigned int n = params->Nth;
+	size_t len;
+	void *p;
+
+	if (!path->mnt)
+		return -ENODATA;
+
+	rcu_read_lock();
+
+	m = real_mount(path->mnt);
+	list_for_each_entry_rcu(child, &m->mnt_mounts, mnt_child) {
+		mp = READ_ONCE(child->mnt_mp);
+		if (child->mnt_parent != m || !mp)
+			continue;
+		if (n-- == 0)
+			goto found;
+	}
+	rcu_read_unlock();
+	return -ENODATA;
+
+found:
+	mountpoint.mnt = path->mnt;
+	mountpoint.dentry = READ_ONCE(mp->m_dentry);
+
+	get_fs_root_rcu(current->fs, &root);
+	if (root.mnt != path->mnt) {
+		root.mnt = path->mnt;
+		root.dentry = path->mnt->mnt_root;
+	}
+
+	p = __d_path(&mountpoint, &root, params->buffer, params->buf_size);
+	rcu_read_unlock();
+
+	if (IS_ERR(p))
+		return PTR_ERR(p);
+	if (!p)
+		return -EPERM;
+
+	len = (params->buffer + params->buf_size) - p;
+	memmove(params->buffer, p, len);
+	return len;
+}
+#endif /* CONFIG_FSINFO */
diff --git a/include/uapi/linux/fsinfo.h b/include/uapi/linux/fsinfo.h
index 58a50207256f..401ad9625c11 100644
--- a/include/uapi/linux/fsinfo.h
+++ b/include/uapi/linux/fsinfo.h
@@ -35,6 +35,10 @@ enum fsinfo_attribute {
 	FSINFO_ATTR_SERVER_NAME		= 17,	/* Name of the Nth server (string) */
 	FSINFO_ATTR_SERVER_ADDRESS	= 18,	/* Mth address of the Nth server */
 	FSINFO_ATTR_AFS_CELL_NAME	= 19,	/* AFS cell name (string) */
+	FSINFO_ATTR_MOUNT_INFO		= 20,	/* Mount object information */
+	FSINFO_ATTR_MOUNT_DEVNAME	= 21,	/* Mount object device name (string) */
+	FSINFO_ATTR_MOUNT_CHILDREN	= 22,	/* Submount list (array) */
+	FSINFO_ATTR_MOUNT_SUBMOUNT	= 23,	/* Relative path of Nth submount (string) */
 	FSINFO_ATTR__NR
 };
 
@@ -288,4 +292,28 @@ struct fsinfo_server_address {
 	struct __kernel_sockaddr_storage address;
 };
 
+/*
+ * Information struct for fsinfo(FSINFO_ATTR_MOUNT_INFO).
+ */
+struct fsinfo_mount_info {
+	__u64		f_sb_id;	/* Superblock ID */
+	__u32		mnt_id;		/* Mount identifier (use with AT_FSINFO_MOUNTID_PATH) */
+	__u32		parent_id;	/* Parent mount identifier */
+	__u32		group_id;	/* Mount group ID */
+	__u32		master_id;	/* Slave master group ID */
+	__u32		from_id;	/* Slave propagated from ID */
+	__u32		attr;		/* MOUNT_ATTR_* flags */
+	__u32		change_counter;	/* Number of changed applied. */
+	__u32		__reserved[1];
+};
+
+/*
+ * Information struct element for fsinfo(FSINFO_ATTR_MOUNT_CHILDREN).
+ * - An extra element is placed on the end representing the parent mount.
+ */
+struct fsinfo_mount_child {
+	__u32		mnt_id;		/* Mount identifier (use with AT_FSINFO_MOUNTID_PATH) */
+	__u32		change_counter;	/* Number of changes applied to mount. */
+};
+
 #endif /* _UAPI_LINUX_FSINFO_H */
diff --git a/samples/vfs/test-fsinfo.c b/samples/vfs/test-fsinfo.c
index 27c4bb93c219..28c9f3cd2c8c 100644
--- a/samples/vfs/test-fsinfo.c
+++ b/samples/vfs/test-fsinfo.c
@@ -21,10 +21,10 @@
 #include <errno.h>
 #include <time.h>
 #include <math.h>
-#include <fcntl.h>
 #include <sys/syscall.h>
 #include <linux/fsinfo.h>
 #include <linux/socket.h>
+#include <linux/fcntl.h>
 #include <sys/stat.h>
 #include <arpa/inet.h>
 
@@ -86,6 +86,10 @@ static const struct fsinfo_attr_info fsinfo_buffer_info[FSINFO_ATTR__NR] = {
 	FSINFO_STRING_N		(SERVER_NAME,		server_name),
 	FSINFO_STRUCT_NM	(SERVER_ADDRESS,	server_address),
 	FSINFO_STRING		(AFS_CELL_NAME,		-),
+	FSINFO_STRUCT		(MOUNT_INFO,		mount_info),
+	FSINFO_STRING		(MOUNT_DEVNAME,		mount_devname),
+	FSINFO_STRUCT_ARRAY	(MOUNT_CHILDREN,	mount_child),
+	FSINFO_STRING_N		(MOUNT_SUBMOUNT,	mount_submount),
 };
 
 #define FSINFO_NAME(X,Y) [FSINFO_ATTR_##X] = #Y
@@ -110,6 +114,10 @@ static const char *fsinfo_attr_names[FSINFO_ATTR__NR] = {
 	FSINFO_NAME		(SERVER_NAME,		server_name),
 	FSINFO_NAME		(SERVER_ADDRESS,	server_address),
 	FSINFO_NAME		(AFS_CELL_NAME,		afs_cell_name),
+	FSINFO_NAME		(MOUNT_INFO,		mount_info),
+	FSINFO_NAME		(MOUNT_DEVNAME,		mount_devname),
+	FSINFO_NAME		(MOUNT_CHILDREN,	mount_children),
+	FSINFO_NAME		(MOUNT_SUBMOUNT,	mount_submount),
 };
 
 union reply {
@@ -123,6 +131,8 @@ union reply {
 	struct fsinfo_timestamp_info timestamps;
 	struct fsinfo_volume_uuid uuid;
 	struct fsinfo_server_address srv_addr;
+	struct fsinfo_mount_info mount_info;
+	struct fsinfo_mount_child mount_children[1];
 };
 
 static void dump_hex(unsigned int *data, int from, int to)
@@ -351,6 +361,29 @@ static void dump_attr_SERVER_ADDRESS(union reply *r, int size)
 	printf("family=%u\n", f->address.ss_family);
 }
 
+static void dump_attr_MOUNT_INFO(union reply *r, int size)
+{
+	struct fsinfo_mount_info *f = &r->mount_info;
+
+	printf("\n");
+	printf("\tsb_id   : %llx\n", (unsigned long long)f->f_sb_id);
+	printf("\tmnt_id  : %x\n", f->mnt_id);
+	printf("\tparent  : %x\n", f->parent_id);
+	printf("\tgroup   : %x\n", f->group_id);
+	printf("\tattr    : %x\n", f->attr);
+	printf("\tchanges : %x\n", f->change_counter);
+}
+
+static void dump_attr_MOUNT_CHILDREN(union reply *r, int size)
+{
+	struct fsinfo_mount_child *f = r->mount_children;
+	int i = 0;
+
+	printf("\n");
+	for (; size >= sizeof(*f); size -= sizeof(*f), f++)
+		printf("\t[%u] %8x %8x\n", i++, f->mnt_id, f->change_counter);
+}
+
 /*
  *
  */
@@ -367,6 +400,8 @@ static const dumper_t fsinfo_attr_dumper[FSINFO_ATTR__NR] = {
 	FSINFO_DUMPER(TIMESTAMP_INFO),
 	FSINFO_DUMPER(VOLUME_UUID),
 	FSINFO_DUMPER(SERVER_ADDRESS),
+	FSINFO_DUMPER(MOUNT_INFO),
+	FSINFO_DUMPER(MOUNT_CHILDREN),
 };
 
 static void dump_fsinfo(enum fsinfo_attribute attr,
@@ -569,16 +604,21 @@ int main(int argc, char **argv)
 	unsigned int attr;
 	int raw = 0, opt, Nth, Mth;
 
-	while ((opt = getopt(argc, argv, "adlr"))) {
+	while ((opt = getopt(argc, argv, "Madlr"))) {
 		switch (opt) {
+		case 'M':
+			params.at_flags = AT_FSINFO_MOUNTID_PATH;
+			continue;
 		case 'a':
 			params.at_flags |= AT_NO_AUTOMOUNT;
+			params.at_flags &= ~AT_FSINFO_MOUNTID_PATH;
 			continue;
 		case 'd':
 			debug = true;
 			continue;
 		case 'l':
 			params.at_flags &= ~AT_SYMLINK_NOFOLLOW;
+			params.at_flags &= ~AT_FSINFO_MOUNTID_PATH;
 			continue;
 		case 'r':
 			raw = 1;
@@ -591,7 +631,8 @@ int main(int argc, char **argv)
 	argv += optind;
 
 	if (argc != 1) {
-		printf("Format: test-fsinfo [-alr] <file>\n");
+		printf("Format: test-fsinfo [-adlr] <file>\n");
+		printf("Format: test-fsinfo [-dr] -M <mnt_id>\n");
 		exit(2);
 	}
 


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] vfs: fsinfo sample: Mount listing program [ver #15]
  2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
                   ` (3 preceding siblings ...)
  2019-06-28 15:47 ` [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() " David Howells
@ 2019-06-28 15:47 ` David Howells
  2019-06-28 15:47 ` [PATCH 6/6] fsinfo: Add documentation for mount and sb watches " David Howells
  5 siblings, 0 replies; 10+ messages in thread
From: David Howells @ 2019-06-28 15:47 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel

Implement a program to demonstrate mount listing using the new fsinfo()
syscall, for example:

# ./test-mntinfo -M 21
MOUNT                                 MOUNT ID   CHANGE#    TYPE & DEVICE
------------------------------------- ---------- ---------- ---------------
21                                            21          8 sysfs 0:15
 \_ kernel/security                           24          0 securityfs 0:8
 \_ fs/cgroup                                 28         16 tmpfs 0:19
 |   \_ unified                               29          0 cgroup2 0:1a
 |   \_ systemd                               30          0 cgroup 0:1b
 |   \_ freezer                               34          0 cgroup 0:1f
 |   \_ cpu,cpuacct                           35          0 cgroup 0:20
 |   \_ devices                               36          0 cgroup 0:21
 |   \_ memory                                37          0 cgroup 0:22
 |   \_ cpuset                                38          0 cgroup 0:23
 |   \_ net_cls,net_prio                      39          0 cgroup 0:24
 |   \_ hugetlb                               40          0 cgroup 0:25
 |   \_ rdma                                  41          0 cgroup 0:26
 |   \_ blkio                                 42          0 cgroup 0:27
 |   \_ perf_event                            43          0 cgroup 0:28
 \_ fs/pstore                                 31          0 pstore 0:1c
 \_ firmware/efi/efivars                      32          0 efivarfs 0:1d
 \_ fs/bpf                                    33          0 bpf 0:1e
 \_ kernel/config                             92          0 configfs 0:10
 \_ fs/selinux                                44          0 selinuxfs 0:12
 \_ kernel/debug                              48          0 debugfs 0:7

Signed-off-by: David Howells <dhowells@redhat.com>
---

 samples/vfs/Makefile       |    3 +
 samples/vfs/test-mntinfo.c |  241 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 244 insertions(+)
 create mode 100644 samples/vfs/test-mntinfo.c

diff --git a/samples/vfs/Makefile b/samples/vfs/Makefile
index 3c542d3b9479..d377b1f7de79 100644
--- a/samples/vfs/Makefile
+++ b/samples/vfs/Makefile
@@ -3,6 +3,7 @@ hostprogs-y := \
 	test-fsinfo \
 	test-fs-query \
 	test-fsmount \
+	test-mntinfo \
 	test-statx
 
 # Tell kbuild to always build the programs
@@ -10,6 +11,8 @@ always := $(hostprogs-y)
 
 HOSTCFLAGS_test-fsinfo.o += -I$(objtree)/usr/include
 HOSTLDLIBS_test-fsinfo += -lm
+HOSTCFLAGS_test-mntinfo.o += -I$(objtree)/usr/include
+HOSTLDLIBS_test-mntinfo += -lm
 
 HOSTCFLAGS_test-fs-query.o += -I$(objtree)/usr/include
 HOSTCFLAGS_test-fsmount.o += -I$(objtree)/usr/include
diff --git a/samples/vfs/test-mntinfo.c b/samples/vfs/test-mntinfo.c
new file mode 100644
index 000000000000..4e1b8f221841
--- /dev/null
+++ b/samples/vfs/test-mntinfo.c
@@ -0,0 +1,241 @@
+/* Test the fsinfo() system call
+ *
+ * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#define _GNU_SOURCE
+#define _ATFILE_SOURCE
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>
+#include <unistd.h>
+#include <ctype.h>
+#include <errno.h>
+#include <time.h>
+#include <math.h>
+#include <sys/syscall.h>
+#include <linux/fsinfo.h>
+#include <linux/socket.h>
+#include <linux/fcntl.h>
+#include <sys/stat.h>
+#include <arpa/inet.h>
+
+#ifndef __NR_fsinfo
+#define __NR_fsinfo -1
+#endif
+
+static __attribute__((unused))
+ssize_t fsinfo(int dfd, const char *filename, struct fsinfo_params *params,
+	       void *buffer, size_t buf_size)
+{
+	return syscall(__NR_fsinfo, dfd, filename, params, buffer, buf_size);
+}
+
+static char tree_buf[4096];
+static char bar_buf[4096];
+
+/*
+ * Get an fsinfo attribute in a statically allocated buffer.
+ */
+static void get_attr(unsigned int mnt_id, enum fsinfo_attribute attr,
+		     void *buf, size_t buf_size)
+{
+	struct fsinfo_params params = {
+		.at_flags	= AT_FSINFO_MOUNTID_PATH,
+		.request	= attr,
+	};
+	char file[32];
+	long ret;
+
+	sprintf(file, "%u", mnt_id);
+
+	memset(buf, 0xbd, buf_size);
+
+	ret = fsinfo(AT_FDCWD, file, &params, buf, buf_size);
+	if (ret == -1) {
+		fprintf(stderr, "mount-%s: %m\n", file);
+		exit(1);
+	}
+}
+
+/*
+ * Get an fsinfo attribute in a dynamically allocated buffer.
+ */
+static void *get_attr_alloc(unsigned int mnt_id, enum fsinfo_attribute attr,
+			    unsigned int Nth, size_t *_size)
+{
+	struct fsinfo_params params = {
+		.at_flags	= AT_FSINFO_MOUNTID_PATH,
+		.request	= attr,
+		.Nth		= Nth,
+	};
+	size_t buf_size = 4096;
+	char file[32];
+	void *r;
+	long ret;
+
+	sprintf(file, "%u", mnt_id);
+
+	for (;;) {
+		r = malloc(buf_size);
+		if (!r) {
+			perror("malloc");
+			exit(1);
+		}
+		memset(r, 0xbd, buf_size);
+
+		ret = fsinfo(AT_FDCWD, file, &params, r, buf_size);
+		if (ret == -1) {
+			fprintf(stderr, "mount-%s: %m\n", file);
+			exit(1);
+		}
+
+		if (ret <= buf_size) {
+			*_size = ret;
+			break;
+		}
+		buf_size = (ret + 4096 - 1) & ~(4096 - 1);
+	}
+
+	return r;
+}
+
+/*
+ * Display a mount and then recurse through its children.
+ */
+static void display_mount(unsigned int mnt_id, unsigned int depth, char *path)
+{
+	struct fsinfo_mount_child *children;
+	struct fsinfo_mount_info info;
+	struct fsinfo_ids ids;
+	unsigned int d;
+	size_t ch_size, p_size;
+	int i, n, s;
+
+	get_attr(mnt_id, FSINFO_ATTR_MOUNT_INFO, &info, sizeof(info));
+	get_attr(mnt_id, FSINFO_ATTR_IDS, &ids, sizeof(ids));
+	if (depth > 0)
+		printf("%s", tree_buf);
+
+	s = strlen(path);
+	printf("%s", !s ? "\"\"" : path);
+	if (!s)
+		s += 2;
+	s += depth;
+	if (s < 38)
+		s = 38 - s;
+	else
+		s = 1;
+	printf("%*.*s", s, s, "");
+
+	printf("%10u %10u %s %x:%x",
+	       info.mnt_id, info.change_counter,
+	       ids.f_fs_name, ids.f_dev_major, ids.f_dev_minor);
+	putchar('\n');
+
+	children = get_attr_alloc(mnt_id, FSINFO_ATTR_MOUNT_CHILDREN, 0, &ch_size);
+	n = ch_size / sizeof(children[0]) - 1;
+
+	bar_buf[depth + 1] = '|';
+	if (depth > 0) {
+		tree_buf[depth - 4 + 1] = bar_buf[depth - 4 + 1];
+		tree_buf[depth - 4 + 2] = ' ';
+	}
+
+	tree_buf[depth + 0] = ' ';
+	tree_buf[depth + 1] = '\\';
+	tree_buf[depth + 2] = '_';
+	tree_buf[depth + 3] = ' ';
+	tree_buf[depth + 4] = 0;
+	d = depth + 4;
+
+	for (i = 0; i < n; i++) {
+		if (i == n - 1)
+			bar_buf[depth + 1] = ' ';
+		path = get_attr_alloc(mnt_id, FSINFO_ATTR_MOUNT_SUBMOUNT, i, &p_size);
+		display_mount(children[i].mnt_id, d, path + 1);
+		free(path);
+	}
+
+	free(children);
+	if (depth > 0) {
+		tree_buf[depth - 4 + 1] = '\\';
+		tree_buf[depth - 4 + 2] = '_';
+	}
+	tree_buf[depth] = 0;
+}
+
+/*
+ * Find the ID of whatever is at the nominated path.
+ */
+static unsigned int lookup_mnt_by_path(const char *path)
+{
+	struct fsinfo_mount_info mnt;
+	struct fsinfo_params params = {
+		.request = FSINFO_ATTR_MOUNT_INFO,
+	};
+
+	if (fsinfo(AT_FDCWD, path, &params, &mnt, sizeof(mnt)) == -1) {
+		perror(path);
+		exit(1);
+	}
+
+	return mnt.mnt_id;
+}
+
+/*
+ *
+ */
+int main(int argc, char **argv)
+{
+	unsigned int mnt_id;
+	char *path;
+	bool use_mnt_id = false;
+	int opt;
+
+	while ((opt = getopt(argc, argv, "M"))) {
+		switch (opt) {
+		case 'M':
+			use_mnt_id = true;
+			continue;
+		}
+		break;
+	}
+
+	argc -= optind;
+	argv += optind;
+
+	switch (argc) {
+	case 0:
+		mnt_id = lookup_mnt_by_path("/");
+		path = "ROOT";
+		break;
+	case 1:
+		path = argv[0];
+		if (use_mnt_id) {
+			mnt_id = strtoul(argv[0], NULL, 0);
+			break;
+		}
+
+		mnt_id = lookup_mnt_by_path(argv[0]);
+		break;
+	default:
+		printf("Format: test-mntinfo\n");
+		printf("Format: test-mntinfo <path>\n");
+		printf("Format: test-mntinfo -M <mnt_id>\n");
+		exit(2);
+	}
+
+	printf("MOUNT                                 MOUNT ID   CHANGE#    TYPE & DEVICE\n");
+	printf("------------------------------------- ---------- ---------- ---------------\n");
+	display_mount(mnt_id, 0, path);
+	return 0;
+}


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] fsinfo: Add documentation for mount and sb watches [ver #15]
  2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
                   ` (4 preceding siblings ...)
  2019-06-28 15:47 ` [PATCH 5/6] vfs: fsinfo sample: Mount listing program " David Howells
@ 2019-06-28 15:47 ` David Howells
  5 siblings, 0 replies; 10+ messages in thread
From: David Howells @ 2019-06-28 15:47 UTC (permalink / raw)
  To: viro
  Cc: dhowells, raven, mszeredi, christian, linux-api, linux-fsdevel,
	linux-kernel

Update the fsinfo documentation to mention mount and sb watches.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/filesystems/fsinfo.rst |   38 +++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/fsinfo.rst b/Documentation/filesystems/fsinfo.rst
index 86c187a46396..ef79582b991d 100644
--- a/Documentation/filesystems/fsinfo.rst
+++ b/Documentation/filesystems/fsinfo.rst
@@ -7,7 +7,8 @@ security information beyond what stat(), statx() and statfs() can query.  It
 does not require a file to be opened as does ioctl().
 
 fsinfo() may be called on a path, an open file descriptor, a filesystem-context
-file descriptor as allocated by fsopen() or fspick().
+file descriptor as allocated by fsopen() or fspick() or a mount ID (allowing
+for mounts concealed by overmounts to be accessed).
 
 The fsinfo() system call needs to be configured on by enabling:
 
@@ -235,6 +236,10 @@ To summarise the attributes that are defined::
   FSINFO_ATTR_SERVER_NAME		N × string
   FSINFO_ATTR_SERVER_ADDRESS		N × M × struct
   FSINFO_ATTR_AFS_CELL_NAME		string
+  FSINFO_ATTR_MOUNT_INFO		struct
+  FSINFO_ATTR_MOUNT_DEVNAME		string
+  FSINFO_ATTR_MOUNT_CHILDREN		array
+  FSINFO_ATTR_MOUNT_SUBMOUNT		N × string
 
 
 Attribute Catalogue
@@ -386,6 +391,37 @@ before any superblock is attached:
     before noting any other parameters.
 
 
+Then there are attributes that convey information about the mount topology:
+
+ *  ``FSINFO_ATTR_MOUNT_INFO``
+
+    This struct-type attribute conveys information about a mount topology node
+    rather than a superblock.  This includes the ID of the superblock mounted
+    there and the ID of the mount node, its parent, group, master and
+    propagation source.  It also contains the attribute flags for the mount and
+    a change counter so that it can be quickly determined if that node changed.
+
+ *  ``FSINFO_ATTR_MOUNT_DEVNAME``
+
+    This string-type attribute returns the "device name" that was supplied when
+    the mount object was created.
+
+ *  ``FSINFO_ATTR_MOUNT_CHILDREN``
+
+    This is an array-type attribute that conveys a set of structs, each of
+    which indicates the mount ID of a child and the change counter for that
+    child.  The kernel also tags an extra element on the end that indicates the
+    ID and change counter of the queried object.  This allows a conflicting
+    change to be quickly detected by comparing the before and after counters.
+
+ *  ``FSINFO_ATTR_MOUNT_SUBMOUNT``
+
+    This is a string-type attribute that conveys the pathname of the Nth
+    mountpoint under the target mount, relative to the mount root or the
+    chroot, whichever is closer.  These correspond on a 1:1 basis with the
+    elements in the FSINFO_ATTR_MOUNT_CHILDREN list.
+
+
 Then there are filesystem-specific attributes.
 
  *  ``FSINFO_ATTR_SERVER_NAME``


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() [ver #15]
  2019-06-28 15:47 ` [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() " David Howells
@ 2019-07-03  1:09   ` Ian Kent
  2019-07-03  1:24     ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Kent @ 2019-07-03  1:09 UTC (permalink / raw)
  To: christian, David Howells, viro
  Cc: mszeredi, linux-api, linux-fsdevel, linux-kernel

Hi Christian,

About the propagation attributes you mentioned ...

On Fri, 2019-06-28 at 16:47 +0100, David Howells wrote:

snip ...

> +
> +#ifdef CONFIG_FSINFO
> +int fsinfo_generic_mount_info(struct path *path, struct fsinfo_kparams
> *params)
> +{
> +	struct fsinfo_mount_info *p = params->buffer;
> +	struct super_block *sb;
> +	struct mount *m;
> +	struct path root;
> +	unsigned int flags;
> +
> +	if (!path->mnt)
> +		return -ENODATA;
> +
> +	m = real_mount(path->mnt);
> +	sb = m->mnt.mnt_sb;
> +
> +	p->f_sb_id		= sb->s_unique_id;
> +	p->mnt_id		= m->mnt_id;
> +	p->parent_id		= m->mnt_parent->mnt_id;
> +	p->change_counter	= atomic_read(&m->mnt_change_counter);
> +
> +	get_fs_root(current->fs, &root);
> +	if (path->mnt == root.mnt) {
> +		p->parent_id = p->mnt_id;
> +	} else {
> +		rcu_read_lock();
> +		if (!are_paths_connected(&root, path))
> +			p->parent_id = p->mnt_id;
> +		rcu_read_unlock();
> +	}
> +	if (IS_MNT_SHARED(m))
> +		p->group_id = m->mnt_group_id;
> +	if (IS_MNT_SLAVE(m)) {
> +		int master = m->mnt_master->mnt_group_id;
> +		int dom = get_dominating_id(m, &root);
> +		p->master_id = master;
> +		if (dom && dom != master)
> +			p->from_id = dom;

This provides information about mount propagation (well mostly).

My understanding of this was that:
"If a mount is propagation private (or slave) the group_id will
be zero otherwise it's propagation shared and it's group id will
be non-zero.

If a mount is propagation slave and propagation peers exist then
the mount field mnt_master will be non-NULL. Then mnt_master
(slave's master) can be used to set master_id. If the group id
of the propagation source is not that of the master then set
the from_id group as well."

This parallels the way in which these values are reported in
the proc pseudo file system.

Perhaps adding flags as well as setting the fields would be
useful too, since interpreting the meaning of the structure
fields isn't obvious, ;)

David, Al, thoughts?

Ian


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() [ver #15]
  2019-07-03  1:09   ` Ian Kent
@ 2019-07-03  1:24     ` Ian Kent
  2019-07-03  1:42       ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Kent @ 2019-07-03  1:24 UTC (permalink / raw)
  To: christian, David Howells, viro
  Cc: mszeredi, linux-api, linux-fsdevel, linux-kernel

On Wed, 2019-07-03 at 09:09 +0800, Ian Kent wrote:
> Hi Christian,
> 
> About the propagation attributes you mentioned ...

Umm ... how did you work out if a mount is unbindable from proc
mountinfo?

I didn't notice anything that could be used for that when I was
looking at this.

> 
> On Fri, 2019-06-28 at 16:47 +0100, David Howells wrote:
> 
> snip ...
> 
> > +
> > +#ifdef CONFIG_FSINFO
> > +int fsinfo_generic_mount_info(struct path *path, struct fsinfo_kparams
> > *params)
> > +{
> > +	struct fsinfo_mount_info *p = params->buffer;
> > +	struct super_block *sb;
> > +	struct mount *m;
> > +	struct path root;
> > +	unsigned int flags;
> > +
> > +	if (!path->mnt)
> > +		return -ENODATA;
> > +
> > +	m = real_mount(path->mnt);
> > +	sb = m->mnt.mnt_sb;
> > +
> > +	p->f_sb_id		= sb->s_unique_id;
> > +	p->mnt_id		= m->mnt_id;
> > +	p->parent_id		= m->mnt_parent->mnt_id;
> > +	p->change_counter	= atomic_read(&m->mnt_change_counter);
> > +
> > +	get_fs_root(current->fs, &root);
> > +	if (path->mnt == root.mnt) {
> > +		p->parent_id = p->mnt_id;
> > +	} else {
> > +		rcu_read_lock();
> > +		if (!are_paths_connected(&root, path))
> > +			p->parent_id = p->mnt_id;
> > +		rcu_read_unlock();
> > +	}
> > +	if (IS_MNT_SHARED(m))
> > +		p->group_id = m->mnt_group_id;
> > +	if (IS_MNT_SLAVE(m)) {
> > +		int master = m->mnt_master->mnt_group_id;
> > +		int dom = get_dominating_id(m, &root);
> > +		p->master_id = master;
> > +		if (dom && dom != master)
> > +			p->from_id = dom;
> 
> This provides information about mount propagation (well mostly).
> 
> My understanding of this was that:
> "If a mount is propagation private (or slave) the group_id will
> be zero otherwise it's propagation shared and it's group id will
> be non-zero.
> 
> If a mount is propagation slave and propagation peers exist then
> the mount field mnt_master will be non-NULL. Then mnt_master
> (slave's master) can be used to set master_id. If the group id
> of the propagation source is not that of the master then set
> the from_id group as well."
> 
> This parallels the way in which these values are reported in
> the proc pseudo file system.
> 
> Perhaps adding flags as well as setting the fields would be
> useful too, since interpreting the meaning of the structure
> fields isn't obvious, ;)
> 
> David, Al, thoughts?
> 
> Ian


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() [ver #15]
  2019-07-03  1:24     ` Ian Kent
@ 2019-07-03  1:42       ` Ian Kent
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Kent @ 2019-07-03  1:42 UTC (permalink / raw)
  To: christian, David Howells, viro
  Cc: mszeredi, linux-api, linux-fsdevel, linux-kernel

On Wed, 2019-07-03 at 09:24 +0800, Ian Kent wrote:
> On Wed, 2019-07-03 at 09:09 +0800, Ian Kent wrote:
> > Hi Christian,
> > 
> > About the propagation attributes you mentioned ...
> 
> Umm ... how did you work out if a mount is unbindable from proc
> mountinfo?
> 
> I didn't notice anything that could be used for that when I was
> looking at this.

Oh wait, fs/proc_namespace.c:show_mountinfo() has:
        if (IS_MNT_UNBINDABLE(r))
                seq_puts(m, " unbindable");

I missed that, probably because I didn't have any unbindable mounts
at the time I was looking at it, oops!

That's missing and probably should be added too.

> 
> > On Fri, 2019-06-28 at 16:47 +0100, David Howells wrote:
> > 
> > snip ...
> > 
> > > +
> > > +#ifdef CONFIG_FSINFO
> > > +int fsinfo_generic_mount_info(struct path *path, struct fsinfo_kparams
> > > *params)
> > > +{
> > > +	struct fsinfo_mount_info *p = params->buffer;
> > > +	struct super_block *sb;
> > > +	struct mount *m;
> > > +	struct path root;
> > > +	unsigned int flags;
> > > +
> > > +	if (!path->mnt)
> > > +		return -ENODATA;
> > > +
> > > +	m = real_mount(path->mnt);
> > > +	sb = m->mnt.mnt_sb;
> > > +
> > > +	p->f_sb_id		= sb->s_unique_id;
> > > +	p->mnt_id		= m->mnt_id;
> > > +	p->parent_id		= m->mnt_parent->mnt_id;
> > > +	p->change_counter	= atomic_read(&m->mnt_change_counter);
> > > +
> > > +	get_fs_root(current->fs, &root);
> > > +	if (path->mnt == root.mnt) {
> > > +		p->parent_id = p->mnt_id;
> > > +	} else {
> > > +		rcu_read_lock();
> > > +		if (!are_paths_connected(&root, path))
> > > +			p->parent_id = p->mnt_id;
> > > +		rcu_read_unlock();
> > > +	}
> > > +	if (IS_MNT_SHARED(m))
> > > +		p->group_id = m->mnt_group_id;
> > > +	if (IS_MNT_SLAVE(m)) {
> > > +		int master = m->mnt_master->mnt_group_id;
> > > +		int dom = get_dominating_id(m, &root);
> > > +		p->master_id = master;
> > > +		if (dom && dom != master)
> > > +			p->from_id = dom;
> > 
> > This provides information about mount propagation (well mostly).
> > 
> > My understanding of this was that:
> > "If a mount is propagation private (or slave) the group_id will
> > be zero otherwise it's propagation shared and it's group id will
> > be non-zero.
> > 
> > If a mount is propagation slave and propagation peers exist then
> > the mount field mnt_master will be non-NULL. Then mnt_master
> > (slave's master) can be used to set master_id. If the group id
> > of the propagation source is not that of the master then set
> > the from_id group as well."
> > 
> > This parallels the way in which these values are reported in
> > the proc pseudo file system.
> > 
> > Perhaps adding flags as well as setting the fields would be
> > useful too, since interpreting the meaning of the structure
> > fields isn't obvious, ;)
> > 
> > David, Al, thoughts?
> > 
> > Ian


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-03  1:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-28 15:46 [PATCH 0/6] fsinfo: Add mount topology query [ver #15] David Howells
2019-06-28 15:47 ` [PATCH 1/6] vfs: Allow fsinfo() to look up a mount object by ID " David Howells
2019-06-28 15:47 ` [PATCH 2/6] vfs: Introduce a non-repeating system-unique superblock " David Howells
2019-06-28 15:47 ` [PATCH 3/6] vfs: Add mount change counter " David Howells
2019-06-28 15:47 ` [PATCH 4/6] vfs: Allow mount information to be queried by fsinfo() " David Howells
2019-07-03  1:09   ` Ian Kent
2019-07-03  1:24     ` Ian Kent
2019-07-03  1:42       ` Ian Kent
2019-06-28 15:47 ` [PATCH 5/6] vfs: fsinfo sample: Mount listing program " David Howells
2019-06-28 15:47 ` [PATCH 6/6] fsinfo: Add documentation for mount and sb watches " David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).