All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Generic per-mount io stats
@ 2022-02-28 11:39 Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 1/6] fs: add iostats counters to struct mount Amir Goldstein
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

Miklos,

Resending those patches with only minor change even though I did get
and feedback on v1 [1].

My use case specifically is for fuse, but I think these mount stats
can be useful for container use cases, either with overlayfs or even
with bind mounts, in order to help sysadmins bisect the source of io
from containers POV.

This revision opts-in for mountstats for all fuse/overlayfs mounts,
but we could also make it always opt-in by mount options for any fs.

Thoughts?

Thanks,
Amir.

Changes since v1:
- Opt-in for per-mount io stats for overlayfs and fuse

[1] https://lore.kernel.org/linux-fsdevel/20210107214401.249416-1-amir73il@gmail.com/

Amir Goldstein (6):
  fs: add iostats counters to struct mount
  fs: tidy up fs_flags definitions
  fs: collect per-mount io stats
  fs: report per-mount io stats
  ovl: opt-in for per-mount io stats
  fuse: opt-in for per-mount io stats

 fs/Kconfig           |  9 ++++++
 fs/fuse/inode.c      |  2 +-
 fs/mount.h           | 59 +++++++++++++++++++++++++++++++++++
 fs/namespace.c       | 19 ++++++++++++
 fs/overlayfs/super.c |  2 +-
 fs/proc_namespace.c  | 13 ++++++++
 fs/read_write.c      | 73 +++++++++++++++++++++++++++++---------------
 include/linux/fs.h   | 15 ++++-----
 8 files changed, 159 insertions(+), 33 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/6] fs: add iostats counters to struct mount
  2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
@ 2022-02-28 11:39 ` Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 2/6] fs: tidy up fs_flags definitions Amir Goldstein
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

With config MOUNT_IO_STATS, add an array of counters to struct mnt_pcp
that will be used to collect I/O statistics.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/Kconfig     |  9 +++++++++
 fs/mount.h     | 32 ++++++++++++++++++++++++++++++++
 fs/namespace.c | 17 +++++++++++++++++
 3 files changed, 58 insertions(+)

diff --git a/fs/Kconfig b/fs/Kconfig
index 6c7dc1387beb..60baa861b3e4 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -15,6 +15,15 @@ config VALIDATE_FS_PARSER
 	  Enable this to perform validation of the parameter description for a
 	  filesystem when it is registered.
 
+config FS_MOUNT_STATS
+	bool "Enable per-mount I/O statistics"
+	depends on SMP
+	help
+	  Enable this to allow collecting per-mount I/O statistics and display
+	  them in /proc/<pid>/mountstats.
+
+	  Say N if unsure.
+
 config FS_IOMAP
 	bool
 
diff --git a/fs/mount.h b/fs/mount.h
index 0b6e08cf8afb..b22169b4d24c 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -24,9 +24,25 @@ struct mnt_namespace {
 	unsigned int		pending_mounts;
 } __randomize_layout;
 
+/* Similar to task_io_accounting members */
+enum {
+	MNTIOS_CHARS_RD,	/* bytes read via syscalls */
+	MNTIOS_CHARS_WR,	/* bytes written via syscalls */
+	MNTIOS_SYSCALLS_RD,	/* # of read syscalls */
+	MNTIOS_SYSCALLS_WR,	/* # of write syscalls */
+	_MNTIOS_COUNTERS_NUM
+};
+
+struct mnt_iostats {
+	s64 counter[_MNTIOS_COUNTERS_NUM];
+};
+
 struct mnt_pcp {
 	int mnt_count;
 	int mnt_writers;
+#ifdef CONFIG_FS_MOUNT_STATS
+	struct mnt_iostats iostats;
+#endif
 };
 
 struct mountpoint {
@@ -148,3 +164,19 @@ static inline bool is_anon_ns(struct mnt_namespace *ns)
 }
 
 extern void mnt_cursor_del(struct mnt_namespace *ns, struct mount *cursor);
+
+static inline void mnt_iostats_counter_inc(struct mount *mnt, int id)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	this_cpu_inc(mnt->mnt_pcp->iostats.counter[id]);
+#endif
+}
+
+static inline void mnt_iostats_counter_add(struct mount *mnt, int id, s64 n)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	this_cpu_add(mnt->mnt_pcp->iostats.counter[id], n);
+#endif
+}
+
+extern s64 mnt_iostats_counter_read(struct mount *mnt, int id);
diff --git a/fs/namespace.c b/fs/namespace.c
index de6fae84f1a1..3fb8f11a42a1 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -306,6 +306,23 @@ static unsigned int mnt_get_writers(struct mount *mnt)
 #endif
 }
 
+s64 mnt_iostats_counter_read(struct mount *mnt, int id)
+{
+	s64 count = 0;
+#ifdef CONFIG_FS_MOUNT_STATS
+	/*
+	 * MOUNT_STATS depends on SMP.
+	 * Should be trivial to implement for !SMP if anyone cares...
+	 */
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		count += per_cpu_ptr(mnt->mnt_pcp, cpu)->iostats.counter[id];
+	}
+#endif
+	return count;
+}
+
 static int mnt_is_readonly(struct vfsmount *mnt)
 {
 	if (mnt->mnt_sb->s_readonly_remount)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/6] fs: tidy up fs_flags definitions
  2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 1/6] fs: add iostats counters to struct mount Amir Goldstein
@ 2022-02-28 11:39 ` Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 3/6] fs: collect per-mount io stats Amir Goldstein
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

Use bit shift for flag constants and abbreviate comments.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 include/linux/fs.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index e2d892b201b0..f220db331dba 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2428,13 +2428,13 @@ int sync_inode_metadata(struct inode *inode, int wait);
 struct file_system_type {
 	const char *name;
 	int fs_flags;
-#define FS_REQUIRES_DEV		1 
-#define FS_BINARY_MOUNTDATA	2
-#define FS_HAS_SUBTYPE		4
-#define FS_USERNS_MOUNT		8	/* Can be mounted by userns root */
-#define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
-#define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
-#define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
+#define FS_REQUIRES_DEV		(1<<0)
+#define FS_BINARY_MOUNTDATA	(1<<1)
+#define FS_HAS_SUBTYPE		(1<<2)
+#define FS_USERNS_MOUNT		(1<<3)	/* Can be mounted by userns root */
+#define FS_DISALLOW_NOTIFY_PERM	(1<<4)	/* Disable fanotify permission events */
+#define FS_ALLOW_IDMAP		(1<<5)	/* FS can handle vfs idmappings */
+#define FS_RENAME_DOES_D_MOVE	(1<<15)	/* FS will handle d_move() internally */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
 	struct dentry *(*mount) (struct file_system_type *, int,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/6] fs: collect per-mount io stats
  2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 1/6] fs: add iostats counters to struct mount Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 2/6] fs: tidy up fs_flags definitions Amir Goldstein
@ 2022-02-28 11:39 ` Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 4/6] fs: report " Amir Goldstein
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

Replace task io account helpers with wrappers that may also collect
per-mount stats.

Filesystems that want these per-mount io stats collected need to
opt-in with the FS_MOUNT_STATS flag.

We may consider opting-in per-mount using a mount option in the
future.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/mount.h         | 26 +++++++++++++++++
 fs/read_write.c    | 73 +++++++++++++++++++++++++++++++---------------
 include/linux/fs.h |  1 +
 3 files changed, 76 insertions(+), 24 deletions(-)

diff --git a/fs/mount.h b/fs/mount.h
index b22169b4d24c..f98bf4cd5b1a 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -165,6 +165,16 @@ static inline bool is_anon_ns(struct mnt_namespace *ns)
 
 extern void mnt_cursor_del(struct mnt_namespace *ns, struct mount *cursor);
 
+static inline bool mnt_has_stats(struct vfsmount *mnt)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	/* Should this also be configurable per mount? */
+	return (mnt->mnt_sb->s_type->fs_flags & FS_MOUNT_STATS);
+#else
+	return false;
+#endif
+}
+
 static inline void mnt_iostats_counter_inc(struct mount *mnt, int id)
 {
 #ifdef CONFIG_FS_MOUNT_STATS
@@ -179,4 +189,20 @@ static inline void mnt_iostats_counter_add(struct mount *mnt, int id, s64 n)
 #endif
 }
 
+static inline void file_iostats_counter_inc(struct file *file, int id)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	if (file && mnt_has_stats(file->f_path.mnt))
+		mnt_iostats_counter_inc(real_mount(file->f_path.mnt), id);
+#endif
+}
+
+static inline void file_iostats_counter_add(struct file *file, int id, s64 n)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	if (file && mnt_has_stats(file->f_path.mnt))
+		mnt_iostats_counter_add(real_mount(file->f_path.mnt), id, n);
+#endif
+}
+
 extern s64 mnt_iostats_counter_read(struct mount *mnt, int id);
diff --git a/fs/read_write.c b/fs/read_write.c
index 0074afa7ecb3..386a907a19a8 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -21,6 +21,7 @@
 #include <linux/mount.h>
 #include <linux/fs.h>
 #include "internal.h"
+#include "mount.h"
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
@@ -34,6 +35,30 @@ const struct file_operations generic_ro_fops = {
 
 EXPORT_SYMBOL(generic_ro_fops);
 
+static void file_add_rchar(struct file *file, struct task_struct *tsk, ssize_t amt)
+{
+	file_iostats_counter_add(file, MNTIOS_CHARS_RD, amt);
+	add_rchar(tsk, amt);
+}
+
+static void file_add_wchar(struct file *file, struct task_struct *tsk, ssize_t amt)
+{
+	file_iostats_counter_add(file, MNTIOS_CHARS_WR, amt);
+	add_wchar(tsk, amt);
+}
+
+static void file_inc_syscr(struct file *file, struct task_struct *tsk)
+{
+	file_iostats_counter_inc(file, MNTIOS_SYSCALLS_RD);
+	inc_syscr(current);
+}
+
+static void file_inc_syscw(struct file *file, struct task_struct *tsk)
+{
+	file_iostats_counter_inc(file, MNTIOS_SYSCALLS_WR);
+	inc_syscw(current);
+}
+
 static inline bool unsigned_offsets(struct file *file)
 {
 	return file->f_mode & FMODE_UNSIGNED_OFFSET;
@@ -441,9 +466,9 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 		if (pos)
 			*pos = kiocb.ki_pos;
 		fsnotify_access(file);
-		add_rchar(current, ret);
+		file_add_rchar(file, current, ret);
 	}
-	inc_syscr(current);
+	file_inc_syscr(file, current);
 	return ret;
 }
 
@@ -483,9 +508,9 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
 		ret = -EINVAL;
 	if (ret > 0) {
 		fsnotify_access(file);
-		add_rchar(current, ret);
+		file_add_rchar(file, current, ret);
 	}
-	inc_syscr(current);
+	file_inc_syscr(file, current);
 	return ret;
 }
 
@@ -537,9 +562,9 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 		if (pos)
 			*pos = kiocb.ki_pos;
 		fsnotify_modify(file);
-		add_wchar(current, ret);
+		file_add_wchar(file, current, ret);
 	}
-	inc_syscw(current);
+	file_inc_syscw(file, current);
 	return ret;
 }
 /*
@@ -592,9 +617,9 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
 		ret = -EINVAL;
 	if (ret > 0) {
 		fsnotify_modify(file);
-		add_wchar(current, ret);
+		file_add_wchar(file, current, ret);
 	}
-	inc_syscw(current);
+	file_inc_syscw(file, current);
 	file_end_write(file);
 	return ret;
 }
@@ -947,8 +972,8 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_rchar(current, ret);
-	inc_syscr(current);
+		file_add_rchar(f.file, current, ret);
+	file_inc_syscr(f.file, current);
 	return ret;
 }
 
@@ -971,8 +996,8 @@ static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_wchar(current, ret);
-	inc_syscw(current);
+		file_add_wchar(f.file, current, ret);
+	file_inc_syscw(f.file, current);
 	return ret;
 }
 
@@ -1000,8 +1025,8 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_rchar(current, ret);
-	inc_syscr(current);
+		file_add_rchar(f.file, current, ret);
+	file_inc_syscr(f.file, current);
 	return ret;
 }
 
@@ -1023,8 +1048,8 @@ static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_wchar(current, ret);
-	inc_syscw(current);
+		file_add_wchar(f.file, current, ret);
+	file_inc_syscw(f.file, current);
 	return ret;
 }
 
@@ -1250,8 +1275,8 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
 	}
 
 	if (retval > 0) {
-		add_rchar(current, retval);
-		add_wchar(current, retval);
+		file_add_rchar(in.file, current, retval);
+		file_add_wchar(out.file, current, retval);
 		fsnotify_access(in.file);
 		fsnotify_modify(out.file);
 		out.file->f_pos = out_pos;
@@ -1261,8 +1286,8 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
 			in.file->f_pos = pos;
 	}
 
-	inc_syscr(current);
-	inc_syscw(current);
+	file_inc_syscr(in.file, current);
+	file_inc_syscw(out.file, current);
 	if (pos > max)
 		retval = -EOVERFLOW;
 
@@ -1511,13 +1536,13 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 done:
 	if (ret > 0) {
 		fsnotify_access(file_in);
-		add_rchar(current, ret);
+		file_add_rchar(file_in, current, ret);
 		fsnotify_modify(file_out);
-		add_wchar(current, ret);
+		file_add_wchar(file_out, current, ret);
 	}
 
-	inc_syscr(current);
-	inc_syscw(current);
+	file_inc_syscr(file_in, current);
+	file_inc_syscw(file_out, current);
 
 	file_end_write(file_out);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f220db331dba..60ee8d8ef020 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2434,6 +2434,7 @@ struct file_system_type {
 #define FS_USERNS_MOUNT		(1<<3)	/* Can be mounted by userns root */
 #define FS_DISALLOW_NOTIFY_PERM	(1<<4)	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP		(1<<5)	/* FS can handle vfs idmappings */
+#define FS_MOUNT_STATS		(1<<6)	/* FS has generic proc/pid/mountstats */
 #define FS_RENAME_DOES_D_MOVE	(1<<15)	/* FS will handle d_move() internally */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
                   ` (2 preceding siblings ...)
  2022-02-28 11:39 ` [PATCH v2 3/6] fs: collect per-mount io stats Amir Goldstein
@ 2022-02-28 11:39 ` Amir Goldstein
  2022-02-28 15:06   ` Miklos Szeredi
  2022-02-28 21:11   ` Dave Chinner
  2022-02-28 11:39 ` [PATCH v2 5/6] ovl: opt-in for " Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 6/6] fuse: " Amir Goldstein
  5 siblings, 2 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

Show optional collected per-mount io stats in /proc/<pid>/mountstats
for filesystems that do not implement their own show_stats() method
and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/mount.h          |  1 +
 fs/namespace.c      |  2 ++
 fs/proc_namespace.c | 13 +++++++++++++
 3 files changed, 16 insertions(+)

diff --git a/fs/mount.h b/fs/mount.h
index f98bf4cd5b1a..2ab6308af78b 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -91,6 +91,7 @@ struct mount {
 	int mnt_id;			/* mount identifier */
 	int mnt_group_id;		/* peer group identifier */
 	int mnt_expiry_mark;		/* true if marked for expiry */
+	time64_t mnt_time;		/* time of mount */
 	struct hlist_head mnt_pins;
 	struct hlist_head mnt_stuck_children;
 } __randomize_layout;
diff --git a/fs/namespace.c b/fs/namespace.c
index 3fb8f11a42a1..546f07ed44c5 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -220,6 +220,8 @@ static struct mount *alloc_vfsmnt(const char *name)
 		mnt->mnt_count = 1;
 		mnt->mnt_writers = 0;
 #endif
+		/* For proc/<pid>/mountstats */
+		mnt->mnt_time = ktime_get_seconds();
 
 		INIT_HLIST_NODE(&mnt->mnt_hash);
 		INIT_LIST_HEAD(&mnt->mnt_child);
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 49650e54d2f8..d744fb8543f5 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -232,6 +232,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt)
 	if (sb->s_op->show_stats) {
 		seq_putc(m, ' ');
 		err = sb->s_op->show_stats(m, mnt_path.dentry);
+	} else if (mnt_has_stats(mnt)) {
+		/* Similar to /proc/<pid>/io */
+		seq_printf(m, "\n"
+			   "\ttimes: %lld %lld\n"
+			   "\trchar: %lld\n"
+			   "\twchar: %lld\n"
+			   "\tsyscr: %lld\n"
+			   "\tsyscw: %lld\n",
+			   r->mnt_time, ktime_get_seconds(),
+			   mnt_iostats_counter_read(r, MNTIOS_CHARS_RD),
+			   mnt_iostats_counter_read(r, MNTIOS_CHARS_WR),
+			   mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD),
+			   mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR));
 	}
 
 	seq_putc(m, '\n');
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/6] ovl: opt-in for per-mount io stats
  2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
                   ` (3 preceding siblings ...)
  2022-02-28 11:39 ` [PATCH v2 4/6] fs: report " Amir Goldstein
@ 2022-02-28 11:39 ` Amir Goldstein
  2022-02-28 11:39 ` [PATCH v2 6/6] fuse: " Amir Goldstein
  5 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

Traditionally, system administrators have used the iostat utility
to track the amount of io performed to a local disk filesystem.

Similar functionality is provided for NFS mounts via the nfsstat
utility that reads the NFS client's stats from /proc/pid/mountstats.

There is currently no good way for a system administrator or a
monitoring application inside a container to track the amount of io
performed via overlayfs.

Opt-in for generic io stats via /proc/pid/mountstats to provide
that functionality.

This feature depends on CONFIG_MOUNT_IO_STATS.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7bb0a47cb615..802e4ed567cc 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -2165,7 +2165,7 @@ static struct dentry *ovl_mount(struct file_system_type *fs_type, int flags,
 static struct file_system_type ovl_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "overlay",
-	.fs_flags	= FS_USERNS_MOUNT,
+	.fs_flags	= FS_USERNS_MOUNT | FS_MOUNT_STATS,
 	.mount		= ovl_mount,
 	.kill_sb	= kill_anon_super,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 6/6] fuse: opt-in for per-mount io stats
  2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
                   ` (4 preceding siblings ...)
  2022-02-28 11:39 ` [PATCH v2 5/6] ovl: opt-in for " Amir Goldstein
@ 2022-02-28 11:39 ` Amir Goldstein
  5 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-unionfs, containers, linux-fsdevel

Traditionally, system administrators have used the iostat utility
to track the amount of io performed to a local disk filesystem.

Similar functionality is provided for NFS mounts via the nfsstat
utility that reads the NFS client's stats from /proc/pid/mountstats.

There is currently no good way for a system administrator or a
monitoring application to track the amount of io performed via fuse
filesystems.

Opt-in for generic io stats via /proc/pid/mountstats to provide
that functionality.

It is possible to collect io stats on the server side inside libfuse,
but those io stats will not cover cached writes and reads.  Therefore,
implementing the server side io stats would be complementary to these
client side io stats.  Also, this feature provides the io stats for
existing fuse filesystem/lib release binaries.

This feature depends on CONFIG_MOUNT_IO_STATS.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/fuse/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 9ee36aa73251..5c58583a12fc 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1806,7 +1806,7 @@ static void fuse_kill_sb_anon(struct super_block *sb)
 static struct file_system_type fuse_fs_type = {
 	.owner		= THIS_MODULE,
 	.name		= "fuse",
-	.fs_flags	= FS_HAS_SUBTYPE | FS_USERNS_MOUNT,
+	.fs_flags	= FS_HAS_SUBTYPE | FS_USERNS_MOUNT | FS_MOUNT_STATS,
 	.init_fs_context = fuse_init_fs_context,
 	.parameters	= fuse_fs_parameters,
 	.kill_sb	= fuse_kill_sb_anon,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 11:39 ` [PATCH v2 4/6] fs: report " Amir Goldstein
@ 2022-02-28 15:06   ` Miklos Szeredi
  2022-02-28 16:18     ` Amir Goldstein
  2022-02-28 21:11   ` Dave Chinner
  1 sibling, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2022-02-28 15:06 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, overlayfs, containers, linux-fsdevel

On Mon, 28 Feb 2022 at 12:39, Amir Goldstein <amir73il@gmail.com> wrote:
>
> Show optional collected per-mount io stats in /proc/<pid>/mountstats
> for filesystems that do not implement their own show_stats() method
> and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.

This would allow some filesystems to report per-mount I/O stats, while
leaving CIFS and NFS reporting a different set of per-sb stats.  This
doesn't sound very clean.

There was an effort to create saner and more efficient interfaces for
per-mount info.  IMO this should be part of that effort instead of
overloading the old interface.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 15:06   ` Miklos Szeredi
@ 2022-02-28 16:18     ` Amir Goldstein
  2022-02-28 16:31       ` Miklos Szeredi
  0 siblings, 1 reply; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 16:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, overlayfs, containers, linux-fsdevel

On Mon, Feb 28, 2022 at 5:06 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, 28 Feb 2022 at 12:39, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > Show optional collected per-mount io stats in /proc/<pid>/mountstats
> > for filesystems that do not implement their own show_stats() method
> > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
>
> This would allow some filesystems to report per-mount I/O stats, while
> leaving CIFS and NFS reporting a different set of per-sb stats.  This
> doesn't sound very clean.
>
> There was an effort to create saner and more efficient interfaces for
> per-mount info.  IMO this should be part of that effort instead of
> overloading the old interface.
>

That's fair, but actually, I have no much need for per-mount I/O stats
in overlayfs/fuse use cases, so I could amend the patches to collect and
show per-sb I/O stats.

Then, the generic show_stats() will not be "overloading the old interface".
Instead, it will be creating a common implementation to share among different
filesystems and using an existing vfs interface as it was intended.

Would you be willing to accept adding per-sb I/O stats to overlayfs
and/or fuse via /proc/<pid>/mountstats?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 16:18     ` Amir Goldstein
@ 2022-02-28 16:31       ` Miklos Szeredi
  2022-02-28 17:06         ` Amir Goldstein
  0 siblings, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2022-02-28 16:31 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Al Viro, overlayfs, containers, linux-fsdevel

On Mon, 28 Feb 2022 at 17:19, Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Feb 28, 2022 at 5:06 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Mon, 28 Feb 2022 at 12:39, Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > Show optional collected per-mount io stats in /proc/<pid>/mountstats
> > > for filesystems that do not implement their own show_stats() method
> > > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
> >
> > This would allow some filesystems to report per-mount I/O stats, while
> > leaving CIFS and NFS reporting a different set of per-sb stats.  This
> > doesn't sound very clean.
> >
> > There was an effort to create saner and more efficient interfaces for
> > per-mount info.  IMO this should be part of that effort instead of
> > overloading the old interface.
> >
>
> That's fair, but actually, I have no much need for per-mount I/O stats
> in overlayfs/fuse use cases, so I could amend the patches to collect and
> show per-sb I/O stats.
>
> Then, the generic show_stats() will not be "overloading the old interface".
> Instead, it will be creating a common implementation to share among different
> filesystems and using an existing vfs interface as it was intended.
>
> Would you be willing to accept adding per-sb I/O stats to overlayfs
> and/or fuse via /proc/<pid>/mountstats?

Yes, that would certainly be more sane.   But I'm also wondering if
there could be some commonality with the stats provided by NFS...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 16:31       ` Miklos Szeredi
@ 2022-02-28 17:06         ` Amir Goldstein
  0 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 17:06 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, overlayfs, containers, linux-fsdevel

On Mon, Feb 28, 2022 at 6:31 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, 28 Feb 2022 at 17:19, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Feb 28, 2022 at 5:06 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Mon, 28 Feb 2022 at 12:39, Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > Show optional collected per-mount io stats in /proc/<pid>/mountstats
> > > > for filesystems that do not implement their own show_stats() method
> > > > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
> > >
> > > This would allow some filesystems to report per-mount I/O stats, while
> > > leaving CIFS and NFS reporting a different set of per-sb stats.  This
> > > doesn't sound very clean.
> > >
> > > There was an effort to create saner and more efficient interfaces for
> > > per-mount info.  IMO this should be part of that effort instead of
> > > overloading the old interface.
> > >
> >
> > That's fair, but actually, I have no much need for per-mount I/O stats
> > in overlayfs/fuse use cases, so I could amend the patches to collect and
> > show per-sb I/O stats.
> >
> > Then, the generic show_stats() will not be "overloading the old interface".
> > Instead, it will be creating a common implementation to share among different
> > filesystems and using an existing vfs interface as it was intended.
> >
> > Would you be willing to accept adding per-sb I/O stats to overlayfs
> > and/or fuse via /proc/<pid>/mountstats?
>
> Yes, that would certainly be more sane.   But I'm also wondering if
> there could be some commonality with the stats provided by NFS...
>

hmm, maybe, but look:

device genesis:/shares/media mounted on /mnt/nfs with fstype nfs4 statvers=1.1
opts: rw,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.1,local_lock=none
age: 38
impl_id: name='',domain='',date='0,0'
caps: caps=0x3ffbffff,wtmult=512,dtsize=32768,bsize=0,namlen=255
nfsv4: bm0=0xfdffbfff,bm1=0xf9be3e,bm2=0x68800,acl=0x3,sessions,pnfs=not
configured,lease_time=90,lease_expired=0
sec: flavor=1,pseudoflavor=1
events: 0 0 0 0 0 2 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
bytes: 0 0 0 0 0 0 0 0
RPC iostats version: 1.1  p/v: 100003/4 (nfs)
xprt: tcp 0 0 2 0 38 27 27 0 27 0 2 0 0
per-op statistics
       NULL: 1 1 0 44 24 0 0 0 0
       READ: 0 0 0 0 0 0 0 0 0
       WRITE: 0 0 0 0 0 0 0 0 0
       COMMIT: 0 0 0 0 0 0 0 0 0
       OPEN: 0 0 0 0 0 0 0 0 0
       OPEN_CONFIRM: 0 0 0 0 0 0 0 0 0
       OPEN_NOATTR: 0 0 0 0 0 0 0 0 0
       OPEN_DOWNGRADE: 0 0 0 0 0 0 0 0 0
       CLOSE: 0 0 0 0 0 0 0 0 0
       ...

Those stats are pretty specific to NFS.
Most of it is RPC protocol stats, not including cached /I/O stats,
so in fact the generic collected io stats could be added to nfs_show_stats()
and they will not break the <tag>:<value> format.

I used the same output format as /proc/<pid>/io for a reason.
The iotop utility parses /proc/<pid>/io to display io per task and
also displays total io stats for block devices.

I am envisioning an extended version of iotop (-m ?) that also
displays total iostats per mount/sb, when available.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 11:39 ` [PATCH v2 4/6] fs: report " Amir Goldstein
  2022-02-28 15:06   ` Miklos Szeredi
@ 2022-02-28 21:11   ` Dave Chinner
  2022-02-28 21:57     ` Amir Goldstein
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2022-02-28 21:11 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Al Viro, linux-unionfs, containers, linux-fsdevel

On Mon, Feb 28, 2022 at 01:39:08PM +0200, Amir Goldstein wrote:
> Show optional collected per-mount io stats in /proc/<pid>/mountstats
> for filesystems that do not implement their own show_stats() method
> and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/mount.h          |  1 +
>  fs/namespace.c      |  2 ++
>  fs/proc_namespace.c | 13 +++++++++++++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/fs/mount.h b/fs/mount.h
> index f98bf4cd5b1a..2ab6308af78b 100644
> --- a/fs/mount.h
> +++ b/fs/mount.h
> @@ -91,6 +91,7 @@ struct mount {
>  	int mnt_id;			/* mount identifier */
>  	int mnt_group_id;		/* peer group identifier */
>  	int mnt_expiry_mark;		/* true if marked for expiry */
> +	time64_t mnt_time;		/* time of mount */
>  	struct hlist_head mnt_pins;
>  	struct hlist_head mnt_stuck_children;
>  } __randomize_layout;
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 3fb8f11a42a1..546f07ed44c5 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -220,6 +220,8 @@ static struct mount *alloc_vfsmnt(const char *name)
>  		mnt->mnt_count = 1;
>  		mnt->mnt_writers = 0;
>  #endif
> +		/* For proc/<pid>/mountstats */
> +		mnt->mnt_time = ktime_get_seconds();
>  
>  		INIT_HLIST_NODE(&mnt->mnt_hash);
>  		INIT_LIST_HEAD(&mnt->mnt_child);
> diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
> index 49650e54d2f8..d744fb8543f5 100644
> --- a/fs/proc_namespace.c
> +++ b/fs/proc_namespace.c
> @@ -232,6 +232,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt)
>  	if (sb->s_op->show_stats) {
>  		seq_putc(m, ' ');
>  		err = sb->s_op->show_stats(m, mnt_path.dentry);
> +	} else if (mnt_has_stats(mnt)) {
> +		/* Similar to /proc/<pid>/io */
> +		seq_printf(m, "\n"
> +			   "\ttimes: %lld %lld\n"
> +			   "\trchar: %lld\n"
> +			   "\twchar: %lld\n"
> +			   "\tsyscr: %lld\n"
> +			   "\tsyscw: %lld\n",
> +			   r->mnt_time, ktime_get_seconds(),
> +			   mnt_iostats_counter_read(r, MNTIOS_CHARS_RD),
> +			   mnt_iostats_counter_read(r, MNTIOS_CHARS_WR),
> +			   mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD),
> +			   mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR));

This doesn't scale as {cpus, mounts, counters, read frequency}
matrix explodes.  Please iterate the per-mount per cpu counters
once, adding up all counters in one pass to an array on stack, then
print them all from the array.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 21:11   ` Dave Chinner
@ 2022-02-28 21:57     ` Amir Goldstein
  2022-03-01  9:46       ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: Amir Goldstein @ 2022-02-28 21:57 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Miklos Szeredi, Al Viro, overlayfs, containers, linux-fsdevel

On Mon, Feb 28, 2022 at 11:11 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Feb 28, 2022 at 01:39:08PM +0200, Amir Goldstein wrote:
> > Show optional collected per-mount io stats in /proc/<pid>/mountstats
> > for filesystems that do not implement their own show_stats() method
> > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/mount.h          |  1 +
> >  fs/namespace.c      |  2 ++
> >  fs/proc_namespace.c | 13 +++++++++++++
> >  3 files changed, 16 insertions(+)
> >
> > diff --git a/fs/mount.h b/fs/mount.h
> > index f98bf4cd5b1a..2ab6308af78b 100644
> > --- a/fs/mount.h
> > +++ b/fs/mount.h
> > @@ -91,6 +91,7 @@ struct mount {
> >       int mnt_id;                     /* mount identifier */
> >       int mnt_group_id;               /* peer group identifier */
> >       int mnt_expiry_mark;            /* true if marked for expiry */
> > +     time64_t mnt_time;              /* time of mount */
> >       struct hlist_head mnt_pins;
> >       struct hlist_head mnt_stuck_children;
> >  } __randomize_layout;
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index 3fb8f11a42a1..546f07ed44c5 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -220,6 +220,8 @@ static struct mount *alloc_vfsmnt(const char *name)
> >               mnt->mnt_count = 1;
> >               mnt->mnt_writers = 0;
> >  #endif
> > +             /* For proc/<pid>/mountstats */
> > +             mnt->mnt_time = ktime_get_seconds();
> >
> >               INIT_HLIST_NODE(&mnt->mnt_hash);
> >               INIT_LIST_HEAD(&mnt->mnt_child);
> > diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
> > index 49650e54d2f8..d744fb8543f5 100644
> > --- a/fs/proc_namespace.c
> > +++ b/fs/proc_namespace.c
> > @@ -232,6 +232,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt)
> >       if (sb->s_op->show_stats) {
> >               seq_putc(m, ' ');
> >               err = sb->s_op->show_stats(m, mnt_path.dentry);
> > +     } else if (mnt_has_stats(mnt)) {
> > +             /* Similar to /proc/<pid>/io */
> > +             seq_printf(m, "\n"
> > +                        "\ttimes: %lld %lld\n"
> > +                        "\trchar: %lld\n"
> > +                        "\twchar: %lld\n"
> > +                        "\tsyscr: %lld\n"
> > +                        "\tsyscw: %lld\n",
> > +                        r->mnt_time, ktime_get_seconds(),
> > +                        mnt_iostats_counter_read(r, MNTIOS_CHARS_RD),
> > +                        mnt_iostats_counter_read(r, MNTIOS_CHARS_WR),
> > +                        mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD),
> > +                        mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR));
>
> This doesn't scale as {cpus, mounts, counters, read frequency}
> matrix explodes.  Please iterate the per-mount per cpu counters
> once, adding up all counters in one pass to an array on stack, then
> print them all from the array.

I am planning to move to per-sb iostats and was thinking of using
an array of 4 struct percpu_counter. That will make this sort of iteration
more challenging.

Do you really think the read frequency of /proc/self/mountstats
warrants such performance optimization?

It's not like the case of the mighty struct xfsstats.
It is only going to fold 4 per cpu iterations into 1.
This doesn't look like a game changer to me.
Am I missing something?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-02-28 21:57     ` Amir Goldstein
@ 2022-03-01  9:46       ` Dave Chinner
  2022-03-01 10:56         ` Amir Goldstein
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2022-03-01  9:46 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Al Viro, overlayfs, containers, linux-fsdevel

On Mon, Feb 28, 2022 at 11:57:19PM +0200, Amir Goldstein wrote:
> On Mon, Feb 28, 2022 at 11:11 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Mon, Feb 28, 2022 at 01:39:08PM +0200, Amir Goldstein wrote:
> > > Show optional collected per-mount io stats in /proc/<pid>/mountstats
> > > for filesystems that do not implement their own show_stats() method
> > > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > >  fs/mount.h          |  1 +
> > >  fs/namespace.c      |  2 ++
> > >  fs/proc_namespace.c | 13 +++++++++++++
> > >  3 files changed, 16 insertions(+)
> > >
> > > diff --git a/fs/mount.h b/fs/mount.h
> > > index f98bf4cd5b1a..2ab6308af78b 100644
> > > --- a/fs/mount.h
> > > +++ b/fs/mount.h
> > > @@ -91,6 +91,7 @@ struct mount {
> > >       int mnt_id;                     /* mount identifier */
> > >       int mnt_group_id;               /* peer group identifier */
> > >       int mnt_expiry_mark;            /* true if marked for expiry */
> > > +     time64_t mnt_time;              /* time of mount */
> > >       struct hlist_head mnt_pins;
> > >       struct hlist_head mnt_stuck_children;
> > >  } __randomize_layout;
> > > diff --git a/fs/namespace.c b/fs/namespace.c
> > > index 3fb8f11a42a1..546f07ed44c5 100644
> > > --- a/fs/namespace.c
> > > +++ b/fs/namespace.c
> > > @@ -220,6 +220,8 @@ static struct mount *alloc_vfsmnt(const char *name)
> > >               mnt->mnt_count = 1;
> > >               mnt->mnt_writers = 0;
> > >  #endif
> > > +             /* For proc/<pid>/mountstats */
> > > +             mnt->mnt_time = ktime_get_seconds();
> > >
> > >               INIT_HLIST_NODE(&mnt->mnt_hash);
> > >               INIT_LIST_HEAD(&mnt->mnt_child);
> > > diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
> > > index 49650e54d2f8..d744fb8543f5 100644
> > > --- a/fs/proc_namespace.c
> > > +++ b/fs/proc_namespace.c
> > > @@ -232,6 +232,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt)
> > >       if (sb->s_op->show_stats) {
> > >               seq_putc(m, ' ');
> > >               err = sb->s_op->show_stats(m, mnt_path.dentry);
> > > +     } else if (mnt_has_stats(mnt)) {
> > > +             /* Similar to /proc/<pid>/io */
> > > +             seq_printf(m, "\n"
> > > +                        "\ttimes: %lld %lld\n"
> > > +                        "\trchar: %lld\n"
> > > +                        "\twchar: %lld\n"
> > > +                        "\tsyscr: %lld\n"
> > > +                        "\tsyscw: %lld\n",
> > > +                        r->mnt_time, ktime_get_seconds(),
> > > +                        mnt_iostats_counter_read(r, MNTIOS_CHARS_RD),
> > > +                        mnt_iostats_counter_read(r, MNTIOS_CHARS_WR),
> > > +                        mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD),
> > > +                        mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR));
> >
> > This doesn't scale as {cpus, mounts, counters, read frequency}
> > matrix explodes.  Please iterate the per-mount per cpu counters
> > once, adding up all counters in one pass to an array on stack, then
> > print them all from the array.
> 
> I am planning to move to per-sb iostats and was thinking of using
> an array of 4 struct percpu_counter. That will make this sort of iteration
> more challenging.

No, it would get rid of it entirely. percpu_counter_read_positive()
does not require any summing at all - that's a much better solution
than a hand rolled set of percpu counters. Please do this.

> Do you really think the read frequency of /proc/self/mountstats
> warrants such performance optimization?

We get bug reports every so often about the overhead of frequently
summing per-cpu stats on large systems. Nothing ratelimits or
restricts access to /proc/self/mountstats, so when you have a
thousand CPUs and a million monkeys...

Rule of thumb: don't do computationally expensive things to generate
data for globally accessible sysfs files.

> It's not like the case of the mighty struct xfsstats.
> It is only going to fold 4 per cpu iterations into 1.
> This doesn't look like a game changer to me.
> Am I missing something?

I'm just pointing out something we've had problems with in
the past and are trying to help you avoid making the same mistakes.
That's what reviewers are supposed to do, yes?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/6] fs: report per-mount io stats
  2022-03-01  9:46       ` Dave Chinner
@ 2022-03-01 10:56         ` Amir Goldstein
  0 siblings, 0 replies; 15+ messages in thread
From: Amir Goldstein @ 2022-03-01 10:56 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Miklos Szeredi, Al Viro, overlayfs, containers, linux-fsdevel

On Tue, Mar 1, 2022 at 11:46 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Feb 28, 2022 at 11:57:19PM +0200, Amir Goldstein wrote:
> > On Mon, Feb 28, 2022 at 11:11 PM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Mon, Feb 28, 2022 at 01:39:08PM +0200, Amir Goldstein wrote:
> > > > Show optional collected per-mount io stats in /proc/<pid>/mountstats
> > > > for filesystems that do not implement their own show_stats() method
> > > > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag.
> > > >
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > ---
> > > >  fs/mount.h          |  1 +
> > > >  fs/namespace.c      |  2 ++
> > > >  fs/proc_namespace.c | 13 +++++++++++++
> > > >  3 files changed, 16 insertions(+)
> > > >
> > > > diff --git a/fs/mount.h b/fs/mount.h
> > > > index f98bf4cd5b1a..2ab6308af78b 100644
> > > > --- a/fs/mount.h
> > > > +++ b/fs/mount.h
> > > > @@ -91,6 +91,7 @@ struct mount {
> > > >       int mnt_id;                     /* mount identifier */
> > > >       int mnt_group_id;               /* peer group identifier */
> > > >       int mnt_expiry_mark;            /* true if marked for expiry */
> > > > +     time64_t mnt_time;              /* time of mount */
> > > >       struct hlist_head mnt_pins;
> > > >       struct hlist_head mnt_stuck_children;
> > > >  } __randomize_layout;
> > > > diff --git a/fs/namespace.c b/fs/namespace.c
> > > > index 3fb8f11a42a1..546f07ed44c5 100644
> > > > --- a/fs/namespace.c
> > > > +++ b/fs/namespace.c
> > > > @@ -220,6 +220,8 @@ static struct mount *alloc_vfsmnt(const char *name)
> > > >               mnt->mnt_count = 1;
> > > >               mnt->mnt_writers = 0;
> > > >  #endif
> > > > +             /* For proc/<pid>/mountstats */
> > > > +             mnt->mnt_time = ktime_get_seconds();
> > > >
> > > >               INIT_HLIST_NODE(&mnt->mnt_hash);
> > > >               INIT_LIST_HEAD(&mnt->mnt_child);
> > > > diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
> > > > index 49650e54d2f8..d744fb8543f5 100644
> > > > --- a/fs/proc_namespace.c
> > > > +++ b/fs/proc_namespace.c
> > > > @@ -232,6 +232,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt)
> > > >       if (sb->s_op->show_stats) {
> > > >               seq_putc(m, ' ');
> > > >               err = sb->s_op->show_stats(m, mnt_path.dentry);
> > > > +     } else if (mnt_has_stats(mnt)) {
> > > > +             /* Similar to /proc/<pid>/io */
> > > > +             seq_printf(m, "\n"
> > > > +                        "\ttimes: %lld %lld\n"
> > > > +                        "\trchar: %lld\n"
> > > > +                        "\twchar: %lld\n"
> > > > +                        "\tsyscr: %lld\n"
> > > > +                        "\tsyscw: %lld\n",
> > > > +                        r->mnt_time, ktime_get_seconds(),
> > > > +                        mnt_iostats_counter_read(r, MNTIOS_CHARS_RD),
> > > > +                        mnt_iostats_counter_read(r, MNTIOS_CHARS_WR),
> > > > +                        mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD),
> > > > +                        mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR));
> > >
> > > This doesn't scale as {cpus, mounts, counters, read frequency}
> > > matrix explodes.  Please iterate the per-mount per cpu counters
> > > once, adding up all counters in one pass to an array on stack, then
> > > print them all from the array.
> >
> > I am planning to move to per-sb iostats and was thinking of using
> > an array of 4 struct percpu_counter. That will make this sort of iteration
> > more challenging.
>
> No, it would get rid of it entirely. percpu_counter_read_positive()
> does not require any summing at all - that's a much better solution
> than a hand rolled set of percpu counters. Please do this.
>
> > Do you really think the read frequency of /proc/self/mountstats
> > warrants such performance optimization?
>
> We get bug reports every so often about the overhead of frequently
> summing per-cpu stats on large systems. Nothing ratelimits or
> restricts access to /proc/self/mountstats, so when you have a
> thousand CPUs and a million monkeys...
>
> Rule of thumb: don't do computationally expensive things to generate
> data for globally accessible sysfs files.
>
> > It's not like the case of the mighty struct xfsstats.
> > It is only going to fold 4 per cpu iterations into 1.
> > This doesn't look like a game changer to me.
> > Am I missing something?
>
> I'm just pointing out something we've had problems with in
> the past and are trying to help you avoid making the same mistakes.
> That's what reviewers are supposed to do, yes?

Yes, thank you :)

Amir.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-03-01 10:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-28 11:39 [PATCH v2 0/6] Generic per-mount io stats Amir Goldstein
2022-02-28 11:39 ` [PATCH v2 1/6] fs: add iostats counters to struct mount Amir Goldstein
2022-02-28 11:39 ` [PATCH v2 2/6] fs: tidy up fs_flags definitions Amir Goldstein
2022-02-28 11:39 ` [PATCH v2 3/6] fs: collect per-mount io stats Amir Goldstein
2022-02-28 11:39 ` [PATCH v2 4/6] fs: report " Amir Goldstein
2022-02-28 15:06   ` Miklos Szeredi
2022-02-28 16:18     ` Amir Goldstein
2022-02-28 16:31       ` Miklos Szeredi
2022-02-28 17:06         ` Amir Goldstein
2022-02-28 21:11   ` Dave Chinner
2022-02-28 21:57     ` Amir Goldstein
2022-03-01  9:46       ` Dave Chinner
2022-03-01 10:56         ` Amir Goldstein
2022-02-28 11:39 ` [PATCH v2 5/6] ovl: opt-in for " Amir Goldstein
2022-02-28 11:39 ` [PATCH v2 6/6] fuse: " Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.