All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Generic per-mount io stats
@ 2021-01-07 21:43 Amir Goldstein
  2021-01-07 21:43 ` [RFC PATCH 1/3] fs: add iostats counters to struct mount Amir Goldstein
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Amir Goldstein @ 2021-01-07 21:43 UTC (permalink / raw)
  To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel

Miklos,

I was trying to address the lack of iostat report for non-blockdev
filesystems such as overlayfs and fuse.

NFS has already addressed this with it own custom stats collection,
which is displayed in /proc/<pid>/mountstats.

When looking at the options, I found that a generic solution is quite
simple and could serve all filesystems that opt-in to use it.

This short patch set results in the following mountstats example report:

device overlay mounted on /mnt with fstype overlay
	times: 125 153
	rchar: 12
	wchar: 0
	syscr: 2
	syscw: 0

The choise to collect and report io stats by mount and not by sb is
quite arbitrary, because it was quite easy to implement and is natural
to the existing mountstats proc file.

I used the arbirtaty flag FS_USERNS_MOUNT as an example for a way for
filesystem to opt-in to mount io stats, but it could be either an FS_
SB_ or MNT_ flag.  I do not anticipate shortage of opinions on this
matter.

As for performance, the io accounting hooks are the existing hooks for
task io accounting.  mount io stats add a dereference to mnt_pcp for
the filesystems that opt-in and one per-cpu var update.  The dereference
to mnt_sb->s_type->fs_flags is temporary as we will probably want to
use an MNT_ flag, whether kernel internal or user controlled.

What do everyone think about this?

Al,

did I break any subtle rules of the vfs?

Thanks,
Amir.

Amir Goldstein (3):
  fs: add iostats counters to struct mount
  fs: collect per-mount io stats
  fs: report per-mount io stats

 fs/Kconfig          |  9 +++++
 fs/mount.h          | 54 ++++++++++++++++++++++++++++
 fs/namespace.c      | 19 ++++++++++
 fs/proc_namespace.c | 13 +++++++
 fs/read_write.c     | 87 ++++++++++++++++++++++++++++++++-------------
 5 files changed, 158 insertions(+), 24 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/3] fs: add iostats counters to struct mount
  2021-01-07 21:43 [RFC PATCH 0/3] Generic per-mount io stats Amir Goldstein
@ 2021-01-07 21:43 ` Amir Goldstein
  2021-01-07 21:44 ` [RFC PATCH 2/3] fs: collect per-mount io stats Amir Goldstein
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2021-01-07 21:43 UTC (permalink / raw)
  To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel

With config MOUNT_IO_STATS, add an array of counters to struct mnt_pcp
that will be used to collect I/O statistics.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Please note that the dependency on SMP is just for the RFC.

Thanks,
Amir.

 fs/Kconfig     |  9 +++++++++
 fs/mount.h     | 32 ++++++++++++++++++++++++++++++++
 fs/namespace.c | 17 +++++++++++++++++
 3 files changed, 58 insertions(+)

diff --git a/fs/Kconfig b/fs/Kconfig
index aa4c12282301..7473bdf4bbfb 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -15,6 +15,15 @@ config VALIDATE_FS_PARSER
 	  Enable this to perform validation of the parameter description for a
 	  filesystem when it is registered.
 
+config FS_MOUNT_STATS
+	bool "Enable per-mount I/O statistics"
+	depends on SMP
+	help
+	  Enable this to allow collecting per-mount I/O statistics and display
+	  them in /proc/<pid>/mountstats.
+
+	  Say N if unsure.
+
 if BLOCK
 
 config FS_IOMAP
diff --git a/fs/mount.h b/fs/mount.h
index ce6c376e0bc2..2bf0df64ded5 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -24,9 +24,25 @@ struct mnt_namespace {
 	unsigned int		pending_mounts;
 } __randomize_layout;
 
+/* Similar to task_io_accounting members */
+enum {
+	MNTIOS_CHARS_RD,	/* bytes read via syscalls */
+	MNTIOS_CHARS_WR,	/* bytes written via syscalls */
+	MNTIOS_SYSCALLS_RD,	/* # of read syscalls */
+	MNTIOS_SYSCALLS_WR,	/* # of write syscalls */
+	_MNTIOS_COUNTERS_NUM
+};
+
+struct mnt_iostats {
+	s64 counter[_MNTIOS_COUNTERS_NUM];
+};
+
 struct mnt_pcp {
 	int mnt_count;
 	int mnt_writers;
+#ifdef CONFIG_FS_MOUNT_STATS
+	struct mnt_iostats iostats;
+#endif
 };
 
 struct mountpoint {
@@ -158,3 +174,19 @@ static inline bool is_anon_ns(struct mnt_namespace *ns)
 }
 
 extern void mnt_cursor_del(struct mnt_namespace *ns, struct mount *cursor);
+
+static inline void mnt_iostats_counter_inc(struct mount *mnt, int id)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	this_cpu_inc(mnt->mnt_pcp->iostats.counter[id]);
+#endif
+}
+
+static inline void mnt_iostats_counter_add(struct mount *mnt, int id, s64 n)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	this_cpu_add(mnt->mnt_pcp->iostats.counter[id], n);
+#endif
+}
+
+extern s64 mnt_iostats_counter_read(struct mount *mnt, int id);
diff --git a/fs/namespace.c b/fs/namespace.c
index d2db7dfe232b..04b35dfcc71f 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -283,6 +283,23 @@ static unsigned int mnt_get_writers(struct mount *mnt)
 #endif
 }
 
+s64 mnt_iostats_counter_read(struct mount *mnt, int id)
+{
+	s64 count = 0;
+#ifdef CONFIG_FS_MOUNT_STATS
+	/*
+	 * MOUNT_STATS depends on SMP.
+	 * Should be trivial to implement for !SMP if anyone cares...
+	 */
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		count += per_cpu_ptr(mnt->mnt_pcp, cpu)->iostats.counter[id];
+	}
+#endif
+	return count;
+}
+
 static int mnt_is_readonly(struct vfsmount *mnt)
 {
 	if (mnt->mnt_sb->s_readonly_remount)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/3] fs: collect per-mount io stats
  2021-01-07 21:43 [RFC PATCH 0/3] Generic per-mount io stats Amir Goldstein
  2021-01-07 21:43 ` [RFC PATCH 1/3] fs: add iostats counters to struct mount Amir Goldstein
@ 2021-01-07 21:44 ` Amir Goldstein
  2021-01-07 21:44 ` [RFC PATCH 3/3] fs: report " Amir Goldstein
  2021-01-08 16:41 ` [RFC PATCH 0/3] Generic " Amir Goldstein
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2021-01-07 21:44 UTC (permalink / raw)
  To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel

Replace task io account helpers with wrappers that may also collect
per-mount stats.

Currently, just for example, stats are collected for mounts of
filesystems with flag FS_USERNS_MOUNT.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

I used the arbirtaty flag FS_USERNS_MOUNT as an example for a way for
filesystem to opt-in to mount io stats, but it could be either an FS_
SB_ or MNT_ flag.  I do not anticipate shortage of opinions on this
matter.

As for performance, the io accounting hooks are the existing hooks for
task io accounting.  mount io stats add a dereference to mnt_pcp for
the filesystems that opt-in and one per-cpu var update.  The dereference
to mnt_sb->s_type->fs_flags is temporary as we will probably want to
use an MNT_ flag, whether kernel internal or user controlled.

Thanks,
Amir.

 fs/mount.h      | 21 ++++++++++++
 fs/read_write.c | 87 +++++++++++++++++++++++++++++++++++--------------
 2 files changed, 84 insertions(+), 24 deletions(-)

diff --git a/fs/mount.h b/fs/mount.h
index 2bf0df64ded5..81db83c36140 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -175,6 +175,27 @@ static inline bool is_anon_ns(struct mnt_namespace *ns)
 
 extern void mnt_cursor_del(struct mnt_namespace *ns, struct mount *cursor);
 
+static inline bool mnt_has_stats(struct vfsmount *mnt)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	/* Just for example. Should this be an FS_ SB_ or MNT_ flag? */
+	return (mnt->mnt_sb->s_type->fs_flags & FS_USERNS_MOUNT);
+#else
+	return false;
+#endif
+}
+
+static inline struct mount *file_mnt_has_stats(struct file *file)
+{
+#ifdef CONFIG_FS_MOUNT_STATS
+	struct vfsmount *mnt = file->f_path.mnt;
+
+	if (mnt_has_stats(mnt))
+		return real_mount(mnt);
+#endif
+	return NULL;
+}
+
 static inline void mnt_iostats_counter_inc(struct mount *mnt, int id)
 {
 #ifdef CONFIG_FS_MOUNT_STATS
diff --git a/fs/read_write.c b/fs/read_write.c
index 75f764b43418..7e3e1ebfefb4 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -21,6 +21,7 @@
 #include <linux/mount.h>
 #include <linux/fs.h>
 #include "internal.h"
+#include "mount.h"
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
@@ -34,6 +35,44 @@ const struct file_operations generic_ro_fops = {
 
 EXPORT_SYMBOL(generic_ro_fops);
 
+static void file_add_rchar(struct file *file, struct task_struct *tsk,
+			   ssize_t amt)
+{
+	struct mount *m = file_mnt_has_stats(file);
+
+	if (m)
+		mnt_iostats_counter_add(m, MNTIOS_CHARS_RD, amt);
+	add_rchar(tsk, amt);
+}
+
+static void file_add_wchar(struct file *file, struct task_struct *tsk,
+			   ssize_t amt)
+{
+	struct mount *m = file_mnt_has_stats(file);
+
+	if (m)
+		mnt_iostats_counter_add(m, MNTIOS_CHARS_WR, amt);
+	add_wchar(tsk, amt);
+}
+
+static void file_inc_syscr(struct file *file, struct task_struct *tsk)
+{
+	struct mount *m = file_mnt_has_stats(file);
+
+	if (m)
+		mnt_iostats_counter_inc(m, MNTIOS_SYSCALLS_RD);
+	inc_syscr(current);
+}
+
+static void file_inc_syscw(struct file *file, struct task_struct *tsk)
+{
+	struct mount *m = file_mnt_has_stats(file);
+
+	if (m)
+		mnt_iostats_counter_inc(m, MNTIOS_SYSCALLS_WR);
+	inc_syscw(current);
+}
+
 static inline bool unsigned_offsets(struct file *file)
 {
 	return file->f_mode & FMODE_UNSIGNED_OFFSET;
@@ -456,9 +495,9 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 		if (pos)
 			*pos = kiocb.ki_pos;
 		fsnotify_access(file);
-		add_rchar(current, ret);
+		file_add_rchar(file, current, ret);
 	}
-	inc_syscr(current);
+	file_inc_syscr(file, current);
 	return ret;
 }
 
@@ -498,9 +537,9 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
 		ret = -EINVAL;
 	if (ret > 0) {
 		fsnotify_access(file);
-		add_rchar(current, ret);
+		file_add_rchar(file, current, ret);
 	}
-	inc_syscr(current);
+	file_inc_syscr(file, current);
 	return ret;
 }
 
@@ -552,9 +591,9 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 		if (pos)
 			*pos = kiocb.ki_pos;
 		fsnotify_modify(file);
-		add_wchar(current, ret);
+		file_add_wchar(file, current, ret);
 	}
-	inc_syscw(current);
+	file_inc_syscw(file, current);
 	return ret;
 }
 /*
@@ -607,9 +646,9 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
 		ret = -EINVAL;
 	if (ret > 0) {
 		fsnotify_modify(file);
-		add_wchar(current, ret);
+		file_add_wchar(file, current, ret);
 	}
-	inc_syscw(current);
+	file_inc_syscw(file, current);
 	file_end_write(file);
 	return ret;
 }
@@ -962,8 +1001,8 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_rchar(current, ret);
-	inc_syscr(current);
+		file_add_rchar(f.file, current, ret);
+	file_inc_syscr(f.file, current);
 	return ret;
 }
 
@@ -986,8 +1025,8 @@ static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_wchar(current, ret);
-	inc_syscw(current);
+		file_add_wchar(f.file, current, ret);
+	file_inc_syscw(f.file, current);
 	return ret;
 }
 
@@ -1015,8 +1054,8 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_rchar(current, ret);
-	inc_syscr(current);
+		file_add_rchar(f.file, current, ret);
+	file_inc_syscr(f.file, current);
 	return ret;
 }
 
@@ -1038,8 +1077,8 @@ static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
 	}
 
 	if (ret > 0)
-		add_wchar(current, ret);
-	inc_syscw(current);
+		file_add_wchar(f.file, current, ret);
+	file_inc_syscw(f.file, current);
 	return ret;
 }
 
@@ -1258,8 +1297,8 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
 	file_end_write(out.file);
 
 	if (retval > 0) {
-		add_rchar(current, retval);
-		add_wchar(current, retval);
+		file_add_rchar(in.file, current, retval);
+		file_add_wchar(out.file, current, retval);
 		fsnotify_access(in.file);
 		fsnotify_modify(out.file);
 		out.file->f_pos = out_pos;
@@ -1269,8 +1308,8 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
 			in.file->f_pos = pos;
 	}
 
-	inc_syscr(current);
-	inc_syscw(current);
+	file_inc_syscr(in.file, current);
+	file_inc_syscw(out.file, current);
 	if (pos > max)
 		retval = -EOVERFLOW;
 
@@ -1519,13 +1558,13 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 done:
 	if (ret > 0) {
 		fsnotify_access(file_in);
-		add_rchar(current, ret);
+		file_add_rchar(file_in, current, ret);
 		fsnotify_modify(file_out);
-		add_wchar(current, ret);
+		file_add_wchar(file_out, current, ret);
 	}
 
-	inc_syscr(current);
-	inc_syscw(current);
+	file_inc_syscr(file_in, current);
+	file_inc_syscw(file_out, current);
 
 	file_end_write(file_out);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 3/3] fs: report per-mount io stats
  2021-01-07 21:43 [RFC PATCH 0/3] Generic per-mount io stats Amir Goldstein
  2021-01-07 21:43 ` [RFC PATCH 1/3] fs: add iostats counters to struct mount Amir Goldstein
  2021-01-07 21:44 ` [RFC PATCH 2/3] fs: collect per-mount io stats Amir Goldstein
@ 2021-01-07 21:44 ` Amir Goldstein
  2021-01-08 16:41 ` [RFC PATCH 0/3] Generic " Amir Goldstein
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2021-01-07 21:44 UTC (permalink / raw)
  To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel

Show optional collected per-mount io stats in /proc/<pid>/mountstats
for filesystems that do not implement their own show_stats() method.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

See following snippet from mountstats example report:

device overlay mounted on /mnt with fstype overlay
	times: 125 153
	rchar: 12
	wchar: 0
	syscr: 2
	syscw: 0

Thanks,
Amir.

 fs/mount.h          |  1 +
 fs/namespace.c      |  2 ++
 fs/proc_namespace.c | 13 +++++++++++++
 3 files changed, 16 insertions(+)

diff --git a/fs/mount.h b/fs/mount.h
index 81db83c36140..1f262892a6ed 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -91,6 +91,7 @@ struct mount {
 	int mnt_id;			/* mount identifier */
 	int mnt_group_id;		/* peer group identifier */
 	int mnt_expiry_mark;		/* true if marked for expiry */
+	time64_t mnt_time;		/* time of mount */
 	struct hlist_head mnt_pins;
 	struct hlist_head mnt_stuck_children;
 } __randomize_layout;
diff --git a/fs/namespace.c b/fs/namespace.c
index 04b35dfcc71f..3a91234e5fd0 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -198,6 +198,8 @@ static struct mount *alloc_vfsmnt(const char *name)
 		mnt->mnt_count = 1;
 		mnt->mnt_writers = 0;
 #endif
+		/* For proc/<pid>/mountstats */
+		mnt->mnt_time = ktime_get_seconds();
 
 		INIT_HLIST_NODE(&mnt->mnt_hash);
 		INIT_LIST_HEAD(&mnt->mnt_child);
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index eafb75755fa3..34aea7f3f550 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -229,6 +229,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt)
 	if (sb->s_op->show_stats) {
 		seq_putc(m, ' ');
 		err = sb->s_op->show_stats(m, mnt_path.dentry);
+	} else if (mnt_has_stats(mnt)) {
+		/* Similar to /proc/<pid>/io */
+		seq_printf(m, "\n"
+			   "\ttimes: %lld %lld\n"
+			   "\trchar: %lld\n"
+			   "\twchar: %lld\n"
+			   "\tsyscr: %lld\n"
+			   "\tsyscw: %lld\n",
+			   r->mnt_time, ktime_get_seconds(),
+			   mnt_iostats_counter_read(r, MNTIOS_CHARS_RD),
+			   mnt_iostats_counter_read(r, MNTIOS_CHARS_WR),
+			   mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD),
+			   mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR));
 	}
 
 	seq_putc(m, '\n');
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 0/3] Generic per-mount io stats
  2021-01-07 21:43 [RFC PATCH 0/3] Generic per-mount io stats Amir Goldstein
                   ` (2 preceding siblings ...)
  2021-01-07 21:44 ` [RFC PATCH 3/3] fs: report " Amir Goldstein
@ 2021-01-08 16:41 ` Amir Goldstein
  3 siblings, 0 replies; 5+ messages in thread
From: Amir Goldstein @ 2021-01-08 16:41 UTC (permalink / raw)
  To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel

On Thu, Jan 7, 2021 at 11:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> Miklos,
>
> I was trying to address the lack of iostat report for non-blockdev
> filesystems such as overlayfs and fuse.
>
> NFS has already addressed this with it own custom stats collection,
> which is displayed in /proc/<pid>/mountstats.
>
> When looking at the options, I found that a generic solution is quite
> simple and could serve all filesystems that opt-in to use it.
>
> This short patch set results in the following mountstats example report:
>
> device overlay mounted on /mnt with fstype overlay
>         times: 125 153
>         rchar: 12
>         wchar: 0
>         syscr: 2
>         syscw: 0
>
> The choise to collect and report io stats by mount and not by sb is
> quite arbitrary, because it was quite easy to implement and is natural
> to the existing mountstats proc file.
>
> I used the arbirtaty flag FS_USERNS_MOUNT as an example for a way for
> filesystem to opt-in to mount io stats, but it could be either an FS_
> SB_ or MNT_ flag.  I do not anticipate shortage of opinions on this
> matter.
>
> As for performance, the io accounting hooks are the existing hooks for
> task io accounting.  mount io stats add a dereference to mnt_pcp for
> the filesystems that opt-in and one per-cpu var update.  The dereference
> to mnt_sb->s_type->fs_flags is temporary as we will probably want to
> use an MNT_ flag, whether kernel internal or user controlled.
>
> What do everyone think about this?
>
> Al,
>
> did I break any subtle rules of the vfs?
>

That is besides dereferencing a NULL file pointer when getting EBADF
in p/readv/writev...

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-08 16:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-07 21:43 [RFC PATCH 0/3] Generic per-mount io stats Amir Goldstein
2021-01-07 21:43 ` [RFC PATCH 1/3] fs: add iostats counters to struct mount Amir Goldstein
2021-01-07 21:44 ` [RFC PATCH 2/3] fs: collect per-mount io stats Amir Goldstein
2021-01-07 21:44 ` [RFC PATCH 3/3] fs: report " Amir Goldstein
2021-01-08 16:41 ` [RFC PATCH 0/3] Generic " Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.