linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 0/3] shmem: Allow userspace monitoring of tmpfs for lack of space.
@ 2022-04-04 13:41 Gabriel Krisman Bertazi
  2022-04-04 13:41 ` [PATCH RESEND 1/3] shmem: Keep track of out-of-memory and out-of-space errors Gabriel Krisman Bertazi
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2022-04-04 13:41 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton, Amir Goldstein
  Cc: Gabriel Krisman Bertazi, kernel, Khazhismel Kumykov, Linux MM,
	linux-fsdevel

the only difference from v1 is addressing Amir's comment about
generating the directory in sysfs using the minor number.

* Original cover letter

When provisioning containerized applications, multiple very small tmpfs
are used, for which one cannot always predict the proper file system
size ahead of time.  We want to be able to reliably monitor filesystems
for ENOSPC errors, without depending on the application being executed
reporting the ENOSPC after a failure.  It is also not enough to watch
statfs since that information might be ephemeral (say the application
recovers by deleting data, the issue can get lost).  For this use case,
it is also interesting to differentiate IO errors caused by lack of
virtual memory from lack of FS space.

This patch exposes two counters on sysfs that log the two conditions
that are interesting to observe for container provisioning.  They are
recorded per tmpfs superblock, and can be polled by a monitoring
application.

I proposed a more general approach [1] using fsnotify, but considering
the specificity of this use-case, people agreed it seems that a simpler
solution in sysfs is more than enough.

[1] https://lore.kernel.org/linux-mm/20211116220742.584975-3-krisman@collabora.com/T/#mee338d25b0e1e07cbe0861f9a5ca8cc439b3edb8

To: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Linux MM <linux-mm@kvack.org>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>

Gabriel Krisman Bertazi (3):
  shmem: Keep track of out-of-memory and out-of-space errors
  shmem: Introduce /sys/fs/tmpfs support
  shmem: Expose space and accounting error count

 Documentation/ABI/testing/sysfs-fs-tmpfs |  13 +++
 include/linux/shmem_fs.h                 |   7 ++
 mm/shmem.c                               | 102 ++++++++++++++++++++++-
 3 files changed, 120 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-fs-tmpfs

-- 
2.35.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH RESEND 1/3] shmem: Keep track of out-of-memory and out-of-space errors
  2022-04-04 13:41 [PATCH RESEND 0/3] shmem: Allow userspace monitoring of tmpfs for lack of space Gabriel Krisman Bertazi
@ 2022-04-04 13:41 ` Gabriel Krisman Bertazi
  2022-04-04 13:41 ` [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support Gabriel Krisman Bertazi
  2022-04-04 13:41 ` [PATCH RESEND 3/3] shmem: Expose space and accounting error count Gabriel Krisman Bertazi
  2 siblings, 0 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2022-04-04 13:41 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton, Amir Goldstein
  Cc: Gabriel Krisman Bertazi, kernel, Khazhismel Kumykov, Linux MM,
	linux-fsdevel

Keep a per-sb counter of failed shmem allocations for ENOMEM/ENOSPC to
be reported on sysfs.  The sysfs support is done separately on a later
patch.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 include/linux/shmem_fs.h | 3 +++
 mm/shmem.c               | 5 ++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index e65b80ed09e7..1a7cd9ea9107 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -44,6 +44,9 @@ struct shmem_sb_info {
 	spinlock_t shrinklist_lock;   /* Protects shrinklist */
 	struct list_head shrinklist;  /* List of shinkable inodes */
 	unsigned long shrinklist_len; /* Length of shrinklist */
+
+	unsigned long acct_errors;
+	unsigned long space_errors;
 };
 
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
diff --git a/mm/shmem.c b/mm/shmem.c
index a09b29ec2b45..c350fa0a0fff 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -212,8 +212,10 @@ static inline bool shmem_inode_acct_block(struct inode *inode, long pages)
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
 
-	if (shmem_acct_block(info->flags, pages))
+	if (shmem_acct_block(info->flags, pages)) {
+		sbinfo->acct_errors += 1;
 		return false;
+	}
 
 	if (sbinfo->max_blocks) {
 		if (percpu_counter_compare(&sbinfo->used_blocks,
@@ -225,6 +227,7 @@ static inline bool shmem_inode_acct_block(struct inode *inode, long pages)
 	return true;
 
 unacct:
+	sbinfo->space_errors += 1;
 	shmem_unacct_blocks(info->flags, pages);
 	return false;
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support
  2022-04-04 13:41 [PATCH RESEND 0/3] shmem: Allow userspace monitoring of tmpfs for lack of space Gabriel Krisman Bertazi
  2022-04-04 13:41 ` [PATCH RESEND 1/3] shmem: Keep track of out-of-memory and out-of-space errors Gabriel Krisman Bertazi
@ 2022-04-04 13:41 ` Gabriel Krisman Bertazi
  2022-04-04 14:02   ` Al Viro
  2022-04-04 13:41 ` [PATCH RESEND 3/3] shmem: Expose space and accounting error count Gabriel Krisman Bertazi
  2 siblings, 1 reply; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2022-04-04 13:41 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton, Amir Goldstein
  Cc: Gabriel Krisman Bertazi, kernel, Khazhismel Kumykov, Linux MM,
	linux-fsdevel

In order to expose tmpfs statistics on sysfs, add the boilerplate code
to create the /sys/fs/tmpfs structure.  As suggested on a previous
review, this uses the minor as the volume directory in /sys/fs/.

This takes care of not exposing SB_NOUSER mounts.  I don't think we have
a usecase for showing them and, since they don't appear elsewhere, they
might be confusing to users.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>

---
Changes since v1:
  - Use minor instead of fsid for directory in sysfs. (Amir)
---
 include/linux/shmem_fs.h |  4 +++
 mm/shmem.c               | 72 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 1a7cd9ea9107..c27ecf0e1b3b 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -47,6 +47,10 @@ struct shmem_sb_info {
 
 	unsigned long acct_errors;
 	unsigned long space_errors;
+
+	/* sysfs */
+	struct kobject s_kobj;		/* /sys/fs/tmpfs/<uuid> */
+	struct completion s_kobj_unregister;
 };
 
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
diff --git a/mm/shmem.c b/mm/shmem.c
index c350fa0a0fff..665d417ba8a8 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -250,6 +250,7 @@ static const struct inode_operations shmem_dir_inode_operations;
 static const struct inode_operations shmem_special_inode_operations;
 static const struct vm_operations_struct shmem_vm_ops;
 static struct file_system_type shmem_fs_type;
+static struct kobject *shmem_root;
 
 bool vma_is_shmem(struct vm_area_struct *vma)
 {
@@ -3584,6 +3585,56 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root)
 
 #endif /* CONFIG_TMPFS */
 
+#if defined(CONFIG_TMPFS) && defined(CONFIG_SYSFS)
+#define TMPFS_SB_ATTR_RO(name)	\
+	static struct kobj_attribute tmpfs_sb_attr_##name = __ATTR_RO(name)
+
+static struct attribute *tmpfs_attrs[] = {
+	NULL
+};
+ATTRIBUTE_GROUPS(tmpfs);
+
+static void tmpfs_sb_release(struct kobject *kobj)
+{
+	struct shmem_sb_info *sbinfo =
+		container_of(kobj, struct shmem_sb_info, s_kobj);
+
+	complete(&sbinfo->s_kobj_unregister);
+}
+
+static struct kobj_type tmpfs_sb_ktype = {
+	.default_groups = tmpfs_groups,
+	.sysfs_ops	= &kobj_sysfs_ops,
+	.release	= tmpfs_sb_release,
+};
+
+static void shmem_unregister_sysfs(struct super_block *sb)
+{
+	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+
+	kobject_del(&sbinfo->s_kobj);
+	kobject_put(&sbinfo->s_kobj);
+	wait_for_completion(&sbinfo->s_kobj_unregister);
+}
+
+static int shmem_register_sysfs(struct super_block *sb)
+{
+	int err;
+	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+
+	init_completion(&sbinfo->s_kobj_unregister);
+	err = kobject_init_and_add(&sbinfo->s_kobj, &tmpfs_sb_ktype,
+				   shmem_root, "%d", MINOR(sb->s_dev));
+	if (err) {
+		kobject_put(&sbinfo->s_kobj);
+		wait_for_completion(&sbinfo->s_kobj_unregister);
+		return err;
+	}
+
+	return 0;
+}
+#endif /* CONFIG_TMPFS && CONFIG_SYSFS */
+
 static void shmem_put_super(struct super_block *sb)
 {
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
@@ -3591,6 +3642,12 @@ static void shmem_put_super(struct super_block *sb)
 	free_percpu(sbinfo->ino_batch);
 	percpu_counter_destroy(&sbinfo->used_blocks);
 	mpol_put(sbinfo->mpol);
+
+#if IS_ENABLED(CONFIG_TMPFS) && IS_ENABLED(CONFIG_SYSFS)
+	if (!(sb->s_flags & SB_NOUSER))
+		shmem_unregister_sysfs(sb);
+#endif
+
 	kfree(sbinfo);
 	sb->s_fs_info = NULL;
 }
@@ -3673,6 +3730,13 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
 	sb->s_root = d_make_root(inode);
 	if (!sb->s_root)
 		goto failed;
+
+#if IS_ENABLED(CONFIG_TMPFS) && IS_ENABLED(CONFIG_SYSFS)
+	if (!(sb->s_flags & SB_NOUSER))
+		if (shmem_register_sysfs(sb))
+			goto failed;
+#endif
+
 	return 0;
 
 failed:
@@ -3889,11 +3953,15 @@ int __init shmem_init(void)
 		goto out2;
 	}
 
+	shmem_root = kobject_create_and_add("tmpfs", fs_kobj);
+	if (!shmem_root)
+		goto out1;
+
 	shm_mnt = kern_mount(&shmem_fs_type);
 	if (IS_ERR(shm_mnt)) {
 		error = PTR_ERR(shm_mnt);
 		pr_err("Could not kern_mount tmpfs\n");
-		goto out1;
+		goto put_kobj;
 	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -3904,6 +3972,8 @@ int __init shmem_init(void)
 #endif
 	return 0;
 
+put_kobj:
+	kobject_put(shmem_root);
 out1:
 	unregister_filesystem(&shmem_fs_type);
 out2:
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH RESEND 3/3] shmem: Expose space and accounting error count
  2022-04-04 13:41 [PATCH RESEND 0/3] shmem: Allow userspace monitoring of tmpfs for lack of space Gabriel Krisman Bertazi
  2022-04-04 13:41 ` [PATCH RESEND 1/3] shmem: Keep track of out-of-memory and out-of-space errors Gabriel Krisman Bertazi
  2022-04-04 13:41 ` [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support Gabriel Krisman Bertazi
@ 2022-04-04 13:41 ` Gabriel Krisman Bertazi
  2 siblings, 0 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2022-04-04 13:41 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton, Amir Goldstein
  Cc: Gabriel Krisman Bertazi, kernel, Khazhismel Kumykov, Linux MM,
	linux-fsdevel

Exposing these shmem counters through sysfs is particularly useful for
container provisioning, to allow administrators to differentiate between
insufficiently provisioned fs size vs. running out of memory.

Suggested-by: Amir Goldstein <amir73il@gmail.com>
Suggested-by: Khazhy Kumykov <khazhy@google.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 Documentation/ABI/testing/sysfs-fs-tmpfs | 13 ++++++++++++
 mm/shmem.c                               | 25 ++++++++++++++++++++++++
 2 files changed, 38 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-fs-tmpfs

diff --git a/Documentation/ABI/testing/sysfs-fs-tmpfs b/Documentation/ABI/testing/sysfs-fs-tmpfs
new file mode 100644
index 000000000000..d32b90949710
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-fs-tmpfs
@@ -0,0 +1,13 @@
+What:		/sys/fs/tmpfs/<disk>/acct_errors
+Date:		March 2022
+Contact:	"Gabriel Krisman Bertazi" <krisman@collabora.com>
+Description:
+		Track the number of IO errors caused by lack of memory to
+		perform the allocation of a tmpfs block.
+
+What:		/sys/fs/tmpfs/<disk>/space_errors
+Date:		March 2022
+Contact:	"Gabriel Krisman Bertazi" <krisman@collabora.com>
+Description:
+		Track the number of IO errors caused by lack of space
+		in the filesystem to perform the allocation of a tmpfs block.
diff --git a/mm/shmem.c b/mm/shmem.c
index 665d417ba8a8..50d22449d99e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -214,6 +214,7 @@ static inline bool shmem_inode_acct_block(struct inode *inode, long pages)
 
 	if (shmem_acct_block(info->flags, pages)) {
 		sbinfo->acct_errors += 1;
+		sysfs_notify(&sbinfo->s_kobj, NULL, "acct_errors");
 		return false;
 	}
 
@@ -228,6 +229,7 @@ static inline bool shmem_inode_acct_block(struct inode *inode, long pages)
 
 unacct:
 	sbinfo->space_errors += 1;
+	sysfs_notify(&sbinfo->s_kobj, NULL, "space_errors");
 	shmem_unacct_blocks(info->flags, pages);
 	return false;
 }
@@ -3586,10 +3588,33 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root)
 #endif /* CONFIG_TMPFS */
 
 #if defined(CONFIG_TMPFS) && defined(CONFIG_SYSFS)
+static ssize_t acct_errors_show(struct kobject *kobj,
+				struct kobj_attribute *attr, char *page)
+{
+	struct shmem_sb_info *sbinfo =
+		container_of(kobj, struct shmem_sb_info, s_kobj);
+
+	return sysfs_emit(page, "%lu\n", sbinfo->acct_errors);
+}
+
+static ssize_t space_errors_show(struct kobject *kobj,
+				struct kobj_attribute *attr, char *page)
+{
+	struct shmem_sb_info *sbinfo =
+		container_of(kobj, struct shmem_sb_info, s_kobj);
+
+	return sysfs_emit(page, "%lu\n", sbinfo->space_errors);
+}
+
 #define TMPFS_SB_ATTR_RO(name)	\
 	static struct kobj_attribute tmpfs_sb_attr_##name = __ATTR_RO(name)
 
+TMPFS_SB_ATTR_RO(acct_errors);
+TMPFS_SB_ATTR_RO(space_errors);
+
 static struct attribute *tmpfs_attrs[] = {
+	&tmpfs_sb_attr_acct_errors.attr,
+	&tmpfs_sb_attr_space_errors.attr,
 	NULL
 };
 ATTRIBUTE_GROUPS(tmpfs);
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support
  2022-04-04 13:41 ` [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support Gabriel Krisman Bertazi
@ 2022-04-04 14:02   ` Al Viro
  2022-04-04 19:02     ` Gabriel Krisman Bertazi
  0 siblings, 1 reply; 6+ messages in thread
From: Al Viro @ 2022-04-04 14:02 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Hugh Dickins, Andrew Morton, Amir Goldstein, kernel,
	Khazhismel Kumykov, Linux MM, linux-fsdevel

On Mon, Apr 04, 2022 at 09:41:36AM -0400, Gabriel Krisman Bertazi wrote:
> In order to expose tmpfs statistics on sysfs, add the boilerplate code
> to create the /sys/fs/tmpfs structure.  As suggested on a previous
> review, this uses the minor as the volume directory in /sys/fs/.
> 
> This takes care of not exposing SB_NOUSER mounts.  I don't think we have
> a usecase for showing them and, since they don't appear elsewhere, they
> might be confusing to users.
> 
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>

> +static void shmem_unregister_sysfs(struct super_block *sb)
> +{
> +	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
> +
> +	kobject_del(&sbinfo->s_kobj);
> +	kobject_put(&sbinfo->s_kobj);
> +	wait_for_completion(&sbinfo->s_kobj_unregister);
> +}

If you embed kobject into something, you basically commit to
having the lifetime rules maintained by that kobject...


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support
  2022-04-04 14:02   ` Al Viro
@ 2022-04-04 19:02     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2022-04-04 19:02 UTC (permalink / raw)
  To: Al Viro
  Cc: Hugh Dickins, Andrew Morton, Amir Goldstein, kernel,
	Khazhismel Kumykov, Linux MM, linux-fsdevel

Al Viro <viro@zeniv.linux.org.uk> writes:

> On Mon, Apr 04, 2022 at 09:41:36AM -0400, Gabriel Krisman Bertazi wrote:
>> In order to expose tmpfs statistics on sysfs, add the boilerplate code
>> to create the /sys/fs/tmpfs structure.  As suggested on a previous
>> review, this uses the minor as the volume directory in /sys/fs/.
>> 
>> This takes care of not exposing SB_NOUSER mounts.  I don't think we have
>> a usecase for showing them and, since they don't appear elsewhere, they
>> might be confusing to users.
>> 
>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>
>> +static void shmem_unregister_sysfs(struct super_block *sb)
>> +{
>> +	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
>> +
>> +	kobject_del(&sbinfo->s_kobj);
>> +	kobject_put(&sbinfo->s_kobj);
>> +	wait_for_completion(&sbinfo->s_kobj_unregister);
>> +}
>
> If you embed kobject into something, you basically commit to
> having the lifetime rules maintained by that kobject...

Hi Viro,

The way I'm doing it seems to be a pattern used by at least Ext4, f2fs
and Btrfs. Is there a problem with embedding it in the superblock,
holding a reference and then waiting for completion when umounting the
fs?

-- 
Gabriel Krisman Bertazi


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-04 19:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-04 13:41 [PATCH RESEND 0/3] shmem: Allow userspace monitoring of tmpfs for lack of space Gabriel Krisman Bertazi
2022-04-04 13:41 ` [PATCH RESEND 1/3] shmem: Keep track of out-of-memory and out-of-space errors Gabriel Krisman Bertazi
2022-04-04 13:41 ` [PATCH RESEND 2/3] shmem: Introduce /sys/fs/tmpfs support Gabriel Krisman Bertazi
2022-04-04 14:02   ` Al Viro
2022-04-04 19:02     ` Gabriel Krisman Bertazi
2022-04-04 13:41 ` [PATCH RESEND 3/3] shmem: Expose space and accounting error count Gabriel Krisman Bertazi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).