linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs
@ 2023-07-13 13:48 cem
  2023-07-13 13:48 ` [PATCH 1/6] shmem: make shmem_inode_acct_block() return error cem
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Carlos Maiolino <cem@kernel.org>

Hello folks.

This is a resend of the quota support for tmpfs. This has been rebased on
today Linus's TOT. These patches conflicted with Luis Chamberlain's series to
include 'noswap' mount option to tmpfs, there was no code change since the
previous version, other than moving the implementation of quota options 'after'
'noswap'.

Honza, giving the fact the conflicts were basically due context, I thought it
was ok to keep your RwB on the patches, could you please confirm it? My
apologies if I should have removed the RwB tags.

As before, details are within each patch.

The original cover-letter follows...

people have been asking for quota support in tmpfs many times in the past
mostly to avoid one malicious user, or misbehaving user/program to consume
all of the system memory. This has been partially solved with the size
mount option, but some problems still prevail.

One of the problems is the fact that /dev/shm is still generally unprotected
with this and another is administration overhead of managing multiple tmpfs
mounts and lack of more fine grained control.

Quota support can solve all these problems in a somewhat standard way
people are already familiar with from regular file systems. It can give us
more fine grained control over how much memory user/groups can consume.
Additionally it can also control number of inodes and with special quota
mount options introduced with a second patch we can set global limits
allowing us to replace the size mount option with quota entirely.

Currently the standard userspace quota tools (quota, xfs_quota) are only
using quotactl ioctl which is expecting a block device. I patched quota [1]
and xfs_quota [2] to use quotactl_fd in case we want to run the tools on
mount point directory to work nicely with tmpfs.

The implementation was tested on patched version of xfstests [3].

[1] https://github.com/lczerner/quota/tree/quotactl_fd_support
[2] https://github.com/lczerner/xfsprogs/tree/quotactl_fd_support
[3] https://github.com/lczerner/xfstests/tree/tmpfs_quota_support


Jan Kara (1):
  quota: Check presence of quota operation structures instead of
    ->quota_read and ->quota_write callbacks

Lukas Czerner (5):
  shmem: make shmem_inode_acct_block() return error
  shmem: make shmem_get_inode() return ERR_PTR instead of NULL
  shmem: prepare shmem quota infrastructure
  shmem: quota support
  Add default quota limit mount options

 Documentation/filesystems/tmpfs.rst |  31 ++
 fs/Kconfig                          |  12 +
 fs/quota/dquot.c                    |   2 +-
 include/linux/shmem_fs.h            |  28 ++
 include/uapi/linux/quota.h          |   1 +
 mm/Makefile                         |   2 +-
 mm/shmem.c                          | 465 +++++++++++++++++++++-------
 mm/shmem_quota.c                    | 350 +++++++++++++++++++++
 8 files changed, 783 insertions(+), 108 deletions(-)
 create mode 100644 mm/shmem_quota.c

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
-- 
2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/6] shmem: make shmem_inode_acct_block() return error
  2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
@ 2023-07-13 13:48 ` cem
  2023-07-13 13:48 ` [PATCH 2/6] shmem: make shmem_get_inode() return ERR_PTR instead of NULL cem
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Lukas Czerner <lczerner@redhat.com>

Make shmem_inode_acct_block() return proper error code instead of bool.
This will be useful later when we introduce quota support.

There should be no functional change.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 mm/shmem.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 2f2e0e618072..51d17655a6e1 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -199,13 +199,14 @@ static inline void shmem_unacct_blocks(unsigned long flags, long pages)
 		vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE));
 }
 
-static inline bool shmem_inode_acct_block(struct inode *inode, long pages)
+static inline int shmem_inode_acct_block(struct inode *inode, long pages)
 {
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
+	int err = -ENOSPC;
 
 	if (shmem_acct_block(info->flags, pages))
-		return false;
+		return err;
 
 	if (sbinfo->max_blocks) {
 		if (percpu_counter_compare(&sbinfo->used_blocks,
@@ -214,11 +215,11 @@ static inline bool shmem_inode_acct_block(struct inode *inode, long pages)
 		percpu_counter_add(&sbinfo->used_blocks, pages);
 	}
 
-	return true;
+	return 0;
 
 unacct:
 	shmem_unacct_blocks(info->flags, pages);
-	return false;
+	return err;
 }
 
 static inline void shmem_inode_unacct_blocks(struct inode *inode, long pages)
@@ -370,7 +371,7 @@ bool shmem_charge(struct inode *inode, long pages)
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	unsigned long flags;
 
-	if (!shmem_inode_acct_block(inode, pages))
+	if (shmem_inode_acct_block(inode, pages))
 		return false;
 
 	/* nrpages adjustment first, then shmem_recalc_inode() when balanced */
@@ -1588,13 +1589,14 @@ static struct folio *shmem_alloc_and_acct_folio(gfp_t gfp, struct inode *inode,
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	struct folio *folio;
 	int nr;
-	int err = -ENOSPC;
+	int err;
 
 	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
 		huge = false;
 	nr = huge ? HPAGE_PMD_NR : 1;
 
-	if (!shmem_inode_acct_block(inode, nr))
+	err = shmem_inode_acct_block(inode, nr);
+	if (err)
 		goto failed;
 
 	if (huge)
@@ -2445,7 +2447,7 @@ int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 	int ret;
 	pgoff_t max_off;
 
-	if (!shmem_inode_acct_block(inode, 1)) {
+	if (shmem_inode_acct_block(inode, 1)) {
 		/*
 		 * We may have got a page, returned -ENOENT triggering a retry,
 		 * and now we find ourselves with -ENOMEM. Release the page, to
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/6] shmem: make shmem_get_inode() return ERR_PTR instead of NULL
  2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
  2023-07-13 13:48 ` [PATCH 1/6] shmem: make shmem_inode_acct_block() return error cem
@ 2023-07-13 13:48 ` cem
  2023-07-13 13:48 ` [PATCH 3/6] quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks cem
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Carlos Maiolino <cem@kernel.org>

Make shmem_get_inode() return ERR_PTR instead of NULL on error. This will be
useful later when we introduce quota support.

There should be no functional change.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>

This patch conflicted with: 2c6efe9cf2d7
---
 mm/shmem.c | 211 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 119 insertions(+), 92 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 51d17655a6e1..2a7b8060b6f4 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2365,67 +2365,74 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block
 	struct shmem_inode_info *info;
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
 	ino_t ino;
+	int err;
+
+	err = shmem_reserve_inode(sb, &ino);
+	if (err)
+		return ERR_PTR(err);
 
-	if (shmem_reserve_inode(sb, &ino))
-		return NULL;
 
 	inode = new_inode(sb);
-	if (inode) {
-		inode->i_ino = ino;
-		inode_init_owner(idmap, inode, dir, mode);
-		inode->i_blocks = 0;
-		inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
-		inode->i_generation = get_random_u32();
-		info = SHMEM_I(inode);
-		memset(info, 0, (char *)inode - (char *)info);
-		spin_lock_init(&info->lock);
-		atomic_set(&info->stop_eviction, 0);
-		info->seals = F_SEAL_SEAL;
-		info->flags = flags & VM_NORESERVE;
-		info->i_crtime = inode->i_mtime;
-		info->fsflags = (dir == NULL) ? 0 :
-			SHMEM_I(dir)->fsflags & SHMEM_FL_INHERITED;
-		if (info->fsflags)
-			shmem_set_inode_flags(inode, info->fsflags);
-		INIT_LIST_HEAD(&info->shrinklist);
-		INIT_LIST_HEAD(&info->swaplist);
-		if (sbinfo->noswap)
-			mapping_set_unevictable(inode->i_mapping);
-		simple_xattrs_init(&info->xattrs);
-		cache_no_acl(inode);
-		mapping_set_large_folios(inode->i_mapping);
-
-		switch (mode & S_IFMT) {
-		default:
-			inode->i_op = &shmem_special_inode_operations;
-			init_special_inode(inode, mode, dev);
-			break;
-		case S_IFREG:
-			inode->i_mapping->a_ops = &shmem_aops;
-			inode->i_op = &shmem_inode_operations;
-			inode->i_fop = &shmem_file_operations;
-			mpol_shared_policy_init(&info->policy,
-						 shmem_get_sbmpol(sbinfo));
-			break;
-		case S_IFDIR:
-			inc_nlink(inode);
-			/* Some things misbehave if size == 0 on a directory */
-			inode->i_size = 2 * BOGO_DIRENT_SIZE;
-			inode->i_op = &shmem_dir_inode_operations;
-			inode->i_fop = &simple_dir_operations;
-			break;
-		case S_IFLNK:
-			/*
-			 * Must not load anything in the rbtree,
-			 * mpol_free_shared_policy will not be called.
-			 */
-			mpol_shared_policy_init(&info->policy, NULL);
-			break;
-		}
 
-		lockdep_annotate_inode_mutex_key(inode);
-	} else
+	if (!inode) {
 		shmem_free_inode(sb);
+		return ERR_PTR(-ENOSPC);
+	}
+
+	inode->i_ino = ino;
+	inode_init_owner(idmap, inode, dir, mode);
+	inode->i_blocks = 0;
+	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
+	inode->i_generation = get_random_u32();
+	info = SHMEM_I(inode);
+	memset(info, 0, (char *)inode - (char *)info);
+	spin_lock_init(&info->lock);
+	atomic_set(&info->stop_eviction, 0);
+	info->seals = F_SEAL_SEAL;
+	info->flags = flags & VM_NORESERVE;
+	info->i_crtime = inode->i_mtime;
+	info->fsflags = (dir == NULL) ? 0 :
+		SHMEM_I(dir)->fsflags & SHMEM_FL_INHERITED;
+	if (info->fsflags)
+		shmem_set_inode_flags(inode, info->fsflags);
+	INIT_LIST_HEAD(&info->shrinklist);
+	INIT_LIST_HEAD(&info->swaplist);
+	INIT_LIST_HEAD(&info->swaplist);
+	if (sbinfo->noswap)
+		mapping_set_unevictable(inode->i_mapping);
+	simple_xattrs_init(&info->xattrs);
+	cache_no_acl(inode);
+	mapping_set_large_folios(inode->i_mapping);
+
+	switch (mode & S_IFMT) {
+	default:
+		inode->i_op = &shmem_special_inode_operations;
+		init_special_inode(inode, mode, dev);
+		break;
+	case S_IFREG:
+		inode->i_mapping->a_ops = &shmem_aops;
+		inode->i_op = &shmem_inode_operations;
+		inode->i_fop = &shmem_file_operations;
+		mpol_shared_policy_init(&info->policy,
+					 shmem_get_sbmpol(sbinfo));
+		break;
+	case S_IFDIR:
+		inc_nlink(inode);
+		/* Some things misbehave if size == 0 on a directory */
+		inode->i_size = 2 * BOGO_DIRENT_SIZE;
+		inode->i_op = &shmem_dir_inode_operations;
+		inode->i_fop = &simple_dir_operations;
+		break;
+	case S_IFLNK:
+		/*
+		 * Must not load anything in the rbtree,
+		 * mpol_free_shared_policy will not be called.
+		 */
+		mpol_shared_policy_init(&info->policy, NULL);
+		break;
+	}
+
+	lockdep_annotate_inode_mutex_key(inode);
 	return inode;
 }
 
@@ -3071,27 +3078,30 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir,
 	    struct dentry *dentry, umode_t mode, dev_t dev)
 {
 	struct inode *inode;
-	int error = -ENOSPC;
+	int error;
 
 	inode = shmem_get_inode(idmap, dir->i_sb, dir, mode, dev, VM_NORESERVE);
-	if (inode) {
-		error = simple_acl_create(dir, inode);
-		if (error)
-			goto out_iput;
-		error = security_inode_init_security(inode, dir,
-						     &dentry->d_name,
-						     shmem_initxattrs, NULL);
-		if (error && error != -EOPNOTSUPP)
-			goto out_iput;
 
-		error = 0;
-		dir->i_size += BOGO_DIRENT_SIZE;
-		dir->i_ctime = dir->i_mtime = current_time(dir);
-		inode_inc_iversion(dir);
-		d_instantiate(dentry, inode);
-		dget(dentry); /* Extra count - pin the dentry in core */
-	}
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
+
+	error = simple_acl_create(dir, inode);
+	if (error)
+		goto out_iput;
+	error = security_inode_init_security(inode, dir,
+					     &dentry->d_name,
+					     shmem_initxattrs, NULL);
+	if (error && error != -EOPNOTSUPP)
+		goto out_iput;
+
+	error = 0;
+	dir->i_size += BOGO_DIRENT_SIZE;
+	dir->i_ctime = dir->i_mtime = current_time(dir);
+	inode_inc_iversion(dir);
+	d_instantiate(dentry, inode);
+	dget(dentry); /* Extra count - pin the dentry in core */
 	return error;
+
 out_iput:
 	iput(inode);
 	return error;
@@ -3102,20 +3112,26 @@ shmem_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
 	      struct file *file, umode_t mode)
 {
 	struct inode *inode;
-	int error = -ENOSPC;
+	int error;
 
 	inode = shmem_get_inode(idmap, dir->i_sb, dir, mode, 0, VM_NORESERVE);
-	if (inode) {
-		error = security_inode_init_security(inode, dir,
-						     NULL,
-						     shmem_initxattrs, NULL);
-		if (error && error != -EOPNOTSUPP)
-			goto out_iput;
-		error = simple_acl_create(dir, inode);
-		if (error)
-			goto out_iput;
-		d_tmpfile(file, inode);
+
+	if (IS_ERR(inode)) {
+		error = PTR_ERR(inode);
+		goto err_out;
 	}
+
+	error = security_inode_init_security(inode, dir,
+					     NULL,
+					     shmem_initxattrs, NULL);
+	if (error && error != -EOPNOTSUPP)
+		goto out_iput;
+	error = simple_acl_create(dir, inode);
+	if (error)
+		goto out_iput;
+	d_tmpfile(file, inode);
+
+err_out:
 	return finish_open_simple(file, error);
 out_iput:
 	iput(inode);
@@ -3290,8 +3306,9 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir,
 
 	inode = shmem_get_inode(idmap, dir->i_sb, dir, S_IFLNK | 0777, 0,
 				VM_NORESERVE);
-	if (!inode)
-		return -ENOSPC;
+
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
 
 	error = security_inode_init_security(inode, dir, &dentry->d_name,
 					     shmem_initxattrs, NULL);
@@ -3929,12 +3946,13 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
 	struct shmem_options *ctx = fc->fs_private;
 	struct inode *inode;
 	struct shmem_sb_info *sbinfo;
+	int error = -ENOMEM;
 
 	/* Round up to L1_CACHE_BYTES to resist false sharing */
 	sbinfo = kzalloc(max((int)sizeof(struct shmem_sb_info),
 				L1_CACHE_BYTES), GFP_KERNEL);
 	if (!sbinfo)
-		return -ENOMEM;
+		return error;
 
 	sb->s_fs_info = sbinfo;
 
@@ -3997,8 +4015,10 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
 
 	inode = shmem_get_inode(&nop_mnt_idmap, sb, NULL, S_IFDIR | sbinfo->mode, 0,
 				VM_NORESERVE);
-	if (!inode)
+	if (IS_ERR(inode)) {
+		error = PTR_ERR(inode);
 		goto failed;
+	}
 	inode->i_uid = sbinfo->uid;
 	inode->i_gid = sbinfo->gid;
 	sb->s_root = d_make_root(inode);
@@ -4008,7 +4028,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
 
 failed:
 	shmem_put_super(sb);
-	return -ENOMEM;
+	return error;
 }
 
 static int shmem_get_tree(struct fs_context *fc)
@@ -4377,10 +4397,16 @@ EXPORT_SYMBOL_GPL(shmem_truncate_range);
 #define shmem_vm_ops				generic_file_vm_ops
 #define shmem_anon_vm_ops			generic_file_vm_ops
 #define shmem_file_operations			ramfs_file_operations
-#define shmem_get_inode(idmap, sb, dir, mode, dev, flags) ramfs_get_inode(sb, dir, mode, dev)
 #define shmem_acct_size(flags, size)		0
 #define shmem_unacct_size(flags, size)		do {} while (0)
 
+static inline struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block *sb, struct inode *dir,
+					    umode_t mode, dev_t dev, unsigned long flags)
+{
+	struct inode *inode = ramfs_get_inode(sb, dir, mode, dev);
+	return inode ? inode : ERR_PTR(-ENOSPC);
+}
+
 #endif /* CONFIG_SHMEM */
 
 /* common code */
@@ -4405,9 +4431,10 @@ static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, l
 
 	inode = shmem_get_inode(&nop_mnt_idmap, mnt->mnt_sb, NULL,
 				S_IFREG | S_IRWXUGO, 0, flags);
-	if (unlikely(!inode)) {
+
+	if (IS_ERR(inode)) {
 		shmem_unacct_size(flags, size);
-		return ERR_PTR(-ENOSPC);
+		return ERR_CAST(inode);
 	}
 	inode->i_flags |= i_flags;
 	inode->i_size = size;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/6] quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks
  2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
  2023-07-13 13:48 ` [PATCH 1/6] shmem: make shmem_inode_acct_block() return error cem
  2023-07-13 13:48 ` [PATCH 2/6] shmem: make shmem_get_inode() return ERR_PTR instead of NULL cem
@ 2023-07-13 13:48 ` cem
  2023-07-13 13:48 ` [PATCH 4/6] shmem: prepare shmem quota infrastructure cem
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Jan Kara <jack@suse.cz>

Currently we check whether superblock has ->quota_read and ->quota_write
operations to check whether filesystem supports quotas. However for
example for shmfs we will not read or write dquots so check whether
quota operations are set in the superblock instead.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/quota/dquot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index e3e4f4047657..4d826c369da2 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2367,7 +2367,7 @@ int dquot_load_quota_sb(struct super_block *sb, int type, int format_id,
 
 	if (!fmt)
 		return -ESRCH;
-	if (!sb->s_op->quota_write || !sb->s_op->quota_read ||
+	if (!sb->dq_op || !sb->s_qcop ||
 	    (type == PRJQUOTA && sb->dq_op->get_projid == NULL)) {
 		error = -EINVAL;
 		goto out_fmt;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/6] shmem: prepare shmem quota infrastructure
  2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
                   ` (2 preceding siblings ...)
  2023-07-13 13:48 ` [PATCH 3/6] quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks cem
@ 2023-07-13 13:48 ` cem
  2023-07-13 13:48 ` [PATCH 5/6] shmem: quota support cem
  2023-07-13 13:48 ` [PATCH 6/6] Add default quota limit mount options cem
  5 siblings, 0 replies; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Carlos Maiolino <cem@kernel.org>

Add new shmem quota format, its quota_format_ops together with
dquot_operations

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/Kconfig                 |  12 ++
 include/linux/shmem_fs.h   |  12 ++
 include/uapi/linux/quota.h |   1 +
 mm/Makefile                |   2 +-
 mm/shmem_quota.c           | 318 +++++++++++++++++++++++++++++++++++++
 5 files changed, 344 insertions(+), 1 deletion(-)
 create mode 100644 mm/shmem_quota.c

diff --git a/fs/Kconfig b/fs/Kconfig
index 18d034ec7953..8218a71933f9 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -233,6 +233,18 @@ config TMPFS_INODE64
 
 	  If unsure, say N.
 
+config TMPFS_QUOTA
+	bool "Tmpfs quota support"
+	depends on TMPFS
+	select QUOTA
+	help
+	  Quota support allows to set per user and group limits for tmpfs
+	  usage.  Say Y to enable quota support. Once enabled you can control
+	  user and group quota enforcement with quota, usrquota and grpquota
+	  mount options.
+
+	  If unsure, say N.
+
 config ARCH_SUPPORTS_HUGETLBFS
 	def_bool n
 
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 9029abd29b1c..7abfaf70b58a 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -13,6 +13,10 @@
 
 /* inode in-kernel data */
 
+#ifdef CONFIG_TMPFS_QUOTA
+#define SHMEM_MAXQUOTAS 2
+#endif
+
 struct shmem_inode_info {
 	spinlock_t		lock;
 	unsigned int		seals;		/* shmem seals */
@@ -172,4 +176,12 @@ extern int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 #endif /* CONFIG_SHMEM */
 #endif /* CONFIG_USERFAULTFD */
 
+/*
+ * Used space is stored as unsigned 64-bit value in bytes but
+ * quota core supports only signed 64-bit values so use that
+ * as a limit
+ */
+#define SHMEM_QUOTA_MAX_SPC_LIMIT 0x7fffffffffffffffLL /* 2^63-1 */
+#define SHMEM_QUOTA_MAX_INO_LIMIT 0x7fffffffffffffffLL
+
 #endif
diff --git a/include/uapi/linux/quota.h b/include/uapi/linux/quota.h
index f17c9636a859..52090105b828 100644
--- a/include/uapi/linux/quota.h
+++ b/include/uapi/linux/quota.h
@@ -77,6 +77,7 @@
 #define	QFMT_VFS_V0 2
 #define QFMT_OCFS2 3
 #define	QFMT_VFS_V1 4
+#define	QFMT_SHMEM 5
 
 /* Size of block in which space limits are passed through the quota
  * interface */
diff --git a/mm/Makefile b/mm/Makefile
index 678530a07326..d4ee20988dd1 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -51,7 +51,7 @@ obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
 			   readahead.o swap.o truncate.o vmscan.o shmem.o \
 			   util.o mmzone.o vmstat.o backing-dev.o \
 			   mm_init.o percpu.o slab_common.o \
-			   compaction.o show_mem.o\
+			   compaction.o show_mem.o shmem_quota.o\
 			   interval_tree.o list_lru.o workingset.o \
 			   debug.o gup.o mmap_lock.o $(mmu-y)
 
diff --git a/mm/shmem_quota.c b/mm/shmem_quota.c
new file mode 100644
index 000000000000..c0b531e2ef68
--- /dev/null
+++ b/mm/shmem_quota.c
@@ -0,0 +1,318 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * In memory quota format relies on quota infrastructure to store dquot
+ * information for us. While conventional quota formats for file systems
+ * with persistent storage can load quota information into dquot from the
+ * storage on-demand and hence quota dquot shrinker can free any dquot
+ * that is not currently being used, it must be avoided here. Otherwise we
+ * can lose valuable information, user provided limits, because there is
+ * no persistent storage to load the information from afterwards.
+ *
+ * One information that in-memory quota format needs to keep track of is
+ * a sorted list of ids for each quota type. This is done by utilizing
+ * an rb tree which root is stored in mem_dqinfo->dqi_priv for each quota
+ * type.
+ *
+ * This format can be used to support quota on file system without persistent
+ * storage such as tmpfs.
+ *
+ * Author:	Lukas Czerner <lczerner@redhat.com>
+ *		Carlos Maiolino <cmaiolino@redhat.com>
+ *
+ * Copyright (C) 2023 Red Hat, Inc.
+ */
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/mount.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/rbtree.h>
+#include <linux/shmem_fs.h>
+
+#include <linux/quotaops.h>
+#include <linux/quota.h>
+
+#ifdef CONFIG_TMPFS_QUOTA
+
+/*
+ * The following constants define the amount of time given a user
+ * before the soft limits are treated as hard limits (usually resulting
+ * in an allocation failure). The timer is started when the user crosses
+ * their soft limit, it is reset when they go below their soft limit.
+ */
+#define SHMEM_MAX_IQ_TIME 604800	/* (7*24*60*60) 1 week */
+#define SHMEM_MAX_DQ_TIME 604800	/* (7*24*60*60) 1 week */
+
+struct quota_id {
+	struct rb_node	node;
+	qid_t		id;
+	qsize_t		bhardlimit;
+	qsize_t		bsoftlimit;
+	qsize_t		ihardlimit;
+	qsize_t		isoftlimit;
+};
+
+static int shmem_check_quota_file(struct super_block *sb, int type)
+{
+	/* There is no real quota file, nothing to do */
+	return 1;
+}
+
+/*
+ * There is no real quota file. Just allocate rb_root for quota ids and
+ * set limits
+ */
+static int shmem_read_file_info(struct super_block *sb, int type)
+{
+	struct quota_info *dqopt = sb_dqopt(sb);
+	struct mem_dqinfo *info = &dqopt->info[type];
+
+	info->dqi_priv = kzalloc(sizeof(struct rb_root), GFP_NOFS);
+	if (!info->dqi_priv)
+		return -ENOMEM;
+
+	info->dqi_max_spc_limit = SHMEM_QUOTA_MAX_SPC_LIMIT;
+	info->dqi_max_ino_limit = SHMEM_QUOTA_MAX_INO_LIMIT;
+
+	info->dqi_bgrace = SHMEM_MAX_DQ_TIME;
+	info->dqi_igrace = SHMEM_MAX_IQ_TIME;
+	info->dqi_flags = 0;
+
+	return 0;
+}
+
+static int shmem_write_file_info(struct super_block *sb, int type)
+{
+	/* There is no real quota file, nothing to do */
+	return 0;
+}
+
+/*
+ * Free all the quota_id entries in the rb tree and rb_root.
+ */
+static int shmem_free_file_info(struct super_block *sb, int type)
+{
+	struct mem_dqinfo *info = &sb_dqopt(sb)->info[type];
+	struct rb_root *root = info->dqi_priv;
+	struct quota_id *entry;
+	struct rb_node *node;
+
+	info->dqi_priv = NULL;
+	node = rb_first(root);
+	while (node) {
+		entry = rb_entry(node, struct quota_id, node);
+		node = rb_next(&entry->node);
+
+		rb_erase(&entry->node, root);
+		kfree(entry);
+	}
+
+	kfree(root);
+	return 0;
+}
+
+static int shmem_get_next_id(struct super_block *sb, struct kqid *qid)
+{
+	struct mem_dqinfo *info = sb_dqinfo(sb, qid->type);
+	struct rb_node *node = ((struct rb_root *)info->dqi_priv)->rb_node;
+	qid_t id = from_kqid(&init_user_ns, *qid);
+	struct quota_info *dqopt = sb_dqopt(sb);
+	struct quota_id *entry = NULL;
+	int ret = 0;
+
+	if (!sb_has_quota_active(sb, qid->type))
+		return -ESRCH;
+
+	down_read(&dqopt->dqio_sem);
+	while (node) {
+		entry = rb_entry(node, struct quota_id, node);
+
+		if (id < entry->id)
+			node = node->rb_left;
+		else if (id > entry->id)
+			node = node->rb_right;
+		else
+			goto got_next_id;
+	}
+
+	if (!entry) {
+		ret = -ENOENT;
+		goto out_unlock;
+	}
+
+	if (id > entry->id) {
+		node = rb_next(&entry->node);
+		if (!node) {
+			ret = -ENOENT;
+			goto out_unlock;
+		}
+		entry = rb_entry(node, struct quota_id, node);
+	}
+
+got_next_id:
+	*qid = make_kqid(&init_user_ns, qid->type, entry->id);
+out_unlock:
+	up_read(&dqopt->dqio_sem);
+	return ret;
+}
+
+/*
+ * Load dquot with limits from existing entry, or create the new entry if
+ * it does not exist.
+ */
+static int shmem_acquire_dquot(struct dquot *dquot)
+{
+	struct mem_dqinfo *info = sb_dqinfo(dquot->dq_sb, dquot->dq_id.type);
+	struct rb_node **n = &((struct rb_root *)info->dqi_priv)->rb_node;
+	struct rb_node *parent = NULL, *new_node = NULL;
+	struct quota_id *new_entry, *entry;
+	qid_t id = from_kqid(&init_user_ns, dquot->dq_id);
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+	int ret = 0;
+
+	mutex_lock(&dquot->dq_lock);
+
+	down_write(&dqopt->dqio_sem);
+	while (*n) {
+		parent = *n;
+		entry = rb_entry(parent, struct quota_id, node);
+
+		if (id < entry->id)
+			n = &(*n)->rb_left;
+		else if (id > entry->id)
+			n = &(*n)->rb_right;
+		else
+			goto found;
+	}
+
+	/* We don't have entry for this id yet, create it */
+	new_entry = kzalloc(sizeof(struct quota_id), GFP_NOFS);
+	if (!new_entry) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	new_entry->id = id;
+	new_node = &new_entry->node;
+	rb_link_node(new_node, parent, n);
+	rb_insert_color(new_node, (struct rb_root *)info->dqi_priv);
+	entry = new_entry;
+
+found:
+	/* Load the stored limits from the tree */
+	spin_lock(&dquot->dq_dqb_lock);
+	dquot->dq_dqb.dqb_bhardlimit = entry->bhardlimit;
+	dquot->dq_dqb.dqb_bsoftlimit = entry->bsoftlimit;
+	dquot->dq_dqb.dqb_ihardlimit = entry->ihardlimit;
+	dquot->dq_dqb.dqb_isoftlimit = entry->isoftlimit;
+
+	if (!dquot->dq_dqb.dqb_bhardlimit &&
+	    !dquot->dq_dqb.dqb_bsoftlimit &&
+	    !dquot->dq_dqb.dqb_ihardlimit &&
+	    !dquot->dq_dqb.dqb_isoftlimit)
+		set_bit(DQ_FAKE_B, &dquot->dq_flags);
+	spin_unlock(&dquot->dq_dqb_lock);
+
+	/* Make sure flags update is visible after dquot has been filled */
+	smp_mb__before_atomic();
+	set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
+out_unlock:
+	up_write(&dqopt->dqio_sem);
+	mutex_unlock(&dquot->dq_lock);
+	return ret;
+}
+
+/*
+ * Store limits from dquot in the tree unless it's fake. If it is fake
+ * remove the id from the tree since there is no useful information in
+ * there.
+ */
+static int shmem_release_dquot(struct dquot *dquot)
+{
+	struct mem_dqinfo *info = sb_dqinfo(dquot->dq_sb, dquot->dq_id.type);
+	struct rb_node *node = ((struct rb_root *)info->dqi_priv)->rb_node;
+	qid_t id = from_kqid(&init_user_ns, dquot->dq_id);
+	struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+	struct quota_id *entry = NULL;
+
+	mutex_lock(&dquot->dq_lock);
+	/* Check whether we are not racing with some other dqget() */
+	if (dquot_is_busy(dquot))
+		goto out_dqlock;
+
+	down_write(&dqopt->dqio_sem);
+	while (node) {
+		entry = rb_entry(node, struct quota_id, node);
+
+		if (id < entry->id)
+			node = node->rb_left;
+		else if (id > entry->id)
+			node = node->rb_right;
+		else
+			goto found;
+	}
+
+	/* We should always find the entry in the rb tree */
+	WARN_ONCE(1, "quota id %u from dquot %p, not in rb tree!\n", id, dquot);
+	up_write(&dqopt->dqio_sem);
+	mutex_unlock(&dquot->dq_lock);
+	return -ENOENT;
+
+found:
+	if (test_bit(DQ_FAKE_B, &dquot->dq_flags)) {
+		/* Remove entry from the tree */
+		rb_erase(&entry->node, info->dqi_priv);
+		kfree(entry);
+	} else {
+		/* Store the limits in the tree */
+		spin_lock(&dquot->dq_dqb_lock);
+		entry->bhardlimit = dquot->dq_dqb.dqb_bhardlimit;
+		entry->bsoftlimit = dquot->dq_dqb.dqb_bsoftlimit;
+		entry->ihardlimit = dquot->dq_dqb.dqb_ihardlimit;
+		entry->isoftlimit = dquot->dq_dqb.dqb_isoftlimit;
+		spin_unlock(&dquot->dq_dqb_lock);
+	}
+
+	clear_bit(DQ_ACTIVE_B, &dquot->dq_flags);
+	up_write(&dqopt->dqio_sem);
+
+out_dqlock:
+	mutex_unlock(&dquot->dq_lock);
+	return 0;
+}
+
+int shmem_mark_dquot_dirty(struct dquot *dquot)
+{
+	return 0;
+}
+
+int shmem_dquot_write_info(struct super_block *sb, int type)
+{
+	return 0;
+}
+
+static const struct quota_format_ops shmem_format_ops = {
+	.check_quota_file	= shmem_check_quota_file,
+	.read_file_info		= shmem_read_file_info,
+	.write_file_info	= shmem_write_file_info,
+	.free_file_info		= shmem_free_file_info,
+};
+
+struct quota_format_type shmem_quota_format = {
+	.qf_fmt_id = QFMT_SHMEM,
+	.qf_ops = &shmem_format_ops,
+	.qf_owner = THIS_MODULE
+};
+
+const struct dquot_operations shmem_quota_operations = {
+	.acquire_dquot		= shmem_acquire_dquot,
+	.release_dquot		= shmem_release_dquot,
+	.alloc_dquot		= dquot_alloc,
+	.destroy_dquot		= dquot_destroy,
+	.write_info		= shmem_dquot_write_info,
+	.mark_dirty		= shmem_mark_dquot_dirty,
+	.get_next_id		= shmem_get_next_id,
+};
+#endif /* CONFIG_TMPFS_QUOTA */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/6] shmem: quota support
  2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
                   ` (3 preceding siblings ...)
  2023-07-13 13:48 ` [PATCH 4/6] shmem: prepare shmem quota infrastructure cem
@ 2023-07-13 13:48 ` cem
  2023-07-14  9:54   ` Christian Brauner
  2023-07-13 13:48 ` [PATCH 6/6] Add default quota limit mount options cem
  5 siblings, 1 reply; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Carlos Maiolino <cem@kernel.org>

Now the basic infra-structure is in place, enable quota support for tmpfs.

This offers user and group quotas to tmpfs (project quotas will be added
later). Also, as other filesystems, the tmpfs quota is not supported
within user namespaces yet, so idmapping is not translated.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 Documentation/filesystems/tmpfs.rst |  15 +++
 include/linux/shmem_fs.h            |   8 ++
 mm/shmem.c                          | 180 ++++++++++++++++++++++++++--
 3 files changed, 195 insertions(+), 8 deletions(-)

diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst
index f18f46be5c0c..0c7d8bd052f1 100644
--- a/Documentation/filesystems/tmpfs.rst
+++ b/Documentation/filesystems/tmpfs.rst
@@ -130,6 +130,21 @@ for emergency or testing purposes. The values you can set for shmem_enabled are:
     option, for testing
 ==  ============================================================
 
+tmpfs also supports quota with the following mount options
+
+========  =============================================================
+quota     User and group quota accounting and enforcement is enabled on
+          the mount. Tmpfs is using hidden system quota files that are
+          initialized on mount.
+usrquota  User quota accounting and enforcement is enabled on the
+          mount.
+grpquota  Group quota accounting and enforcement is enabled on the
+          mount.
+========  =============================================================
+
+Note that tmpfs quotas do not support user namespaces so no uid/gid
+translation is done if quotas are enabled inside user namespaces.
+
 tmpfs has a mount option to set the NUMA memory allocation policy for
 all files in that instance (if CONFIG_NUMA is enabled) - which can be
 adjusted on the fly via 'mount -o remount ...'
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 7abfaf70b58a..1a568a0f542f 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -31,6 +31,9 @@ struct shmem_inode_info {
 	atomic_t		stop_eviction;	/* hold when working on inode */
 	struct timespec64	i_crtime;	/* file creation time */
 	unsigned int		fsflags;	/* flags for FS_IOC_[SG]ETFLAGS */
+#ifdef CONFIG_TMPFS_QUOTA
+	struct dquot		*i_dquot[MAXQUOTAS];
+#endif
 	struct inode		vfs_inode;
 };
 
@@ -184,4 +187,9 @@ extern int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 #define SHMEM_QUOTA_MAX_SPC_LIMIT 0x7fffffffffffffffLL /* 2^63-1 */
 #define SHMEM_QUOTA_MAX_INO_LIMIT 0x7fffffffffffffffLL
 
+#ifdef CONFIG_TMPFS_QUOTA
+extern const struct dquot_operations shmem_quota_operations;
+extern struct quota_format_type shmem_quota_format;
+#endif /* CONFIG_TMPFS_QUOTA */
+
 #endif
diff --git a/mm/shmem.c b/mm/shmem.c
index 2a7b8060b6f4..5022238dd68d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -78,6 +78,7 @@ static struct vfsmount *shm_mnt;
 #include <uapi/linux/memfd.h>
 #include <linux/rmap.h>
 #include <linux/uuid.h>
+#include <linux/quotaops.h>
 
 #include <linux/uaccess.h>
 
@@ -116,11 +117,13 @@ struct shmem_options {
 	int huge;
 	int seen;
 	bool noswap;
+	unsigned short quota_types;
 #define SHMEM_SEEN_BLOCKS 1
 #define SHMEM_SEEN_INODES 2
 #define SHMEM_SEEN_HUGE 4
 #define SHMEM_SEEN_INUMS 8
 #define SHMEM_SEEN_NOSWAP 16
+#define SHMEM_SEEN_QUOTA 32
 };
 
 #ifdef CONFIG_TMPFS
@@ -212,7 +215,16 @@ static inline int shmem_inode_acct_block(struct inode *inode, long pages)
 		if (percpu_counter_compare(&sbinfo->used_blocks,
 					   sbinfo->max_blocks - pages) > 0)
 			goto unacct;
+
+		err = dquot_alloc_block_nodirty(inode, pages);
+		if (err)
+			goto unacct;
+
 		percpu_counter_add(&sbinfo->used_blocks, pages);
+	} else {
+		err = dquot_alloc_block_nodirty(inode, pages);
+		if (err)
+			goto unacct;
 	}
 
 	return 0;
@@ -227,6 +239,8 @@ static inline void shmem_inode_unacct_blocks(struct inode *inode, long pages)
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
 
+	dquot_free_block_nodirty(inode, pages);
+
 	if (sbinfo->max_blocks)
 		percpu_counter_sub(&sbinfo->used_blocks, pages);
 	shmem_unacct_blocks(info->flags, pages);
@@ -255,6 +269,47 @@ bool vma_is_shmem(struct vm_area_struct *vma)
 static LIST_HEAD(shmem_swaplist);
 static DEFINE_MUTEX(shmem_swaplist_mutex);
 
+#ifdef CONFIG_TMPFS_QUOTA
+
+static int shmem_enable_quotas(struct super_block *sb,
+			       unsigned short quota_types)
+{
+	int type, err = 0;
+
+	sb_dqopt(sb)->flags |= DQUOT_QUOTA_SYS_FILE | DQUOT_NOLIST_DIRTY;
+	for (type = 0; type < SHMEM_MAXQUOTAS; type++) {
+		if (!(quota_types & (1 << type)))
+			continue;
+		err = dquot_load_quota_sb(sb, type, QFMT_SHMEM,
+					  DQUOT_USAGE_ENABLED |
+					  DQUOT_LIMITS_ENABLED);
+		if (err)
+			goto out_err;
+	}
+	return 0;
+
+out_err:
+	pr_warn("tmpfs: failed to enable quota tracking (type=%d, err=%d)\n",
+		type, err);
+	for (type--; type >= 0; type--)
+		dquot_quota_off(sb, type);
+	return err;
+}
+
+static void shmem_disable_quotas(struct super_block *sb)
+{
+	int type;
+
+	for (type = 0; type < SHMEM_MAXQUOTAS; type++)
+		dquot_quota_off(sb, type);
+}
+
+static struct dquot **shmem_get_dquots(struct inode *inode)
+{
+	return SHMEM_I(inode)->i_dquot;
+}
+#endif /* CONFIG_TMPFS_QUOTA */
+
 /*
  * shmem_reserve_inode() performs bookkeeping to reserve a shmem inode, and
  * produces a novel ino for the newly allocated inode.
@@ -361,7 +416,6 @@ static void shmem_recalc_inode(struct inode *inode)
 	freed = info->alloced - info->swapped - inode->i_mapping->nrpages;
 	if (freed > 0) {
 		info->alloced -= freed;
-		inode->i_blocks -= freed * BLOCKS_PER_PAGE;
 		shmem_inode_unacct_blocks(inode, freed);
 	}
 }
@@ -379,7 +433,6 @@ bool shmem_charge(struct inode *inode, long pages)
 
 	spin_lock_irqsave(&info->lock, flags);
 	info->alloced += pages;
-	inode->i_blocks += pages * BLOCKS_PER_PAGE;
 	shmem_recalc_inode(inode);
 	spin_unlock_irqrestore(&info->lock, flags);
 
@@ -395,7 +448,6 @@ void shmem_uncharge(struct inode *inode, long pages)
 
 	spin_lock_irqsave(&info->lock, flags);
 	info->alloced -= pages;
-	inode->i_blocks -= pages * BLOCKS_PER_PAGE;
 	shmem_recalc_inode(inode);
 	spin_unlock_irqrestore(&info->lock, flags);
 
@@ -1141,6 +1193,21 @@ static int shmem_setattr(struct mnt_idmap *idmap,
 		}
 	}
 
+	if (is_quota_modification(idmap, inode, attr)) {
+		error = dquot_initialize(inode);
+		if (error)
+			return error;
+	}
+
+	/* Transfer quota accounting */
+	if (i_uid_needs_update(idmap, attr, inode) ||
+	    i_gid_needs_update(idmap, attr, inode)) {
+		error = dquot_transfer(idmap, inode, attr);
+
+		if (error)
+			return error;
+	}
+
 	setattr_copy(idmap, inode, attr);
 	if (attr->ia_valid & ATTR_MODE)
 		error = posix_acl_chmod(idmap, dentry, inode->i_mode);
@@ -1187,6 +1254,10 @@ static void shmem_evict_inode(struct inode *inode)
 	WARN_ON(inode->i_blocks);
 	shmem_free_inode(inode->i_sb);
 	clear_inode(inode);
+#ifdef CONFIG_TMPFS_QUOTA
+	dquot_free_inode(inode);
+	dquot_drop(inode);
+#endif
 }
 
 static int shmem_find_swap_entries(struct address_space *mapping,
@@ -1986,7 +2057,6 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
 
 	spin_lock_irq(&info->lock);
 	info->alloced += folio_nr_pages(folio);
-	inode->i_blocks += (blkcnt_t)BLOCKS_PER_PAGE << folio_order(folio);
 	shmem_recalc_inode(inode);
 	spin_unlock_irq(&info->lock);
 	alloced = true;
@@ -2357,9 +2427,10 @@ static void shmem_set_inode_flags(struct inode *inode, unsigned int fsflags)
 #define shmem_initxattrs NULL
 #endif
 
-static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block *sb,
-				     struct inode *dir, umode_t mode, dev_t dev,
-				     unsigned long flags)
+static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
+					     struct super_block *sb,
+					     struct inode *dir, umode_t mode,
+					     dev_t dev, unsigned long flags)
 {
 	struct inode *inode;
 	struct shmem_inode_info *info;
@@ -2436,6 +2507,43 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block
 	return inode;
 }
 
+#ifdef CONFIG_TMPFS_QUOTA
+static struct inode *shmem_get_inode(struct mnt_idmap *idmap,
+				     struct super_block *sb, struct inode *dir,
+				     umode_t mode, dev_t dev, unsigned long flags)
+{
+	int err;
+	struct inode *inode;
+
+	inode = __shmem_get_inode(idmap, sb, dir, mode, dev, flags);
+	if (IS_ERR(inode))
+		return inode;
+
+	err = dquot_initialize(inode);
+	if (err)
+		goto errout;
+
+	err = dquot_alloc_inode(inode);
+	if (err) {
+		dquot_drop(inode);
+		goto errout;
+	}
+	return inode;
+
+errout:
+	inode->i_flags |= S_NOQUOTA;
+	iput(inode);
+	return ERR_PTR(err);
+}
+#else
+static inline struct inode *shmem_get_inode(struct mnt_idmap *idmap,
+				     struct super_block *sb, struct inode *dir,
+				     umode_t mode, dev_t dev, unsigned long flags)
+{
+	return __shmem_get_inode(idmap, sb, dir, mode, dev, flags);
+}
+#endif /* CONFIG_TMPFS_QUOTA */
+
 #ifdef CONFIG_USERFAULTFD
 int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 			   struct vm_area_struct *dst_vma,
@@ -2538,7 +2646,6 @@ int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 
 	spin_lock_irq(&info->lock);
 	info->alloced++;
-	inode->i_blocks += BLOCKS_PER_PAGE;
 	shmem_recalc_inode(inode);
 	spin_unlock_irq(&info->lock);
 
@@ -3516,6 +3623,7 @@ static ssize_t shmem_listxattr(struct dentry *dentry, char *buffer, size_t size)
 
 static const struct inode_operations shmem_short_symlink_operations = {
 	.getattr	= shmem_getattr,
+	.setattr	= shmem_setattr,
 	.get_link	= simple_get_link,
 #ifdef CONFIG_TMPFS_XATTR
 	.listxattr	= shmem_listxattr,
@@ -3524,6 +3632,7 @@ static const struct inode_operations shmem_short_symlink_operations = {
 
 static const struct inode_operations shmem_symlink_inode_operations = {
 	.getattr	= shmem_getattr,
+	.setattr	= shmem_setattr,
 	.get_link	= shmem_get_link,
 #ifdef CONFIG_TMPFS_XATTR
 	.listxattr	= shmem_listxattr,
@@ -3623,6 +3732,9 @@ enum shmem_param {
 	Opt_inode32,
 	Opt_inode64,
 	Opt_noswap,
+	Opt_quota,
+	Opt_usrquota,
+	Opt_grpquota,
 };
 
 static const struct constant_table shmem_param_enums_huge[] = {
@@ -3645,6 +3757,11 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
 	fsparam_flag  ("inode32",	Opt_inode32),
 	fsparam_flag  ("inode64",	Opt_inode64),
 	fsparam_flag  ("noswap",	Opt_noswap),
+#ifdef CONFIG_TMPFS_QUOTA
+	fsparam_flag  ("quota",		Opt_quota),
+	fsparam_flag  ("usrquota",	Opt_usrquota),
+	fsparam_flag  ("grpquota",	Opt_grpquota),
+#endif
 	{}
 };
 
@@ -3736,6 +3853,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
 		ctx->noswap = true;
 		ctx->seen |= SHMEM_SEEN_NOSWAP;
 		break;
+	case Opt_quota:
+		ctx->seen |= SHMEM_SEEN_QUOTA;
+		ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
+		break;
+	case Opt_usrquota:
+		ctx->seen |= SHMEM_SEEN_QUOTA;
+		ctx->quota_types |= QTYPE_MASK_USR;
+		break;
+	case Opt_grpquota:
+		ctx->seen |= SHMEM_SEEN_QUOTA;
+		ctx->quota_types |= QTYPE_MASK_GRP;
+		break;
 	}
 	return 0;
 
@@ -3843,6 +3972,12 @@ static int shmem_reconfigure(struct fs_context *fc)
 		goto out;
 	}
 
+	if (ctx->seen & SHMEM_SEEN_QUOTA &&
+	    !sb_any_quota_loaded(fc->root->d_sb)) {
+		err = "Cannot enable quota on remount";
+		goto out;
+	}
+
 	if (ctx->seen & SHMEM_SEEN_HUGE)
 		sbinfo->huge = ctx->huge;
 	if (ctx->seen & SHMEM_SEEN_INUMS)
@@ -3934,6 +4069,9 @@ static void shmem_put_super(struct super_block *sb)
 {
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
 
+#ifdef CONFIG_TMPFS_QUOTA
+	shmem_disable_quotas(sb);
+#endif
 	free_percpu(sbinfo->ino_batch);
 	percpu_counter_destroy(&sbinfo->used_blocks);
 	mpol_put(sbinfo->mpol);
@@ -4013,6 +4151,17 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
 #endif
 	uuid_gen(&sb->s_uuid);
 
+#ifdef CONFIG_TMPFS_QUOTA
+	if (ctx->seen & SHMEM_SEEN_QUOTA) {
+		sb->dq_op = &shmem_quota_operations;
+		sb->s_qcop = &dquot_quotactl_sysfile_ops;
+		sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP;
+
+		if (shmem_enable_quotas(sb, ctx->quota_types))
+			goto failed;
+	}
+#endif /* CONFIG_TMPFS_QUOTA */
+
 	inode = shmem_get_inode(&nop_mnt_idmap, sb, NULL, S_IFDIR | sbinfo->mode, 0,
 				VM_NORESERVE);
 	if (IS_ERR(inode)) {
@@ -4188,6 +4337,9 @@ static const struct super_operations shmem_ops = {
 #ifdef CONFIG_TMPFS
 	.statfs		= shmem_statfs,
 	.show_options	= shmem_show_options,
+#endif
+#ifdef CONFIG_TMPFS_QUOTA
+	.get_dquots	= shmem_get_dquots,
 #endif
 	.evict_inode	= shmem_evict_inode,
 	.drop_inode	= generic_delete_inode,
@@ -4254,6 +4406,14 @@ void __init shmem_init(void)
 
 	shmem_init_inodecache();
 
+#ifdef CONFIG_TMPFS_QUOTA
+	error = register_quota_format(&shmem_quota_format);
+	if (error < 0) {
+		pr_err("Could not register quota format\n");
+		goto out3;
+	}
+#endif
+
 	error = register_filesystem(&shmem_fs_type);
 	if (error) {
 		pr_err("Could not register tmpfs\n");
@@ -4278,6 +4438,10 @@ void __init shmem_init(void)
 out1:
 	unregister_filesystem(&shmem_fs_type);
 out2:
+#ifdef CONFIG_TMPFS_QUOTA
+	unregister_quota_format(&shmem_quota_format);
+out3:
+#endif
 	shmem_destroy_inodecache();
 	shm_mnt = ERR_PTR(error);
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/6] Add default quota limit mount options
  2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
                   ` (4 preceding siblings ...)
  2023-07-13 13:48 ` [PATCH 5/6] shmem: quota support cem
@ 2023-07-13 13:48 ` cem
  5 siblings, 0 replies; 12+ messages in thread
From: cem @ 2023-07-13 13:48 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: jack, akpm, viro, linux-mm, djwong, hughd, brauner, mcgrof

From: Lukas Czerner <lczerner@redhat.com>

Allow system administrator to set default global quota limits at tmpfs
mount time.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
---
 Documentation/filesystems/tmpfs.rst | 34 +++++++++++-----
 include/linux/shmem_fs.h            |  8 ++++
 mm/shmem.c                          | 61 +++++++++++++++++++++++++++++
 mm/shmem_quota.c                    | 34 +++++++++++++++-
 4 files changed, 127 insertions(+), 10 deletions(-)

diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst
index 0c7d8bd052f1..f843dbbeb589 100644
--- a/Documentation/filesystems/tmpfs.rst
+++ b/Documentation/filesystems/tmpfs.rst
@@ -132,15 +132,31 @@ for emergency or testing purposes. The values you can set for shmem_enabled are:
 
 tmpfs also supports quota with the following mount options
 
-========  =============================================================
-quota     User and group quota accounting and enforcement is enabled on
-          the mount. Tmpfs is using hidden system quota files that are
-          initialized on mount.
-usrquota  User quota accounting and enforcement is enabled on the
-          mount.
-grpquota  Group quota accounting and enforcement is enabled on the
-          mount.
-========  =============================================================
+======================== =================================================
+quota                    User and group quota accounting and enforcement
+                         is enabled on the mount. Tmpfs is using hidden
+                         system quota files that are initialized on mount.
+usrquota                 User quota accounting and enforcement is enabled
+                         on the mount.
+grpquota                 Group quota accounting and enforcement is enabled
+                         on the mount.
+usrquota_block_hardlimit Set global user quota block hard limit.
+usrquota_inode_hardlimit Set global user quota inode hard limit.
+grpquota_block_hardlimit Set global group quota block hard limit.
+grpquota_inode_hardlimit Set global group quota inode hard limit.
+======================== =================================================
+
+None of the quota related mount options can be set or changed on remount.
+
+Quota limit parameters accept a suffix k, m or g for kilo, mega and giga
+and can't be changed on remount. Default global quota limits are taking
+effect for any and all user/group/project except root the first time the
+quota entry for user/group/project id is being accessed - typically the
+first time an inode with a particular id ownership is being created after
+the mount. In other words, instead of the limits being initialized to zero,
+they are initialized with the particular value provided with these mount
+options. The limits can be changed for any user/group id at any time as they
+normally can be.
 
 Note that tmpfs quotas do not support user namespaces so no uid/gid
 translation is done if quotas are enabled inside user namespaces.
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index 1a568a0f542f..c0058f3bba70 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -42,6 +42,13 @@ struct shmem_inode_info {
 	(FS_IMMUTABLE_FL | FS_APPEND_FL | FS_NODUMP_FL | FS_NOATIME_FL)
 #define SHMEM_FL_INHERITED		(FS_NODUMP_FL | FS_NOATIME_FL)
 
+struct shmem_quota_limits {
+	qsize_t usrquota_bhardlimit; /* Default user quota block hard limit */
+	qsize_t usrquota_ihardlimit; /* Default user quota inode hard limit */
+	qsize_t grpquota_bhardlimit; /* Default group quota block hard limit */
+	qsize_t grpquota_ihardlimit; /* Default group quota inode hard limit */
+};
+
 struct shmem_sb_info {
 	unsigned long max_blocks;   /* How many blocks are allowed */
 	struct percpu_counter used_blocks;  /* How many are allocated */
@@ -60,6 +67,7 @@ struct shmem_sb_info {
 	spinlock_t shrinklist_lock;   /* Protects shrinklist */
 	struct list_head shrinklist;  /* List of shinkable inodes */
 	unsigned long shrinklist_len; /* Length of shrinklist */
+	struct shmem_quota_limits qlimits; /* Default quota limits */
 };
 
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
diff --git a/mm/shmem.c b/mm/shmem.c
index 5022238dd68d..083ce6b478e7 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -118,6 +118,7 @@ struct shmem_options {
 	int seen;
 	bool noswap;
 	unsigned short quota_types;
+	struct shmem_quota_limits qlimits;
 #define SHMEM_SEEN_BLOCKS 1
 #define SHMEM_SEEN_INODES 2
 #define SHMEM_SEEN_HUGE 4
@@ -3735,6 +3736,10 @@ enum shmem_param {
 	Opt_quota,
 	Opt_usrquota,
 	Opt_grpquota,
+	Opt_usrquota_block_hardlimit,
+	Opt_usrquota_inode_hardlimit,
+	Opt_grpquota_block_hardlimit,
+	Opt_grpquota_inode_hardlimit,
 };
 
 static const struct constant_table shmem_param_enums_huge[] = {
@@ -3761,6 +3766,10 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
 	fsparam_flag  ("quota",		Opt_quota),
 	fsparam_flag  ("usrquota",	Opt_usrquota),
 	fsparam_flag  ("grpquota",	Opt_grpquota),
+	fsparam_string("usrquota_block_hardlimit", Opt_usrquota_block_hardlimit),
+	fsparam_string("usrquota_inode_hardlimit", Opt_usrquota_inode_hardlimit),
+	fsparam_string("grpquota_block_hardlimit", Opt_grpquota_block_hardlimit),
+	fsparam_string("grpquota_inode_hardlimit", Opt_grpquota_inode_hardlimit),
 #endif
 	{}
 };
@@ -3865,6 +3874,42 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
 		ctx->seen |= SHMEM_SEEN_QUOTA;
 		ctx->quota_types |= QTYPE_MASK_GRP;
 		break;
+	case Opt_usrquota_block_hardlimit:
+		size = memparse(param->string, &rest);
+		if (*rest || !size)
+			goto bad_value;
+		if (size > SHMEM_QUOTA_MAX_SPC_LIMIT)
+			return invalfc(fc,
+				       "User quota block hardlimit too large.");
+		ctx->qlimits.usrquota_bhardlimit = size;
+		break;
+	case Opt_grpquota_block_hardlimit:
+		size = memparse(param->string, &rest);
+		if (*rest || !size)
+			goto bad_value;
+		if (size > SHMEM_QUOTA_MAX_SPC_LIMIT)
+			return invalfc(fc,
+				       "Group quota block hardlimit too large.");
+		ctx->qlimits.grpquota_bhardlimit = size;
+		break;
+	case Opt_usrquota_inode_hardlimit:
+		size = memparse(param->string, &rest);
+		if (*rest || !size)
+			goto bad_value;
+		if (size > SHMEM_QUOTA_MAX_INO_LIMIT)
+			return invalfc(fc,
+				       "User quota inode hardlimit too large.");
+		ctx->qlimits.usrquota_ihardlimit = size;
+		break;
+	case Opt_grpquota_inode_hardlimit:
+		size = memparse(param->string, &rest);
+		if (*rest || !size)
+			goto bad_value;
+		if (size > SHMEM_QUOTA_MAX_INO_LIMIT)
+			return invalfc(fc,
+				       "Group quota inode hardlimit too large.");
+		ctx->qlimits.grpquota_ihardlimit = size;
+		break;
 	}
 	return 0;
 
@@ -3978,6 +4023,18 @@ static int shmem_reconfigure(struct fs_context *fc)
 		goto out;
 	}
 
+#ifdef CONFIG_TMPFS_QUOTA
+#define CHANGED_LIMIT(name)						\
+	(ctx->qlimits.name## hardlimit &&				\
+	(ctx->qlimits.name## hardlimit != sbinfo->qlimits.name## hardlimit))
+
+	if (CHANGED_LIMIT(usrquota_b) || CHANGED_LIMIT(usrquota_i) ||
+	    CHANGED_LIMIT(grpquota_b) || CHANGED_LIMIT(grpquota_i)) {
+		err = "Cannot change global quota limit on remount";
+		goto out;
+	}
+#endif /* CONFIG_TMPFS_QUOTA */
+
 	if (ctx->seen & SHMEM_SEEN_HUGE)
 		sbinfo->huge = ctx->huge;
 	if (ctx->seen & SHMEM_SEEN_INUMS)
@@ -4157,6 +4214,10 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
 		sb->s_qcop = &dquot_quotactl_sysfile_ops;
 		sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP;
 
+		/* Copy the default limits from ctx into sbinfo */
+		memcpy(&sbinfo->qlimits, &ctx->qlimits,
+		       sizeof(struct shmem_quota_limits));
+
 		if (shmem_enable_quotas(sb, ctx->quota_types))
 			goto failed;
 	}
diff --git a/mm/shmem_quota.c b/mm/shmem_quota.c
index c0b531e2ef68..e349c0901bce 100644
--- a/mm/shmem_quota.c
+++ b/mm/shmem_quota.c
@@ -166,6 +166,7 @@ static int shmem_acquire_dquot(struct dquot *dquot)
 {
 	struct mem_dqinfo *info = sb_dqinfo(dquot->dq_sb, dquot->dq_id.type);
 	struct rb_node **n = &((struct rb_root *)info->dqi_priv)->rb_node;
+	struct shmem_sb_info *sbinfo = dquot->dq_sb->s_fs_info;
 	struct rb_node *parent = NULL, *new_node = NULL;
 	struct quota_id *new_entry, *entry;
 	qid_t id = from_kqid(&init_user_ns, dquot->dq_id);
@@ -195,6 +196,14 @@ static int shmem_acquire_dquot(struct dquot *dquot)
 	}
 
 	new_entry->id = id;
+	if (dquot->dq_id.type == USRQUOTA) {
+		new_entry->bhardlimit = sbinfo->qlimits.usrquota_bhardlimit;
+		new_entry->ihardlimit = sbinfo->qlimits.usrquota_ihardlimit;
+	} else if (dquot->dq_id.type == GRPQUOTA) {
+		new_entry->bhardlimit = sbinfo->qlimits.grpquota_bhardlimit;
+		new_entry->ihardlimit = sbinfo->qlimits.grpquota_ihardlimit;
+	}
+
 	new_node = &new_entry->node;
 	rb_link_node(new_node, parent, n);
 	rb_insert_color(new_node, (struct rb_root *)info->dqi_priv);
@@ -224,6 +233,29 @@ static int shmem_acquire_dquot(struct dquot *dquot)
 	return ret;
 }
 
+static bool shmem_is_empty_dquot(struct dquot *dquot)
+{
+	struct shmem_sb_info *sbinfo = dquot->dq_sb->s_fs_info;
+	qsize_t bhardlimit;
+	qsize_t ihardlimit;
+
+	if (dquot->dq_id.type == USRQUOTA) {
+		bhardlimit = sbinfo->qlimits.usrquota_bhardlimit;
+		ihardlimit = sbinfo->qlimits.usrquota_ihardlimit;
+	} else if (dquot->dq_id.type == GRPQUOTA) {
+		bhardlimit = sbinfo->qlimits.grpquota_bhardlimit;
+		ihardlimit = sbinfo->qlimits.grpquota_ihardlimit;
+	}
+
+	if (test_bit(DQ_FAKE_B, &dquot->dq_flags) ||
+		(dquot->dq_dqb.dqb_curspace == 0 &&
+		 dquot->dq_dqb.dqb_curinodes == 0 &&
+		 dquot->dq_dqb.dqb_bhardlimit == bhardlimit &&
+		 dquot->dq_dqb.dqb_ihardlimit == ihardlimit))
+		return true;
+
+	return false;
+}
 /*
  * Store limits from dquot in the tree unless it's fake. If it is fake
  * remove the id from the tree since there is no useful information in
@@ -261,7 +293,7 @@ static int shmem_release_dquot(struct dquot *dquot)
 	return -ENOENT;
 
 found:
-	if (test_bit(DQ_FAKE_B, &dquot->dq_flags)) {
+	if (shmem_is_empty_dquot(dquot)) {
 		/* Remove entry from the tree */
 		rb_erase(&entry->node, info->dqi_priv);
 		kfree(entry);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/6] shmem: quota support
  2023-07-13 13:48 ` [PATCH 5/6] shmem: quota support cem
@ 2023-07-14  9:54   ` Christian Brauner
  2023-07-14 10:40     ` Carlos Maiolino
  2023-07-14 12:26     ` Carlos Maiolino
  0 siblings, 2 replies; 12+ messages in thread
From: Christian Brauner @ 2023-07-14  9:54 UTC (permalink / raw)
  To: cem; +Cc: linux-fsdevel, jack, akpm, viro, linux-mm, djwong, hughd, mcgrof

On Thu, Jul 13, 2023 at 03:48:47PM +0200, cem@kernel.org wrote:
> From: Carlos Maiolino <cem@kernel.org>
> 
> Now the basic infra-structure is in place, enable quota support for tmpfs.
> 
> This offers user and group quotas to tmpfs (project quotas will be added
> later). Also, as other filesystems, the tmpfs quota is not supported
> within user namespaces yet, so idmapping is not translated.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> Reviewed-by: Jan Kara <jack@suse.cz>
> ---
>  Documentation/filesystems/tmpfs.rst |  15 +++
>  include/linux/shmem_fs.h            |   8 ++
>  mm/shmem.c                          | 180 ++++++++++++++++++++++++++--
>  3 files changed, 195 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst
> index f18f46be5c0c..0c7d8bd052f1 100644
> --- a/Documentation/filesystems/tmpfs.rst
> +++ b/Documentation/filesystems/tmpfs.rst
> @@ -130,6 +130,21 @@ for emergency or testing purposes. The values you can set for shmem_enabled are:
>      option, for testing
>  ==  ============================================================
>  
> +tmpfs also supports quota with the following mount options
> +
> +========  =============================================================
> +quota     User and group quota accounting and enforcement is enabled on
> +          the mount. Tmpfs is using hidden system quota files that are
> +          initialized on mount.
> +usrquota  User quota accounting and enforcement is enabled on the
> +          mount.
> +grpquota  Group quota accounting and enforcement is enabled on the
> +          mount.
> +========  =============================================================
> +
> +Note that tmpfs quotas do not support user namespaces so no uid/gid
> +translation is done if quotas are enabled inside user namespaces.
> +
>  tmpfs has a mount option to set the NUMA memory allocation policy for
>  all files in that instance (if CONFIG_NUMA is enabled) - which can be
>  adjusted on the fly via 'mount -o remount ...'
> diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> index 7abfaf70b58a..1a568a0f542f 100644
> --- a/include/linux/shmem_fs.h
> +++ b/include/linux/shmem_fs.h
> @@ -31,6 +31,9 @@ struct shmem_inode_info {
>  	atomic_t		stop_eviction;	/* hold when working on inode */
>  	struct timespec64	i_crtime;	/* file creation time */
>  	unsigned int		fsflags;	/* flags for FS_IOC_[SG]ETFLAGS */
> +#ifdef CONFIG_TMPFS_QUOTA
> +	struct dquot		*i_dquot[MAXQUOTAS];
> +#endif
>  	struct inode		vfs_inode;
>  };
>  
> @@ -184,4 +187,9 @@ extern int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
>  #define SHMEM_QUOTA_MAX_SPC_LIMIT 0x7fffffffffffffffLL /* 2^63-1 */
>  #define SHMEM_QUOTA_MAX_INO_LIMIT 0x7fffffffffffffffLL
>  
> +#ifdef CONFIG_TMPFS_QUOTA
> +extern const struct dquot_operations shmem_quota_operations;
> +extern struct quota_format_type shmem_quota_format;
> +#endif /* CONFIG_TMPFS_QUOTA */
> +
>  #endif
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 2a7b8060b6f4..5022238dd68d 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -78,6 +78,7 @@ static struct vfsmount *shm_mnt;
>  #include <uapi/linux/memfd.h>
>  #include <linux/rmap.h>
>  #include <linux/uuid.h>
> +#include <linux/quotaops.h>
>  
>  #include <linux/uaccess.h>
>  
> @@ -116,11 +117,13 @@ struct shmem_options {
>  	int huge;
>  	int seen;
>  	bool noswap;
> +	unsigned short quota_types;
>  #define SHMEM_SEEN_BLOCKS 1
>  #define SHMEM_SEEN_INODES 2
>  #define SHMEM_SEEN_HUGE 4
>  #define SHMEM_SEEN_INUMS 8
>  #define SHMEM_SEEN_NOSWAP 16
> +#define SHMEM_SEEN_QUOTA 32
>  };
>  
>  #ifdef CONFIG_TMPFS
> @@ -212,7 +215,16 @@ static inline int shmem_inode_acct_block(struct inode *inode, long pages)
>  		if (percpu_counter_compare(&sbinfo->used_blocks,
>  					   sbinfo->max_blocks - pages) > 0)
>  			goto unacct;
> +
> +		err = dquot_alloc_block_nodirty(inode, pages);
> +		if (err)
> +			goto unacct;
> +
>  		percpu_counter_add(&sbinfo->used_blocks, pages);
> +	} else {
> +		err = dquot_alloc_block_nodirty(inode, pages);
> +		if (err)
> +			goto unacct;
>  	}
>  
>  	return 0;
> @@ -227,6 +239,8 @@ static inline void shmem_inode_unacct_blocks(struct inode *inode, long pages)
>  	struct shmem_inode_info *info = SHMEM_I(inode);
>  	struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
>  
> +	dquot_free_block_nodirty(inode, pages);
> +
>  	if (sbinfo->max_blocks)
>  		percpu_counter_sub(&sbinfo->used_blocks, pages);
>  	shmem_unacct_blocks(info->flags, pages);
> @@ -255,6 +269,47 @@ bool vma_is_shmem(struct vm_area_struct *vma)
>  static LIST_HEAD(shmem_swaplist);
>  static DEFINE_MUTEX(shmem_swaplist_mutex);
>  
> +#ifdef CONFIG_TMPFS_QUOTA
> +
> +static int shmem_enable_quotas(struct super_block *sb,
> +			       unsigned short quota_types)
> +{
> +	int type, err = 0;
> +
> +	sb_dqopt(sb)->flags |= DQUOT_QUOTA_SYS_FILE | DQUOT_NOLIST_DIRTY;
> +	for (type = 0; type < SHMEM_MAXQUOTAS; type++) {
> +		if (!(quota_types & (1 << type)))
> +			continue;
> +		err = dquot_load_quota_sb(sb, type, QFMT_SHMEM,
> +					  DQUOT_USAGE_ENABLED |
> +					  DQUOT_LIMITS_ENABLED);
> +		if (err)
> +			goto out_err;
> +	}
> +	return 0;
> +
> +out_err:
> +	pr_warn("tmpfs: failed to enable quota tracking (type=%d, err=%d)\n",
> +		type, err);
> +	for (type--; type >= 0; type--)
> +		dquot_quota_off(sb, type);
> +	return err;
> +}
> +
> +static void shmem_disable_quotas(struct super_block *sb)
> +{
> +	int type;
> +
> +	for (type = 0; type < SHMEM_MAXQUOTAS; type++)
> +		dquot_quota_off(sb, type);
> +}
> +
> +static struct dquot **shmem_get_dquots(struct inode *inode)
> +{
> +	return SHMEM_I(inode)->i_dquot;
> +}
> +#endif /* CONFIG_TMPFS_QUOTA */
> +
>  /*
>   * shmem_reserve_inode() performs bookkeeping to reserve a shmem inode, and
>   * produces a novel ino for the newly allocated inode.
> @@ -361,7 +416,6 @@ static void shmem_recalc_inode(struct inode *inode)
>  	freed = info->alloced - info->swapped - inode->i_mapping->nrpages;
>  	if (freed > 0) {
>  		info->alloced -= freed;
> -		inode->i_blocks -= freed * BLOCKS_PER_PAGE;
>  		shmem_inode_unacct_blocks(inode, freed);
>  	}
>  }
> @@ -379,7 +433,6 @@ bool shmem_charge(struct inode *inode, long pages)
>  
>  	spin_lock_irqsave(&info->lock, flags);
>  	info->alloced += pages;
> -	inode->i_blocks += pages * BLOCKS_PER_PAGE;
>  	shmem_recalc_inode(inode);
>  	spin_unlock_irqrestore(&info->lock, flags);
>  
> @@ -395,7 +448,6 @@ void shmem_uncharge(struct inode *inode, long pages)
>  
>  	spin_lock_irqsave(&info->lock, flags);
>  	info->alloced -= pages;
> -	inode->i_blocks -= pages * BLOCKS_PER_PAGE;
>  	shmem_recalc_inode(inode);
>  	spin_unlock_irqrestore(&info->lock, flags);
>  
> @@ -1141,6 +1193,21 @@ static int shmem_setattr(struct mnt_idmap *idmap,
>  		}
>  	}
>  
> +	if (is_quota_modification(idmap, inode, attr)) {
> +		error = dquot_initialize(inode);
> +		if (error)
> +			return error;
> +	}
> +
> +	/* Transfer quota accounting */
> +	if (i_uid_needs_update(idmap, attr, inode) ||
> +	    i_gid_needs_update(idmap, attr, inode)) {
> +		error = dquot_transfer(idmap, inode, attr);
> +
> +		if (error)
> +			return error;
> +	}
> +
>  	setattr_copy(idmap, inode, attr);
>  	if (attr->ia_valid & ATTR_MODE)
>  		error = posix_acl_chmod(idmap, dentry, inode->i_mode);
> @@ -1187,6 +1254,10 @@ static void shmem_evict_inode(struct inode *inode)
>  	WARN_ON(inode->i_blocks);
>  	shmem_free_inode(inode->i_sb);
>  	clear_inode(inode);
> +#ifdef CONFIG_TMPFS_QUOTA
> +	dquot_free_inode(inode);
> +	dquot_drop(inode);
> +#endif
>  }
>  
>  static int shmem_find_swap_entries(struct address_space *mapping,
> @@ -1986,7 +2057,6 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
>  
>  	spin_lock_irq(&info->lock);
>  	info->alloced += folio_nr_pages(folio);
> -	inode->i_blocks += (blkcnt_t)BLOCKS_PER_PAGE << folio_order(folio);
>  	shmem_recalc_inode(inode);
>  	spin_unlock_irq(&info->lock);
>  	alloced = true;
> @@ -2357,9 +2427,10 @@ static void shmem_set_inode_flags(struct inode *inode, unsigned int fsflags)
>  #define shmem_initxattrs NULL
>  #endif
>  
> -static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block *sb,
> -				     struct inode *dir, umode_t mode, dev_t dev,
> -				     unsigned long flags)
> +static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
> +					     struct super_block *sb,
> +					     struct inode *dir, umode_t mode,
> +					     dev_t dev, unsigned long flags)
>  {
>  	struct inode *inode;
>  	struct shmem_inode_info *info;
> @@ -2436,6 +2507,43 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block
>  	return inode;
>  }
>  
> +#ifdef CONFIG_TMPFS_QUOTA
> +static struct inode *shmem_get_inode(struct mnt_idmap *idmap,
> +				     struct super_block *sb, struct inode *dir,
> +				     umode_t mode, dev_t dev, unsigned long flags)
> +{
> +	int err;
> +	struct inode *inode;
> +
> +	inode = __shmem_get_inode(idmap, sb, dir, mode, dev, flags);
> +	if (IS_ERR(inode))
> +		return inode;
> +
> +	err = dquot_initialize(inode);
> +	if (err)
> +		goto errout;
> +
> +	err = dquot_alloc_inode(inode);
> +	if (err) {
> +		dquot_drop(inode);
> +		goto errout;
> +	}
> +	return inode;
> +
> +errout:
> +	inode->i_flags |= S_NOQUOTA;
> +	iput(inode);
> +	return ERR_PTR(err);
> +}
> +#else
> +static inline struct inode *shmem_get_inode(struct mnt_idmap *idmap,
> +				     struct super_block *sb, struct inode *dir,
> +				     umode_t mode, dev_t dev, unsigned long flags)
> +{
> +	return __shmem_get_inode(idmap, sb, dir, mode, dev, flags);
> +}
> +#endif /* CONFIG_TMPFS_QUOTA */
> +
>  #ifdef CONFIG_USERFAULTFD
>  int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
>  			   struct vm_area_struct *dst_vma,
> @@ -2538,7 +2646,6 @@ int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
>  
>  	spin_lock_irq(&info->lock);
>  	info->alloced++;
> -	inode->i_blocks += BLOCKS_PER_PAGE;
>  	shmem_recalc_inode(inode);
>  	spin_unlock_irq(&info->lock);
>  
> @@ -3516,6 +3623,7 @@ static ssize_t shmem_listxattr(struct dentry *dentry, char *buffer, size_t size)
>  
>  static const struct inode_operations shmem_short_symlink_operations = {
>  	.getattr	= shmem_getattr,
> +	.setattr	= shmem_setattr,
>  	.get_link	= simple_get_link,
>  #ifdef CONFIG_TMPFS_XATTR
>  	.listxattr	= shmem_listxattr,
> @@ -3524,6 +3632,7 @@ static const struct inode_operations shmem_short_symlink_operations = {
>  
>  static const struct inode_operations shmem_symlink_inode_operations = {
>  	.getattr	= shmem_getattr,
> +	.setattr	= shmem_setattr,
>  	.get_link	= shmem_get_link,
>  #ifdef CONFIG_TMPFS_XATTR
>  	.listxattr	= shmem_listxattr,
> @@ -3623,6 +3732,9 @@ enum shmem_param {
>  	Opt_inode32,
>  	Opt_inode64,
>  	Opt_noswap,
> +	Opt_quota,
> +	Opt_usrquota,
> +	Opt_grpquota,
>  };
>  
>  static const struct constant_table shmem_param_enums_huge[] = {
> @@ -3645,6 +3757,11 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
>  	fsparam_flag  ("inode32",	Opt_inode32),
>  	fsparam_flag  ("inode64",	Opt_inode64),
>  	fsparam_flag  ("noswap",	Opt_noswap),
> +#ifdef CONFIG_TMPFS_QUOTA
> +	fsparam_flag  ("quota",		Opt_quota),
> +	fsparam_flag  ("usrquota",	Opt_usrquota),
> +	fsparam_flag  ("grpquota",	Opt_grpquota),
> +#endif
>  	{}
>  };
>  
> @@ -3736,6 +3853,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
>  		ctx->noswap = true;
>  		ctx->seen |= SHMEM_SEEN_NOSWAP;
>  		break;
> +	case Opt_quota:
> +		ctx->seen |= SHMEM_SEEN_QUOTA;
> +		ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
> +		break;
> +	case Opt_usrquota:
> +		ctx->seen |= SHMEM_SEEN_QUOTA;
> +		ctx->quota_types |= QTYPE_MASK_USR;
> +		break;
> +	case Opt_grpquota:
> +		ctx->seen |= SHMEM_SEEN_QUOTA;
> +		ctx->quota_types |= QTYPE_MASK_GRP;
> +		break;
>  	}
>  	return 0;

I mentioned this in an earlier review; following the sequence:

if (ctx->seen & SHMEM_SEEN_QUOTA)
-> shmem_enable_quotas()
   -> dquot_load_quota_sb()

to then figure out that in dquot_load_quota_sb() we fail if
sb->s_user_ns != &init_user_ns is too subtle for a filesystem that's
mountable by unprivileged users. Every few months someone will end up
stumbling upon this code and wonder where it's blocked. There isn't even
a comment in the code.

Aside from that it's also really unfriendly to users because they may go
through setting up a tmpfs instances in the following way:

        fd_fs = fsopen("tmpfs");

User now enables quota:

        fsconfig(fd_fs, ..., "quota", ...) = 0

and goes on to set a bunch of other options:

        fsconfig(fd_fs, ..., "inode64", ...) = 0
        fsconfig(fd_fs, ..., "nr_inodes", ...) = 0
        fsconfig(fd_fs, ..., "nr_blocks", ...) = 0
        fsconfig(fd_fs, ..., "huge", ...) = 0
        fsconfig(fd_fs, ..., "mode", ...) = 0
        fsconfig(fd_fs, ..., "gid", ...) = 0

everything seems dandy and they create the superblock:

        fsconfig(fd_fs, FSCONFIG_CMD_CREATE, ...) = -EINVAL

which fails.

The user has not just performed 9 useless system calls they also have
zero clue what mount option caused the failure.

What this code really really should do is fail at:

        fsconfig(fd_fs, ..., "quota", ...) = -EINVAL

and log an error that the user can retrieve from the fs context. IOW,

diff --git a/mm/shmem.c b/mm/shmem.c
index 083ce6b478e7..baca8bf44569 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3863,14 +3863,20 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
                ctx->seen |= SHMEM_SEEN_NOSWAP;
                break;
        case Opt_quota:
+               if (fc->user_ns != &init_user_ns)
+                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
                ctx->seen |= SHMEM_SEEN_QUOTA;
                ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
                break;
        case Opt_usrquota:
+               if (fc->user_ns != &init_user_ns)
+                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
                ctx->seen |= SHMEM_SEEN_QUOTA;
                ctx->quota_types |= QTYPE_MASK_USR;
                break;
        case Opt_grpquota:
+               if (fc->user_ns != &init_user_ns)
+                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
                ctx->seen |= SHMEM_SEEN_QUOTA;
                ctx->quota_types |= QTYPE_MASK_GRP;
                break;

This exactly what we already to for the "noswap" option btw.

Could you fold these changes into the patch and resend, please?
I synced with Andrew earlier and I'll be taking this series.

---

And btw, the *_SEEN_* logic for mount options is broken - but that's not
specific to your patch. Imagine:

        fd_fs = fsopen("tmpfs");
        fsconfig(fd_fs, ..., "nr_inodes", 0, "1000") = 0

Now ctx->inodes == 1000 and ctx->seen |= SHMEM_SEEN_INODES.

Now the user does:

        fsconfig(fd_fs, ..., "nr_inodes", 0, "-1234") = -EINVAL

This fails, but:

        ctx->inodes = memparse(param->string, &rest);
        if (*rest)
                goto bad_value;

will set ctx->inodes to whatever memparse returns but leaves
SHMEM_SEEN_INODES raised in ctx->seen. Now superblock creation may
succeed with a garbage inode limit. This should affect other mount
options as well.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/6] shmem: quota support
  2023-07-14  9:54   ` Christian Brauner
@ 2023-07-14 10:40     ` Carlos Maiolino
  2023-07-14 12:26     ` Carlos Maiolino
  1 sibling, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2023-07-14 10:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-fsdevel, jack, akpm, viro, linux-mm, djwong, hughd, mcgrof

Hi Christian.

> > @@ -3736,6 +3853,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> >  		ctx->noswap = true;
> >  		ctx->seen |= SHMEM_SEEN_NOSWAP;
> >  		break;
> > +	case Opt_quota:
> > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > +		ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
> > +		break;
> > +	case Opt_usrquota:
> > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > +		ctx->quota_types |= QTYPE_MASK_USR;
> > +		break;
> > +	case Opt_grpquota:
> > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > +		ctx->quota_types |= QTYPE_MASK_GRP;
> > +		break;
> >  	}
> >  	return 0;
> 
> I mentioned this in an earlier review; following the sequence:

Ok, my apologies, I should have lost it in the noise.

> 
> if (ctx->seen & SHMEM_SEEN_QUOTA)
> -> shmem_enable_quotas()
>    -> dquot_load_quota_sb()
> 
> to then figure out that in dquot_load_quota_sb() we fail if
> sb->s_user_ns != &init_user_ns is too subtle for a filesystem that's
> mountable by unprivileged users. Every few months someone will end up
> stumbling upon this code and wonder where it's blocked. There isn't even
> a comment in the code.
> 
> Aside from that it's also really unfriendly to users because they may go
> through setting up a tmpfs instances in the following way:
> 
>         fd_fs = fsopen("tmpfs");
> 
> User now enables quota:
> 
>         fsconfig(fd_fs, ..., "quota", ...) = 0
> 
> and goes on to set a bunch of other options:
> 
>         fsconfig(fd_fs, ..., "inode64", ...) = 0
>         fsconfig(fd_fs, ..., "nr_inodes", ...) = 0
>         fsconfig(fd_fs, ..., "nr_blocks", ...) = 0
>         fsconfig(fd_fs, ..., "huge", ...) = 0
>         fsconfig(fd_fs, ..., "mode", ...) = 0
>         fsconfig(fd_fs, ..., "gid", ...) = 0
> 
> everything seems dandy and they create the superblock:
> 
>         fsconfig(fd_fs, FSCONFIG_CMD_CREATE, ...) = -EINVAL
> 
> which fails.
> 
> The user has not just performed 9 useless system calls they also have
> zero clue what mount option caused the failure.
> 
> What this code really really should do is fail at:
> 
>         fsconfig(fd_fs, ..., "quota", ...) = -EINVAL
> 
> and log an error that the user can retrieve from the fs context. IOW,
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 083ce6b478e7..baca8bf44569 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3863,14 +3863,20 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
>                 ctx->seen |= SHMEM_SEEN_NOSWAP;
>                 break;
>         case Opt_quota:
> +               if (fc->user_ns != &init_user_ns)
> +                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
>                 ctx->seen |= SHMEM_SEEN_QUOTA;
>                 ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
>                 break;
>         case Opt_usrquota:
> +               if (fc->user_ns != &init_user_ns)
> +                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
>                 ctx->seen |= SHMEM_SEEN_QUOTA;
>                 ctx->quota_types |= QTYPE_MASK_USR;
>                 break;
>         case Opt_grpquota:
> +               if (fc->user_ns != &init_user_ns)
> +                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
>                 ctx->seen |= SHMEM_SEEN_QUOTA;
>                 ctx->quota_types |= QTYPE_MASK_GRP;
>                 break;
> 
> This exactly what we already to for the "noswap" option btw.
> 
> Could you fold these changes into the patch and resend, please?
> I synced with Andrew earlier and I'll be taking this series.

Thanks! I will sure do it, I'll update the patch, build, test and send it again
in a few minutes.

> 
> ---
> 
> And btw, the *_SEEN_* logic for mount options is broken - but that's not
> specific to your patch. Imagine:
> 
>         fd_fs = fsopen("tmpfs");
>         fsconfig(fd_fs, ..., "nr_inodes", 0, "1000") = 0
> 
> Now ctx->inodes == 1000 and ctx->seen |= SHMEM_SEEN_INODES.
> 
> Now the user does:
> 
>         fsconfig(fd_fs, ..., "nr_inodes", 0, "-1234") = -EINVAL
> 
> This fails, but:
> 
>         ctx->inodes = memparse(param->string, &rest);
>         if (*rest)
>                 goto bad_value;
> 
> will set ctx->inodes to whatever memparse returns but leaves
> SHMEM_SEEN_INODES raised in ctx->seen. Now superblock creation may
> succeed with a garbage inode limit. This should affect other mount
> options as well.

Interesting. Thanks for the heads up. I'll look in more details into it when I
start working for namespace support for quotas (as we spoke previously).

Cheers.

-- 
Carlos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/6] shmem: quota support
  2023-07-14  9:54   ` Christian Brauner
  2023-07-14 10:40     ` Carlos Maiolino
@ 2023-07-14 12:26     ` Carlos Maiolino
  2023-07-14 13:48       ` Christian Brauner
  1 sibling, 1 reply; 12+ messages in thread
From: Carlos Maiolino @ 2023-07-14 12:26 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-fsdevel, jack, akpm, viro, linux-mm, djwong, hughd, mcgrof

> >
> > @@ -3736,6 +3853,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> >  		ctx->noswap = true;
> >  		ctx->seen |= SHMEM_SEEN_NOSWAP;
> >  		break;
> > +	case Opt_quota:
> > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > +		ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
> > +		break;
> > +	case Opt_usrquota:
> > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > +		ctx->quota_types |= QTYPE_MASK_USR;
> > +		break;
> > +	case Opt_grpquota:
> > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > +		ctx->quota_types |= QTYPE_MASK_GRP;
> > +		break;
> >  	}
> >  	return 0;
> 
> I mentioned this in an earlier review; following the sequence:
> 
> if (ctx->seen & SHMEM_SEEN_QUOTA)
> -> shmem_enable_quotas()
>    -> dquot_load_quota_sb()
> 
> to then figure out that in dquot_load_quota_sb() we fail if
> sb->s_user_ns != &init_user_ns is too subtle for a filesystem that's
> mountable by unprivileged users. Every few months someone will end up
> stumbling upon this code and wonder where it's blocked. There isn't even
> a comment in the code.

I was just going to rebase these updated changes on top of linux-next, and I
realized the patches are already there. Wouldn't it be better if I send a
follow-up patch on top of linux-next, applying these changes, as a Fixes: tag?

-- 
Carlos

> 
> Aside from that it's also really unfriendly to users because they may go
> through setting up a tmpfs instances in the following way:
> 
>         fd_fs = fsopen("tmpfs");
> 
> User now enables quota:
> 
>         fsconfig(fd_fs, ..., "quota", ...) = 0
> 
> and goes on to set a bunch of other options:
> 
>         fsconfig(fd_fs, ..., "inode64", ...) = 0
>         fsconfig(fd_fs, ..., "nr_inodes", ...) = 0
>         fsconfig(fd_fs, ..., "nr_blocks", ...) = 0
>         fsconfig(fd_fs, ..., "huge", ...) = 0
>         fsconfig(fd_fs, ..., "mode", ...) = 0
>         fsconfig(fd_fs, ..., "gid", ...) = 0
> 
> everything seems dandy and they create the superblock:
> 
>         fsconfig(fd_fs, FSCONFIG_CMD_CREATE, ...) = -EINVAL
> 
> which fails.
> 
> The user has not just performed 9 useless system calls they also have
> zero clue what mount option caused the failure.
> 
> What this code really really should do is fail at:
> 
>         fsconfig(fd_fs, ..., "quota", ...) = -EINVAL
> 
> and log an error that the user can retrieve from the fs context. IOW,
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 083ce6b478e7..baca8bf44569 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3863,14 +3863,20 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
>                 ctx->seen |= SHMEM_SEEN_NOSWAP;
>                 break;
>         case Opt_quota:
> +               if (fc->user_ns != &init_user_ns)
> +                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
>                 ctx->seen |= SHMEM_SEEN_QUOTA;
>                 ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
>                 break;
>         case Opt_usrquota:
> +               if (fc->user_ns != &init_user_ns)
> +                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
>                 ctx->seen |= SHMEM_SEEN_QUOTA;
>                 ctx->quota_types |= QTYPE_MASK_USR;
>                 break;
>         case Opt_grpquota:
> +               if (fc->user_ns != &init_user_ns)
> +                       return invalfc(fc, "Quotas in unprivileged tmpfs mounts unsupported");
>                 ctx->seen |= SHMEM_SEEN_QUOTA;
>                 ctx->quota_types |= QTYPE_MASK_GRP;
>                 break;
> 
> This exactly what we already to for the "noswap" option btw.
> 
> Could you fold these changes into the patch and resend, please?
> I synced with Andrew earlier and I'll be taking this series.
> 
> ---
> 
> And btw, the *_SEEN_* logic for mount options is broken - but that's not
> specific to your patch. Imagine:
> 
>         fd_fs = fsopen("tmpfs");
>         fsconfig(fd_fs, ..., "nr_inodes", 0, "1000") = 0
> 
> Now ctx->inodes == 1000 and ctx->seen |= SHMEM_SEEN_INODES.
> 
> Now the user does:
> 
>         fsconfig(fd_fs, ..., "nr_inodes", 0, "-1234") = -EINVAL
> 
> This fails, but:
> 
>         ctx->inodes = memparse(param->string, &rest);
>         if (*rest)
>                 goto bad_value;
> 
> will set ctx->inodes to whatever memparse returns but leaves
> SHMEM_SEEN_INODES raised in ctx->seen. Now superblock creation may
> succeed with a garbage inode limit. This should affect other mount
> options as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/6] shmem: quota support
  2023-07-14 12:26     ` Carlos Maiolino
@ 2023-07-14 13:48       ` Christian Brauner
  2023-07-14 14:47         ` Carlos Maiolino
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Brauner @ 2023-07-14 13:48 UTC (permalink / raw)
  To: Carlos Maiolino
  Cc: linux-fsdevel, jack, akpm, viro, linux-mm, djwong, hughd, mcgrof

On Fri, Jul 14, 2023 at 02:26:44PM +0200, Carlos Maiolino wrote:
> > >
> > > @@ -3736,6 +3853,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> > >  		ctx->noswap = true;
> > >  		ctx->seen |= SHMEM_SEEN_NOSWAP;
> > >  		break;
> > > +	case Opt_quota:
> > > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > > +		ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
> > > +		break;
> > > +	case Opt_usrquota:
> > > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > > +		ctx->quota_types |= QTYPE_MASK_USR;
> > > +		break;
> > > +	case Opt_grpquota:
> > > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > > +		ctx->quota_types |= QTYPE_MASK_GRP;
> > > +		break;
> > >  	}
> > >  	return 0;
> > 
> > I mentioned this in an earlier review; following the sequence:
> > 
> > if (ctx->seen & SHMEM_SEEN_QUOTA)
> > -> shmem_enable_quotas()
> >    -> dquot_load_quota_sb()
> > 
> > to then figure out that in dquot_load_quota_sb() we fail if
> > sb->s_user_ns != &init_user_ns is too subtle for a filesystem that's
> > mountable by unprivileged users. Every few months someone will end up
> > stumbling upon this code and wonder where it's blocked. There isn't even
> > a comment in the code.
> 
> I was just going to rebase these updated changes on top of linux-next, and I
> realized the patches are already there. Wouldn't it be better if I send a
> follow-up patch on top of linux-next, applying these changes, as a Fixes: tag?

I would just resend and fold the fix into the patch. There's no good
reason to make this a separate patch imho.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/6] shmem: quota support
  2023-07-14 13:48       ` Christian Brauner
@ 2023-07-14 14:47         ` Carlos Maiolino
  0 siblings, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2023-07-14 14:47 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-fsdevel, jack, akpm, viro, linux-mm, djwong, hughd, mcgrof

On Fri, Jul 14, 2023 at 03:48:12PM +0200, Christian Brauner wrote:
> On Fri, Jul 14, 2023 at 02:26:44PM +0200, Carlos Maiolino wrote:
> > > >
> > > > @@ -3736,6 +3853,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> > > >  		ctx->noswap = true;
> > > >  		ctx->seen |= SHMEM_SEEN_NOSWAP;
> > > >  		break;
> > > > +	case Opt_quota:
> > > > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > > > +		ctx->quota_types |= (QTYPE_MASK_USR | QTYPE_MASK_GRP);
> > > > +		break;
> > > > +	case Opt_usrquota:
> > > > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > > > +		ctx->quota_types |= QTYPE_MASK_USR;
> > > > +		break;
> > > > +	case Opt_grpquota:
> > > > +		ctx->seen |= SHMEM_SEEN_QUOTA;
> > > > +		ctx->quota_types |= QTYPE_MASK_GRP;
> > > > +		break;
> > > >  	}
> > > >  	return 0;
> > >
> > > I mentioned this in an earlier review; following the sequence:
> > >
> > > if (ctx->seen & SHMEM_SEEN_QUOTA)
> > > -> shmem_enable_quotas()
> > >    -> dquot_load_quota_sb()
> > >
> > > to then figure out that in dquot_load_quota_sb() we fail if
> > > sb->s_user_ns != &init_user_ns is too subtle for a filesystem that's
> > > mountable by unprivileged users. Every few months someone will end up
> > > stumbling upon this code and wonder where it's blocked. There isn't even
> > > a comment in the code.
> >
> > I was just going to rebase these updated changes on top of linux-next, and I
> > realized the patches are already there. Wouldn't it be better if I send a
> > follow-up patch on top of linux-next, applying these changes, as a Fixes: tag?
> 
> I would just resend and fold the fix into the patch. There's no good
> reason to make this a separate patch imho.

sounds good. Thanks

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-07-14 14:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-13 13:48 [PATCH RESEND V4 0/6] shmem: Add user and group quota support for tmpfs cem
2023-07-13 13:48 ` [PATCH 1/6] shmem: make shmem_inode_acct_block() return error cem
2023-07-13 13:48 ` [PATCH 2/6] shmem: make shmem_get_inode() return ERR_PTR instead of NULL cem
2023-07-13 13:48 ` [PATCH 3/6] quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks cem
2023-07-13 13:48 ` [PATCH 4/6] shmem: prepare shmem quota infrastructure cem
2023-07-13 13:48 ` [PATCH 5/6] shmem: quota support cem
2023-07-14  9:54   ` Christian Brauner
2023-07-14 10:40     ` Carlos Maiolino
2023-07-14 12:26     ` Carlos Maiolino
2023-07-14 13:48       ` Christian Brauner
2023-07-14 14:47         ` Carlos Maiolino
2023-07-13 13:48 ` [PATCH 6/6] Add default quota limit mount options cem

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).