All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support
@ 2017-12-20 15:18 Luis Henriques
  2017-12-20 15:18 ` [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Luis Henriques @ 2017-12-20 15:18 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

A cephfs-specific quota implementation has been available in the
user-space fuse client for a while.  This quota implementation allows an
administrator to restrict the number of bytes and/or the number of files
in a filesystem subtree.  This quota implementation, however, is
supported at the client-level only, which means that cooperation is
required between different clients accessing the system.

This obviously assumes that all clients are trusted entities and will
respect the quotas, preventing users from exceeding the quota limits.
Since the kernel client doesn't support quotas, it has not been possible
to use it in a cluster where quotas are a requirement.

This patchset is an RFC that adds kernel client support for cephfs
quotas as it is currently implemented in the ceph fuse client.  Note
however that this patchset is not yet feature complete, as it only
implements the max_files quota (max_bytes is still missing).

** Changes since v2 **

Rework after review from Yan, Zheng:

- Dropped patch 0001 ("ceph: add seqlock for snaprealm hierarchy change
  detection") and use mdsc->snap_rwsem for walking the snaprealm
  hierarchy instead of adding a seqlock.  This means that patches 0003
  and 0004 needed to be reworked.

- Added a NULL check in ceph_handle_quota() after the inode lookup with
  ceph_find_inode().

** Changes since v1 **

Instead of trying to do a reverse path walk to find the "quota realm"
for a given directory, this patchset is now using snaprealms.  Thus, for
testing it, a modified MDS is required:

  https://github.com/ukernel/ceph/tree/wip-cephfs-quota-realm

This modified MDS creates a snaprealm when a quota is set in a
directory.  This means that a client needs only to walk up the snaprealm
hierarchy to find a directory that has quotas instead of doing the full
reverse path walking.

Note however that this requires an extra patch that adds a seqlock (1st
patch in series) to detect changes in the snaprealm hierarchy.

Luis Henriques (3):
  ceph: quota: add initial infrastructure to support cephfs quotas
  ceph: quotas: support for ceph.quota.max_files
  ceph: quota: don't allow cross-quota renames

 fs/ceph/Makefile                   |   2 +-
 fs/ceph/dir.c                      |  16 ++++
 fs/ceph/file.c                     |   4 +-
 fs/ceph/inode.c                    |   6 ++
 fs/ceph/mds_client.c               |  23 +++++
 fs/ceph/mds_client.h               |   2 +
 fs/ceph/quota.c                    | 190 +++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h                    |  10 ++
 fs/ceph/xattr.c                    |  44 +++++++++
 include/linux/ceph/ceph_features.h |   3 +-
 include/linux/ceph/ceph_fs.h       |  17 ++++
 11 files changed, 314 insertions(+), 3 deletions(-)
 create mode 100644 fs/ceph/quota.c


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas
  2017-12-20 15:18 [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Luis Henriques
@ 2017-12-20 15:18 ` Luis Henriques
  2017-12-21  7:58   ` Yan, Zheng
  2017-12-20 15:18 ` [RFC PATCH v3 2/3] ceph: quotas: support for ceph.quota.max_files Luis Henriques
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Luis Henriques @ 2017-12-20 15:18 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

This patch adds the infrastructure required to support cephfs quotas as it
is currently implemented in the ceph fuse client.  Cephfs quotas can be
set on any directory, and can restrict the number of bytes or the number
of files stored beneath that point in the directory hierarchy.

Quotas are set using the extended attributes 'ceph.quota.max_files' and
'ceph.quota.max_bytes', and can be removed by setting these attributes to
'0'.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/Makefile                   |  2 +-
 fs/ceph/inode.c                    |  6 ++++
 fs/ceph/mds_client.c               | 23 ++++++++++++++
 fs/ceph/mds_client.h               |  2 ++
 fs/ceph/quota.c                    | 63 ++++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h                    |  8 +++++
 fs/ceph/xattr.c                    | 44 ++++++++++++++++++++++++++
 include/linux/ceph/ceph_features.h |  3 +-
 include/linux/ceph/ceph_fs.h       | 17 ++++++++++
 9 files changed, 166 insertions(+), 2 deletions(-)
 create mode 100644 fs/ceph/quota.c

diff --git a/fs/ceph/Makefile b/fs/ceph/Makefile
index 174f5709e508..a699e320393f 100644
--- a/fs/ceph/Makefile
+++ b/fs/ceph/Makefile
@@ -6,7 +6,7 @@
 obj-$(CONFIG_CEPH_FS) += ceph.o
 
 ceph-y := super.o inode.o dir.o file.o locks.o addr.o ioctl.o \
-	export.o caps.o snap.o xattr.o \
+	export.o caps.o snap.o xattr.o quota.o \
 	mds_client.o mdsmap.o strings.o ceph_frag.o \
 	debugfs.o
 
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index ab81652198c4..8a0ba96e105d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -441,6 +441,9 @@ struct inode *ceph_alloc_inode(struct super_block *sb)
 	atomic64_set(&ci->i_complete_seq[1], 0);
 	ci->i_symlink = NULL;
 
+	ci->i_max_bytes = 0;
+	ci->i_max_files = 0;
+
 	memset(&ci->i_dir_layout, 0, sizeof(ci->i_dir_layout));
 	RCU_INIT_POINTER(ci->i_layout.pool_ns, NULL);
 
@@ -790,6 +793,9 @@ static int fill_inode(struct inode *inode, struct page *locked_page,
 	inode->i_rdev = le32_to_cpu(info->rdev);
 	inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
 
+	ci->i_max_bytes = iinfo->max_bytes;
+	ci->i_max_files = iinfo->max_files;
+
 	if ((new_version || (new_issued & CEPH_CAP_AUTH_SHARED)) &&
 	    (issued & CEPH_CAP_AUTH_EXCL) == 0) {
 		inode->i_mode = le32_to_cpu(info->mode);
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 1b468250e947..2290056d13fc 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -100,6 +100,26 @@ static int parse_reply_info_in(void **p, void *end,
 	} else
 		info->inline_version = CEPH_INLINE_NONE;
 
+	if (features & CEPH_FEATURE_MDS_QUOTA) {
+		u8 struct_v, struct_compat;
+		u32 struct_len;
+
+		/*
+		 * both struct_v and struct_compat are expected to be >= 1
+		 */
+		ceph_decode_8_safe(p, end, struct_v, bad);
+		ceph_decode_8_safe(p, end, struct_compat, bad);
+		if (!struct_v || !struct_compat)
+			goto bad;
+		ceph_decode_32_safe(p, end, struct_len, bad);
+		ceph_decode_need(p, end, struct_len, bad);
+		ceph_decode_64_safe(p, end, info->max_bytes, bad);
+		ceph_decode_64_safe(p, end, info->max_files, bad);
+	} else {
+		info->max_bytes = 0;
+		info->max_files = 0;
+	}
+
 	info->pool_ns_len = 0;
 	info->pool_ns_data = NULL;
 	if (features & CEPH_FEATURE_FS_FILE_LAYOUT_V2) {
@@ -4064,6 +4084,9 @@ static void dispatch(struct ceph_connection *con, struct ceph_msg *msg)
 	case CEPH_MSG_CLIENT_LEASE:
 		handle_lease(mdsc, s, msg);
 		break;
+	case CEPH_MSG_CLIENT_QUOTA:
+		ceph_handle_quota(mdsc, s, msg);
+		break;
 
 	default:
 		pr_err("received unknown message type %d %s\n", type,
diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
index 837ac4b087a0..7af576733948 100644
--- a/fs/ceph/mds_client.h
+++ b/fs/ceph/mds_client.h
@@ -49,6 +49,8 @@ struct ceph_mds_reply_info_in {
 	char *inline_data;
 	u32 pool_ns_len;
 	char *pool_ns_data;
+	u64 max_bytes;
+	u64 max_files;
 };
 
 struct ceph_mds_reply_dir_entry {
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
new file mode 100644
index 000000000000..7bde6e85b609
--- /dev/null
+++ b/fs/ceph/quota.c
@@ -0,0 +1,63 @@
+/*
+ * quota.c - CephFS quota
+ *
+ * Copyright (C) 2017 SUSE
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "super.h"
+#include "mds_client.h"
+
+void ceph_handle_quota(struct ceph_mds_client *mdsc,
+		       struct ceph_mds_session *session,
+		       struct ceph_msg *msg)
+{
+	struct super_block *sb = mdsc->fsc->sb;
+	struct ceph_mds_quota *h = msg->front.iov_base;
+	struct ceph_vino vino;
+	struct inode *inode;
+	struct ceph_inode_info *ci;
+
+	if (msg->front.iov_len != sizeof(*h)) {
+		pr_err("ceph_handle_quota corrupt message mds%d len %d\n",
+		       session->s_mds, (int)msg->front.iov_len);
+		ceph_msg_dump(msg);
+		return;
+	}
+
+	/* lookup inode */
+	vino.ino = le64_to_cpu(h->ino);
+	vino.snap = CEPH_NOSNAP;
+	inode = ceph_find_inode(sb, vino);
+	if (!inode) {
+		pr_warn("Failed to find inode %llu\n", vino.ino);
+		return;
+	}
+	ci = ceph_inode(inode);
+
+	mutex_lock(&session->s_mutex);
+	session->s_seq++;
+	mutex_unlock(&session->s_mutex);
+
+	spin_lock(&ci->i_ceph_lock);
+	ci->i_rbytes = le64_to_cpu(h->rbytes);
+	ci->i_rfiles = le64_to_cpu(h->rfiles);
+	ci->i_rsubdirs = le64_to_cpu(h->rsubdirs);
+	ci->i_max_bytes = le64_to_cpu(h->max_bytes);
+	ci->i_max_files = le64_to_cpu(h->max_files);
+	spin_unlock(&ci->i_ceph_lock);
+
+	iput(inode);
+}
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 2beeec07fa76..f998b7f076cf 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -309,6 +309,9 @@ struct ceph_inode_info {
 	u64 i_rbytes, i_rfiles, i_rsubdirs;
 	u64 i_files, i_subdirs;
 
+	/* quotas */
+	u64 i_max_bytes, i_max_files;
+
 	struct rb_root i_fragtree;
 	int i_fragtree_nsplits;
 	struct mutex i_fragtree_mutex;
@@ -1019,4 +1022,9 @@ extern int ceph_locks_to_pagelist(struct ceph_filelock *flocks,
 extern int ceph_fs_debugfs_init(struct ceph_fs_client *client);
 extern void ceph_fs_debugfs_cleanup(struct ceph_fs_client *client);
 
+/* quota.c */
+extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
+			      struct ceph_mds_session *session,
+			      struct ceph_msg *msg);
+
 #endif /* _FS_CEPH_SUPER_H */
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
index e1c4e0b12b4c..cfc3028be0fa 100644
--- a/fs/ceph/xattr.c
+++ b/fs/ceph/xattr.c
@@ -224,6 +224,31 @@ static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val,
 			(long)ci->i_rctime.tv_nsec);
 }
 
+/* quotas */
+
+static bool ceph_vxattrcb_quota_exists(struct ceph_inode_info *ci)
+{
+	return (ci->i_max_files || ci->i_max_bytes);
+}
+
+static size_t ceph_vxattrcb_quota(struct ceph_inode_info *ci, char *val,
+				  size_t size)
+{
+	return snprintf(val, size, "max_bytes=%llu max_files=%llu",
+			ci->i_max_bytes, ci->i_max_files);
+}
+
+static size_t ceph_vxattrcb_quota_max_bytes(struct ceph_inode_info *ci,
+					    char *val, size_t size)
+{
+	return snprintf(val, size, "%llu", ci->i_max_bytes);
+}
+
+static size_t ceph_vxattrcb_quota_max_files(struct ceph_inode_info *ci,
+					    char *val, size_t size)
+{
+	return snprintf(val, size, "%llu", ci->i_max_files);
+}
 
 #define CEPH_XATTR_NAME(_type, _name)	XATTR_CEPH_PREFIX #_type "." #_name
 #define CEPH_XATTR_NAME2(_type, _name, _name2)	\
@@ -247,6 +272,15 @@ static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val,
 		.hidden = true,			\
 		.exists_cb = ceph_vxattrcb_layout_exists,	\
 	}
+#define XATTR_QUOTA_FIELD(_type, _name)					\
+	{								\
+		.name = CEPH_XATTR_NAME(_type, _name),			\
+		.name_size = sizeof (CEPH_XATTR_NAME(_type, _name)),	\
+		.getxattr_cb = ceph_vxattrcb_ ## _type ## _ ## _name,	\
+		.readonly = false,					\
+		.hidden = true,						\
+		.exists_cb = ceph_vxattrcb_quota_exists,		\
+	}
 
 static struct ceph_vxattr ceph_dir_vxattrs[] = {
 	{
@@ -270,6 +304,16 @@ static struct ceph_vxattr ceph_dir_vxattrs[] = {
 	XATTR_NAME_CEPH(dir, rsubdirs),
 	XATTR_NAME_CEPH(dir, rbytes),
 	XATTR_NAME_CEPH(dir, rctime),
+	{
+		.name = "ceph.quota",
+		.name_size = sizeof("ceph.quota"),
+		.getxattr_cb = ceph_vxattrcb_quota,
+		.readonly = false,
+		.hidden = true,
+		.exists_cb = ceph_vxattrcb_quota_exists,
+	},
+	XATTR_QUOTA_FIELD(quota, max_bytes),
+	XATTR_QUOTA_FIELD(quota, max_files),
 	{ .name = NULL, 0 }	/* Required table terminator */
 };
 static size_t ceph_dir_vxattrs_name_size;	/* total size of all names */
diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
index 59042d5ac520..6acd46c36271 100644
--- a/include/linux/ceph/ceph_features.h
+++ b/include/linux/ceph/ceph_features.h
@@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facin
 	 CEPH_FEATURE_SERVER_JEWEL |		\
 	 CEPH_FEATURE_MON_STATEFUL_SUB |	\
 	 CEPH_FEATURE_CRUSH_TUNABLES5 |		\
-	 CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
+	 CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING |	\
+	 CEPH_FEATURE_MDS_QUOTA)
 
 #define CEPH_FEATURES_REQUIRED_DEFAULT   \
 	(CEPH_FEATURE_NOSRCADDR |	 \
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index 88dd51381aaf..98bdcc0eda3f 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -134,6 +134,7 @@ struct ceph_dir_layout {
 #define CEPH_MSG_CLIENT_LEASE           0x311
 #define CEPH_MSG_CLIENT_SNAP            0x312
 #define CEPH_MSG_CLIENT_CAPRELEASE      0x313
+#define CEPH_MSG_CLIENT_QUOTA		0x314
 
 /* pool ops */
 #define CEPH_MSG_POOLOP_REPLY           48
@@ -807,4 +808,20 @@ struct ceph_mds_snap_realm {
 } __attribute__ ((packed));
 /* followed by my snap list, then prior parent snap list */
 
+/*
+ * quotas
+ */
+struct ceph_mds_quota {
+	__le64 ino;		/* ino */
+	struct ceph_timespec rctime;
+	__le64 rbytes;		/* dir stats */
+	__le64 rfiles;
+	__le64 rsubdirs;
+	__u8 struct_v;		/* compat */
+	__u8 struct_compat;
+	__le32 struct_len;
+	__le64 max_bytes;	/* quota max. bytes */
+	__le64 max_files;	/* quota max. files */
+} __attribute__ ((packed));
+
 #endif

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 2/3] ceph: quotas: support for ceph.quota.max_files
  2017-12-20 15:18 [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Luis Henriques
  2017-12-20 15:18 ` [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
@ 2017-12-20 15:18 ` Luis Henriques
  2017-12-21  8:11   ` Yan, Zheng
  2017-12-20 15:18 ` [RFC PATCH v3 3/3] ceph: quota: don't allow cross-quota renames Luis Henriques
  2017-12-21  8:21 ` [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Yan, Zheng
  3 siblings, 1 reply; 9+ messages in thread
From: Luis Henriques @ 2017-12-20 15:18 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

This patch adds support for the max_files quota.  It hooks into all the
ceph functions that add new filesystem objects that need to be checked
against the quota limits.  When these limits are hit, -EDQUOT is returned.

Note that we're not checking quotas on ceph_link().  ceph_link doesn't
really create a new inode,  and since the MDS doesn't update the directory
statistics when a new (hard) link is created (only with symlinks), they
are not accounted as a new file.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/dir.c   | 11 +++++++++
 fs/ceph/file.c  |  4 +++-
 fs/ceph/quota.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h |  1 +
 4 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 8a5266699b67..66550d92b1ac 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -818,6 +818,9 @@ static int ceph_mknod(struct inode *dir, struct dentry *dentry,
 	if (ceph_snap(dir) != CEPH_NOSNAP)
 		return -EROFS;
 
+	if (ceph_quota_is_max_files_exceeded(dir))
+		return -EDQUOT;
+
 	err = ceph_pre_init_acls(dir, &mode, &acls);
 	if (err < 0)
 		return err;
@@ -871,6 +874,9 @@ static int ceph_symlink(struct inode *dir, struct dentry *dentry,
 	if (ceph_snap(dir) != CEPH_NOSNAP)
 		return -EROFS;
 
+	if (ceph_quota_is_max_files_exceeded(dir))
+		return -EDQUOT;
+
 	dout("symlink in dir %p dentry %p to '%s'\n", dir, dentry, dest);
 	req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_SYMLINK, USE_AUTH_MDS);
 	if (IS_ERR(req)) {
@@ -920,6 +926,11 @@ static int ceph_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 		goto out;
 	}
 
+	if (ceph_quota_is_max_files_exceeded(dir)) {
+		err = -EDQUOT;
+		goto out;
+	}
+
 	mode |= S_IFDIR;
 	err = ceph_pre_init_acls(dir, &mode, &acls);
 	if (err < 0)
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 5c17125f45c7..5a77a66e3d6b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -371,7 +371,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 	struct ceph_mds_request *req;
 	struct dentry *dn;
 	struct ceph_acls_info acls = {};
-       int mask;
+	int mask;
 	int err;
 
 	dout("atomic_open %p dentry %p '%pd' %s flags %d mode 0%o\n",
@@ -382,6 +382,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
 		return -ENAMETOOLONG;
 
 	if (flags & O_CREAT) {
+		if (ceph_quota_is_max_files_exceeded(dir))
+			return -EDQUOT;
 		err = ceph_pre_init_acls(dir, &mode, &acls);
 		if (err < 0)
 			return err;
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index 7bde6e85b609..06b3268f8f7f 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -61,3 +61,72 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
 
 	iput(inode);
 }
+
+enum quota_check_op {
+	QUOTA_CHECK_MAX_FILES_OP /* check quota max_files limit */
+};
+
+/*
+ * check_quota_exceeded() will walk up the snaprealm hierarchy and, for each
+ * realm, it will execute quota check operation defined by the 'op' parameter.
+ * The snaprealm walk is interrupted if the quota check detects that the quota
+ * is exceeded or if the root inode is reached.
+ */
+static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op,
+				 loff_t size)
+{
+	struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc;
+	struct ceph_inode_info *ci;
+	struct ceph_snap_realm *realm, *next;
+	struct ceph_vino vino;
+	struct inode *ino;
+	u64 max = 0, rvalue = 0;
+	bool quota_exceeded = false, is_root = false;
+
+	WARN_ON(!S_ISDIR(inode->i_mode));
+
+	down_read(&mdsc->snap_rwsem);
+	realm = ceph_inode(inode)->i_snap_realm;
+	ceph_get_snap_realm(mdsc, realm);
+	while (realm) {
+		vino.ino = realm->ino;
+		vino.snap = CEPH_NOSNAP;
+		ino = ceph_find_inode(inode->i_sb, vino);
+		if (!ino) {
+			pr_warn("Failed to find inode for %llu\n", vino.ino);
+			break;
+		}
+		ci = ceph_inode(ino);
+		switch(op) {
+		case QUOTA_CHECK_MAX_FILES_OP:
+			spin_lock(&ci->i_ceph_lock);
+			max = ci->i_max_files;
+			rvalue = ci->i_rfiles + ci->i_rsubdirs;
+			is_root = (ci->i_vino.ino == CEPH_INO_ROOT);
+			spin_unlock(&ci->i_ceph_lock);
+			quota_exceeded = (max && (rvalue >= max));
+			break;
+		default:
+			/* Shouldn't happen */
+			pr_warn("Invalid quota check op (%d)\n", op);
+			is_root = true; /* Just break the loop */
+		}
+		iput(ino);
+
+		if (quota_exceeded || is_root)
+			break;
+		next = realm->parent;
+		ceph_get_snap_realm(mdsc, next);
+		ceph_put_snap_realm(mdsc, realm);
+		realm = next;
+	}
+	ceph_put_snap_realm(mdsc, realm);
+	up_read(&mdsc->snap_rwsem);
+
+	return quota_exceeded;
+}
+
+bool ceph_quota_is_max_files_exceeded(struct inode *inode)
+{
+	return check_quota_exceeded(inode, QUOTA_CHECK_MAX_FILES_OP, 0);
+}
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index f998b7f076cf..20197e29a7f0 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1026,5 +1026,6 @@ extern void ceph_fs_debugfs_cleanup(struct ceph_fs_client *client);
 extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
 			      struct ceph_mds_session *session,
 			      struct ceph_msg *msg);
+extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
 
 #endif /* _FS_CEPH_SUPER_H */

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 3/3] ceph: quota: don't allow cross-quota renames
  2017-12-20 15:18 [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Luis Henriques
  2017-12-20 15:18 ` [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
  2017-12-20 15:18 ` [RFC PATCH v3 2/3] ceph: quotas: support for ceph.quota.max_files Luis Henriques
@ 2017-12-20 15:18 ` Luis Henriques
  2017-12-21  8:10   ` Yan, Zheng
  2017-12-21  8:21 ` [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Yan, Zheng
  3 siblings, 1 reply; 9+ messages in thread
From: Luis Henriques @ 2017-12-20 15:18 UTC (permalink / raw)
  To: ceph-devel; +Cc: Yan, Zheng, Jeff Layton, Jan Fajerski, Luis Henriques

This patch changes ceph_rename so that -EXDEV is returned if an attempt is
made to mv a file between two different dir trees with different quotas
setup.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
---
 fs/ceph/dir.c   |  5 +++++
 fs/ceph/quota.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ceph/super.h |  1 +
 3 files changed, 64 insertions(+)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 66550d92b1ac..f6ac16caa1e9 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1090,6 +1090,11 @@ static int ceph_rename(struct inode *old_dir, struct dentry *old_dentry,
 		else
 			return -EROFS;
 	}
+	/* don't allow cross-quota renames */
+	if ((old_dir != new_dir) &&
+	    (!ceph_quota_is_same_realm(old_dir, new_dir)))
+		return -EXDEV;
+
 	dout("rename dir %p dentry %p to dir %p dentry %p\n",
 	     old_dir, old_dentry, new_dir, new_dentry);
 	req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
index 06b3268f8f7f..14e372deb633 100644
--- a/fs/ceph/quota.c
+++ b/fs/ceph/quota.c
@@ -20,6 +20,11 @@
 #include "super.h"
 #include "mds_client.h"
 
+static inline bool ceph_has_quota(struct ceph_inode_info *ci)
+{
+	return (ci && (ci->i_max_files || ci->i_max_bytes));
+}
+
 void ceph_handle_quota(struct ceph_mds_client *mdsc,
 		       struct ceph_mds_session *session,
 		       struct ceph_msg *msg)
@@ -62,6 +67,59 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
 	iput(inode);
 }
 
+/*
+ * This function walks through the snaprealm for an inode and returns the
+ * ceph_inode_info for the first snaprealm that has quotas set (either max_files
+ * or max_bytes).  If the root is reached, return the root ceph_inode_info
+ * instead.
+ */
+static struct ceph_inode_info *get_quota_realm(struct ceph_mds_client *mdsc,
+					       struct inode *inode)
+{
+	struct ceph_inode_info *ci = NULL;
+	struct ceph_snap_realm *realm, *next;
+	struct ceph_vino vino;
+	struct inode *ino;
+
+	realm = ceph_inode(inode)->i_snap_realm;
+	ceph_get_snap_realm(mdsc, realm);
+	while (realm) {
+		vino.ino = realm->ino;
+		vino.snap = CEPH_NOSNAP;
+		ino = ceph_find_inode(inode->i_sb, vino);
+		if (!ino) {
+			pr_warn("Failed to find inode for %llu\n", vino.ino);
+			break;
+		}
+		ci = ceph_inode(ino);
+		if (ceph_has_quota(ci) || (ci->i_vino.ino == CEPH_INO_ROOT)) {
+			iput(ino);
+			break;
+		}
+		iput(ino);
+		next = realm->parent;
+		ceph_get_snap_realm(mdsc, next);
+		ceph_put_snap_realm(mdsc, realm);
+		realm = next;
+	}
+	ceph_put_snap_realm(mdsc, realm);
+
+	return ci;
+}
+
+bool ceph_quota_is_same_realm(struct inode *old, struct inode *new)
+{
+	struct ceph_mds_client *mdsc = ceph_inode_to_client(old)->mdsc;
+	struct ceph_inode_info *ci_old, *ci_new;
+
+	down_read(&mdsc->snap_rwsem);
+	ci_old = get_quota_realm(mdsc, old);
+	ci_new = get_quota_realm(mdsc, new);
+	up_read(&mdsc->snap_rwsem);
+
+	return (ci_old == ci_new);
+}
+
 enum quota_check_op {
 	QUOTA_CHECK_MAX_FILES_OP /* check quota max_files limit */
 };
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 20197e29a7f0..a66e73338386 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1027,5 +1027,6 @@ extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
 			      struct ceph_mds_session *session,
 			      struct ceph_msg *msg);
 extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
+extern bool ceph_quota_is_same_realm(struct inode *old, struct inode *new);
 
 #endif /* _FS_CEPH_SUPER_H */

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas
  2017-12-20 15:18 ` [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
@ 2017-12-21  7:58   ` Yan, Zheng
  0 siblings, 0 replies; 9+ messages in thread
From: Yan, Zheng @ 2017-12-21  7:58 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

On Wed, Dec 20, 2017 at 11:18 PM, Luis Henriques <lhenriques@suse.com> wrote:
> This patch adds the infrastructure required to support cephfs quotas as it
> is currently implemented in the ceph fuse client.  Cephfs quotas can be
> set on any directory, and can restrict the number of bytes or the number
> of files stored beneath that point in the directory hierarchy.
>
> Quotas are set using the extended attributes 'ceph.quota.max_files' and
> 'ceph.quota.max_bytes', and can be removed by setting these attributes to
> '0'.
>
> Link: http://tracker.ceph.com/issues/22372
> Signed-off-by: Luis Henriques <lhenriques@suse.com>
> ---
>  fs/ceph/Makefile                   |  2 +-
>  fs/ceph/inode.c                    |  6 ++++
>  fs/ceph/mds_client.c               | 23 ++++++++++++++
>  fs/ceph/mds_client.h               |  2 ++
>  fs/ceph/quota.c                    | 63 ++++++++++++++++++++++++++++++++++++++
>  fs/ceph/super.h                    |  8 +++++
>  fs/ceph/xattr.c                    | 44 ++++++++++++++++++++++++++
>  include/linux/ceph/ceph_features.h |  3 +-
>  include/linux/ceph/ceph_fs.h       | 17 ++++++++++
>  9 files changed, 166 insertions(+), 2 deletions(-)
>  create mode 100644 fs/ceph/quota.c
>
> diff --git a/fs/ceph/Makefile b/fs/ceph/Makefile
> index 174f5709e508..a699e320393f 100644
> --- a/fs/ceph/Makefile
> +++ b/fs/ceph/Makefile
> @@ -6,7 +6,7 @@
>  obj-$(CONFIG_CEPH_FS) += ceph.o
>
>  ceph-y := super.o inode.o dir.o file.o locks.o addr.o ioctl.o \
> -       export.o caps.o snap.o xattr.o \
> +       export.o caps.o snap.o xattr.o quota.o \
>         mds_client.o mdsmap.o strings.o ceph_frag.o \
>         debugfs.o
>
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index ab81652198c4..8a0ba96e105d 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -441,6 +441,9 @@ struct inode *ceph_alloc_inode(struct super_block *sb)
>         atomic64_set(&ci->i_complete_seq[1], 0);
>         ci->i_symlink = NULL;
>
> +       ci->i_max_bytes = 0;
> +       ci->i_max_files = 0;
> +
>         memset(&ci->i_dir_layout, 0, sizeof(ci->i_dir_layout));
>         RCU_INIT_POINTER(ci->i_layout.pool_ns, NULL);
>
> @@ -790,6 +793,9 @@ static int fill_inode(struct inode *inode, struct page *locked_page,
>         inode->i_rdev = le32_to_cpu(info->rdev);
>         inode->i_blkbits = fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1;
>
> +       ci->i_max_bytes = iinfo->max_bytes;
> +       ci->i_max_files = iinfo->max_files;
> +
>         if ((new_version || (new_issued & CEPH_CAP_AUTH_SHARED)) &&
>             (issued & CEPH_CAP_AUTH_EXCL) == 0) {
>                 inode->i_mode = le32_to_cpu(info->mode);
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 1b468250e947..2290056d13fc 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -100,6 +100,26 @@ static int parse_reply_info_in(void **p, void *end,
>         } else
>                 info->inline_version = CEPH_INLINE_NONE;
>
> +       if (features & CEPH_FEATURE_MDS_QUOTA) {
> +               u8 struct_v, struct_compat;
> +               u32 struct_len;
> +
> +               /*
> +                * both struct_v and struct_compat are expected to be >= 1
> +                */
> +               ceph_decode_8_safe(p, end, struct_v, bad);
> +               ceph_decode_8_safe(p, end, struct_compat, bad);
> +               if (!struct_v || !struct_compat)
> +                       goto bad;
> +               ceph_decode_32_safe(p, end, struct_len, bad);
> +               ceph_decode_need(p, end, struct_len, bad);
> +               ceph_decode_64_safe(p, end, info->max_bytes, bad);
> +               ceph_decode_64_safe(p, end, info->max_files, bad);
> +       } else {
> +               info->max_bytes = 0;
> +               info->max_files = 0;
> +       }
> +
>         info->pool_ns_len = 0;
>         info->pool_ns_data = NULL;
>         if (features & CEPH_FEATURE_FS_FILE_LAYOUT_V2) {
> @@ -4064,6 +4084,9 @@ static void dispatch(struct ceph_connection *con, struct ceph_msg *msg)
>         case CEPH_MSG_CLIENT_LEASE:
>                 handle_lease(mdsc, s, msg);
>                 break;
> +       case CEPH_MSG_CLIENT_QUOTA:
> +               ceph_handle_quota(mdsc, s, msg);
> +               break;
>
>         default:
>                 pr_err("received unknown message type %d %s\n", type,
> diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
> index 837ac4b087a0..7af576733948 100644
> --- a/fs/ceph/mds_client.h
> +++ b/fs/ceph/mds_client.h
> @@ -49,6 +49,8 @@ struct ceph_mds_reply_info_in {
>         char *inline_data;
>         u32 pool_ns_len;
>         char *pool_ns_data;
> +       u64 max_bytes;
> +       u64 max_files;
>  };
>
>  struct ceph_mds_reply_dir_entry {
> diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
> new file mode 100644
> index 000000000000..7bde6e85b609
> --- /dev/null
> +++ b/fs/ceph/quota.c
> @@ -0,0 +1,63 @@
> +/*
> + * quota.c - CephFS quota
> + *
> + * Copyright (C) 2017 SUSE
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "super.h"
> +#include "mds_client.h"
> +
> +void ceph_handle_quota(struct ceph_mds_client *mdsc,
> +                      struct ceph_mds_session *session,
> +                      struct ceph_msg *msg)
> +{
> +       struct super_block *sb = mdsc->fsc->sb;
> +       struct ceph_mds_quota *h = msg->front.iov_base;
> +       struct ceph_vino vino;
> +       struct inode *inode;
> +       struct ceph_inode_info *ci;
> +
> +       if (msg->front.iov_len != sizeof(*h)) {
> +               pr_err("ceph_handle_quota corrupt message mds%d len %d\n",
> +                      session->s_mds, (int)msg->front.iov_len);
> +               ceph_msg_dump(msg);
> +               return;
> +       }
> +
> +       /* lookup inode */
> +       vino.ino = le64_to_cpu(h->ino);
> +       vino.snap = CEPH_NOSNAP;
> +       inode = ceph_find_inode(sb, vino);
> +       if (!inode) {
> +               pr_warn("Failed to find inode %llu\n", vino.ino);
> +               return;
> +       }
> +       ci = ceph_inode(inode);
> +
> +       mutex_lock(&session->s_mutex);
> +       session->s_seq++;
> +       mutex_unlock(&session->s_mutex);

this code should be executed no mater inode is in the cache or not

> +
> +       spin_lock(&ci->i_ceph_lock);
> +       ci->i_rbytes = le64_to_cpu(h->rbytes);
> +       ci->i_rfiles = le64_to_cpu(h->rfiles);
> +       ci->i_rsubdirs = le64_to_cpu(h->rsubdirs);
> +       ci->i_max_bytes = le64_to_cpu(h->max_bytes);
> +       ci->i_max_files = le64_to_cpu(h->max_files);
> +       spin_unlock(&ci->i_ceph_lock);
> +
> +       iput(inode);
> +}
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index 2beeec07fa76..f998b7f076cf 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -309,6 +309,9 @@ struct ceph_inode_info {
>         u64 i_rbytes, i_rfiles, i_rsubdirs;
>         u64 i_files, i_subdirs;
>
> +       /* quotas */
> +       u64 i_max_bytes, i_max_files;
> +
>         struct rb_root i_fragtree;
>         int i_fragtree_nsplits;
>         struct mutex i_fragtree_mutex;
> @@ -1019,4 +1022,9 @@ extern int ceph_locks_to_pagelist(struct ceph_filelock *flocks,
>  extern int ceph_fs_debugfs_init(struct ceph_fs_client *client);
>  extern void ceph_fs_debugfs_cleanup(struct ceph_fs_client *client);
>
> +/* quota.c */
> +extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
> +                             struct ceph_mds_session *session,
> +                             struct ceph_msg *msg);
> +
>  #endif /* _FS_CEPH_SUPER_H */
> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> index e1c4e0b12b4c..cfc3028be0fa 100644
> --- a/fs/ceph/xattr.c
> +++ b/fs/ceph/xattr.c
> @@ -224,6 +224,31 @@ static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val,
>                         (long)ci->i_rctime.tv_nsec);
>  }
>
> +/* quotas */
> +
> +static bool ceph_vxattrcb_quota_exists(struct ceph_inode_info *ci)
> +{
> +       return (ci->i_max_files || ci->i_max_bytes);
> +}
> +
> +static size_t ceph_vxattrcb_quota(struct ceph_inode_info *ci, char *val,
> +                                 size_t size)
> +{
> +       return snprintf(val, size, "max_bytes=%llu max_files=%llu",
> +                       ci->i_max_bytes, ci->i_max_files);
> +}
> +
> +static size_t ceph_vxattrcb_quota_max_bytes(struct ceph_inode_info *ci,
> +                                           char *val, size_t size)
> +{
> +       return snprintf(val, size, "%llu", ci->i_max_bytes);
> +}
> +
> +static size_t ceph_vxattrcb_quota_max_files(struct ceph_inode_info *ci,
> +                                           char *val, size_t size)
> +{
> +       return snprintf(val, size, "%llu", ci->i_max_files);
> +}
>
>  #define CEPH_XATTR_NAME(_type, _name)  XATTR_CEPH_PREFIX #_type "." #_name
>  #define CEPH_XATTR_NAME2(_type, _name, _name2) \
> @@ -247,6 +272,15 @@ static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val,
>                 .hidden = true,                 \
>                 .exists_cb = ceph_vxattrcb_layout_exists,       \
>         }
> +#define XATTR_QUOTA_FIELD(_type, _name)                                        \
> +       {                                                               \
> +               .name = CEPH_XATTR_NAME(_type, _name),                  \
> +               .name_size = sizeof (CEPH_XATTR_NAME(_type, _name)),    \
> +               .getxattr_cb = ceph_vxattrcb_ ## _type ## _ ## _name,   \
> +               .readonly = false,                                      \
> +               .hidden = true,                                         \
> +               .exists_cb = ceph_vxattrcb_quota_exists,                \
> +       }
>
>  static struct ceph_vxattr ceph_dir_vxattrs[] = {
>         {
> @@ -270,6 +304,16 @@ static struct ceph_vxattr ceph_dir_vxattrs[] = {
>         XATTR_NAME_CEPH(dir, rsubdirs),
>         XATTR_NAME_CEPH(dir, rbytes),
>         XATTR_NAME_CEPH(dir, rctime),
> +       {
> +               .name = "ceph.quota",
> +               .name_size = sizeof("ceph.quota"),
> +               .getxattr_cb = ceph_vxattrcb_quota,
> +               .readonly = false,
> +               .hidden = true,
> +               .exists_cb = ceph_vxattrcb_quota_exists,
> +       },
> +       XATTR_QUOTA_FIELD(quota, max_bytes),
> +       XATTR_QUOTA_FIELD(quota, max_files),
>         { .name = NULL, 0 }     /* Required table terminator */
>  };
>  static size_t ceph_dir_vxattrs_name_size;      /* total size of all names */
> diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h
> index 59042d5ac520..6acd46c36271 100644
> --- a/include/linux/ceph/ceph_features.h
> +++ b/include/linux/ceph/ceph_features.h
> @@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // client-facin
>          CEPH_FEATURE_SERVER_JEWEL |            \
>          CEPH_FEATURE_MON_STATEFUL_SUB |        \
>          CEPH_FEATURE_CRUSH_TUNABLES5 |         \
> -        CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
> +        CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING | \
> +        CEPH_FEATURE_MDS_QUOTA)
>
>  #define CEPH_FEATURES_REQUIRED_DEFAULT   \
>         (CEPH_FEATURE_NOSRCADDR |        \
> diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
> index 88dd51381aaf..98bdcc0eda3f 100644
> --- a/include/linux/ceph/ceph_fs.h
> +++ b/include/linux/ceph/ceph_fs.h
> @@ -134,6 +134,7 @@ struct ceph_dir_layout {
>  #define CEPH_MSG_CLIENT_LEASE           0x311
>  #define CEPH_MSG_CLIENT_SNAP            0x312
>  #define CEPH_MSG_CLIENT_CAPRELEASE      0x313
> +#define CEPH_MSG_CLIENT_QUOTA          0x314
>
>  /* pool ops */
>  #define CEPH_MSG_POOLOP_REPLY           48
> @@ -807,4 +808,20 @@ struct ceph_mds_snap_realm {
>  } __attribute__ ((packed));
>  /* followed by my snap list, then prior parent snap list */
>
> +/*
> + * quotas
> + */
> +struct ceph_mds_quota {
> +       __le64 ino;             /* ino */
> +       struct ceph_timespec rctime;
> +       __le64 rbytes;          /* dir stats */
> +       __le64 rfiles;
> +       __le64 rsubdirs;
> +       __u8 struct_v;          /* compat */
> +       __u8 struct_compat;
> +       __le32 struct_len;
> +       __le64 max_bytes;       /* quota max. bytes */
> +       __le64 max_files;       /* quota max. files */
> +} __attribute__ ((packed));
> +
>  #endif
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 3/3] ceph: quota: don't allow cross-quota renames
  2017-12-20 15:18 ` [RFC PATCH v3 3/3] ceph: quota: don't allow cross-quota renames Luis Henriques
@ 2017-12-21  8:10   ` Yan, Zheng
  0 siblings, 0 replies; 9+ messages in thread
From: Yan, Zheng @ 2017-12-21  8:10 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

On Wed, Dec 20, 2017 at 11:18 PM, Luis Henriques <lhenriques@suse.com> wrote:
> This patch changes ceph_rename so that -EXDEV is returned if an attempt is
> made to mv a file between two different dir trees with different quotas
> setup.
>
> Link: http://tracker.ceph.com/issues/22372
> Signed-off-by: Luis Henriques <lhenriques@suse.com>
> ---
>  fs/ceph/dir.c   |  5 +++++
>  fs/ceph/quota.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/ceph/super.h |  1 +
>  3 files changed, 64 insertions(+)
>
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index 66550d92b1ac..f6ac16caa1e9 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -1090,6 +1090,11 @@ static int ceph_rename(struct inode *old_dir, struct dentry *old_dentry,
>                 else
>                         return -EROFS;
>         }
> +       /* don't allow cross-quota renames */
> +       if ((old_dir != new_dir) &&
> +           (!ceph_quota_is_same_realm(old_dir, new_dir)))
> +               return -EXDEV;
> +
>         dout("rename dir %p dentry %p to dir %p dentry %p\n",
>              old_dir, old_dentry, new_dir, new_dentry);
>         req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS);
> diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
> index 06b3268f8f7f..14e372deb633 100644
> --- a/fs/ceph/quota.c
> +++ b/fs/ceph/quota.c
> @@ -20,6 +20,11 @@
>  #include "super.h"
>  #include "mds_client.h"
>
> +static inline bool ceph_has_quota(struct ceph_inode_info *ci)
> +{
> +       return (ci && (ci->i_max_files || ci->i_max_bytes));
> +}
> +
>  void ceph_handle_quota(struct ceph_mds_client *mdsc,
>                        struct ceph_mds_session *session,
>                        struct ceph_msg *msg)
> @@ -62,6 +67,59 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
>         iput(inode);
>  }
>
> +/*
> + * This function walks through the snaprealm for an inode and returns the
> + * ceph_inode_info for the first snaprealm that has quotas set (either max_files
> + * or max_bytes).  If the root is reached, return the root ceph_inode_info
> + * instead.
> + */
> +static struct ceph_inode_info *get_quota_realm(struct ceph_mds_client *mdsc,
> +                                              struct inode *inode)
> +{
> +       struct ceph_inode_info *ci = NULL;
> +       struct ceph_snap_realm *realm, *next;
> +       struct ceph_vino vino;
> +       struct inode *ino;

In ceph code,  'ino' is usually acronym for inode number. it's better
to use name 'in'

> +
> +       realm = ceph_inode(inode)->i_snap_realm;
> +       ceph_get_snap_realm(mdsc, realm);
> +       while (realm) {
> +               vino.ino = realm->ino;
> +               vino.snap = CEPH_NOSNAP;
> +               ino = ceph_find_inode(inode->i_sb, vino);
> +               if (!ino) {
> +                       pr_warn("Failed to find inode for %llu\n", vino.ino);
> +                       break;
> +               }
> +               ci = ceph_inode(ino);
> +               if (ceph_has_quota(ci) || (ci->i_vino.ino == CEPH_INO_ROOT)) {
> +                       iput(ino);
> +                       break;
> +               }
> +               iput(ino);
> +               next = realm->parent;
> +               ceph_get_snap_realm(mdsc, next);
> +               ceph_put_snap_realm(mdsc, realm);
> +               realm = next;
> +       }
> +       ceph_put_snap_realm(mdsc, realm);
> +
> +       return ci;

I think it's better to return realm, and let caller call ceph_put_snap_realm


> +}
> +
> +bool ceph_quota_is_same_realm(struct inode *old, struct inode *new)
> +{
> +       struct ceph_mds_client *mdsc = ceph_inode_to_client(old)->mdsc;
> +       struct ceph_inode_info *ci_old, *ci_new;
> +
> +       down_read(&mdsc->snap_rwsem);
> +       ci_old = get_quota_realm(mdsc, old);
> +       ci_new = get_quota_realm(mdsc, new);
> +       up_read(&mdsc->snap_rwsem);
> +
> +       return (ci_old == ci_new);
> +}
> +
>  enum quota_check_op {
>         QUOTA_CHECK_MAX_FILES_OP /* check quota max_files limit */
>  };
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index 20197e29a7f0..a66e73338386 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -1027,5 +1027,6 @@ extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
>                               struct ceph_mds_session *session,
>                               struct ceph_msg *msg);
>  extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
> +extern bool ceph_quota_is_same_realm(struct inode *old, struct inode *new);
>
>  #endif /* _FS_CEPH_SUPER_H */
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 2/3] ceph: quotas: support for ceph.quota.max_files
  2017-12-20 15:18 ` [RFC PATCH v3 2/3] ceph: quotas: support for ceph.quota.max_files Luis Henriques
@ 2017-12-21  8:11   ` Yan, Zheng
  0 siblings, 0 replies; 9+ messages in thread
From: Yan, Zheng @ 2017-12-21  8:11 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

On Wed, Dec 20, 2017 at 11:18 PM, Luis Henriques <lhenriques@suse.com> wrote:
> This patch adds support for the max_files quota.  It hooks into all the
> ceph functions that add new filesystem objects that need to be checked
> against the quota limits.  When these limits are hit, -EDQUOT is returned.
>
> Note that we're not checking quotas on ceph_link().  ceph_link doesn't
> really create a new inode,  and since the MDS doesn't update the directory
> statistics when a new (hard) link is created (only with symlinks), they
> are not accounted as a new file.
>
> Link: http://tracker.ceph.com/issues/22372
> Signed-off-by: Luis Henriques <lhenriques@suse.com>
> ---
>  fs/ceph/dir.c   | 11 +++++++++
>  fs/ceph/file.c  |  4 +++-
>  fs/ceph/quota.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/ceph/super.h |  1 +
>  4 files changed, 84 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index 8a5266699b67..66550d92b1ac 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -818,6 +818,9 @@ static int ceph_mknod(struct inode *dir, struct dentry *dentry,
>         if (ceph_snap(dir) != CEPH_NOSNAP)
>                 return -EROFS;
>
> +       if (ceph_quota_is_max_files_exceeded(dir))
> +               return -EDQUOT;
> +
>         err = ceph_pre_init_acls(dir, &mode, &acls);
>         if (err < 0)
>                 return err;
> @@ -871,6 +874,9 @@ static int ceph_symlink(struct inode *dir, struct dentry *dentry,
>         if (ceph_snap(dir) != CEPH_NOSNAP)
>                 return -EROFS;
>
> +       if (ceph_quota_is_max_files_exceeded(dir))
> +               return -EDQUOT;
> +
>         dout("symlink in dir %p dentry %p to '%s'\n", dir, dentry, dest);
>         req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_SYMLINK, USE_AUTH_MDS);
>         if (IS_ERR(req)) {
> @@ -920,6 +926,11 @@ static int ceph_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
>                 goto out;
>         }
>
> +       if (ceph_quota_is_max_files_exceeded(dir)) {
> +               err = -EDQUOT;
> +               goto out;
> +       }
> +
>         mode |= S_IFDIR;
>         err = ceph_pre_init_acls(dir, &mode, &acls);
>         if (err < 0)
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 5c17125f45c7..5a77a66e3d6b 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -371,7 +371,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
>         struct ceph_mds_request *req;
>         struct dentry *dn;
>         struct ceph_acls_info acls = {};
> -       int mask;
> +       int mask;
>         int err;
>
>         dout("atomic_open %p dentry %p '%pd' %s flags %d mode 0%o\n",
> @@ -382,6 +382,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
>                 return -ENAMETOOLONG;
>
>         if (flags & O_CREAT) {
> +               if (ceph_quota_is_max_files_exceeded(dir))
> +                       return -EDQUOT;
>                 err = ceph_pre_init_acls(dir, &mode, &acls);
>                 if (err < 0)
>                         return err;
> diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
> index 7bde6e85b609..06b3268f8f7f 100644
> --- a/fs/ceph/quota.c
> +++ b/fs/ceph/quota.c
> @@ -61,3 +61,72 @@ void ceph_handle_quota(struct ceph_mds_client *mdsc,
>
>         iput(inode);
>  }
> +
> +enum quota_check_op {
> +       QUOTA_CHECK_MAX_FILES_OP /* check quota max_files limit */
> +};
> +
> +/*
> + * check_quota_exceeded() will walk up the snaprealm hierarchy and, for each
> + * realm, it will execute quota check operation defined by the 'op' parameter.
> + * The snaprealm walk is interrupted if the quota check detects that the quota
> + * is exceeded or if the root inode is reached.
> + */
> +static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op,
> +                                loff_t size)
> +{
> +       struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc;
> +       struct ceph_inode_info *ci;
> +       struct ceph_snap_realm *realm, *next;
> +       struct ceph_vino vino;
> +       struct inode *ino;

It's better to use name 'in'

> +       u64 max = 0, rvalue = 0;
> +       bool quota_exceeded = false, is_root = false;
> +
> +       WARN_ON(!S_ISDIR(inode->i_mode));
> +
> +       down_read(&mdsc->snap_rwsem);
> +       realm = ceph_inode(inode)->i_snap_realm;
> +       ceph_get_snap_realm(mdsc, realm);
> +       while (realm) {
> +               vino.ino = realm->ino;
> +               vino.snap = CEPH_NOSNAP;
> +               ino = ceph_find_inode(inode->i_sb, vino);
> +               if (!ino) {
> +                       pr_warn("Failed to find inode for %llu\n", vino.ino);
> +                       break;
> +               }
> +               ci = ceph_inode(ino);
> +               switch(op) {
> +               case QUOTA_CHECK_MAX_FILES_OP:
> +                       spin_lock(&ci->i_ceph_lock);
> +                       max = ci->i_max_files;
> +                       rvalue = ci->i_rfiles + ci->i_rsubdirs;
> +                       is_root = (ci->i_vino.ino == CEPH_INO_ROOT);
> +                       spin_unlock(&ci->i_ceph_lock);
> +                       quota_exceeded = (max && (rvalue >= max));
> +                       break;
> +               default:
> +                       /* Shouldn't happen */
> +                       pr_warn("Invalid quota check op (%d)\n", op);
> +                       is_root = true; /* Just break the loop */
> +               }
> +               iput(ino);
> +
> +               if (quota_exceeded || is_root)
> +                       break;
> +               next = realm->parent;
> +               ceph_get_snap_realm(mdsc, next);
> +               ceph_put_snap_realm(mdsc, realm);
> +               realm = next;
> +       }
> +       ceph_put_snap_realm(mdsc, realm);
> +       up_read(&mdsc->snap_rwsem);
> +
> +       return quota_exceeded;
> +}
> +
> +bool ceph_quota_is_max_files_exceeded(struct inode *inode)
> +{
> +       return check_quota_exceeded(inode, QUOTA_CHECK_MAX_FILES_OP, 0);
> +}
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index f998b7f076cf..20197e29a7f0 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -1026,5 +1026,6 @@ extern void ceph_fs_debugfs_cleanup(struct ceph_fs_client *client);
>  extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
>                               struct ceph_mds_session *session,
>                               struct ceph_msg *msg);
> +extern bool ceph_quota_is_max_files_exceeded(struct inode *inode);
>
>  #endif /* _FS_CEPH_SUPER_H */
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support
  2017-12-20 15:18 [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Luis Henriques
                   ` (2 preceding siblings ...)
  2017-12-20 15:18 ` [RFC PATCH v3 3/3] ceph: quota: don't allow cross-quota renames Luis Henriques
@ 2017-12-21  8:21 ` Yan, Zheng
  2017-12-21  9:32   ` Luis Henriques
  3 siblings, 1 reply; 9+ messages in thread
From: Yan, Zheng @ 2017-12-21  8:21 UTC (permalink / raw)
  To: Luis Henriques; +Cc: ceph-devel, Yan, Zheng, Jeff Layton, Jan Fajerski

On Wed, Dec 20, 2017 at 11:18 PM, Luis Henriques <lhenriques@suse.com> wrote:
> A cephfs-specific quota implementation has been available in the
> user-space fuse client for a while.  This quota implementation allows an
> administrator to restrict the number of bytes and/or the number of files
> in a filesystem subtree.  This quota implementation, however, is
> supported at the client-level only, which means that cooperation is
> required between different clients accessing the system.
>
> This obviously assumes that all clients are trusted entities and will
> respect the quotas, preventing users from exceeding the quota limits.
> Since the kernel client doesn't support quotas, it has not been possible
> to use it in a cluster where quotas are a requirement.
>
> This patchset is an RFC that adds kernel client support for cephfs
> quotas as it is currently implemented in the ceph fuse client.  Note
> however that this patchset is not yet feature complete, as it only
> implements the max_files quota (max_bytes is still missing).
>
> ** Changes since v2 **
>
> Rework after review from Yan, Zheng:
>
> - Dropped patch 0001 ("ceph: add seqlock for snaprealm hierarchy change
>   detection") and use mdsc->snap_rwsem for walking the snaprealm
>   hierarchy instead of adding a seqlock.  This means that patches 0003
>   and 0004 needed to be reworked.
>
> - Added a NULL check in ceph_handle_quota() after the inode lookup with
>   ceph_find_inode().
>
> ** Changes since v1 **
>
> Instead of trying to do a reverse path walk to find the "quota realm"
> for a given directory, this patchset is now using snaprealms.  Thus, for
> testing it, a modified MDS is required:
>
>   https://github.com/ukernel/ceph/tree/wip-cephfs-quota-realm
>
> This modified MDS creates a snaprealm when a quota is set in a
> directory.  This means that a client needs only to walk up the snaprealm
> hierarchy to find a directory that has quotas instead of doing the full
> reverse path walking.
>
> Note however that this requires an extra patch that adds a seqlock (1st
> patch in series) to detect changes in the snaprealm hierarchy.
>
> Luis Henriques (3):
>   ceph: quota: add initial infrastructure to support cephfs quotas
>   ceph: quotas: support for ceph.quota.max_files
>   ceph: quota: don't allow cross-quota renames
>
>  fs/ceph/Makefile                   |   2 +-
>  fs/ceph/dir.c                      |  16 ++++
>  fs/ceph/file.c                     |   4 +-
>  fs/ceph/inode.c                    |   6 ++
>  fs/ceph/mds_client.c               |  23 +++++
>  fs/ceph/mds_client.h               |   2 +
>  fs/ceph/quota.c                    | 190 +++++++++++++++++++++++++++++++++++++
>  fs/ceph/super.h                    |  10 ++
>  fs/ceph/xattr.c                    |  44 +++++++++
>  include/linux/ceph/ceph_features.h |   3 +-
>  include/linux/ceph/ceph_fs.h       |  17 ++++
>  11 files changed, 314 insertions(+), 3 deletions(-)
>  create mode 100644 fs/ceph/quota.c
>

a few minor comments, otherwise this series looks good.

Regards
Yan, Zheng

> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support
  2017-12-21  8:21 ` [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Yan, Zheng
@ 2017-12-21  9:32   ` Luis Henriques
  0 siblings, 0 replies; 9+ messages in thread
From: Luis Henriques @ 2017-12-21  9:32 UTC (permalink / raw)
  To: Yan, Zheng; +Cc: ceph-devel, Jeff Layton, Jan Fajerski

"Yan, Zheng" <ukernel@gmail.com> writes:

> On Wed, Dec 20, 2017 at 11:18 PM, Luis Henriques <lhenriques@suse.com> wrote:
>> A cephfs-specific quota implementation has been available in the
>> user-space fuse client for a while.  This quota implementation allows an
>> administrator to restrict the number of bytes and/or the number of files
>> in a filesystem subtree.  This quota implementation, however, is
>> supported at the client-level only, which means that cooperation is
>> required between different clients accessing the system.
>>
>> This obviously assumes that all clients are trusted entities and will
>> respect the quotas, preventing users from exceeding the quota limits.
>> Since the kernel client doesn't support quotas, it has not been possible
>> to use it in a cluster where quotas are a requirement.
>>
>> This patchset is an RFC that adds kernel client support for cephfs
>> quotas as it is currently implemented in the ceph fuse client.  Note
>> however that this patchset is not yet feature complete, as it only
>> implements the max_files quota (max_bytes is still missing).
>>
>> ** Changes since v2 **
>>
>> Rework after review from Yan, Zheng:
>>
>> - Dropped patch 0001 ("ceph: add seqlock for snaprealm hierarchy change
>>   detection") and use mdsc->snap_rwsem for walking the snaprealm
>>   hierarchy instead of adding a seqlock.  This means that patches 0003
>>   and 0004 needed to be reworked.
>>
>> - Added a NULL check in ceph_handle_quota() after the inode lookup with
>>   ceph_find_inode().
>>
>> ** Changes since v1 **
>>
>> Instead of trying to do a reverse path walk to find the "quota realm"
>> for a given directory, this patchset is now using snaprealms.  Thus, for
>> testing it, a modified MDS is required:
>>
>>   https://github.com/ukernel/ceph/tree/wip-cephfs-quota-realm
>>
>> This modified MDS creates a snaprealm when a quota is set in a
>> directory.  This means that a client needs only to walk up the snaprealm
>> hierarchy to find a directory that has quotas instead of doing the full
>> reverse path walking.
>>
>> Note however that this requires an extra patch that adds a seqlock (1st
>> patch in series) to detect changes in the snaprealm hierarchy.
>>
>> Luis Henriques (3):
>>   ceph: quota: add initial infrastructure to support cephfs quotas
>>   ceph: quotas: support for ceph.quota.max_files
>>   ceph: quota: don't allow cross-quota renames
>>
>>  fs/ceph/Makefile                   |   2 +-
>>  fs/ceph/dir.c                      |  16 ++++
>>  fs/ceph/file.c                     |   4 +-
>>  fs/ceph/inode.c                    |   6 ++
>>  fs/ceph/mds_client.c               |  23 +++++
>>  fs/ceph/mds_client.h               |   2 +
>>  fs/ceph/quota.c                    | 190 +++++++++++++++++++++++++++++++++++++
>>  fs/ceph/super.h                    |  10 ++
>>  fs/ceph/xattr.c                    |  44 +++++++++
>>  include/linux/ceph/ceph_features.h |   3 +-
>>  include/linux/ceph/ceph_fs.h       |  17 ++++
>>  11 files changed, 314 insertions(+), 3 deletions(-)
>>  create mode 100644 fs/ceph/quota.c
>>
>
> a few minor comments, otherwise this series looks good.

Thanks a lot for your review, I'll incorporate all those changes for the
next version (which hopefully will include already the max_bytes
implementation).

Cheers,
-- 
Luis


>
> Regards
> Yan, Zheng
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-12-21  9:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20 15:18 [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Luis Henriques
2017-12-20 15:18 ` [RFC PATCH v3 1/3] ceph: quota: add initial infrastructure to support cephfs quotas Luis Henriques
2017-12-21  7:58   ` Yan, Zheng
2017-12-20 15:18 ` [RFC PATCH v3 2/3] ceph: quotas: support for ceph.quota.max_files Luis Henriques
2017-12-21  8:11   ` Yan, Zheng
2017-12-20 15:18 ` [RFC PATCH v3 3/3] ceph: quota: don't allow cross-quota renames Luis Henriques
2017-12-21  8:10   ` Yan, Zheng
2017-12-21  8:21 ` [RFC PATCH v3 0/3] ceph: kernel client cephfs quota support Yan, Zheng
2017-12-21  9:32   ` Luis Henriques

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.