All of lore.kernel.org
 help / color / mirror / Atom feed
* [v12 0/5] ext4: add project quota support
@ 2015-04-09 15:14 Li Xi
  2015-04-09 15:14 ` [v12 1/5] vfs: adds general codes to enforces project quota limits Li Xi
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Li Xi @ 2015-04-09 15:14 UTC (permalink / raw)
  To: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, tytso-3s7WtUTddSA,
	adilger-m1MBpc4rdrD3fQ9qLvQP4Q, jack-AlSwsSmVLrQ,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	hch-wEGCiKHe2LqWVfeAwA7xHQ, dmonakhov-GEFAQzZX7r8dnm+yROfE0A

The following patches propose an implementation of project quota
support for ext4. A project is an aggregate of unrelated inodes
which might scatter in different directories. Inodes that belong
to the same project possess an identical identification i.e.
'project ID', just like every inode has its user/group
identification. The following patches add project quota as
supplement to the former uer/group quota types.

The semantics of ext4 project quota is consistent with XFS. Each
directory can have EXT4_INODE_PROJINHERIT flag set. When the
EXT4_INODE_PROJINHERIT flag of a parent directory is not set, a
newly created inode under that directory will have a default project
ID (i.e. 0). And its EXT4_INODE_PROJINHERIT flag is not set either.
When this flag is set on a directory, following rules will be kept:

1) The newly created inode under that directory will inherit both
the EXT4_INODE_PROJINHERIT flag and the project ID from its parent
directory.

2) Hard-linking a inode with different project ID into that directory
will fail with errno EXDEV.

3) Renaming a inode with different project ID into that directory
will fail with errno EXDEV. However, 'mv' command will detect this
failure and copy the renamed inode to a new inode in the directory.
Thus, this new inode will inherit both the project ID and
EXT4_INODE_PROJINHERIT flag.

4) If the project quota of that ID is being enforced, statfs() on
that directory will take the quotas as another upper limits along
with the capacity of the file system, i.e. the total block/inode
number will be the minimum of the quota limits and file system
capacity.

Changelog:
* v12 <- v11:
 - Relax the permission check when setting project ID.
* v11 <- v10:
 - Remove project quota mount option;
 - Fix permission check when setting project ID
* v10 <- v9:
 - Remove non-journaled project quota interface;
 - Only allow admin to read project quota info;
 - Cleanup FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface.
* v9 <- v8:
 - Remove non-journaled project quota;
 - Rebase to newest dev branch of ext4 repository (3.19.0-rc3).
* v8 <- v7:
 - Rebase to newest dev branch of ext4 repository (3.18.0_rc3).
* v7 <- v6:
 - Map ext4 inode flags to xflags of struct fsxattr;
 - Add patch to cleanup ext4 inode flag definitions.
* v6 <- v5:
 - Add project ID check for cross rename;
 - Remove patch of EXT4_IOC_GETPROJECT/EXT4_IOC_SETPROJECT ioctl
* v5 <- v4:
 - Check project feature when set/get project ID;
 - Do not check project feature for project quota;
 - Add support of FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR.
* v4 <- v3:
 - Do not check project feature when set/get project ID;
 - Use EXT4_MAXQUOTAS instead of MAXQUOTAS in ext4 patches;
 - Remove unnecessary change of fs/quota/dquot.c;
 - Remove CONFIG_QUOTA_PROJECT.
* v3 <- v2:
 - Add EXT4_INODE_PROJINHERIT semantics.
* v2 <- v1:
 - Add ioctl interface for setting/getting project;
 - Add EXT4_FEATURE_RO_COMPAT_PROJECT;
 - Add get_projid() method in struct dquot_operations;
 - Add error check of ext4_inode_projid_set/get().

v11: http://www.spinics.net/lists/linux-ext4/msg47450.html
v10: http://www.spinics.net/lists/linux-ext4/msg47413.html
v9: http://www.spinics.net/lists/linux-ext4/msg47326.html
v8: http://www.spinics.net/lists/linux-ext4/msg46545.html
v7: http://www.spinics.net/lists/linux-fsdevel/msg80404.html
v6: http://www.spinics.net/lists/linux-fsdevel/msg80022.html
v5: http://www.spinics.net/lists/linux-api/msg04840.html
v4: http://lwn.net/Articles/612972/
v3: http://www.spinics.net/lists/linux-ext4/msg45184.html
v2: http://www.spinics.net/lists/linux-ext4/msg44695.html
v1: http://article.gmane.org/gmane.comp.file-systems.ext4/45153

Any comments or feedbacks are appreciated.

Regards,
                                         - Li Xi

Li Xi (5):
  vfs: adds general codes to enforces project quota limits
  ext4: adds project ID support
  ext4: adds project quota support
  ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support
  ext4: cleanup inode flag definitions

 fs/ext4/ext4.h             |   86 +++++++----
 fs/ext4/ialloc.c           |    5 +
 fs/ext4/inode.c            |   29 ++++
 fs/ext4/ioctl.c            |  361 +++++++++++++++++++++++++++++++++-----------
 fs/ext4/namei.c            |   18 +++
 fs/ext4/super.c            |   57 +++++++-
 fs/quota/dquot.c           |   35 ++++-
 fs/quota/quota.c           |    5 +-
 fs/quota/quotaio_v2.h      |    6 +-
 fs/xfs/xfs_fs.h            |   47 ++----
 include/linux/quota.h      |    2 +
 include/uapi/linux/fs.h    |   33 ++++
 include/uapi/linux/quota.h |    6 +-
 13 files changed, 527 insertions(+), 163 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [v12 1/5] vfs: adds general codes to enforces project quota limits
  2015-04-09 15:14 [v12 0/5] ext4: add project quota support Li Xi
@ 2015-04-09 15:14 ` Li Xi
       [not found]   ` <1428592477-8212-2-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
  2015-04-09 15:14 ` [v12 2/5] ext4: adds project ID support Li Xi
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Li Xi @ 2015-04-09 15:14 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4, linux-api, tytso, adilger, jack, viro,
	hch, dmonakhov

This patch adds support for a new quota type PRJQUOTA for project quota
enforcement. Also a new method get_projid() is added into dquot_operations
structure.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/quota/dquot.c           |   35 ++++++++++++++++++++++++++++++-----
 fs/quota/quota.c           |    5 ++++-
 fs/quota/quotaio_v2.h      |    6 ++++--
 include/linux/quota.h      |    2 ++
 include/uapi/linux/quota.h |    6 ++++--
 5 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 8f0acef..a02bb68 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1159,8 +1159,8 @@ static int need_print_warning(struct dquot_warn *warn)
 			return uid_eq(current_fsuid(), warn->w_dq_id.uid);
 		case GRPQUOTA:
 			return in_group_p(warn->w_dq_id.gid);
-		case PRJQUOTA:	/* Never taken... Just make gcc happy */
-			return 0;
+		case PRJQUOTA:
+			return 1;
 	}
 	return 0;
 }
@@ -1399,6 +1399,9 @@ static void __dquot_initialize(struct inode *inode, int type)
 	/* First get references to structures we might need. */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		struct kqid qid;
+		kprojid_t projid;
+		int rc;
+
 		got[cnt] = NULL;
 		if (type != -1 && cnt != type)
 			continue;
@@ -1409,6 +1412,10 @@ static void __dquot_initialize(struct inode *inode, int type)
 		 */
 		if (i_dquot(inode)[cnt])
 			continue;
+
+		if (!sb_has_quota_active(sb, cnt))
+			continue;
+
 		init_needed = 1;
 
 		switch (cnt) {
@@ -1418,6 +1425,12 @@ static void __dquot_initialize(struct inode *inode, int type)
 		case GRPQUOTA:
 			qid = make_kqid_gid(inode->i_gid);
 			break;
+		case PRJQUOTA:
+			rc = inode->i_sb->dq_op->get_projid(inode, &projid);
+			if (rc)
+				continue;
+			qid = make_kqid_projid(projid);
+			break;
 		}
 		got[cnt] = dqget(sb, qid);
 	}
@@ -2161,7 +2174,8 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
 		error = -EROFS;
 		goto out_fmt;
 	}
-	if (!sb->s_op->quota_write || !sb->s_op->quota_read) {
+	if (!sb->s_op->quota_write || !sb->s_op->quota_read ||
+	    (type == PRJQUOTA && sb->dq_op->get_projid == NULL)) {
 		error = -EINVAL;
 		goto out_fmt;
 	}
@@ -2402,8 +2416,19 @@ static void do_get_dqblk(struct dquot *dquot, struct fs_disk_quota *di)
 
 	memset(di, 0, sizeof(*di));
 	di->d_version = FS_DQUOT_VERSION;
-	di->d_flags = dquot->dq_id.type == USRQUOTA ?
-			FS_USER_QUOTA : FS_GROUP_QUOTA;
+	switch (dquot->dq_id.type) {
+	case USRQUOTA:
+		di->d_flags = FS_USER_QUOTA;
+		break;
+	case GRPQUOTA:
+		di->d_flags = FS_GROUP_QUOTA;
+		break;
+	case PRJQUOTA:
+		di->d_flags = FS_PROJ_QUOTA;
+		break;
+	default:
+		BUG();
+	}
 	di->d_id = from_kqid_munged(current_user_ns(), dquot->dq_id);
 
 	spin_lock(&dq_data_lock);
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 2aa4151..33b30b1 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -30,7 +30,10 @@ static int check_quotactl_permission(struct super_block *sb, int type, int cmd,
 	case Q_XGETQSTATV:
 	case Q_XQUOTASYNC:
 		break;
-	/* allow to query information for dquots we "own" */
+	/*
+	 * allow to query information for dquots we "own"
+	 * always allow querying project quota
+	 */
 	case Q_GETQUOTA:
 	case Q_XGETQUOTA:
 		if ((type == USRQUOTA && uid_eq(current_euid(), make_kuid(current_user_ns(), id))) ||
diff --git a/fs/quota/quotaio_v2.h b/fs/quota/quotaio_v2.h
index f1966b4..4e95430 100644
--- a/fs/quota/quotaio_v2.h
+++ b/fs/quota/quotaio_v2.h
@@ -13,12 +13,14 @@
  */
 #define V2_INITQMAGICS {\
 	0xd9c01f11,	/* USRQUOTA */\
-	0xd9c01927	/* GRPQUOTA */\
+	0xd9c01927,	/* GRPQUOTA */\
+	0xd9c03f14,	/* PRJQUOTA */\
 }
 
 #define V2_INITQVERSIONS {\
 	1,		/* USRQUOTA */\
-	1		/* GRPQUOTA */\
+	1,		/* GRPQUOTA */\
+	1,		/* PRJQUOTA */\
 }
 
 /* First generic header */
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 50978b7..ba51f7e 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -50,6 +50,7 @@
 
 #undef USRQUOTA
 #undef GRPQUOTA
+#undef PRJQUOTA
 enum quota_type {
 	USRQUOTA = 0,		/* element used for user quotas */
 	GRPQUOTA = 1,		/* element used for group quotas */
@@ -317,6 +318,7 @@ struct dquot_operations {
 	/* get reserved quota for delayed alloc, value returned is managed by
 	 * quota code only */
 	qsize_t *(*get_reserved_space) (struct inode *);
+	int (*get_projid) (struct inode *, kprojid_t *);/* Get project ID */
 };
 
 struct path;
diff --git a/include/uapi/linux/quota.h b/include/uapi/linux/quota.h
index 3b6cfbe..b2d9486 100644
--- a/include/uapi/linux/quota.h
+++ b/include/uapi/linux/quota.h
@@ -36,11 +36,12 @@
 #include <linux/errno.h>
 #include <linux/types.h>
 
-#define __DQUOT_VERSION__	"dquot_6.5.2"
+#define __DQUOT_VERSION__	"dquot_6.6.0"
 
-#define MAXQUOTAS 2
+#define MAXQUOTAS 3
 #define USRQUOTA  0		/* element used for user quotas */
 #define GRPQUOTA  1		/* element used for group quotas */
+#define PRJQUOTA  2		/* element used for project quotas */
 
 /*
  * Definitions for the default names of the quotas files.
@@ -48,6 +49,7 @@
 #define INITQFNAMES { \
 	"user",    /* USRQUOTA */ \
 	"group",   /* GRPQUOTA */ \
+	"project", /* PRJQUOTA */ \
 	"undefined", \
 };
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [v12 2/5] ext4: adds project ID support
  2015-04-09 15:14 [v12 0/5] ext4: add project quota support Li Xi
  2015-04-09 15:14 ` [v12 1/5] vfs: adds general codes to enforces project quota limits Li Xi
@ 2015-04-09 15:14 ` Li Xi
       [not found]   ` <1428592477-8212-3-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
  2015-04-09 15:14 ` [v12 3/5] ext4: adds project quota support Li Xi
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Li Xi @ 2015-04-09 15:14 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4, linux-api, tytso, adilger, jack, viro,
	hch, dmonakhov

This patch adds a new internal field of ext4 inode to save project
identifier. Also a new flag EXT4_INODE_PROJINHERIT is added for
inheriting project ID from parent directory.

Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/ext4.h          |   21 +++++++++++++++++----
 fs/ext4/ialloc.c        |    5 +++++
 fs/ext4/inode.c         |   29 +++++++++++++++++++++++++++++
 fs/ext4/namei.c         |   18 ++++++++++++++++++
 fs/ext4/super.c         |    1 +
 include/uapi/linux/fs.h |    1 +
 6 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 7fec2ef..7acb2da 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -378,16 +378,18 @@ struct flex_groups {
 #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
 #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
 #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
+#define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE		0x004BDFFF /* User visible flags */
-#define EXT4_FL_USER_MODIFIABLE		0x004380FF /* User modifiable flags */
+#define EXT4_FL_USER_VISIBLE		0x204BDFFF /* User visible flags */
+#define EXT4_FL_USER_MODIFIABLE		0x204380FF /* User modifiable flags */
 
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
-			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
+			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
+			   EXT4_PROJINHERIT_FL)
 
 /* Flags that are appropriate for regular files (all but dir-specific ones). */
 #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL))
@@ -435,6 +437,7 @@ enum {
 	EXT4_INODE_EA_INODE	= 21,	/* Inode used for large EA */
 	EXT4_INODE_EOFBLOCKS	= 22,	/* Blocks allocated beyond EOF */
 	EXT4_INODE_INLINE_DATA	= 28,	/* Data in inode. */
+	EXT4_INODE_PROJINHERIT	= 29,	/* Create with parents projid */
 	EXT4_INODE_RESERVED	= 31,	/* reserved for ext4 lib */
 };
 
@@ -684,6 +687,7 @@ struct ext4_inode {
 	__le32  i_crtime;       /* File Creation time */
 	__le32  i_crtime_extra; /* extra FileCreationtime (nsec << 2 | epoch) */
 	__le32  i_version_hi;	/* high 32 bits for 64-bit version */
+	__le32  i_projid;	/* Project ID */
 };
 
 struct move_extent {
@@ -939,6 +943,7 @@ struct ext4_inode_info {
 
 	/* Precomputed uuid+inum+igen checksum for seeding inode checksums */
 	__u32 i_csum_seed;
+	kprojid_t i_projid;
 };
 
 /*
@@ -1531,6 +1536,7 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
  */
 #define EXT4_FEATURE_RO_COMPAT_METADATA_CSUM	0x0400
 #define EXT4_FEATURE_RO_COMPAT_READONLY		0x1000
+#define EXT4_FEATURE_RO_COMPAT_PROJECT		0x2000
 
 #define EXT4_FEATURE_INCOMPAT_COMPRESSION	0x0001
 #define EXT4_FEATURE_INCOMPAT_FILETYPE		0x0002
@@ -1581,7 +1587,8 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
 					 EXT4_FEATURE_RO_COMPAT_HUGE_FILE |\
 					 EXT4_FEATURE_RO_COMPAT_BIGALLOC |\
 					 EXT4_FEATURE_RO_COMPAT_METADATA_CSUM|\
-					 EXT4_FEATURE_RO_COMPAT_QUOTA)
+					 EXT4_FEATURE_RO_COMPAT_QUOTA |\
+					 EXT4_FEATURE_RO_COMPAT_PROJECT)
 
 /*
  * Default values for user and/or group using reserved blocks
@@ -1589,6 +1596,11 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
 #define	EXT4_DEF_RESUID		0
 #define	EXT4_DEF_RESGID		0
 
+/*
+ * Default project ID
+ */
+#define	EXT4_DEF_PROJID		0
+
 #define EXT4_DEF_INODE_READAHEAD_BLKS	32
 
 /*
@@ -2141,6 +2153,7 @@ extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode,
 			     loff_t lstart, loff_t lend);
 extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
 extern qsize_t *ext4_get_reserved_space(struct inode *inode);
+extern int ext4_get_projid(struct inode *inode, kprojid_t *projid);
 extern void ext4_da_update_reserve_space(struct inode *inode,
 					int used, int quota_claim);
 
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index ac644c3..10ca9dd 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -756,6 +756,11 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
 		inode->i_gid = dir->i_gid;
 	} else
 		inode_init_owner(inode, dir, mode);
+	if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_PROJECT) &&
+	    ext4_test_inode_flag(dir, EXT4_INODE_PROJINHERIT))
+		ei->i_projid = EXT4_I(dir)->i_projid;
+	else
+		ei->i_projid = make_kprojid(&init_user_ns, EXT4_DEF_PROJID);
 	dquot_initialize(inode);
 
 	if (!goal)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4df6d01..6e4833f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3870,6 +3870,14 @@ static inline void ext4_iget_extra_inode(struct inode *inode,
 		EXT4_I(inode)->i_inline_off = 0;
 }
 
+int ext4_get_projid(struct inode *inode, kprojid_t *projid)
+{
+	if (!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb, EXT4_FEATURE_RO_COMPAT_PROJECT))
+		return -EOPNOTSUPP;
+	*projid = EXT4_I(inode)->i_projid;
+	return 0;
+}
+
 struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 {
 	struct ext4_iloc iloc;
@@ -3881,6 +3889,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 	int block;
 	uid_t i_uid;
 	gid_t i_gid;
+	projid_t i_projid;
 
 	inode = iget_locked(sb, ino);
 	if (!inode)
@@ -3930,12 +3939,18 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 	inode->i_mode = le16_to_cpu(raw_inode->i_mode);
 	i_uid = (uid_t)le16_to_cpu(raw_inode->i_uid_low);
 	i_gid = (gid_t)le16_to_cpu(raw_inode->i_gid_low);
+	if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_PROJECT))
+		i_projid = (projid_t)le32_to_cpu(raw_inode->i_projid);
+	else
+		i_projid = EXT4_DEF_PROJID;
+
 	if (!(test_opt(inode->i_sb, NO_UID32))) {
 		i_uid |= le16_to_cpu(raw_inode->i_uid_high) << 16;
 		i_gid |= le16_to_cpu(raw_inode->i_gid_high) << 16;
 	}
 	i_uid_write(inode, i_uid);
 	i_gid_write(inode, i_gid);
+	ei->i_projid = make_kprojid(&init_user_ns, i_projid);;
 	set_nlink(inode, le16_to_cpu(raw_inode->i_links_count));
 
 	ext4_clear_state_flags(ei);	/* Only relevant on 32-bit archs */
@@ -4165,6 +4180,7 @@ static int ext4_do_update_inode(handle_t *handle,
 	int need_datasync = 0, set_large_file = 0;
 	uid_t i_uid;
 	gid_t i_gid;
+	projid_t i_projid;
 
 	spin_lock(&ei->i_raw_lock);
 
@@ -4177,6 +4193,7 @@ static int ext4_do_update_inode(handle_t *handle,
 	raw_inode->i_mode = cpu_to_le16(inode->i_mode);
 	i_uid = i_uid_read(inode);
 	i_gid = i_gid_read(inode);
+	i_projid = from_kprojid(&init_user_ns, ei->i_projid);
 	if (!(test_opt(inode->i_sb, NO_UID32))) {
 		raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
 		raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
@@ -4256,6 +4273,18 @@ static int ext4_do_update_inode(handle_t *handle,
 		}
 	}
 
+	BUG_ON(!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
+			EXT4_FEATURE_RO_COMPAT_PROJECT) &&
+	       i_projid != EXT4_DEF_PROJID);
+	if (i_projid != EXT4_DEF_PROJID &&
+	    (EXT4_INODE_SIZE(inode->i_sb) <= EXT4_GOOD_OLD_INODE_SIZE ||
+	     (!EXT4_FITS_IN_INODE(raw_inode, ei, i_projid)))) {
+		spin_unlock(&ei->i_raw_lock);
+		err = -EFBIG;
+		goto out_brelse;
+	}
+	raw_inode->i_projid = cpu_to_le32(i_projid);
+
 	ext4_inode_csum_set(inode, raw_inode, ei);
 
 	spin_unlock(&ei->i_raw_lock);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 2291923..63a9623 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2938,6 +2938,11 @@ static int ext4_link(struct dentry *old_dentry,
 	if (inode->i_nlink >= EXT4_LINK_MAX)
 		return -EMLINK;
 
+	if ((ext4_test_inode_flag(dir, EXT4_INODE_PROJINHERIT)) &&
+	    (!projid_eq(EXT4_I(dir)->i_projid,
+			EXT4_I(old_dentry->d_inode)->i_projid)))
+		return -EXDEV;
+
 	dquot_initialize(dir);
 
 retry:
@@ -3217,6 +3222,11 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
 	int credits;
 	u8 old_file_type;
 
+	if ((ext4_test_inode_flag(new_dir, EXT4_INODE_PROJINHERIT)) &&
+	    (!projid_eq(EXT4_I(new_dir)->i_projid,
+			EXT4_I(old_dentry->d_inode)->i_projid)))
+		return -EXDEV;
+
 	dquot_initialize(old.dir);
 	dquot_initialize(new.dir);
 
@@ -3395,6 +3405,14 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
 	u8 new_file_type;
 	int retval;
 
+	if ((ext4_test_inode_flag(new_dir, EXT4_INODE_PROJINHERIT) &&
+	     !projid_eq(EXT4_I(new_dir)->i_projid,
+			EXT4_I(old_dentry->d_inode)->i_projid)) ||
+	    (ext4_test_inode_flag(old_dir, EXT4_INODE_PROJINHERIT) &&
+	     !projid_eq(EXT4_I(old_dir)->i_projid,
+			EXT4_I(new_dentry->d_inode)->i_projid)))
+		return -EXDEV;
+
 	dquot_initialize(old.dir);
 	dquot_initialize(new.dir);
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index bff3427..04c6cc3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1073,6 +1073,7 @@ static const struct dquot_operations ext4_quota_operations = {
 	.write_info	= ext4_write_info,
 	.alloc_dquot	= dquot_alloc,
 	.destroy_dquot	= dquot_destroy,
+	.get_projid	= ext4_get_projid,
 };
 
 static const struct quotactl_ops ext4_qctl_operations = {
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 3735fa0..fcbf647 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -195,6 +195,7 @@ struct inodes_stat_t {
 #define FS_EXTENT_FL			0x00080000 /* Extents */
 #define FS_DIRECTIO_FL			0x00100000 /* Use direct i/o */
 #define FS_NOCOW_FL			0x00800000 /* Do not cow file */
+#define FS_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define FS_RESERVED_FL			0x80000000 /* reserved for ext2 lib */
 
 #define FS_FL_USER_VISIBLE		0x0003DFFF /* User visible flags */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [v12 3/5] ext4: adds project quota support
  2015-04-09 15:14 [v12 0/5] ext4: add project quota support Li Xi
  2015-04-09 15:14 ` [v12 1/5] vfs: adds general codes to enforces project quota limits Li Xi
  2015-04-09 15:14 ` [v12 2/5] ext4: adds project ID support Li Xi
@ 2015-04-09 15:14 ` Li Xi
  2015-04-09 15:14 ` [v12 4/5] ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support Li Xi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Li Xi @ 2015-04-09 15:14 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4, linux-api, tytso, adilger, jack, viro,
	hch, dmonakhov

This patch adds mount options for enabling/disabling project quota
accounting and enforcement. A new specific inode is also used for
project quota accounting.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/ext4.h  |    6 +++-
 fs/ext4/super.c |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 7acb2da..8ddc723 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1169,7 +1169,8 @@ struct ext4_super_block {
 	__le32	s_overhead_clusters;	/* overhead blocks/clusters in fs */
 	__le32	s_backup_bgs[2];	/* groups with sparse_super2 SBs */
 	__u8	s_encrypt_algos[4];	/* Encryption algorithms in use  */
-	__le32	s_reserved[105];	/* Padding to the end of the block */
+	__le32  s_prj_quota_inum;       /* inode for tracking project quota */
+	__le32	s_reserved[104];	/* Padding to the end of the block */
 	__le32	s_checksum;		/* crc32c(superblock) */
 };
 
@@ -1184,7 +1185,7 @@ struct ext4_super_block {
 #define EXT4_MF_FS_ABORTED	0x0002	/* Fatal error detected */
 
 /* Number of quota types we support */
-#define EXT4_MAXQUOTAS 2
+#define EXT4_MAXQUOTAS 3
 
 /*
  * fourth extended-fs super-block data in memory
@@ -1376,6 +1377,7 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
 		ino == EXT4_BOOT_LOADER_INO ||
 		ino == EXT4_JOURNAL_INO ||
 		ino == EXT4_RESIZE_INO ||
+		ino == le32_to_cpu(EXT4_SB(sb)->s_es->s_prj_quota_inum) ||
 		(ino >= EXT4_FIRST_INO(sb) &&
 		 ino <= le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count));
 }
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 04c6cc3..476e46f 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1036,8 +1036,8 @@ static int bdev_try_to_free_page(struct super_block *sb, struct page *page,
 }
 
 #ifdef CONFIG_QUOTA
-#define QTYPE2NAME(t) ((t) == USRQUOTA ? "user" : "group")
-#define QTYPE2MOPT(on, t) ((t) == USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
+static char *quotatypes[] = INITQFNAMES;
+#define QTYPE2NAME(t) (quotatypes[t])
 
 static int ext4_write_dquot(struct dquot *dquot);
 static int ext4_acquire_dquot(struct dquot *dquot);
@@ -3944,7 +3944,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		sb->s_qcop = &ext4_qctl_sysfile_operations;
 	else
 		sb->s_qcop = &ext4_qctl_operations;
-	sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP;
+	sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP | QTYPE_MASK_PRJ;
 #endif
 	memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
 
@@ -5040,6 +5040,46 @@ restore_opts:
 	return err;
 }
 
+static int ext4_statfs_project(struct super_block *sb,
+			       kprojid_t projid, struct kstatfs *buf)
+{
+	struct kqid qid;
+	struct dquot *dquot;
+	u64 limit;
+	u64 curblock;
+
+	qid = make_kqid_projid(projid);
+	dquot = dqget(sb, qid);
+	if (!dquot)
+		return -ESRCH;
+	spin_lock(&dq_data_lock);
+
+	limit = dquot->dq_dqb.dqb_bsoftlimit ?
+		dquot->dq_dqb.dqb_bsoftlimit :
+		dquot->dq_dqb.dqb_bhardlimit;
+	if (limit && buf->f_blocks * buf->f_bsize > limit) {
+		curblock = dquot->dq_dqb.dqb_curspace / buf->f_bsize;
+		buf->f_blocks = limit / buf->f_bsize;
+		buf->f_bfree = buf->f_bavail =
+			(buf->f_blocks > curblock) ?
+			 (buf->f_blocks - curblock) : 0;
+	}
+
+	limit = dquot->dq_dqb.dqb_isoftlimit ?
+		dquot->dq_dqb.dqb_isoftlimit :
+		dquot->dq_dqb.dqb_ihardlimit;
+	if (limit && buf->f_files > limit) {
+		buf->f_files = limit;
+		buf->f_ffree =
+			(buf->f_files > dquot->dq_dqb.dqb_curinodes) ?
+			 (buf->f_files - dquot->dq_dqb.dqb_curinodes) : 0;
+	}
+
+	spin_unlock(&dq_data_lock);
+	dqput(dquot);
+	return 0;
+}
+
 static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
 {
 	struct super_block *sb = dentry->d_sb;
@@ -5048,6 +5088,7 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
 	ext4_fsblk_t overhead = 0, resv_blocks;
 	u64 fsid;
 	s64 bfree;
+	struct inode *inode = dentry->d_inode;
 	resv_blocks = EXT4_C2B(sbi, atomic64_read(&sbi->s_resv_clusters));
 
 	if (!test_opt(sb, MINIX_DF))
@@ -5072,6 +5113,9 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
 	buf->f_fsid.val[0] = fsid & 0xFFFFFFFFUL;
 	buf->f_fsid.val[1] = (fsid >> 32) & 0xFFFFFFFFUL;
 
+	if (ext4_test_inode_flag(inode, EXT4_INODE_PROJINHERIT) &&
+	    sb_has_quota_limits_enabled(sb, PRJQUOTA))
+		ext4_statfs_project(sb, EXT4_I(inode)->i_projid, buf);
 	return 0;
 }
 
@@ -5236,7 +5280,8 @@ static int ext4_quota_enable(struct super_block *sb, int type, int format_id,
 	struct inode *qf_inode;
 	unsigned long qf_inums[EXT4_MAXQUOTAS] = {
 		le32_to_cpu(EXT4_SB(sb)->s_es->s_usr_quota_inum),
-		le32_to_cpu(EXT4_SB(sb)->s_es->s_grp_quota_inum)
+		le32_to_cpu(EXT4_SB(sb)->s_es->s_grp_quota_inum),
+		le32_to_cpu(EXT4_SB(sb)->s_es->s_prj_quota_inum)
 	};
 
 	BUG_ON(!EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_QUOTA));
@@ -5264,7 +5309,8 @@ static int ext4_enable_quotas(struct super_block *sb)
 	int type, err = 0;
 	unsigned long qf_inums[EXT4_MAXQUOTAS] = {
 		le32_to_cpu(EXT4_SB(sb)->s_es->s_usr_quota_inum),
-		le32_to_cpu(EXT4_SB(sb)->s_es->s_grp_quota_inum)
+		le32_to_cpu(EXT4_SB(sb)->s_es->s_grp_quota_inum),
+		le32_to_cpu(EXT4_SB(sb)->s_es->s_prj_quota_inum)
 	};
 
 	sb_dqopt(sb)->flags |= DQUOT_QUOTA_SYS_FILE;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [v12 4/5] ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support
  2015-04-09 15:14 [v12 0/5] ext4: add project quota support Li Xi
                   ` (2 preceding siblings ...)
  2015-04-09 15:14 ` [v12 3/5] ext4: adds project quota support Li Xi
@ 2015-04-09 15:14 ` Li Xi
  2015-04-09 15:14 ` [v12 5/5] ext4: cleanup inode flag definitions Li Xi
       [not found] ` <1428592477-8212-1-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
  5 siblings, 0 replies; 14+ messages in thread
From: Li Xi @ 2015-04-09 15:14 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4, linux-api, tytso, adilger, jack, viro,
	hch, dmonakhov

This patch adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR ioctl interface
support for ext4. The interface is kept consistent with
XFS_IOC_FSGETXATTR/XFS_IOC_FSGETXATTR.

Signed-off-by: Li Xi <lixi@ddn.com>
---
 fs/ext4/ext4.h          |    9 ++
 fs/ext4/ioctl.c         |  361 +++++++++++++++++++++++++++++++++++-----------
 fs/xfs/xfs_fs.h         |   47 +++----
 include/uapi/linux/fs.h |   32 ++++
 4 files changed, 332 insertions(+), 117 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 8ddc723..377fec0 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -384,6 +384,13 @@ struct flex_groups {
 #define EXT4_FL_USER_VISIBLE		0x204BDFFF /* User visible flags */
 #define EXT4_FL_USER_MODIFIABLE		0x204380FF /* User modifiable flags */
 
+#define EXT4_FL_XFLAG_VISIBLE		(EXT4_SYNC_FL | \
+					 EXT4_IMMUTABLE_FL | \
+					 EXT4_APPEND_FL | \
+					 EXT4_NODUMP_FL | \
+					 EXT4_NOATIME_FL | \
+					 EXT4_PROJINHERIT_FL)
+
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
@@ -606,6 +613,8 @@ enum {
 #define EXT4_IOC_RESIZE_FS		_IOW('f', 16, __u64)
 #define EXT4_IOC_SWAP_BOOT		_IO('f', 17)
 #define EXT4_IOC_PRECACHE_EXTENTS	_IO('f', 18)
+#define EXT4_IOC_FSGETXATTR		FS_IOC_FSGETXATTR
+#define EXT4_IOC_FSSETXATTR		FS_IOC_FSSETXATTR
 
 #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
 /*
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index f58a0d1..b1e40ca 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -14,6 +14,8 @@
 #include <linux/compat.h>
 #include <linux/mount.h>
 #include <linux/file.h>
+#include <linux/quotaops.h>
+#include <linux/quota.h>
 #include <asm/uaccess.h>
 #include "ext4_jbd2.h"
 #include "ext4.h"
@@ -196,6 +198,222 @@ journal_err_out:
 	return err;
 }
 
+static int ext4_ioctl_setflags(struct inode *inode,
+			       unsigned int flags)
+{
+	struct ext4_inode_info *ei = EXT4_I(inode);
+	handle_t *handle = NULL;
+	int err = EPERM, migrate = 0;
+	struct ext4_iloc iloc;
+	unsigned int oldflags, mask, i;
+	unsigned int jflag;
+
+	/* Is it quota file? Do not allow user to mess with it */
+	if (IS_NOQUOTA(inode))
+		goto flags_out;
+
+	oldflags = ei->i_flags;
+
+	/* The JOURNAL_DATA flag is modifiable only by root */
+	jflag = flags & EXT4_JOURNAL_DATA_FL;
+
+	/*
+	 * The IMMUTABLE and APPEND_ONLY flags can only be changed by
+	 * the relevant capability.
+	 *
+	 * This test looks nicer. Thanks to Pauline Middelink
+	 */
+	if ((flags ^ oldflags) & (EXT4_APPEND_FL | EXT4_IMMUTABLE_FL)) {
+		if (!capable(CAP_LINUX_IMMUTABLE))
+			goto flags_out;
+	}
+
+	/*
+	 * The JOURNAL_DATA flag can only be changed by
+	 * the relevant capability.
+	 */
+	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
+		if (!capable(CAP_SYS_RESOURCE))
+			goto flags_out;
+	}
+	if ((flags ^ oldflags) & EXT4_EXTENTS_FL)
+		migrate = 1;
+
+	if (flags & EXT4_EOFBLOCKS_FL) {
+		/* we don't support adding EOFBLOCKS flag */
+		if (!(oldflags & EXT4_EOFBLOCKS_FL)) {
+			err = -EOPNOTSUPP;
+			goto flags_out;
+		}
+	} else if (oldflags & EXT4_EOFBLOCKS_FL)
+		ext4_truncate(inode);
+
+	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
+	if (IS_ERR(handle)) {
+		err = PTR_ERR(handle);
+		goto flags_out;
+	}
+	if (IS_SYNC(inode))
+		ext4_handle_sync(handle);
+	err = ext4_reserve_inode_write(handle, inode, &iloc);
+	if (err)
+		goto flags_err;
+
+	for (i = 0, mask = 1; i < 32; i++, mask <<= 1) {
+		if (!(mask & EXT4_FL_USER_MODIFIABLE))
+			continue;
+		if (mask & flags)
+			ext4_set_inode_flag(inode, i);
+		else
+			ext4_clear_inode_flag(inode, i);
+	}
+
+	ext4_set_inode_flags(inode);
+	inode->i_ctime = ext4_current_time(inode);
+
+	err = ext4_mark_iloc_dirty(handle, inode, &iloc);
+flags_err:
+	ext4_journal_stop(handle);
+	if (err)
+		goto flags_out;
+
+	if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL))
+		err = ext4_change_inode_journal_flag(inode, jflag);
+	if (err)
+		goto flags_out;
+	if (migrate) {
+		if (flags & EXT4_EXTENTS_FL)
+			err = ext4_ext_migrate(inode);
+		else
+			err = ext4_ind_migrate(inode);
+	}
+
+flags_out:
+	return err;
+}
+
+static int ext4_ioctl_setproject(struct file *filp, __u32 projid)
+{
+	struct inode *inode = file_inode(filp);
+	struct super_block *sb = inode->i_sb;
+	struct ext4_inode_info *ei = EXT4_I(inode);
+	int err;
+	handle_t *handle;
+	kprojid_t kprojid;
+	struct ext4_iloc iloc;
+	struct ext4_inode *raw_inode;
+
+	struct dquot *transfer_to[EXT4_MAXQUOTAS] = { };
+
+	if (!EXT4_HAS_RO_COMPAT_FEATURE(sb,
+			EXT4_FEATURE_RO_COMPAT_PROJECT)) {
+		BUG_ON(__kprojid_val(EXT4_I(inode)->i_projid)
+		       != EXT4_DEF_PROJID);
+		if (projid != EXT4_DEF_PROJID)
+			return -EOPNOTSUPP;
+		else
+			return 0;
+	}
+
+	kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
+
+	if (projid_eq(kprojid, EXT4_I(inode)->i_projid))
+		return 0;
+
+	err = mnt_want_write_file(filp);
+	if (err)
+		return err;
+
+	err = -EPERM;
+	mutex_lock(&inode->i_mutex);
+	/* Is it quota file? Do not allow user to mess with it */
+	if (IS_NOQUOTA(inode))
+		goto project_out;
+
+	dquot_initialize(inode);
+
+	handle = ext4_journal_start(inode, EXT4_HT_QUOTA,
+		EXT4_QUOTA_INIT_BLOCKS(sb) +
+		EXT4_QUOTA_DEL_BLOCKS(sb) + 3);
+	if (IS_ERR(handle)) {
+		err = PTR_ERR(handle);
+		goto project_out;
+	}
+
+	err = ext4_reserve_inode_write(handle, inode, &iloc);
+	if (err)
+		goto project_stop;
+
+	raw_inode = ext4_raw_inode(&iloc);
+	if ((EXT4_INODE_SIZE(sb) <=
+	     EXT4_GOOD_OLD_INODE_SIZE) ||
+	    (!EXT4_FITS_IN_INODE(raw_inode, ei, i_projid))) {
+	    	err = -EFBIG;
+	    	goto project_stop;
+	}
+
+	transfer_to[PRJQUOTA] = dqget(sb, make_kqid_projid(kprojid));
+	if (!transfer_to[PRJQUOTA])
+		goto project_set;
+
+	err = __dquot_transfer(inode, transfer_to);
+	dqput(transfer_to[PRJQUOTA]);
+	if (err)
+		goto project_stop;
+
+project_set:
+	EXT4_I(inode)->i_projid = kprojid;
+	inode->i_ctime = ext4_current_time(inode);
+	err = ext4_mark_iloc_dirty(handle, inode, &iloc);
+project_stop:
+	ext4_journal_stop(handle);
+project_out:
+	mutex_unlock(&inode->i_mutex);
+	mnt_drop_write_file(filp);
+	return err;
+}
+
+/* Transfer internal flags to xflags */
+static inline __u32 ext4_iflags_to_xflags(unsigned long iflags)
+{
+	__u32 xflags = 0;
+
+	if (iflags & EXT4_SYNC_FL)
+		xflags |= FS_XFLAG_SYNC;
+	if (iflags & EXT4_IMMUTABLE_FL)
+		xflags |= FS_XFLAG_IMMUTABLE;
+	if (iflags & EXT4_APPEND_FL)
+		xflags |= FS_XFLAG_APPEND;
+	if (iflags & EXT4_NODUMP_FL)
+		xflags |= FS_XFLAG_NODUMP;
+	if (iflags & EXT4_NOATIME_FL)
+		xflags |= FS_XFLAG_NOATIME;
+	if (iflags & EXT4_PROJINHERIT_FL)
+		xflags |= FS_XFLAG_PROJINHERIT;
+	return xflags;
+}
+
+/* Transfer xflags flags to internal */
+static inline unsigned long ext4_xflags_to_iflags(__u32 xflags)
+{
+	unsigned long iflags = 0;
+
+	if (xflags & FS_XFLAG_SYNC)
+		iflags |= EXT4_SYNC_FL;
+	if (xflags & FS_XFLAG_IMMUTABLE)
+		iflags |= EXT4_IMMUTABLE_FL;
+	if (xflags & FS_XFLAG_APPEND)
+		iflags |= EXT4_APPEND_FL;
+	if (xflags & FS_XFLAG_NODUMP)
+		iflags |= EXT4_NODUMP_FL;
+	if (xflags & FS_XFLAG_NOATIME)
+		iflags |= EXT4_NOATIME_FL;
+	if (xflags & FS_XFLAG_PROJINHERIT)
+		iflags |= EXT4_PROJINHERIT_FL;
+
+	return iflags;
+}
+
 long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 {
 	struct inode *inode = file_inode(filp);
@@ -211,11 +429,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
 		return put_user(flags, (int __user *) arg);
 	case EXT4_IOC_SETFLAGS: {
-		handle_t *handle = NULL;
-		int err, migrate = 0;
-		struct ext4_iloc iloc;
-		unsigned int oldflags, mask, i;
-		unsigned int jflag;
+		int err;
 
 		if (!inode_owner_or_capable(inode))
 			return -EACCES;
@@ -229,89 +443,8 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 
 		flags = ext4_mask_flags(inode->i_mode, flags);
 
-		err = -EPERM;
 		mutex_lock(&inode->i_mutex);
-		/* Is it quota file? Do not allow user to mess with it */
-		if (IS_NOQUOTA(inode))
-			goto flags_out;
-
-		oldflags = ei->i_flags;
-
-		/* The JOURNAL_DATA flag is modifiable only by root */
-		jflag = flags & EXT4_JOURNAL_DATA_FL;
-
-		/*
-		 * The IMMUTABLE and APPEND_ONLY flags can only be changed by
-		 * the relevant capability.
-		 *
-		 * This test looks nicer. Thanks to Pauline Middelink
-		 */
-		if ((flags ^ oldflags) & (EXT4_APPEND_FL | EXT4_IMMUTABLE_FL)) {
-			if (!capable(CAP_LINUX_IMMUTABLE))
-				goto flags_out;
-		}
-
-		/*
-		 * The JOURNAL_DATA flag can only be changed by
-		 * the relevant capability.
-		 */
-		if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
-			if (!capable(CAP_SYS_RESOURCE))
-				goto flags_out;
-		}
-		if ((flags ^ oldflags) & EXT4_EXTENTS_FL)
-			migrate = 1;
-
-		if (flags & EXT4_EOFBLOCKS_FL) {
-			/* we don't support adding EOFBLOCKS flag */
-			if (!(oldflags & EXT4_EOFBLOCKS_FL)) {
-				err = -EOPNOTSUPP;
-				goto flags_out;
-			}
-		} else if (oldflags & EXT4_EOFBLOCKS_FL)
-			ext4_truncate(inode);
-
-		handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
-		if (IS_ERR(handle)) {
-			err = PTR_ERR(handle);
-			goto flags_out;
-		}
-		if (IS_SYNC(inode))
-			ext4_handle_sync(handle);
-		err = ext4_reserve_inode_write(handle, inode, &iloc);
-		if (err)
-			goto flags_err;
-
-		for (i = 0, mask = 1; i < 32; i++, mask <<= 1) {
-			if (!(mask & EXT4_FL_USER_MODIFIABLE))
-				continue;
-			if (mask & flags)
-				ext4_set_inode_flag(inode, i);
-			else
-				ext4_clear_inode_flag(inode, i);
-		}
-
-		ext4_set_inode_flags(inode);
-		inode->i_ctime = ext4_current_time(inode);
-
-		err = ext4_mark_iloc_dirty(handle, inode, &iloc);
-flags_err:
-		ext4_journal_stop(handle);
-		if (err)
-			goto flags_out;
-
-		if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL))
-			err = ext4_change_inode_journal_flag(inode, jflag);
-		if (err)
-			goto flags_out;
-		if (migrate) {
-			if (flags & EXT4_EXTENTS_FL)
-				err = ext4_ext_migrate(inode);
-			else
-				err = ext4_ind_migrate(inode);
-		}
-
-flags_out:
+		err = ext4_ioctl_setflags(inode, flags);
 		mutex_unlock(&inode->i_mutex);
 		mnt_drop_write_file(filp);
 		return err;
@@ -615,7 +748,61 @@ resizefs_out:
 	}
 	case EXT4_IOC_PRECACHE_EXTENTS:
 		return ext4_ext_precache(inode);
+	case EXT4_IOC_FSGETXATTR:
+	{
+		struct fsxattr fa;
+
+		memset(&fa, 0, sizeof(struct fsxattr));
 
+		ext4_get_inode_flags(ei);
+		fa.fsx_xflags = ext4_iflags_to_xflags(ei->i_flags & EXT4_FL_USER_VISIBLE);
+
+		if (EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
+				EXT4_FEATURE_RO_COMPAT_PROJECT)) {
+			fa.fsx_projid = (__u32)from_kprojid(&init_user_ns,
+				EXT4_I(inode)->i_projid);
+		}
+
+		if (copy_to_user((struct fsxattr __user *)arg,
+				 &fa, sizeof(fa)))
+			return -EFAULT;
+		return 0;
+	}
+	case EXT4_IOC_FSSETXATTR:
+	{
+		struct fsxattr fa;
+		int err;
+
+		if (copy_from_user(&fa, (struct fsxattr __user *)arg,
+				   sizeof(fa)))
+			return -EFAULT;
+
+		/* Make sure caller has proper permission */
+		if (!inode_owner_or_capable(inode))
+			return -EACCES;
+
+		err = mnt_want_write_file(filp);
+		if (err)
+			return err;
+
+		flags = ext4_xflags_to_iflags(fa.fsx_xflags);
+		flags = ext4_mask_flags(inode->i_mode, flags);
+
+		mutex_lock(&inode->i_mutex);
+		flags = (ei->i_flags & ~EXT4_FL_XFLAG_VISIBLE) |
+			 (flags & EXT4_FL_XFLAG_VISIBLE);
+		err = ext4_ioctl_setflags(inode, flags);
+		mutex_unlock(&inode->i_mutex);
+		mnt_drop_write_file(filp);
+		if (err)
+			return err;
+
+		err = ext4_ioctl_setproject(filp, fa.fsx_projid);
+		if (err)
+			return err;
+
+		return 0;
+	}
 	default:
 		return -ENOTTY;
 	}
diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
index 18dc721..64c7ae6 100644
--- a/fs/xfs/xfs_fs.h
+++ b/fs/xfs/xfs_fs.h
@@ -36,38 +36,25 @@ struct dioattr {
 #endif
 
 /*
- * Structure for XFS_IOC_FSGETXATTR[A] and XFS_IOC_FSSETXATTR.
- */
-#ifndef HAVE_FSXATTR
-struct fsxattr {
-	__u32		fsx_xflags;	/* xflags field value (get/set) */
-	__u32		fsx_extsize;	/* extsize field value (get/set)*/
-	__u32		fsx_nextents;	/* nextents field value (get)	*/
-	__u32		fsx_projid;	/* project identifier (get/set) */
-	unsigned char	fsx_pad[12];
-};
-#endif
-
-/*
  * Flags for the bs_xflags/fsx_xflags field
  * There should be a one-to-one correspondence between these flags and the
  * XFS_DIFLAG_s.
  */
-#define XFS_XFLAG_REALTIME	0x00000001	/* data in realtime volume */
-#define XFS_XFLAG_PREALLOC	0x00000002	/* preallocated file extents */
-#define XFS_XFLAG_IMMUTABLE	0x00000008	/* file cannot be modified */
-#define XFS_XFLAG_APPEND	0x00000010	/* all writes append */
-#define XFS_XFLAG_SYNC		0x00000020	/* all writes synchronous */
-#define XFS_XFLAG_NOATIME	0x00000040	/* do not update access time */
-#define XFS_XFLAG_NODUMP	0x00000080	/* do not include in backups */
-#define XFS_XFLAG_RTINHERIT	0x00000100	/* create with rt bit set */
-#define XFS_XFLAG_PROJINHERIT	0x00000200	/* create with parents projid */
-#define XFS_XFLAG_NOSYMLINKS	0x00000400	/* disallow symlink creation */
-#define XFS_XFLAG_EXTSIZE	0x00000800	/* extent size allocator hint */
-#define XFS_XFLAG_EXTSZINHERIT	0x00001000	/* inherit inode extent size */
-#define XFS_XFLAG_NODEFRAG	0x00002000  	/* do not defragment */
-#define XFS_XFLAG_FILESTREAM	0x00004000	/* use filestream allocator */
-#define XFS_XFLAG_HASATTR	0x80000000	/* no DIFLAG for this	*/
+#define XFS_XFLAG_REALTIME	FS_XFLAG_REALTIME	/* data in realtime volume */
+#define XFS_XFLAG_PREALLOC	FS_XFLAG_PREALLOC	/* preallocated file extents */
+#define XFS_XFLAG_IMMUTABLE	FS_XFLAG_IMMUTABLE	/* file cannot be modified */
+#define XFS_XFLAG_APPEND	FS_XFLAG_APPEND		/* all writes append */
+#define XFS_XFLAG_SYNC		FS_XFLAG_SYNC		/* all writes synchronous */
+#define XFS_XFLAG_NOATIME	FS_XFLAG_NOATIME	/* do not update access time */
+#define XFS_XFLAG_NODUMP	FS_XFLAG_NODUMP		/* do not include in backups */
+#define XFS_XFLAG_RTINHERIT	FS_XFLAG_RTINHERIT	/* create with rt bit set */
+#define XFS_XFLAG_PROJINHERIT	FS_XFLAG_PROJINHERIT	/* create with parents projid */
+#define XFS_XFLAG_NOSYMLINKS	FS_XFLAG_NOSYMLINKS	/* disallow symlink creation */
+#define XFS_XFLAG_EXTSIZE	FS_XFLAG_EXTSIZE	/* extent size allocator hint */
+#define XFS_XFLAG_EXTSZINHERIT	FS_XFLAG_EXTSZINHERIT	/* inherit inode extent size */
+#define XFS_XFLAG_NODEFRAG	FS_XFLAG_NODEFRAG  	/* do not defragment */
+#define XFS_XFLAG_FILESTREAM	FS_XFLAG_FILESTREAM	/* use filestream allocator */
+#define XFS_XFLAG_HASATTR	FS_XFLAG_HASATTR	/* no DIFLAG for this	*/
 
 /*
  * Structure for XFS_IOC_GETBMAP.
@@ -503,8 +490,8 @@ typedef struct xfs_swapext
 #define XFS_IOC_ALLOCSP		_IOW ('X', 10, struct xfs_flock64)
 #define XFS_IOC_FREESP		_IOW ('X', 11, struct xfs_flock64)
 #define XFS_IOC_DIOINFO		_IOR ('X', 30, struct dioattr)
-#define XFS_IOC_FSGETXATTR	_IOR ('X', 31, struct fsxattr)
-#define XFS_IOC_FSSETXATTR	_IOW ('X', 32, struct fsxattr)
+#define XFS_IOC_FSGETXATTR	FS_IOC_FSGETXATTR
+#define XFS_IOC_FSSETXATTR	FS_IOC_FSSETXATTR
 #define XFS_IOC_ALLOCSP64	_IOW ('X', 36, struct xfs_flock64)
 #define XFS_IOC_FREESP64	_IOW ('X', 37, struct xfs_flock64)
 #define XFS_IOC_GETBMAP		_IOWR('X', 38, struct getbmap)
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index fcbf647..69dda62 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -58,6 +58,36 @@ struct inodes_stat_t {
 	long dummy[5];		/* padding for sysctl ABI compatibility */
 };
 
+/*
+ * Structure for FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR.
+ */
+struct fsxattr {
+	__u32		fsx_xflags;	/* xflags field value (get/set) */
+	__u32		fsx_extsize;	/* extsize field value (get/set)*/
+	__u32		fsx_nextents;	/* nextents field value (get)	*/
+	__u32		fsx_projid;	/* project identifier (get/set) */
+	unsigned char	fsx_pad[12];
+};
+
+/*
+ * Flags for the fsx_xflags field
+ */
+#define FS_XFLAG_REALTIME	0x00000001	/* data in realtime volume */
+#define FS_XFLAG_PREALLOC	0x00000002	/* preallocated file extents */
+#define FS_XFLAG_IMMUTABLE	0x00000008	/* file cannot be modified */
+#define FS_XFLAG_APPEND		0x00000010	/* all writes append */
+#define FS_XFLAG_SYNC		0x00000020	/* all writes synchronous */
+#define FS_XFLAG_NOATIME	0x00000040	/* do not update access time */
+#define FS_XFLAG_NODUMP		0x00000080	/* do not include in backups */
+#define FS_XFLAG_RTINHERIT	0x00000100	/* create with rt bit set */
+#define FS_XFLAG_PROJINHERIT	0x00000200	/* create with parents projid */
+#define FS_XFLAG_NOSYMLINKS	0x00000400	/* disallow symlink creation */
+#define FS_XFLAG_EXTSIZE	0x00000800	/* extent size allocator hint */
+#define FS_XFLAG_EXTSZINHERIT	0x00001000	/* inherit inode extent size */
+#define FS_XFLAG_NODEFRAG	0x00002000  	/* do not defragment */
+#define FS_XFLAG_FILESTREAM	0x00004000	/* use filestream allocator */
+#define FS_XFLAG_HASATTR	0x80000000	/* no DIFLAG for this */
+
 
 #define NR_FILE  8192	/* this can well be larger on a larger system */
 
@@ -163,6 +193,8 @@ struct inodes_stat_t {
 #define	FS_IOC_GETVERSION		_IOR('v', 1, long)
 #define	FS_IOC_SETVERSION		_IOW('v', 2, long)
 #define FS_IOC_FIEMAP			_IOWR('f', 11, struct fiemap)
+#define FS_IOC_FSGETXATTR		_IOR('X', 31, struct fsxattr)
+#define FS_IOC_FSSETXATTR		_IOW('X', 32, struct fsxattr)
 #define FS_IOC32_GETFLAGS		_IOR('f', 1, int)
 #define FS_IOC32_SETFLAGS		_IOW('f', 2, int)
 #define FS_IOC32_GETVERSION		_IOR('v', 1, int)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [v12 5/5] ext4: cleanup inode flag definitions
  2015-04-09 15:14 [v12 0/5] ext4: add project quota support Li Xi
                   ` (3 preceding siblings ...)
  2015-04-09 15:14 ` [v12 4/5] ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support Li Xi
@ 2015-04-09 15:14 ` Li Xi
       [not found] ` <1428592477-8212-1-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
  5 siblings, 0 replies; 14+ messages in thread
From: Li Xi @ 2015-04-09 15:14 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4, linux-api, tytso, adilger, jack, viro,
	hch, dmonakhov

The inode flags defined in uapi/linux/fs.h were migrated from
ext4.h. This patch changes the inode flag definitions in ext4.h
to VFS definitions to make the gaps between them clearer.

Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
---
 fs/ext4/ext4.h |   50 +++++++++++++++++++++++++-------------------------
 1 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 377fec0..05d0e8d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -352,34 +352,34 @@ struct flex_groups {
 /*
  * Inode flags
  */
-#define	EXT4_SECRM_FL			0x00000001 /* Secure deletion */
-#define	EXT4_UNRM_FL			0x00000002 /* Undelete */
-#define	EXT4_COMPR_FL			0x00000004 /* Compress file */
-#define EXT4_SYNC_FL			0x00000008 /* Synchronous updates */
-#define EXT4_IMMUTABLE_FL		0x00000010 /* Immutable file */
-#define EXT4_APPEND_FL			0x00000020 /* writes to file may only append */
-#define EXT4_NODUMP_FL			0x00000040 /* do not dump file */
-#define EXT4_NOATIME_FL			0x00000080 /* do not update atime */
+#define	EXT4_SECRM_FL			FS_SECRM_FL        /* Secure deletion */
+#define	EXT4_UNRM_FL			FS_UNRM_FL         /* Undelete */
+#define	EXT4_COMPR_FL			FS_COMPR_FL        /* Compress file */
+#define EXT4_SYNC_FL			FS_SYNC_FL         /* Synchronous updates */
+#define EXT4_IMMUTABLE_FL		FS_IMMUTABLE_FL    /* Immutable file */
+#define EXT4_APPEND_FL			FS_APPEND_FL       /* writes to file may only append */
+#define EXT4_NODUMP_FL			FS_NODUMP_FL       /* do not dump file */
+#define EXT4_NOATIME_FL			FS_NOATIME_FL      /* do not update atime */
 /* Reserved for compression usage... */
-#define EXT4_DIRTY_FL			0x00000100
-#define EXT4_COMPRBLK_FL		0x00000200 /* One or more compressed clusters */
-#define EXT4_NOCOMPR_FL			0x00000400 /* Don't compress */
+#define EXT4_DIRTY_FL			FS_DIRTY_FL
+#define EXT4_COMPRBLK_FL		FS_COMPRBLK_FL     /* One or more compressed clusters */
+#define EXT4_NOCOMPR_FL			FS_NOCOMP_FL       /* Don't compress */
 	/* nb: was previously EXT2_ECOMPR_FL */
-#define EXT4_ENCRYPT_FL			0x00000800 /* encrypted file */
+#define EXT4_ENCRYPT_FL			0x00000800         /* encrypted file */
 /* End compression flags --- maybe not all used */
-#define EXT4_INDEX_FL			0x00001000 /* hash-indexed directory */
-#define EXT4_IMAGIC_FL			0x00002000 /* AFS directory */
-#define EXT4_JOURNAL_DATA_FL		0x00004000 /* file data should be journaled */
-#define EXT4_NOTAIL_FL			0x00008000 /* file tail should not be merged */
-#define EXT4_DIRSYNC_FL			0x00010000 /* dirsync behaviour (directories only) */
-#define EXT4_TOPDIR_FL			0x00020000 /* Top of directory hierarchies*/
-#define EXT4_HUGE_FILE_FL               0x00040000 /* Set to each huge file */
-#define EXT4_EXTENTS_FL			0x00080000 /* Inode uses extents */
-#define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
-#define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
-#define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
-#define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
-#define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
+#define EXT4_INDEX_FL			FS_INDEX_FL        /* hash-indexed directory */
+#define EXT4_IMAGIC_FL			FS_IMAGIC_FL       /* AFS directory */
+#define EXT4_JOURNAL_DATA_FL		FS_JOURNAL_DATA_FL /* file data should be journaled */
+#define EXT4_NOTAIL_FL			FS_NOTAIL_FL       /* file tail should not be merged */
+#define EXT4_DIRSYNC_FL			FS_DIRSYNC_FL      /* dirsync behaviour (directories only) */
+#define EXT4_TOPDIR_FL			FS_TOPDIR_FL       /* Top of directory hierarchies*/
+#define EXT4_HUGE_FILE_FL               0x00040000         /* Set to each huge file */
+#define EXT4_EXTENTS_FL			FS_EXTENT_FL       /* Inode uses extents */
+#define EXT4_EA_INODE_FL	        0x00200000         /* Inode used for large EA */
+#define EXT4_EOFBLOCKS_FL		0x00400000         /* Blocks allocated beyond EOF */
+#define EXT4_INLINE_DATA_FL		0x10000000         /* Inode has inline data. */
+#define EXT4_PROJINHERIT_FL		FS_PROJINHERIT_FL  /* Create with parents projid */
+#define EXT4_RESERVED_FL		FS_RESERVED_FL     /* reserved for ext4 lib */
 
 #define EXT4_FL_USER_VISIBLE		0x204BDFFF /* User visible flags */
 #define EXT4_FL_USER_MODIFIABLE		0x204380FF /* User modifiable flags */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [v12 2/5] ext4: adds project ID support
       [not found]   ` <1428592477-8212-3-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
@ 2015-04-10 23:37     ` Andreas Dilger
       [not found]       ` <C5339F6B-2DAE-4984-86CF-B2990CEC3550-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Andreas Dilger @ 2015-04-10 23:37 UTC (permalink / raw)
  To: Li Xi
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, tytso-3s7WtUTddSA,
	jack-AlSwsSmVLrQ, viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	hch-wEGCiKHe2LqWVfeAwA7xHQ, dmonakhov-GEFAQzZX7r8dnm+yROfE0A

On Apr 9, 2015, at 9:14 AM, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> This patch adds a new internal field of ext4 inode to save project
> identifier. Also a new flag EXT4_INODE_PROJINHERIT is added for
> inheriting project ID from parent directory.
> 
> Signed-off-by: Li Xi <lixi-LfVdkaOWEx8@public.gmane.org>
> Reviewed-by: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
> ---
> fs/ext4/ext4.h          |   21 +++++++++++++++++----
> fs/ext4/ialloc.c        |    5 +++++
> fs/ext4/inode.c         |   29 +++++++++++++++++++++++++++++
> fs/ext4/namei.c         |   18 ++++++++++++++++++
> fs/ext4/super.c         |    1 +
> include/uapi/linux/fs.h |    1 +
> 6 files changed, 71 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 7fec2ef..7acb2da 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -378,16 +378,18 @@ struct flex_groups {
> #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
> #define EXT4_EOFBLOCKS_FL		0x00400000 /* Blocks allocated beyond EOF */
> #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
> +#define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> #define EXT4_RESERVED_FL		0x80000000 /* reserved for ext4 lib */
> 
> -#define EXT4_FL_USER_VISIBLE		0x004BDFFF /* User visible flags */
> -#define EXT4_FL_USER_MODIFIABLE		0x004380FF /* User modifiable flags */
> +#define EXT4_FL_USER_VISIBLE		0x204BDFFF /* User visible flags */
> +#define EXT4_FL_USER_MODIFIABLE		0x204380FF /* User modifiable flags */

I just noticed that EXT4_INLINE_DATA_FL isn't in EXT4_FL_USER_VISIBLE,
but it probably should be.  If this patch is refreshed it wouldn't hurt
to add that flag also.

> /* Flags that should be inherited by new inodes from their parent. */
> #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
> 			   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
> 			   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
> -			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
> +			   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
> +			   EXT4_PROJINHERIT_FL)
> 
> /* Flags that are appropriate for regular files (all but dir-specific ones). */
> #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL))
> @@ -435,6 +437,7 @@ enum {
> 	EXT4_INODE_EA_INODE	= 21,	/* Inode used for large EA */
> 	EXT4_INODE_EOFBLOCKS	= 22,	/* Blocks allocated beyond EOF */
> 	EXT4_INODE_INLINE_DATA	= 28,	/* Data in inode. */
> +	EXT4_INODE_PROJINHERIT	= 29,	/* Create with parents projid */
> 	EXT4_INODE_RESERVED	= 31,	/* reserved for ext4 lib */
> };
> 
> @@ -684,6 +687,7 @@ struct ext4_inode {
> 	__le32  i_crtime;       /* File Creation time */
> 	__le32  i_crtime_extra; /* extra FileCreationtime (nsec << 2 | epoch) */
> 	__le32  i_version_hi;	/* high 32 bits for 64-bit version */
> +	__le32  i_projid;	/* Project ID */
> };
> 
> struct move_extent {
> @@ -939,6 +943,7 @@ struct ext4_inode_info {
> 
> 	/* Precomputed uuid+inum+igen checksum for seeding inode checksums */
> 	__u32 i_csum_seed;
> +	kprojid_t i_projid;
> };
> 
> /*
> @@ -1531,6 +1536,7 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
>  */
> #define EXT4_FEATURE_RO_COMPAT_METADATA_CSUM	0x0400
> #define EXT4_FEATURE_RO_COMPAT_READONLY		0x1000
> +#define EXT4_FEATURE_RO_COMPAT_PROJECT		0x2000
> 
> #define EXT4_FEATURE_INCOMPAT_COMPRESSION	0x0001
> #define EXT4_FEATURE_INCOMPAT_FILETYPE		0x0002
> @@ -1581,7 +1587,8 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
> 					 EXT4_FEATURE_RO_COMPAT_HUGE_FILE |\
> 					 EXT4_FEATURE_RO_COMPAT_BIGALLOC |\
> 					 EXT4_FEATURE_RO_COMPAT_METADATA_CSUM|\
> -					 EXT4_FEATURE_RO_COMPAT_QUOTA)
> +					 EXT4_FEATURE_RO_COMPAT_QUOTA |\
> +					 EXT4_FEATURE_RO_COMPAT_PROJECT)
> 
> /*
>  * Default values for user and/or group using reserved blocks
> @@ -1589,6 +1596,11 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
> #define	EXT4_DEF_RESUID		0
> #define	EXT4_DEF_RESGID		0
> 
> +/*
> + * Default project ID
> + */
> +#define	EXT4_DEF_PROJID		0
> +
> #define EXT4_DEF_INODE_READAHEAD_BLKS	32
> 
> /*
> @@ -2141,6 +2153,7 @@ extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode,
> 			     loff_t lstart, loff_t lend);
> extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
> extern qsize_t *ext4_get_reserved_space(struct inode *inode);
> +extern int ext4_get_projid(struct inode *inode, kprojid_t *projid);
> extern void ext4_da_update_reserve_space(struct inode *inode,
> 					int used, int quota_claim);
> 
> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
> index ac644c3..10ca9dd 100644
> --- a/fs/ext4/ialloc.c
> +++ b/fs/ext4/ialloc.c
> @@ -756,6 +756,11 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
> 		inode->i_gid = dir->i_gid;
> 	} else
> 		inode_init_owner(inode, dir, mode);
> +	if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_PROJECT) &&
> +	    ext4_test_inode_flag(dir, EXT4_INODE_PROJINHERIT))
> +		ei->i_projid = EXT4_I(dir)->i_projid;
> +	else
> +		ei->i_projid = make_kprojid(&init_user_ns, EXT4_DEF_PROJID);
> 	dquot_initialize(inode);
> 
> 	if (!goal)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 4df6d01..6e4833f 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3870,6 +3870,14 @@ static inline void ext4_iget_extra_inode(struct inode *inode,
> 		EXT4_I(inode)->i_inline_off = 0;
> }
> 
> +int ext4_get_projid(struct inode *inode, kprojid_t *projid)
> +{
> +	if (!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb, EXT4_FEATURE_RO_COMPAT_PROJECT))
> +		return -EOPNOTSUPP;
> +	*projid = EXT4_I(inode)->i_projid;
> +	return 0;
> +}
> +
> struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> {
> 	struct ext4_iloc iloc;
> @@ -3881,6 +3889,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> 	int block;
> 	uid_t i_uid;
> 	gid_t i_gid;
> +	projid_t i_projid;
> 
> 	inode = iget_locked(sb, ino);
> 	if (!inode)
> @@ -3930,12 +3939,18 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
> 	inode->i_mode = le16_to_cpu(raw_inode->i_mode);
> 	i_uid = (uid_t)le16_to_cpu(raw_inode->i_uid_low);
> 	i_gid = (gid_t)le16_to_cpu(raw_inode->i_gid_low);
> +	if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_PROJECT))
> +		i_projid = (projid_t)le32_to_cpu(raw_inode->i_projid);
> +	else
> +		i_projid = EXT4_DEF_PROJID;
> +
> 	if (!(test_opt(inode->i_sb, NO_UID32))) {
> 		i_uid |= le16_to_cpu(raw_inode->i_uid_high) << 16;
> 		i_gid |= le16_to_cpu(raw_inode->i_gid_high) << 16;
> 	}
> 	i_uid_write(inode, i_uid);
> 	i_gid_write(inode, i_gid);
> +	ei->i_projid = make_kprojid(&init_user_ns, i_projid);;
> 	set_nlink(inode, le16_to_cpu(raw_inode->i_links_count));
> 
> 	ext4_clear_state_flags(ei);	/* Only relevant on 32-bit archs */
> @@ -4165,6 +4180,7 @@ static int ext4_do_update_inode(handle_t *handle,
> 	int need_datasync = 0, set_large_file = 0;
> 	uid_t i_uid;
> 	gid_t i_gid;
> +	projid_t i_projid;
> 
> 	spin_lock(&ei->i_raw_lock);
> 
> @@ -4177,6 +4193,7 @@ static int ext4_do_update_inode(handle_t *handle,
> 	raw_inode->i_mode = cpu_to_le16(inode->i_mode);
> 	i_uid = i_uid_read(inode);
> 	i_gid = i_gid_read(inode);
> +	i_projid = from_kprojid(&init_user_ns, ei->i_projid);
> 	if (!(test_opt(inode->i_sb, NO_UID32))) {
> 		raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
> 		raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
> @@ -4256,6 +4273,18 @@ static int ext4_do_update_inode(handle_t *handle,
> 		}
> 	}
> 
> +	BUG_ON(!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
> +			EXT4_FEATURE_RO_COMPAT_PROJECT) &&
> +	       i_projid != EXT4_DEF_PROJID);
> +	if (i_projid != EXT4_DEF_PROJID &&
> +	    (EXT4_INODE_SIZE(inode->i_sb) <= EXT4_GOOD_OLD_INODE_SIZE ||
> +	     (!EXT4_FITS_IN_INODE(raw_inode, ei, i_projid)))) {
> +		spin_unlock(&ei->i_raw_lock);
> +		err = -EFBIG;

I mentioned the following back on v8 of the patch, but I'll write it again:

I don't think -EFBIG "File too large" is a good error here.  Better would
be -EOPNOTSUPP for EXT4_INODE_SIZE() <= EXT4_GOOD_OLD_INODE_SIZE, since
this will never work for this filesystem.  For the !EXT4_FITS_IN_INODE()
case -EOVERFLOW would be better?

Also, returning the error from ext4_mark_iloc_dirty->ext4_do_update_inode()
is a bit late in the game, since the inode has already been modified at
this point.  That callpath typically only returned an error if the disk
was bad or the journal aborted (again normally because the disk was bad),
so at that point the dirty in-memory data was lost anyway.

It would be better to check this in the caller before the inode is changed
so that an error can be returned without (essentially) corrupting the
in-memory state.  Since the projid should only be set for new inodes (which
will always have enough space, assuming RO_COMPAT_PROJECT cannot be set on
filesystems with 128-byte inodes), or in case of rename into a project
directory, it shouldn't be too big a change.

> +		goto out_brelse;
> +	}
> +	raw_inode->i_projid = cpu_to_le32(i_projid);
> +
> 	ext4_inode_csum_set(inode, raw_inode, ei);
> 
> 	spin_unlock(&ei->i_raw_lock);
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 2291923..63a9623 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -2938,6 +2938,11 @@ static int ext4_link(struct dentry *old_dentry,
> 	if (inode->i_nlink >= EXT4_LINK_MAX)
> 		return -EMLINK;
> 
> +	if ((ext4_test_inode_flag(dir, EXT4_INODE_PROJINHERIT)) &&
> +	    (!projid_eq(EXT4_I(dir)->i_projid,
> +			EXT4_I(old_dentry->d_inode)->i_projid)))
> +		return -EXDEV;
> +
> 	dquot_initialize(dir);
> 
> retry:
> @@ -3217,6 +3222,11 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
> 	int credits;
> 	u8 old_file_type;
> 
> +	if ((ext4_test_inode_flag(new_dir, EXT4_INODE_PROJINHERIT)) &&
> +	    (!projid_eq(EXT4_I(new_dir)->i_projid,
> +			EXT4_I(old_dentry->d_inode)->i_projid)))
> +		return -EXDEV;
> +
> 	dquot_initialize(old.dir);
> 	dquot_initialize(new.dir);
> 
> @@ -3395,6 +3405,14 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
> 	u8 new_file_type;
> 	int retval;
> 
> +	if ((ext4_test_inode_flag(new_dir, EXT4_INODE_PROJINHERIT) &&
> +	     !projid_eq(EXT4_I(new_dir)->i_projid,
> +			EXT4_I(old_dentry->d_inode)->i_projid)) ||
> +	    (ext4_test_inode_flag(old_dir, EXT4_INODE_PROJINHERIT) &&
> +	     !projid_eq(EXT4_I(old_dir)->i_projid,
> +			EXT4_I(new_dentry->d_inode)->i_projid)))
> +		return -EXDEV;
> +
> 	dquot_initialize(old.dir);
> 	dquot_initialize(new.dir);
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index bff3427..04c6cc3 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1073,6 +1073,7 @@ static const struct dquot_operations ext4_quota_operations = {
> 	.write_info	= ext4_write_info,
> 	.alloc_dquot	= dquot_alloc,
> 	.destroy_dquot	= dquot_destroy,
> +	.get_projid	= ext4_get_projid,
> };
> 
> static const struct quotactl_ops ext4_qctl_operations = {
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 3735fa0..fcbf647 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -195,6 +195,7 @@ struct inodes_stat_t {
> #define FS_EXTENT_FL			0x00080000 /* Extents */
> #define FS_DIRECTIO_FL			0x00100000 /* Use direct i/o */
> #define FS_NOCOW_FL			0x00800000 /* Do not cow file */
> +#define FS_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
> #define FS_RESERVED_FL			0x80000000 /* reserved for ext2 lib */
> 
> #define FS_FL_USER_VISIBLE		0x0003DFFF /* User visible flags */
> -- 
> 1.7.1
> 


Cheers, Andreas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 1/5] vfs: adds general codes to enforces project quota limits
       [not found]   ` <1428592477-8212-2-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
@ 2015-04-12 13:49     ` Alban Crequy
  0 siblings, 0 replies; 14+ messages in thread
From: Alban Crequy @ 2015-04-12 13:49 UTC (permalink / raw)
  To: Li Xi
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA, Linux API, tytso-3s7WtUTddSA,
	adilger-m1MBpc4rdrD3fQ9qLvQP4Q, jack-AlSwsSmVLrQ,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	hch-wEGCiKHe2LqWVfeAwA7xHQ, dmonakhov-GEFAQzZX7r8dnm+yROfE0A

On 9 April 2015 at 17:14, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> This patch adds support for a new quota type PRJQUOTA for project quota
> enforcement. Also a new method get_projid() is added into dquot_operations
> structure.
>(...)
> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
> index 2aa4151..33b30b1 100644
> --- a/fs/quota/quota.c
> +++ b/fs/quota/quota.c
> @@ -30,7 +30,10 @@ static int check_quotactl_permission(struct super_block *sb, int type, int cmd,
>         case Q_XGETQSTATV:
>         case Q_XQUOTASYNC:
>                 break;
> -       /* allow to query information for dquots we "own" */
> +       /*
> +        * allow to query information for dquots we "own"
> +        * always allow querying project quota
> +        */

I would add a precision in the comment:

always allow querying project quota when the project id is mapped in
the current user namespace.

id is the id in the current user namespace. So quotas for unmapped
users, unmapped groups or unmapped projects cannot be queried.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 0/5] ext4: add project quota support
       [not found] ` <1428592477-8212-1-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
@ 2015-04-12 15:36   ` Alban Crequy
  2015-04-12 15:36   ` Alban Crequy
  1 sibling, 0 replies; 14+ messages in thread
From: Alban Crequy @ 2015-04-12 15:36 UTC (permalink / raw)
  To: Li Xi
  Cc: adilger-m1MBpc4rdrD3fQ9qLvQP4Q, jack-AlSwsSmVLrQ, Linux API,
	Linux Containers, hch-wEGCiKHe2LqWVfeAwA7xHQ,
	dmonakhov-GEFAQzZX7r8dnm+yROfE0A,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, tytso-3s7WtUTddSA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA

On 9 April 2015 at 17:14, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> The following patches propose an implementation of project quota
> support for ext4. A project is an aggregate of unrelated inodes
> which might scatter in different directories. Inodes that belong
> to the same project possess an identical identification i.e.
> 'project ID', just like every inode has its user/group
> identification. The following patches add project quota as
> supplement to the former uer/group quota types.
> (...)

Thanks for this work, I would like to use this for containers. I am
adding containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org in Cc.

To make sure I understand correctly, I will describe the configuration
I have in mind and hopefully someone can tell me if it makes sense.

Containers created by rkt (https://github.com/coreos/rkt) use an
overlay filesystem as root and the lowerdir/upperdir directories are
based on an ext4 filesystem outside of the container's reach. The
lowerdir is the base image, and several container instances can
potentially use the same lowerdir. Each container has its upperdir
containing their changes.

With your patch set, I could assign a different projid to the upperdir
of each container with a specific quota. Then it will limit how much
the container will be able to write. I don't know if the overlay's
workdir would need to have projid too.

When a quota warning is sent on netlink, it is received only in the
initial user namespace and the processes in a different user namespace
will not be able to receive the netlink warnings. The user will only
receive a warning through the control terminal.

Since rkt does not use user namespaces yet, a rkt container could
unfortunately receive quota warnings through netlink concerning the
host or other containers. Or is it restricted to init_net?

quotactl() can be used in a rkt container if the proccesses in the
container can guess somehow which block device is used by the
filesystem hosting the overlay's upperdir and if they can mknod it
somewhere. Usually, containers don't restrict mknod but just restrict
read-write access through the device cgroup. The read-write access is
irrelevant for quotactl(): quotactl() just check that the device node
exists and that it is not on a nodev mount. The nodev check does not
restrict containers here because they usually have a /dev mounted as
tmpfs without the nodev option.

Containers that don't use user namespaces (so no projid mapping) would
be able to query quotas for projid assigned to other containers
(unfortunately). They would be able to change the quota of other
containers if they are privileged enough to be given CAP_SYS_RESOURCE.

Containers using user namespaces would not be able to change any quota
config because they don't have CAP_SYS_RESOURCE in the init user
namespace. If they are configured with a proper projid mapping, they
would only be able to query the projid they are assigned (they could
guess which projid to query by looking at /proc/self/projid_map).

Do you know if someone is working on the documentation? It would be
nice if filesystems/quota.txt could say who can receive the quota
warnings on netlink (which namespace) and if it could give some
information about projid. But maybe this belong to the proc(5) and
user_namespaces(7) manpages as well.

Is there any suggestions how to allocate projid in userspace?
Something like /etc/subprojid similar to /etc/subuid?

Thanks!
Alban

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 0/5] ext4: add project quota support
       [not found] ` <1428592477-8212-1-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
  2015-04-12 15:36   ` [v12 0/5] ext4: add project quota support Alban Crequy
@ 2015-04-12 15:36   ` Alban Crequy
       [not found]     ` <CAMXgnP6RF4HPDyugvnMKn3rDnuG7j1cz-xFtEMWx4Va1rEHVEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 14+ messages in thread
From: Alban Crequy @ 2015-04-12 15:36 UTC (permalink / raw)
  To: Li Xi
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA, Linux API, tytso-3s7WtUTddSA,
	adilger-m1MBpc4rdrD3fQ9qLvQP4Q, jack-AlSwsSmVLrQ,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	hch-wEGCiKHe2LqWVfeAwA7xHQ, dmonakhov-GEFAQzZX7r8dnm+yROfE0A,
	Linux Containers

On 9 April 2015 at 17:14, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> The following patches propose an implementation of project quota
> support for ext4. A project is an aggregate of unrelated inodes
> which might scatter in different directories. Inodes that belong
> to the same project possess an identical identification i.e.
> 'project ID', just like every inode has its user/group
> identification. The following patches add project quota as
> supplement to the former uer/group quota types.
> (...)

Thanks for this work, I would like to use this for containers. I am
adding containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org in Cc.

To make sure I understand correctly, I will describe the configuration
I have in mind and hopefully someone can tell me if it makes sense.

Containers created by rkt (https://github.com/coreos/rkt) use an
overlay filesystem as root and the lowerdir/upperdir directories are
based on an ext4 filesystem outside of the container's reach. The
lowerdir is the base image, and several container instances can
potentially use the same lowerdir. Each container has its upperdir
containing their changes.

With your patch set, I could assign a different projid to the upperdir
of each container with a specific quota. Then it will limit how much
the container will be able to write. I don't know if the overlay's
workdir would need to have projid too.

When a quota warning is sent on netlink, it is received only in the
initial user namespace and the processes in a different user namespace
will not be able to receive the netlink warnings. The user will only
receive a warning through the control terminal.

Since rkt does not use user namespaces yet, a rkt container could
unfortunately receive quota warnings through netlink concerning the
host or other containers. Or is it restricted to init_net?

quotactl() can be used in a rkt container if the proccesses in the
container can guess somehow which block device is used by the
filesystem hosting the overlay's upperdir and if they can mknod it
somewhere. Usually, containers don't restrict mknod but just restrict
read-write access through the device cgroup. The read-write access is
irrelevant for quotactl(): quotactl() just check that the device node
exists and that it is not on a nodev mount. The nodev check does not
restrict containers here because they usually have a /dev mounted as
tmpfs without the nodev option.

Containers that don't use user namespaces (so no projid mapping) would
be able to query quotas for projid assigned to other containers
(unfortunately). They would be able to change the quota of other
containers if they are privileged enough to be given CAP_SYS_RESOURCE.

Containers using user namespaces would not be able to change any quota
config because they don't have CAP_SYS_RESOURCE in the init user
namespace. If they are configured with a proper projid mapping, they
would only be able to query the projid they are assigned (they could
guess which projid to query by looking at /proc/self/projid_map).

Do you know if someone is working on the documentation? It would be
nice if filesystems/quota.txt could say who can receive the quota
warnings on netlink (which namespace) and if it could give some
information about projid. But maybe this belong to the proc(5) and
user_namespaces(7) manpages as well.

Is there any suggestions how to allocate projid in userspace?
Something like /etc/subprojid similar to /etc/subuid?

Thanks!
Alban

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 2/5] ext4: adds project ID support
       [not found]       ` <C5339F6B-2DAE-4984-86CF-B2990CEC3550-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
@ 2015-04-13 16:33         ` Li Xi
  0 siblings, 0 replies; 14+ messages in thread
From: Li Xi @ 2015-04-13 16:33 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Ext4 Developers List,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Jan Kara,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	hch-wEGCiKHe2LqWVfeAwA7xHQ, Dmitry Monakhov

Sorry Andreas, I will update the patch soon.

On Fri, Apr 10, 2015 at 5:37 PM, Andreas Dilger <adilger-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org> wrote:
> On Apr 9, 2015, at 9:14 AM, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>> This patch adds a new internal field of ext4 inode to save project
>> identifier. Also a new flag EXT4_INODE_PROJINHERIT is added for
>> inheriting project ID from parent directory.
>>
>> Signed-off-by: Li Xi <lixi-LfVdkaOWEx8@public.gmane.org>
>> Reviewed-by: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
>> ---
>> fs/ext4/ext4.h          |   21 +++++++++++++++++----
>> fs/ext4/ialloc.c        |    5 +++++
>> fs/ext4/inode.c         |   29 +++++++++++++++++++++++++++++
>> fs/ext4/namei.c         |   18 ++++++++++++++++++
>> fs/ext4/super.c         |    1 +
>> include/uapi/linux/fs.h |    1 +
>> 6 files changed, 71 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>> index 7fec2ef..7acb2da 100644
>> --- a/fs/ext4/ext4.h
>> +++ b/fs/ext4/ext4.h
>> @@ -378,16 +378,18 @@ struct flex_groups {
>> #define EXT4_EA_INODE_FL              0x00200000 /* Inode used for large EA */
>> #define EXT4_EOFBLOCKS_FL             0x00400000 /* Blocks allocated beyond EOF */
>> #define EXT4_INLINE_DATA_FL           0x10000000 /* Inode has inline data. */
>> +#define EXT4_PROJINHERIT_FL          0x20000000 /* Create with parents projid */
>> #define EXT4_RESERVED_FL              0x80000000 /* reserved for ext4 lib */
>>
>> -#define EXT4_FL_USER_VISIBLE         0x004BDFFF /* User visible flags */
>> -#define EXT4_FL_USER_MODIFIABLE              0x004380FF /* User modifiable flags */
>> +#define EXT4_FL_USER_VISIBLE         0x204BDFFF /* User visible flags */
>> +#define EXT4_FL_USER_MODIFIABLE              0x204380FF /* User modifiable flags */
>
> I just noticed that EXT4_INLINE_DATA_FL isn't in EXT4_FL_USER_VISIBLE,
> but it probably should be.  If this patch is refreshed it wouldn't hurt
> to add that flag also.
>
>> /* Flags that should be inherited by new inodes from their parent. */
>> #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
>>                          EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
>>                          EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
>> -                        EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL)
>> +                        EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
>> +                        EXT4_PROJINHERIT_FL)
>>
>> /* Flags that are appropriate for regular files (all but dir-specific ones). */
>> #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL))
>> @@ -435,6 +437,7 @@ enum {
>>       EXT4_INODE_EA_INODE     = 21,   /* Inode used for large EA */
>>       EXT4_INODE_EOFBLOCKS    = 22,   /* Blocks allocated beyond EOF */
>>       EXT4_INODE_INLINE_DATA  = 28,   /* Data in inode. */
>> +     EXT4_INODE_PROJINHERIT  = 29,   /* Create with parents projid */
>>       EXT4_INODE_RESERVED     = 31,   /* reserved for ext4 lib */
>> };
>>
>> @@ -684,6 +687,7 @@ struct ext4_inode {
>>       __le32  i_crtime;       /* File Creation time */
>>       __le32  i_crtime_extra; /* extra FileCreationtime (nsec << 2 | epoch) */
>>       __le32  i_version_hi;   /* high 32 bits for 64-bit version */
>> +     __le32  i_projid;       /* Project ID */
>> };
>>
>> struct move_extent {
>> @@ -939,6 +943,7 @@ struct ext4_inode_info {
>>
>>       /* Precomputed uuid+inum+igen checksum for seeding inode checksums */
>>       __u32 i_csum_seed;
>> +     kprojid_t i_projid;
>> };
>>
>> /*
>> @@ -1531,6 +1536,7 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
>>  */
>> #define EXT4_FEATURE_RO_COMPAT_METADATA_CSUM  0x0400
>> #define EXT4_FEATURE_RO_COMPAT_READONLY               0x1000
>> +#define EXT4_FEATURE_RO_COMPAT_PROJECT               0x2000
>>
>> #define EXT4_FEATURE_INCOMPAT_COMPRESSION     0x0001
>> #define EXT4_FEATURE_INCOMPAT_FILETYPE                0x0002
>> @@ -1581,7 +1587,8 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
>>                                        EXT4_FEATURE_RO_COMPAT_HUGE_FILE |\
>>                                        EXT4_FEATURE_RO_COMPAT_BIGALLOC |\
>>                                        EXT4_FEATURE_RO_COMPAT_METADATA_CSUM|\
>> -                                      EXT4_FEATURE_RO_COMPAT_QUOTA)
>> +                                      EXT4_FEATURE_RO_COMPAT_QUOTA |\
>> +                                      EXT4_FEATURE_RO_COMPAT_PROJECT)
>>
>> /*
>>  * Default values for user and/or group using reserved blocks
>> @@ -1589,6 +1596,11 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
>> #define       EXT4_DEF_RESUID         0
>> #define       EXT4_DEF_RESGID         0
>>
>> +/*
>> + * Default project ID
>> + */
>> +#define      EXT4_DEF_PROJID         0
>> +
>> #define EXT4_DEF_INODE_READAHEAD_BLKS 32
>>
>> /*
>> @@ -2141,6 +2153,7 @@ extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode,
>>                            loff_t lstart, loff_t lend);
>> extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
>> extern qsize_t *ext4_get_reserved_space(struct inode *inode);
>> +extern int ext4_get_projid(struct inode *inode, kprojid_t *projid);
>> extern void ext4_da_update_reserve_space(struct inode *inode,
>>                                       int used, int quota_claim);
>>
>> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
>> index ac644c3..10ca9dd 100644
>> --- a/fs/ext4/ialloc.c
>> +++ b/fs/ext4/ialloc.c
>> @@ -756,6 +756,11 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir,
>>               inode->i_gid = dir->i_gid;
>>       } else
>>               inode_init_owner(inode, dir, mode);
>> +     if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_PROJECT) &&
>> +         ext4_test_inode_flag(dir, EXT4_INODE_PROJINHERIT))
>> +             ei->i_projid = EXT4_I(dir)->i_projid;
>> +     else
>> +             ei->i_projid = make_kprojid(&init_user_ns, EXT4_DEF_PROJID);
>>       dquot_initialize(inode);
>>
>>       if (!goal)
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 4df6d01..6e4833f 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -3870,6 +3870,14 @@ static inline void ext4_iget_extra_inode(struct inode *inode,
>>               EXT4_I(inode)->i_inline_off = 0;
>> }
>>
>> +int ext4_get_projid(struct inode *inode, kprojid_t *projid)
>> +{
>> +     if (!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb, EXT4_FEATURE_RO_COMPAT_PROJECT))
>> +             return -EOPNOTSUPP;
>> +     *projid = EXT4_I(inode)->i_projid;
>> +     return 0;
>> +}
>> +
>> struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>> {
>>       struct ext4_iloc iloc;
>> @@ -3881,6 +3889,7 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>>       int block;
>>       uid_t i_uid;
>>       gid_t i_gid;
>> +     projid_t i_projid;
>>
>>       inode = iget_locked(sb, ino);
>>       if (!inode)
>> @@ -3930,12 +3939,18 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
>>       inode->i_mode = le16_to_cpu(raw_inode->i_mode);
>>       i_uid = (uid_t)le16_to_cpu(raw_inode->i_uid_low);
>>       i_gid = (gid_t)le16_to_cpu(raw_inode->i_gid_low);
>> +     if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_PROJECT))
>> +             i_projid = (projid_t)le32_to_cpu(raw_inode->i_projid);
>> +     else
>> +             i_projid = EXT4_DEF_PROJID;
>> +
>>       if (!(test_opt(inode->i_sb, NO_UID32))) {
>>               i_uid |= le16_to_cpu(raw_inode->i_uid_high) << 16;
>>               i_gid |= le16_to_cpu(raw_inode->i_gid_high) << 16;
>>       }
>>       i_uid_write(inode, i_uid);
>>       i_gid_write(inode, i_gid);
>> +     ei->i_projid = make_kprojid(&init_user_ns, i_projid);;
>>       set_nlink(inode, le16_to_cpu(raw_inode->i_links_count));
>>
>>       ext4_clear_state_flags(ei);     /* Only relevant on 32-bit archs */
>> @@ -4165,6 +4180,7 @@ static int ext4_do_update_inode(handle_t *handle,
>>       int need_datasync = 0, set_large_file = 0;
>>       uid_t i_uid;
>>       gid_t i_gid;
>> +     projid_t i_projid;
>>
>>       spin_lock(&ei->i_raw_lock);
>>
>> @@ -4177,6 +4193,7 @@ static int ext4_do_update_inode(handle_t *handle,
>>       raw_inode->i_mode = cpu_to_le16(inode->i_mode);
>>       i_uid = i_uid_read(inode);
>>       i_gid = i_gid_read(inode);
>> +     i_projid = from_kprojid(&init_user_ns, ei->i_projid);
>>       if (!(test_opt(inode->i_sb, NO_UID32))) {
>>               raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid));
>>               raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid));
>> @@ -4256,6 +4273,18 @@ static int ext4_do_update_inode(handle_t *handle,
>>               }
>>       }
>>
>> +     BUG_ON(!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
>> +                     EXT4_FEATURE_RO_COMPAT_PROJECT) &&
>> +            i_projid != EXT4_DEF_PROJID);
>> +     if (i_projid != EXT4_DEF_PROJID &&
>> +         (EXT4_INODE_SIZE(inode->i_sb) <= EXT4_GOOD_OLD_INODE_SIZE ||
>> +          (!EXT4_FITS_IN_INODE(raw_inode, ei, i_projid)))) {
>> +             spin_unlock(&ei->i_raw_lock);
>> +             err = -EFBIG;
>
> I mentioned the following back on v8 of the patch, but I'll write it again:
>
> I don't think -EFBIG "File too large" is a good error here.  Better would
> be -EOPNOTSUPP for EXT4_INODE_SIZE() <= EXT4_GOOD_OLD_INODE_SIZE, since
> this will never work for this filesystem.  For the !EXT4_FITS_IN_INODE()
> case -EOVERFLOW would be better?
>
> Also, returning the error from ext4_mark_iloc_dirty->ext4_do_update_inode()
> is a bit late in the game, since the inode has already been modified at
> this point.  That callpath typically only returned an error if the disk
> was bad or the journal aborted (again normally because the disk was bad),
> so at that point the dirty in-memory data was lost anyway.
>
> It would be better to check this in the caller before the inode is changed
> so that an error can be returned without (essentially) corrupting the
> in-memory state.  Since the projid should only be set for new inodes (which
> will always have enough space, assuming RO_COMPAT_PROJECT cannot be set on
> filesystems with 128-byte inodes), or in case of rename into a project
> directory, it shouldn't be too big a change.
>
>> +             goto out_brelse;
>> +     }
>> +     raw_inode->i_projid = cpu_to_le32(i_projid);
>> +
>>       ext4_inode_csum_set(inode, raw_inode, ei);
>>
>>       spin_unlock(&ei->i_raw_lock);
>> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
>> index 2291923..63a9623 100644
>> --- a/fs/ext4/namei.c
>> +++ b/fs/ext4/namei.c
>> @@ -2938,6 +2938,11 @@ static int ext4_link(struct dentry *old_dentry,
>>       if (inode->i_nlink >= EXT4_LINK_MAX)
>>               return -EMLINK;
>>
>> +     if ((ext4_test_inode_flag(dir, EXT4_INODE_PROJINHERIT)) &&
>> +         (!projid_eq(EXT4_I(dir)->i_projid,
>> +                     EXT4_I(old_dentry->d_inode)->i_projid)))
>> +             return -EXDEV;
>> +
>>       dquot_initialize(dir);
>>
>> retry:
>> @@ -3217,6 +3222,11 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
>>       int credits;
>>       u8 old_file_type;
>>
>> +     if ((ext4_test_inode_flag(new_dir, EXT4_INODE_PROJINHERIT)) &&
>> +         (!projid_eq(EXT4_I(new_dir)->i_projid,
>> +                     EXT4_I(old_dentry->d_inode)->i_projid)))
>> +             return -EXDEV;
>> +
>>       dquot_initialize(old.dir);
>>       dquot_initialize(new.dir);
>>
>> @@ -3395,6 +3405,14 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
>>       u8 new_file_type;
>>       int retval;
>>
>> +     if ((ext4_test_inode_flag(new_dir, EXT4_INODE_PROJINHERIT) &&
>> +          !projid_eq(EXT4_I(new_dir)->i_projid,
>> +                     EXT4_I(old_dentry->d_inode)->i_projid)) ||
>> +         (ext4_test_inode_flag(old_dir, EXT4_INODE_PROJINHERIT) &&
>> +          !projid_eq(EXT4_I(old_dir)->i_projid,
>> +                     EXT4_I(new_dentry->d_inode)->i_projid)))
>> +             return -EXDEV;
>> +
>>       dquot_initialize(old.dir);
>>       dquot_initialize(new.dir);
>>
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index bff3427..04c6cc3 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -1073,6 +1073,7 @@ static const struct dquot_operations ext4_quota_operations = {
>>       .write_info     = ext4_write_info,
>>       .alloc_dquot    = dquot_alloc,
>>       .destroy_dquot  = dquot_destroy,
>> +     .get_projid     = ext4_get_projid,
>> };
>>
>> static const struct quotactl_ops ext4_qctl_operations = {
>> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
>> index 3735fa0..fcbf647 100644
>> --- a/include/uapi/linux/fs.h
>> +++ b/include/uapi/linux/fs.h
>> @@ -195,6 +195,7 @@ struct inodes_stat_t {
>> #define FS_EXTENT_FL                  0x00080000 /* Extents */
>> #define FS_DIRECTIO_FL                        0x00100000 /* Use direct i/o */
>> #define FS_NOCOW_FL                   0x00800000 /* Do not cow file */
>> +#define FS_PROJINHERIT_FL            0x20000000 /* Create with parents projid */
>> #define FS_RESERVED_FL                        0x80000000 /* reserved for ext2 lib */
>>
>> #define FS_FL_USER_VISIBLE            0x0003DFFF /* User visible flags */
>> --
>> 1.7.1
>>
>
>
> Cheers, Andreas
>
>
>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 0/5] ext4: add project quota support
       [not found]     ` <CAMXgnP6RF4HPDyugvnMKn3rDnuG7j1cz-xFtEMWx4Va1rEHVEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-04-14  8:21       ` Jan Kara
       [not found]         ` <20150414082115.GB23327-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2015-04-14  8:21 UTC (permalink / raw)
  To: Alban Crequy
  Cc: adilger-m1MBpc4rdrD3fQ9qLvQP4Q, tytso-3s7WtUTddSA, Linux API,
	Linux Containers, hch-wEGCiKHe2LqWVfeAwA7xHQ,
	dmonakhov-GEFAQzZX7r8dnm+yROfE0A,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, Li Xi,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, jack-AlSwsSmVLrQ,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA

On Sun 12-04-15 17:36:53, Alban Crequy wrote:
> On 9 April 2015 at 17:14, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > The following patches propose an implementation of project quota
> > support for ext4. A project is an aggregate of unrelated inodes
> > which might scatter in different directories. Inodes that belong
> > to the same project possess an identical identification i.e.
> > 'project ID', just like every inode has its user/group
> > identification. The following patches add project quota as
> > supplement to the former uer/group quota types.
> > (...)
> 
> Thanks for this work, I would like to use this for containers. I am
> adding containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org in Cc.
> 
> To make sure I understand correctly, I will describe the configuration
> I have in mind and hopefully someone can tell me if it makes sense.
> 
> Containers created by rkt (https://github.com/coreos/rkt) use an
> overlay filesystem as root and the lowerdir/upperdir directories are
> based on an ext4 filesystem outside of the container's reach. The
> lowerdir is the base image, and several container instances can
> potentially use the same lowerdir. Each container has its upperdir
> containing their changes.
> 
> With your patch set, I could assign a different projid to the upperdir
> of each container with a specific quota. Then it will limit how much
> the container will be able to write. I don't know if the overlay's
> workdir would need to have projid too.
  I don't think overlay's workdir needs project id. Limits will be simply
checked when storing data into upperdir by overlayfs. Overlayfs will get
EDQUOT which it will report back into the user.

> When a quota warning is sent on netlink, it is received only in the
> initial user namespace and the processes in a different user namespace
> will not be able to receive the netlink warnings. The user will only
> receive a warning through the control terminal.
  So I don't know much about namespaces but I don't see how quota netlink
messages would be connected with *user* namespaces. But you are right that
quota netlink messages will contain ID of the violator mapped into init
user namespace so it won't make sense to processes in other user namespaces
even if they were able to receive it.

> Since rkt does not use user namespaces yet, a rkt container could
> unfortunately receive quota warnings through netlink concerning the
> host or other containers. Or is it restricted to init_net?
  Quota netlink messages are sent only in init_net namespace (since quota
netlink protocol wasn't made namespace aware). So this shouldn't be an
issue.
 
> quotactl() can be used in a rkt container if the proccesses in the
> container can guess somehow which block device is used by the
> filesystem hosting the overlay's upperdir and if they can mknod it
> somewhere. Usually, containers don't restrict mknod but just restrict
> read-write access through the device cgroup. The read-write access is
> irrelevant for quotactl(): quotactl() just check that the device node
> exists and that it is not on a nodev mount. The nodev check does not
> restrict containers here because they usually have a /dev mounted as
> tmpfs without the nodev option.
  Correct. This raises a somewhat unrelated question: Does this mean that a
container is able to mount arbitrary block device? Because also there we
just pass a device path to the kernel...

> Containers that don't use user namespaces (so no projid mapping) would
> be able to query quotas for projid assigned to other containers
> (unfortunately). They would be able to change the quota of other
> containers if they are privileged enough to be given CAP_SYS_RESOURCE.
  Yes.

> Containers using user namespaces would not be able to change any quota
> config because they don't have CAP_SYS_RESOURCE in the init user
> namespace. If they are configured with a proper projid mapping, they
> would only be able to query the projid they are assigned (they could
> guess which projid to query by looking at /proc/self/projid_map).
  Yes.

> Do you know if someone is working on the documentation? It would be
> nice if filesystems/quota.txt could say who can receive the quota
> warnings on netlink (which namespace) and if it could give some
  I have added that.

> information about projid. But maybe this belong to the proc(5) and
> user_namespaces(7) manpages as well.
  Project ID in VFS quotas is fairly new thing. Once ext4 gains support for
it, I can add some documentation.

> Is there any suggestions how to allocate projid in userspace?
> Something like /etc/subprojid similar to /etc/subuid?
  I guess you need some coordination between namespaces? I only know that
traditionally xfsprogs use /etc/projid for name->project id translation 
and /etc/projects contain roots of directory trees for which you wish to
maintain directory quota together with project ids for each of the trees.

								Honza
-- 
Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 0/5] ext4: add project quota support
       [not found]         ` <20150414082115.GB23327-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
@ 2015-04-14 10:07           ` Alban Crequy
  2015-04-14 10:07           ` Alban Crequy
  1 sibling, 0 replies; 14+ messages in thread
From: Alban Crequy @ 2015-04-14 10:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: adilger-m1MBpc4rdrD3fQ9qLvQP4Q, tytso-3s7WtUTddSA, Linux API,
	Linux Containers, Alban Crequy, hch-wEGCiKHe2LqWVfeAwA7xHQ,
	dmonakhov-GEFAQzZX7r8dnm+yROfE0A,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, Li Xi,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 10:21 AM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
> On Sun 12-04-15 17:36:53, Alban Crequy wrote:
>> On 9 April 2015 at 17:14, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > The following patches propose an implementation of project quota
>> > support for ext4. A project is an aggregate of unrelated inodes
>> > which might scatter in different directories. Inodes that belong
>> > to the same project possess an identical identification i.e.
>> > 'project ID', just like every inode has its user/group
>> > identification. The following patches add project quota as
>> > supplement to the former uer/group quota types.
>> > (...)
>>
>> Thanks for this work, I would like to use this for containers. I am
>> adding containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org in Cc.
>>
>> To make sure I understand correctly, I will describe the configuration
>> I have in mind and hopefully someone can tell me if it makes sense.
>>
>> Containers created by rkt (https://github.com/coreos/rkt) use an
>> overlay filesystem as root and the lowerdir/upperdir directories are
>> based on an ext4 filesystem outside of the container's reach. The
>> lowerdir is the base image, and several container instances can
>> potentially use the same lowerdir. Each container has its upperdir
>> containing their changes.
>>
>> With your patch set, I could assign a different projid to the upperdir
>> of each container with a specific quota. Then it will limit how much
>> the container will be able to write. I don't know if the overlay's
>> workdir would need to have projid too.
>   I don't think overlay's workdir needs project id. Limits will be simply
> checked when storing data into upperdir by overlayfs. Overlayfs will get
> EDQUOT which it will report back into the user.

Noted, thanks.

>> When a quota warning is sent on netlink, it is received only in the
>> initial user namespace and the processes in a different user namespace
>> will not be able to receive the netlink warnings. The user will only
>> receive a warning through the control terminal.
>   So I don't know much about namespaces but I don't see how quota netlink
> messages would be connected with *user* namespaces. But you are right that
> quota netlink messages will contain ID of the violator mapped into init
> user namespace so it won't make sense to processes in other user namespaces
> even if they were able to receive it.
>
>> Since rkt does not use user namespaces yet, a rkt container could
>> unfortunately receive quota warnings through netlink concerning the
>> host or other containers. Or is it restricted to init_net?
>   Quota netlink messages are sent only in init_net namespace (since quota
> netlink protocol wasn't made namespace aware). So this shouldn't be an
> issue.

You're right, I misread it, it references the init network namespace
and not the user namespace:

fs/quota/netlink.c:quota_send_warning() uses genlmsg_multicast() which
specifically references init_net:

         return genlmsg_multicast_netns(family, &init_net, skb,
                                        portid, group, flags);

>> quotactl() can be used in a rkt container if the proccesses in the
>> container can guess somehow which block device is used by the
>> filesystem hosting the overlay's upperdir and if they can mknod it
>> somewhere. Usually, containers don't restrict mknod but just restrict
>> read-write access through the device cgroup. The read-write access is
>> irrelevant for quotactl(): quotactl() just check that the device node
>> exists and that it is not on a nodev mount. The nodev check does not
>> restrict containers here because they usually have a /dev mounted as
>> tmpfs without the nodev option.
>   Correct. This raises a somewhat unrelated question: Does this mean that a
> container is able to mount arbitrary block device? Because also there we
> just pass a device path to the kernel...

The process would still need CAP_SYS_ADMIN and there are additional
checks when the user namespace is not the initial user namespace:

fs/namespace.c do_new_mount()
        if (user_ns != &init_user_ns) {
                if (!(type->fs_flags & FS_USERNS_MOUNT)) {
                        put_filesystem(type);
                        return -EPERM;
                }...

For example, FS_USERNS_MOUNT is set on devpts_fs_type but not on
ext4_fs_type. So it's not possible to mount ext4 in a different user
namespace. Containers that don't use user namespaces can avoid giving
CAP_SYS_ADMIN or restrict mount with some AppArmor rules.

>> Containers that don't use user namespaces (so no projid mapping) would
>> be able to query quotas for projid assigned to other containers
>> (unfortunately). They would be able to change the quota of other
>> containers if they are privileged enough to be given CAP_SYS_RESOURCE.
>   Yes.
>
>> Containers using user namespaces would not be able to change any quota
>> config because they don't have CAP_SYS_RESOURCE in the init user
>> namespace. If they are configured with a proper projid mapping, they
>> would only be able to query the projid they are assigned (they could
>> guess which projid to query by looking at /proc/self/projid_map).
>   Yes.
>
>> Do you know if someone is working on the documentation? It would be
>> nice if filesystems/quota.txt could say who can receive the quota
>> warnings on netlink (which namespace) and if it could give some
>   I have added that.
>
>> information about projid. But maybe this belong to the proc(5) and
>> user_namespaces(7) manpages as well.
>   Project ID in VFS quotas is fairly new thing. Once ext4 gains support for
> it, I can add some documentation.
>
>> Is there any suggestions how to allocate projid in userspace?
>> Something like /etc/subprojid similar to /etc/subuid?
>   I guess you need some coordination between namespaces?

Yes, I was thinking if Docker uses projid for some containers, rkt
uses other projid for other containers and the sysadmin also define
some projid manually.

> I only know that
> traditionally xfsprogs use /etc/projid for name->project id translation
> and /etc/projects contain roots of directory trees for which you wish to
> maintain directory quota together with project ids for each of the trees.

Thanks for the pointer.

Alban

>
>                                                                 Honza
> --
> Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
> SUSE Labs, CR
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [v12 0/5] ext4: add project quota support
       [not found]         ` <20150414082115.GB23327-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
  2015-04-14 10:07           ` Alban Crequy
@ 2015-04-14 10:07           ` Alban Crequy
  1 sibling, 0 replies; 14+ messages in thread
From: Alban Crequy @ 2015-04-14 10:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: Alban Crequy, adilger-m1MBpc4rdrD3fQ9qLvQP4Q, tytso-3s7WtUTddSA,
	Linux API, Linux Containers, hch-wEGCiKHe2LqWVfeAwA7xHQ,
	dmonakhov-GEFAQzZX7r8dnm+yROfE0A,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, Li Xi,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 10:21 AM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
> On Sun 12-04-15 17:36:53, Alban Crequy wrote:
>> On 9 April 2015 at 17:14, Li Xi <pkuelelixi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > The following patches propose an implementation of project quota
>> > support for ext4. A project is an aggregate of unrelated inodes
>> > which might scatter in different directories. Inodes that belong
>> > to the same project possess an identical identification i.e.
>> > 'project ID', just like every inode has its user/group
>> > identification. The following patches add project quota as
>> > supplement to the former uer/group quota types.
>> > (...)
>>
>> Thanks for this work, I would like to use this for containers. I am
>> adding containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org in Cc.
>>
>> To make sure I understand correctly, I will describe the configuration
>> I have in mind and hopefully someone can tell me if it makes sense.
>>
>> Containers created by rkt (https://github.com/coreos/rkt) use an
>> overlay filesystem as root and the lowerdir/upperdir directories are
>> based on an ext4 filesystem outside of the container's reach. The
>> lowerdir is the base image, and several container instances can
>> potentially use the same lowerdir. Each container has its upperdir
>> containing their changes.
>>
>> With your patch set, I could assign a different projid to the upperdir
>> of each container with a specific quota. Then it will limit how much
>> the container will be able to write. I don't know if the overlay's
>> workdir would need to have projid too.
>   I don't think overlay's workdir needs project id. Limits will be simply
> checked when storing data into upperdir by overlayfs. Overlayfs will get
> EDQUOT which it will report back into the user.

Noted, thanks.

>> When a quota warning is sent on netlink, it is received only in the
>> initial user namespace and the processes in a different user namespace
>> will not be able to receive the netlink warnings. The user will only
>> receive a warning through the control terminal.
>   So I don't know much about namespaces but I don't see how quota netlink
> messages would be connected with *user* namespaces. But you are right that
> quota netlink messages will contain ID of the violator mapped into init
> user namespace so it won't make sense to processes in other user namespaces
> even if they were able to receive it.
>
>> Since rkt does not use user namespaces yet, a rkt container could
>> unfortunately receive quota warnings through netlink concerning the
>> host or other containers. Or is it restricted to init_net?
>   Quota netlink messages are sent only in init_net namespace (since quota
> netlink protocol wasn't made namespace aware). So this shouldn't be an
> issue.

You're right, I misread it, it references the init network namespace
and not the user namespace:

fs/quota/netlink.c:quota_send_warning() uses genlmsg_multicast() which
specifically references init_net:

         return genlmsg_multicast_netns(family, &init_net, skb,
                                        portid, group, flags);

>> quotactl() can be used in a rkt container if the proccesses in the
>> container can guess somehow which block device is used by the
>> filesystem hosting the overlay's upperdir and if they can mknod it
>> somewhere. Usually, containers don't restrict mknod but just restrict
>> read-write access through the device cgroup. The read-write access is
>> irrelevant for quotactl(): quotactl() just check that the device node
>> exists and that it is not on a nodev mount. The nodev check does not
>> restrict containers here because they usually have a /dev mounted as
>> tmpfs without the nodev option.
>   Correct. This raises a somewhat unrelated question: Does this mean that a
> container is able to mount arbitrary block device? Because also there we
> just pass a device path to the kernel...

The process would still need CAP_SYS_ADMIN and there are additional
checks when the user namespace is not the initial user namespace:

fs/namespace.c do_new_mount()
        if (user_ns != &init_user_ns) {
                if (!(type->fs_flags & FS_USERNS_MOUNT)) {
                        put_filesystem(type);
                        return -EPERM;
                }...

For example, FS_USERNS_MOUNT is set on devpts_fs_type but not on
ext4_fs_type. So it's not possible to mount ext4 in a different user
namespace. Containers that don't use user namespaces can avoid giving
CAP_SYS_ADMIN or restrict mount with some AppArmor rules.

>> Containers that don't use user namespaces (so no projid mapping) would
>> be able to query quotas for projid assigned to other containers
>> (unfortunately). They would be able to change the quota of other
>> containers if they are privileged enough to be given CAP_SYS_RESOURCE.
>   Yes.
>
>> Containers using user namespaces would not be able to change any quota
>> config because they don't have CAP_SYS_RESOURCE in the init user
>> namespace. If they are configured with a proper projid mapping, they
>> would only be able to query the projid they are assigned (they could
>> guess which projid to query by looking at /proc/self/projid_map).
>   Yes.
>
>> Do you know if someone is working on the documentation? It would be
>> nice if filesystems/quota.txt could say who can receive the quota
>> warnings on netlink (which namespace) and if it could give some
>   I have added that.
>
>> information about projid. But maybe this belong to the proc(5) and
>> user_namespaces(7) manpages as well.
>   Project ID in VFS quotas is fairly new thing. Once ext4 gains support for
> it, I can add some documentation.
>
>> Is there any suggestions how to allocate projid in userspace?
>> Something like /etc/subprojid similar to /etc/subuid?
>   I guess you need some coordination between namespaces?

Yes, I was thinking if Docker uses projid for some containers, rkt
uses other projid for other containers and the sysadmin also define
some projid manually.

> I only know that
> traditionally xfsprogs use /etc/projid for name->project id translation
> and /etc/projects contain roots of directory trees for which you wish to
> maintain directory quota together with project ids for each of the trees.

Thanks for the pointer.

Alban

>
>                                                                 Honza
> --
> Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
> SUSE Labs, CR
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-04-14 10:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-09 15:14 [v12 0/5] ext4: add project quota support Li Xi
2015-04-09 15:14 ` [v12 1/5] vfs: adds general codes to enforces project quota limits Li Xi
     [not found]   ` <1428592477-8212-2-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
2015-04-12 13:49     ` Alban Crequy
2015-04-09 15:14 ` [v12 2/5] ext4: adds project ID support Li Xi
     [not found]   ` <1428592477-8212-3-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
2015-04-10 23:37     ` Andreas Dilger
     [not found]       ` <C5339F6B-2DAE-4984-86CF-B2990CEC3550-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-04-13 16:33         ` Li Xi
2015-04-09 15:14 ` [v12 3/5] ext4: adds project quota support Li Xi
2015-04-09 15:14 ` [v12 4/5] ext4: adds FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support Li Xi
2015-04-09 15:14 ` [v12 5/5] ext4: cleanup inode flag definitions Li Xi
     [not found] ` <1428592477-8212-1-git-send-email-lixi-LfVdkaOWEx8@public.gmane.org>
2015-04-12 15:36   ` [v12 0/5] ext4: add project quota support Alban Crequy
2015-04-12 15:36   ` Alban Crequy
     [not found]     ` <CAMXgnP6RF4HPDyugvnMKn3rDnuG7j1cz-xFtEMWx4Va1rEHVEQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-04-14  8:21       ` Jan Kara
     [not found]         ` <20150414082115.GB23327-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-04-14 10:07           ` Alban Crequy
2015-04-14 10:07           ` Alban Crequy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.