[PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH -V7 00/26]  New ACL format for better NFSv4 acl interoperability
@ 2011-10-18 15:32 Aneesh Kumar K.V
  2011-10-18 15:32   ` Aneesh Kumar K.V
                   ` (27 more replies)
  0 siblings, 28 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

Hi,

The following set of patches implements VFS and ext4 changes needed to implement
a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
extended by file masks to fit into the standard POSIX file permission model.
They are designed to work seamlessly locally as well as across the NFSv4 and
CIFS/SMB2 network file system protocols.

A user-space utility for displaying and changing richacls is available at [4]
(a number of examples can be found at http://acl.bestbits.at/richacl/examples.html).

[4] git://github.com/kvaneesh/richacl-tools.git master

To test richacl on ext4 use tune2fs -O richacl to enable richacl feature and mount
the file system using -o acl mount option.

More details regarding richacl can be found at
http://acl.bestbits.at/richacl/

Changes from v6:
a) Update patches based on review comments.
b) Add Acked-by:
c) rebase to 3.1-rc10

git repository With all the patches can be found at
git://github.com/kvaneesh/linux.git richacl

IMHO the patches are ready to be merged upstream. How do we push these changes
to Linus tree ? Andrew, Viro, any comment on how we can get this merged upstream ?

-aneesh

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH -V7 01/26] vfs: Indicate that the permission functions take all the MAY_* flags
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 0b3138d..2a4574f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -257,7 +257,7 @@ other_perms:
 /**
  * generic_permission -  check for access rights on a Posix-like filesystem
  * @inode:	inode to check access rights for
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  *
  * Used to check for read/write/execute permissions on a file.
  * We use "fsuid" for this, letting us set arbitrary permissions
@@ -331,7 +331,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
 /**
  * inode_permission  -  check for access rights to a given inode
  * @inode:	inode to check permission on
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  *
  * Used to check for read/write/execute permissions on an inode.
  * We use "fsuid" for this, letting us set arbitrary permissions
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 01/26] vfs: Indicate that the permission functions take all the MAY_* flags
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen-DgEjT+Ai2ygdnm+yROfE0A, bfields-uC3wQj2KruNg9hUCZPvPmw,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA
  Cc: aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Acked-by: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/namei.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 0b3138d..2a4574f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -257,7 +257,7 @@ other_perms:
 /**
  * generic_permission -  check for access rights on a Posix-like filesystem
  * @inode:	inode to check access rights for
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  *
  * Used to check for read/write/execute permissions on a file.
  * We use "fsuid" for this, letting us set arbitrary permissions
@@ -331,7 +331,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
 /**
  * inode_permission  -  check for access rights to a given inode
  * @inode:	inode to check permission on
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  *
  * Used to check for read/write/execute permissions on an inode.
  * We use "fsuid" for this, letting us set arbitrary permissions
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 02/26] vfs: Add hex format for MAY_* flag values
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
  2011-10-18 15:32   ` Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 03/26] vfs: Pass all mask flags down to iop->check_acl Aneesh Kumar K.V
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

We are going to add more flags and having them in hex format
make it simpler

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 include/linux/fs.h |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 277f497..c1884e9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -58,14 +58,15 @@ struct inodes_stat_t {
 
 #define NR_FILE  8192	/* this can well be larger on a larger system */
 
-#define MAY_EXEC 1
-#define MAY_WRITE 2
-#define MAY_READ 4
-#define MAY_APPEND 8
-#define MAY_ACCESS 16
-#define MAY_OPEN 32
-#define MAY_CHDIR 64
-#define MAY_NOT_BLOCK 128	/* called from RCU mode, don't block */
+#define MAY_EXEC		0x00000001
+#define MAY_WRITE		0x00000002
+#define MAY_READ		0x00000004
+#define MAY_APPEND		0x00000008
+#define MAY_ACCESS		0x00000010
+#define MAY_OPEN		0x00000020
+#define MAY_CHDIR		0x00000040
+/* called from RCU mode, don't block */
+#define MAY_NOT_BLOCK		0x00000080
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 03/26] vfs: Pass all mask flags down to iop->check_acl
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
  2011-10-18 15:32   ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 02/26] vfs: Add hex format for MAY_* flag values Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 04/26] vfs: Add a comment to inode_permission() Aneesh Kumar K.V
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Some file permission models differentiate between writing to a file
(MAY_WRITE) and appending to it (MAY_WRITE | MAY_APPEND).  Pass all the
mask flags down to iop->check_acl so that filesystems can distinguish
between writing and appending.

All users of iop->check_acl pass the mask value back into
posix_acl_permission(); strip off the additional mask flags there.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c     |    2 --
 fs/posix_acl.c |    2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 2a4574f..276cd30 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -227,8 +227,6 @@ static int acl_permission_check(struct inode *inode, int mask)
 {
 	unsigned int mode = inode->i_mode;
 
-	mask &= MAY_READ | MAY_WRITE | MAY_EXEC | MAY_NOT_BLOCK;
-
 	if (current_user_ns() != inode_userns(inode))
 		goto other_perms;
 
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 10027b4..cea4623 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -218,6 +218,8 @@ posix_acl_permission(struct inode *inode, const struct posix_acl *acl, int want)
 	const struct posix_acl_entry *pa, *pe, *mask_obj;
 	int found = 0;
 
+	want &= MAY_READ | MAY_WRITE | MAY_EXEC | MAY_NOT_BLOCK;
+
 	FOREACH_ACL_ENTRY(pa, acl, pe) {
                 switch(pa->e_tag) {
                         case ACL_USER_OBJ:
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 04/26] vfs: Add a comment to inode_permission()
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 03/26] vfs: Pass all mask flags down to iop->check_acl Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 05/26] vfs: Add generic IS_ACL() test for acl support Aneesh Kumar K.V
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 276cd30..9061157 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -221,7 +221,7 @@ static int check_acl(struct inode *inode, int mask)
 }
 
 /*
- * This does basic POSIX ACL permission checking
+ * This does the basic permission checking
  */
 static int acl_permission_check(struct inode *inode, int mask)
 {
@@ -271,7 +271,7 @@ int generic_permission(struct inode *inode, int mask)
 	int ret;
 
 	/*
-	 * Do the basic POSIX ACL permission checks.
+	 * Do the basic permission checks.
 	 */
 	ret = acl_permission_check(inode, mask);
 	if (ret != -EACCES)
@@ -335,6 +335,8 @@ static inline int do_inode_permission(struct inode *inode, int mask)
  * We use "fsuid" for this, letting us set arbitrary permissions
  * for filesystem access without changing the "normal" uids which
  * are used for other things.
+ *
+ * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 05/26] vfs: Add generic IS_ACL() test for acl support
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 04/26] vfs: Add a comment to inode_permission() Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 06/26] vfs: Add IS_RICHACL() test for richacl support Aneesh Kumar K.V
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

When IS_POSIXACL() is true, the vfs does not apply the umask.  Other acl
models will need the same exception, so introduce a separate IS_ACL()
test.

The IS_POSIX_ACL() test is still needed so that nfsd can determine when
the underlying file system supports POSIX ACLs (as opposed to some other
kind).

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c         |    6 +++---
 include/linux/fs.h |    8 +++++++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 9061157..cf8b2f0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2166,7 +2166,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	/* Negative dentry, just create the file */
 	if (!dentry->d_inode) {
 		int mode = op->mode;
-		if (!IS_POSIXACL(dir->d_inode))
+		if (!IS_ACL(dir->d_inode))
 			mode &= ~current_umask();
 		/*
 		 * This write is needed to ensure that a
@@ -2484,7 +2484,7 @@ SYSCALL_DEFINE4(mknodat, int, dfd, const char __user *, filename, int, mode,
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	if (!IS_POSIXACL(path.dentry->d_inode))
+	if (!IS_ACL(path.dentry->d_inode))
 		mode &= ~current_umask();
 	error = may_mknod(mode);
 	if (error)
@@ -2553,7 +2553,7 @@ SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, int, mode)
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	if (!IS_POSIXACL(path.dentry->d_inode))
+	if (!IS_ACL(path.dentry->d_inode))
 		mode &= ~current_umask();
 	error = mnt_want_write(path.mnt);
 	if (error)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c1884e9..1994b84 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -201,7 +201,7 @@ struct inodes_stat_t {
 #define MS_VERBOSE	32768	/* War is peace. Verbosity is silence.
 				   MS_VERBOSE is deprecated. */
 #define MS_SILENT	32768
-#define MS_POSIXACL	(1<<16)	/* VFS does not apply the umask */
+#define MS_POSIXACL	(1<<16) /* Supports POSIX ACLs */
 #define MS_UNBINDABLE	(1<<17)	/* change to unbindable */
 #define MS_PRIVATE	(1<<18)	/* change to private */
 #define MS_SLAVE	(1<<19)	/* change to slave */
@@ -279,6 +279,12 @@ struct inodes_stat_t {
 #define IS_AUTOMOUNT(inode)	((inode)->i_flags & S_AUTOMOUNT)
 #define IS_NOSEC(inode)		((inode)->i_flags & S_NOSEC)
 
+/*
+ * IS_ACL() tells the VFS to not apply the umask
+ * and use check_acl for acl permission checks when defined.
+ */
+#define IS_ACL(inode)		__IS_FLG(inode, MS_POSIXACL)
+
 /* the read-only stuff doesn't really belong here, but any other place is
    probably as bad and I don't want to create yet another include file. */
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 06/26] vfs: Add IS_RICHACL() test for richacl support
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 05/26] vfs: Add generic IS_ACL() test for acl support Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 07/26] vfs: Optimize out IS_RICHACL() if CONFIG_FS_RICHACL is not defined Aneesh Kumar K.V
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Introduce a new MS_RICHACL super-block flag and a new IS_RICHACL() test
which file systems like nfs can use.  IS_ACL() is true if IS_POSIXACL()
or IS_RICHACL() is true.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 include/linux/fs.h |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1994b84..7b4bfe6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -210,6 +210,7 @@ struct inodes_stat_t {
 #define MS_KERNMOUNT	(1<<22) /* this is a kern_mount call */
 #define MS_I_VERSION	(1<<23) /* Update inode I_version field */
 #define MS_STRICTATIME	(1<<24) /* Always perform atime updates */
+#define MS_RICHACL	(1<<25) /* Supports richacls */
 #define MS_NOSEC	(1<<28)
 #define MS_BORN		(1<<29)
 #define MS_ACTIVE	(1<<30)
@@ -270,6 +271,7 @@ struct inodes_stat_t {
 #define IS_APPEND(inode)	((inode)->i_flags & S_APPEND)
 #define IS_IMMUTABLE(inode)	((inode)->i_flags & S_IMMUTABLE)
 #define IS_POSIXACL(inode)	__IS_FLG(inode, MS_POSIXACL)
+#define IS_RICHACL(inode)	__IS_FLG(inode, MS_RICHACL)
 
 #define IS_DEADDIR(inode)	((inode)->i_flags & S_DEAD)
 #define IS_NOCMTIME(inode)	((inode)->i_flags & S_NOCMTIME)
@@ -283,7 +285,7 @@ struct inodes_stat_t {
  * IS_ACL() tells the VFS to not apply the umask
  * and use check_acl for acl permission checks when defined.
  */
-#define IS_ACL(inode)		__IS_FLG(inode, MS_POSIXACL)
+#define IS_ACL(inode)		__IS_FLG(inode, MS_POSIXACL | MS_RICHACL)
 
 /* the read-only stuff doesn't really belong here, but any other place is
    probably as bad and I don't want to create yet another include file. */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 07/26] vfs: Optimize out IS_RICHACL() if CONFIG_FS_RICHACL is not defined
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 06/26] vfs: Add IS_RICHACL() test for richacl support Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 08/26] vfs: Add new file and directory create permission flags Aneesh Kumar K.V
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

if CONFIG_FS_RICHACL is not defined optimize out
the ACL check function.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/Kconfig         |    3 +++
 include/linux/fs.h |    5 +++++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 9fe0b34..7939190 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -46,6 +46,9 @@ endif # BLOCK
 config FS_POSIX_ACL
 	def_bool n
 
+config FS_RICHACL
+	def_bool n
+
 config EXPORTFS
 	tristate
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7b4bfe6..f3ebf86 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -271,7 +271,12 @@ struct inodes_stat_t {
 #define IS_APPEND(inode)	((inode)->i_flags & S_APPEND)
 #define IS_IMMUTABLE(inode)	((inode)->i_flags & S_IMMUTABLE)
 #define IS_POSIXACL(inode)	__IS_FLG(inode, MS_POSIXACL)
+
+#ifdef CONFIG_FS_RICHACL
 #define IS_RICHACL(inode)	__IS_FLG(inode, MS_RICHACL)
+#else
+#define IS_RICHACL(inode)	0
+#endif
 
 #define IS_DEADDIR(inode)	((inode)->i_flags & S_DEAD)
 #define IS_NOCMTIME(inode)	((inode)->i_flags & S_NOCMTIME)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 08/26] vfs: Add new file and directory create permission flags
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 07/26] vfs: Optimize out IS_RICHACL() if CONFIG_FS_RICHACL is not defined Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-19 16:42   ` J. Bruce Fields
  2011-10-18 15:32 ` [PATCH -V7 09/26] vfs: Add delete child and delete self " Aneesh Kumar K.V
                   ` (19 subsequent siblings)
  27 siblings, 1 reply; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Some permission models distinguish between the permission to create a
non-directory and a directory.  Pass this information down to
inode_permission() as mask flags

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c         |   26 +++++++++++++++-----------
 include/linux/fs.h |    2 ++
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index cf8b2f0..f6184b8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -336,7 +336,8 @@ static inline int do_inode_permission(struct inode *inode, int mask)
  * for filesystem access without changing the "normal" uids which
  * are used for other things.
  *
- * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
+ * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
+ * MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
@@ -1914,13 +1915,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
  *  3. We should have write and exec permissions on dir
  *  4. We can't do it if dir is immutable (done in permission())
  */
-static inline int may_create(struct inode *dir, struct dentry *child)
+static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
 {
+	int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
+
 	if (child->d_inode)
 		return -EEXIST;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
-	return inode_permission(dir, MAY_WRITE | MAY_EXEC);
+	return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
 }
 
 /*
@@ -1968,7 +1971,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
 int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
 		struct nameidata *nd)
 {
-	int error = may_create(dir, dentry);
+	int error = may_create(dir, dentry, 0);
 
 	if (error)
 		return error;
@@ -2427,7 +2430,7 @@ EXPORT_SYMBOL(user_path_create);
 
 int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
 {
-	int error = may_create(dir, dentry);
+	int error = may_create(dir, dentry, 0);
 
 	if (error)
 		return error;
@@ -2524,7 +2527,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
 
 int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 {
-	int error = may_create(dir, dentry);
+	int error = may_create(dir, dentry, 1);
 
 	if (error)
 		return error;
@@ -2806,7 +2809,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
 
 int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
 {
-	int error = may_create(dir, dentry);
+	int error = may_create(dir, dentry, 0);
 
 	if (error)
 		return error;
@@ -2872,7 +2875,10 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 	if (!inode)
 		return -ENOENT;
 
-	error = may_create(dir, new_dentry);
+	if (S_ISDIR(inode->i_mode))
+		return -EPERM;
+
+	error = may_create(dir, new_dentry, 0);
 	if (error)
 		return error;
 
@@ -2886,8 +2892,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 		return -EPERM;
 	if (!dir->i_op->link)
 		return -EPERM;
-	if (S_ISDIR(inode->i_mode))
-		return -EPERM;
 
 	error = security_inode_link(old_dentry, dir, new_dentry);
 	if (error)
@@ -3097,7 +3101,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		return error;
 
 	if (!new_dentry->d_inode)
-		error = may_create(new_dir, new_dentry);
+		error = may_create(new_dir, new_dentry, is_dir);
 	else
 		error = may_delete(new_dir, new_dentry, is_dir);
 	if (error)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f3ebf86..60361c6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -67,6 +67,8 @@ struct inodes_stat_t {
 #define MAY_CHDIR		0x00000040
 /* called from RCU mode, don't block */
 #define MAY_NOT_BLOCK		0x00000080
+#define MAY_CREATE_FILE		0x00000100
+#define MAY_CREATE_DIR		0x00000200
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 09/26] vfs: Add delete child and delete self permission flags
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 08/26] vfs: Add new file and directory create permission flags Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-19 22:09   ` J. Bruce Fields
  2011-10-18 15:32   ` Aneesh Kumar K.V
                   ` (18 subsequent siblings)
  27 siblings, 1 reply; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Normally, deleting a file requires write access to the parent directory.
Some permission models use a different permission on the parent
directory to indicate delete access.  In addition, a process can have
per-file delete access even without delete access on the parent
directory.

Introduce two new inode_permission() mask flags and use them in
may_delete()

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c         |   42 ++++++++++++++++++++++++++++--------------
 include/linux/fs.h |    2 ++
 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index f6184b8..7bf42e8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -337,7 +337,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
  * are used for other things.
  *
  * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
- * MAY_WRITE must also be set in @mask.
+ * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
@@ -1853,7 +1853,7 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
 		return 0;
 
 other_userns:
-	return !ns_capable(inode_userns(inode), CAP_FOWNER);
+	return 1;
 }
 
 /*
@@ -1875,30 +1875,44 @@ other_userns:
  * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
  *     nfs_async_unlink().
  */
-static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
+static int may_delete(struct inode *dir, struct dentry *victim,
+		      int isdir, int replace)
 {
-	int error;
+	struct inode *inode = victim->d_inode;
+	int mask, replace_mask = 0, error, is_sticky;
+
 
-	if (!victim->d_inode)
+	if (!inode)
 		return -ENOENT;
 
 	BUG_ON(victim->d_parent->d_inode != dir);
 	audit_inode_child(victim, dir);
 
-	error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
+	mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
+	if (replace)
+		replace_mask = S_ISDIR(inode->i_mode) ?
+				MAY_CREATE_DIR : MAY_CREATE_FILE;
+	is_sticky = check_sticky(dir, inode);
+	error = inode_permission(dir, mask | replace_mask);
+	if ((error || is_sticky) && IS_RICHACL(inode) &&
+	    (inode_permission(dir, MAY_EXEC | replace_mask) == 0) &&
+	    (inode_permission(inode, MAY_DELETE_SELF) == 0))
+		error = 0;
+	else if (!error && is_sticky &&
+		 !ns_capable(inode_userns(inode), CAP_FOWNER))
+		error = -EPERM;
 	if (error)
 		return error;
 	if (IS_APPEND(dir))
 		return -EPERM;
-	if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
-	    IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
+	if (IS_APPEND(inode) || IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
 		return -EPERM;
 	if (isdir) {
-		if (!S_ISDIR(victim->d_inode->i_mode))
+		if (!S_ISDIR(inode->i_mode))
 			return -ENOTDIR;
 		if (IS_ROOT(victim))
 			return -EBUSY;
-	} else if (S_ISDIR(victim->d_inode->i_mode))
+	} else if (S_ISDIR(inode->i_mode))
 		return -EISDIR;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
@@ -2605,7 +2619,7 @@ void dentry_unhash(struct dentry *dentry)
 
 int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 1);
+	int error = may_delete(dir, dentry, 1, 0);
 
 	if (error)
 		return error;
@@ -2700,7 +2714,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 0);
+	int error = may_delete(dir, dentry, 0, 0);
 
 	if (error)
 		return error;
@@ -3096,14 +3110,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (old_dentry->d_inode == new_dentry->d_inode)
  		return 0;
  
-	error = may_delete(old_dir, old_dentry, is_dir);
+	error = may_delete(old_dir, old_dentry, is_dir, 0);
 	if (error)
 		return error;
 
 	if (!new_dentry->d_inode)
 		error = may_create(new_dir, new_dentry, is_dir);
 	else
-		error = may_delete(new_dir, new_dentry, is_dir);
+		error = may_delete(new_dir, new_dentry, is_dir, 1);
 	if (error)
 		return error;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 60361c6..ccece40 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -69,6 +69,8 @@ struct inodes_stat_t {
 #define MAY_NOT_BLOCK		0x00000080
 #define MAY_CREATE_FILE		0x00000100
 #define MAY_CREATE_DIR		0x00000200
+#define MAY_DELETE_CHILD	0x00000400
+#define MAY_DELETE_SELF		0x00000800
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 10/26] vfs: Make the inode passed to inode_change_ok non-const
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

We will need to call iop->permission and iop->get_acl from
inode_change_ok() for additional permission checks, and both take a
non-const inode.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/attr.c          |    2 +-
 include/linux/fs.h |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 538e279..f15e9e3 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -26,7 +26,7 @@
  * Should be called as the first thing in ->setattr implementations,
  * possibly after taking additional locks.
  */
-int inode_change_ok(const struct inode *inode, struct iattr *attr)
+int inode_change_ok(struct inode *inode, struct iattr *attr)
 {
 	unsigned int ia_valid = attr->ia_valid;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ccece40..724a4f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2566,7 +2566,7 @@ extern int buffer_migrate_page(struct address_space *,
 #define buffer_migrate_page NULL
 #endif
 
-extern int inode_change_ok(const struct inode *, struct iattr *);
+extern int inode_change_ok(struct inode *, struct iattr *);
 extern int inode_newsize_ok(const struct inode *, loff_t offset);
 extern void setattr_copy(struct inode *inode, const struct iattr *attr);
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 10/26] vfs: Make the inode passed to inode_change_ok non-const
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen-DgEjT+Ai2ygdnm+yROfE0A, bfields-uC3wQj2KruNg9hUCZPvPmw,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA
  Cc: aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

We will need to call iop->permission and iop->get_acl from
inode_change_ok() for additional permission checks, and both take a
non-const inode.

Acked-by: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/attr.c          |    2 +-
 include/linux/fs.h |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 538e279..f15e9e3 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -26,7 +26,7 @@
  * Should be called as the first thing in ->setattr implementations,
  * possibly after taking additional locks.
  */
-int inode_change_ok(const struct inode *inode, struct iattr *attr)
+int inode_change_ok(struct inode *inode, struct iattr *attr)
 {
 	unsigned int ia_valid = attr->ia_valid;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ccece40..724a4f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2566,7 +2566,7 @@ extern int buffer_migrate_page(struct address_space *,
 #define buffer_migrate_page NULL
 #endif
 
-extern int inode_change_ok(const struct inode *, struct iattr *);
+extern int inode_change_ok(struct inode *, struct iattr *);
 extern int inode_newsize_ok(const struct inode *, loff_t offset);
 extern void setattr_copy(struct inode *inode, const struct iattr *attr);
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 11/26] vfs: Add permission flags for setting file attributes
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Some permission models can allow processes to take ownership of a file,
change the file permissions, and set the file timestamps.  Introduce new
permission mask flags and check for those permissions in
inode_change_ok().

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/attr.c          |   70 +++++++++++++++++++++++++++++++++++++++++++--------
 fs/namei.c         |    2 +-
 include/linux/fs.h |    4 +++
 3 files changed, 64 insertions(+), 12 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index f15e9e3..00578b9 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -14,6 +14,55 @@
 #include <linux/fcntl.h>
 #include <linux/security.h>
 
+static int richacl_change_ok(struct inode *inode, int mask)
+{
+	if (!IS_RICHACL(inode))
+		return -EPERM;
+
+	if (inode->i_op->permission)
+		return inode->i_op->permission(inode, mask);
+
+	return check_acl(inode, mask);
+}
+
+static bool inode_uid_change_ok(struct inode *inode, uid_t ia_uid)
+{
+	if (current_fsuid() == inode->i_uid && ia_uid == inode->i_uid)
+		return true;
+	if (current_fsuid() == ia_uid &&
+	    richacl_change_ok(inode, MAY_TAKE_OWNERSHIP) == 0)
+		return true;
+	if (capable(CAP_CHOWN))
+		return true;
+	return false;
+}
+
+static bool inode_gid_change_ok(struct inode *inode, gid_t ia_gid)
+{
+	int in_group = in_group_p(ia_gid);
+	if (current_fsuid() == inode->i_uid &&
+	    (in_group || ia_gid == inode->i_gid))
+		return true;
+	if (in_group && richacl_change_ok(inode, MAY_TAKE_OWNERSHIP) == 0)
+		return true;
+	if (capable(CAP_CHOWN))
+		return true;
+	return false;
+}
+
+static bool inode_owner_permitted_or_capable(struct inode *inode, int mask)
+{
+	struct user_namespace *ns = inode_userns(inode);
+
+	if (current_user_ns() == ns && current_fsuid() == inode->i_uid)
+		return true;
+	if (richacl_change_ok(inode, mask) == 0)
+		return true;
+	if (ns_capable(ns, CAP_FOWNER))
+		return true;
+	return false;
+}
+
 /**
  * inode_change_ok - check if attribute changes to an inode are allowed
  * @inode:	inode to check
@@ -45,21 +94,20 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)
 		return 0;
 
 	/* Make sure a caller can chown. */
-	if ((ia_valid & ATTR_UID) &&
-	    (current_fsuid() != inode->i_uid ||
-	     attr->ia_uid != inode->i_uid) && !capable(CAP_CHOWN))
-		return -EPERM;
+	if (ia_valid & ATTR_UID) {
+		if (!inode_uid_change_ok(inode, attr->ia_uid))
+			return -EPERM;
+	}
 
 	/* Make sure caller can chgrp. */
-	if ((ia_valid & ATTR_GID) &&
-	    (current_fsuid() != inode->i_uid ||
-	    (!in_group_p(attr->ia_gid) && attr->ia_gid != inode->i_gid)) &&
-	    !capable(CAP_CHOWN))
-		return -EPERM;
+	if (ia_valid & ATTR_GID) {
+		if (!inode_gid_change_ok(inode, attr->ia_gid))
+			return -EPERM;
+	}
 
 	/* Make sure a caller can chmod. */
 	if (ia_valid & ATTR_MODE) {
-		if (!inode_owner_or_capable(inode))
+		if (!inode_owner_permitted_or_capable(inode, MAY_CHMOD))
 			return -EPERM;
 		/* Also check the setgid bit! */
 		if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid :
@@ -69,7 +117,7 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)
 
 	/* Check for setting the inode time. */
 	if (ia_valid & (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)) {
-		if (!inode_owner_or_capable(inode))
+		if (!inode_owner_permitted_or_capable(inode, MAY_SET_TIMES))
 			return -EPERM;
 	}
 
diff --git a/fs/namei.c b/fs/namei.c
index 7bf42e8..eb8f918 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,7 +174,7 @@ void putname(const char *name)
 EXPORT_SYMBOL(putname);
 #endif
 
-static int check_acl(struct inode *inode, int mask)
+int check_acl(struct inode *inode, int mask)
 {
 #ifdef CONFIG_FS_POSIX_ACL
 	struct posix_acl *acl;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 724a4f4..ac1d8e5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -71,6 +71,9 @@ struct inodes_stat_t {
 #define MAY_CREATE_DIR		0x00000200
 #define MAY_DELETE_CHILD	0x00000400
 #define MAY_DELETE_SELF		0x00000800
+#define MAY_TAKE_OWNERSHIP	0x00001000
+#define MAY_CHMOD		0x00002000
+#define MAY_SET_TIMES		0x00004000
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
@@ -2232,6 +2235,7 @@ extern sector_t bmap(struct inode *, sector_t);
 extern int notify_change(struct dentry *, struct iattr *);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
+extern int check_acl(struct inode *, int);
 
 static inline bool execute_ok(struct inode *inode)
 {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 11/26] vfs: Add permission flags for setting file attributes
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen-DgEjT+Ai2ygdnm+yROfE0A, bfields-uC3wQj2KruNg9hUCZPvPmw,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA
  Cc: aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Some permission models can allow processes to take ownership of a file,
change the file permissions, and set the file timestamps.  Introduce new
permission mask flags and check for those permissions in
inode_change_ok().

Acked-by: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/attr.c          |   70 +++++++++++++++++++++++++++++++++++++++++++--------
 fs/namei.c         |    2 +-
 include/linux/fs.h |    4 +++
 3 files changed, 64 insertions(+), 12 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index f15e9e3..00578b9 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -14,6 +14,55 @@
 #include <linux/fcntl.h>
 #include <linux/security.h>
 
+static int richacl_change_ok(struct inode *inode, int mask)
+{
+	if (!IS_RICHACL(inode))
+		return -EPERM;
+
+	if (inode->i_op->permission)
+		return inode->i_op->permission(inode, mask);
+
+	return check_acl(inode, mask);
+}
+
+static bool inode_uid_change_ok(struct inode *inode, uid_t ia_uid)
+{
+	if (current_fsuid() == inode->i_uid && ia_uid == inode->i_uid)
+		return true;
+	if (current_fsuid() == ia_uid &&
+	    richacl_change_ok(inode, MAY_TAKE_OWNERSHIP) == 0)
+		return true;
+	if (capable(CAP_CHOWN))
+		return true;
+	return false;
+}
+
+static bool inode_gid_change_ok(struct inode *inode, gid_t ia_gid)
+{
+	int in_group = in_group_p(ia_gid);
+	if (current_fsuid() == inode->i_uid &&
+	    (in_group || ia_gid == inode->i_gid))
+		return true;
+	if (in_group && richacl_change_ok(inode, MAY_TAKE_OWNERSHIP) == 0)
+		return true;
+	if (capable(CAP_CHOWN))
+		return true;
+	return false;
+}
+
+static bool inode_owner_permitted_or_capable(struct inode *inode, int mask)
+{
+	struct user_namespace *ns = inode_userns(inode);
+
+	if (current_user_ns() == ns && current_fsuid() == inode->i_uid)
+		return true;
+	if (richacl_change_ok(inode, mask) == 0)
+		return true;
+	if (ns_capable(ns, CAP_FOWNER))
+		return true;
+	return false;
+}
+
 /**
  * inode_change_ok - check if attribute changes to an inode are allowed
  * @inode:	inode to check
@@ -45,21 +94,20 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)
 		return 0;
 
 	/* Make sure a caller can chown. */
-	if ((ia_valid & ATTR_UID) &&
-	    (current_fsuid() != inode->i_uid ||
-	     attr->ia_uid != inode->i_uid) && !capable(CAP_CHOWN))
-		return -EPERM;
+	if (ia_valid & ATTR_UID) {
+		if (!inode_uid_change_ok(inode, attr->ia_uid))
+			return -EPERM;
+	}
 
 	/* Make sure caller can chgrp. */
-	if ((ia_valid & ATTR_GID) &&
-	    (current_fsuid() != inode->i_uid ||
-	    (!in_group_p(attr->ia_gid) && attr->ia_gid != inode->i_gid)) &&
-	    !capable(CAP_CHOWN))
-		return -EPERM;
+	if (ia_valid & ATTR_GID) {
+		if (!inode_gid_change_ok(inode, attr->ia_gid))
+			return -EPERM;
+	}
 
 	/* Make sure a caller can chmod. */
 	if (ia_valid & ATTR_MODE) {
-		if (!inode_owner_or_capable(inode))
+		if (!inode_owner_permitted_or_capable(inode, MAY_CHMOD))
 			return -EPERM;
 		/* Also check the setgid bit! */
 		if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid :
@@ -69,7 +117,7 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)
 
 	/* Check for setting the inode time. */
 	if (ia_valid & (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)) {
-		if (!inode_owner_or_capable(inode))
+		if (!inode_owner_permitted_or_capable(inode, MAY_SET_TIMES))
 			return -EPERM;
 	}
 
diff --git a/fs/namei.c b/fs/namei.c
index 7bf42e8..eb8f918 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,7 +174,7 @@ void putname(const char *name)
 EXPORT_SYMBOL(putname);
 #endif
 
-static int check_acl(struct inode *inode, int mask)
+int check_acl(struct inode *inode, int mask)
 {
 #ifdef CONFIG_FS_POSIX_ACL
 	struct posix_acl *acl;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 724a4f4..ac1d8e5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -71,6 +71,9 @@ struct inodes_stat_t {
 #define MAY_CREATE_DIR		0x00000200
 #define MAY_DELETE_CHILD	0x00000400
 #define MAY_DELETE_SELF		0x00000800
+#define MAY_TAKE_OWNERSHIP	0x00001000
+#define MAY_CHMOD		0x00002000
+#define MAY_SET_TIMES		0x00004000
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
@@ -2232,6 +2235,7 @@ extern sector_t bmap(struct inode *, sector_t);
 extern int notify_change(struct dentry *, struct iattr *);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
+extern int check_acl(struct inode *, int);
 
 static inline bool execute_ok(struct inode *inode)
 {
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 12/26] vfs: Make acl_permission_check() work for richacls
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (10 preceding siblings ...)
  2011-10-18 15:32   ` Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 13/26] richacl: In-memory representation and helper functions Aneesh Kumar K.V
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index eb8f918..0c28f95 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -230,6 +230,20 @@ static int acl_permission_check(struct inode *inode, int mask)
 	if (current_user_ns() != inode_userns(inode))
 		goto other_perms;
 
+	if (IS_RICHACL(inode)) {
+		int error = check_acl(inode, mask);
+		if (error != -EAGAIN)
+			return error;
+		if (mask & (MAY_DELETE_SELF | MAY_TAKE_OWNERSHIP |
+			    MAY_CHMOD | MAY_SET_TIMES)) {
+			/*
+			 * The file permission bit cannot grant these
+			 * permissions.
+			 */
+			return -EACCES;
+		}
+	}
+
 	if (likely(current_fsuid() == inode->i_uid))
 		mode >>= 6;
 	else {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 13/26] richacl: In-memory representation and helper functions
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (11 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 12/26] vfs: Make acl_permission_check() work for richacls Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 14/26] richacl: Permission mapping functions Aneesh Kumar K.V
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

A richacl consists of an NFSv4 acl and an owner, group, and other mask.
These three masks correspond to the owner, group, and other file
permission bits, but they contain NFSv4 permissions instead of POSIX
permissions.

Each entry in the NFSv4 acl applies to the file owner (OWNER@), the
owning group (GROUP@), literally everyone (EVERYONE@), or to a specific
uid or gid.

As in the standard POSIX file permission model, each process is the
owner, group, or other file class.  A richacl grants a requested access
only if the NFSv4 acl in the richacl grants the access (according to the
NFSv4 permission check algorithm), and the file mask that applies to the
process includes the requested permissions.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/Makefile             |    2 +
 fs/richacl_base.c       |  109 +++++++++++++++++++++
 include/linux/richacl.h |  245 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 356 insertions(+), 0 deletions(-)
 create mode 100644 fs/richacl_base.c
 create mode 100644 include/linux/richacl.h

diff --git a/fs/Makefile b/fs/Makefile
index afc1096..7612168 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -48,6 +48,8 @@ obj-$(CONFIG_NFS_COMMON)	+= nfs_common/
 obj-$(CONFIG_GENERIC_ACL)	+= generic_acl.o
 
 obj-$(CONFIG_FHANDLE)		+= fhandle.o
+obj-$(CONFIG_FS_RICHACL)	+= richacl.o
+richacl-y			:= richacl_base.o
 
 obj-y				+= quota/
 
diff --git a/fs/richacl_base.c b/fs/richacl_base.c
new file mode 100644
index 0000000..3536626
--- /dev/null
+++ b/fs/richacl_base.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright (C) 2006, 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/sched.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/richacl.h>
+
+MODULE_LICENSE("GPL");
+
+/*
+ * Special e_who identifiers:  ACEs which have ACE4_SPECIAL_WHO set in
+ * ace->e_flags use these constants in ace->u.e_who.
+ *
+ * For efficiency, we compare pointers instead of comparing strings.
+ */
+const char richace_owner_who[]	  = "OWNER@";
+EXPORT_SYMBOL_GPL(richace_owner_who);
+const char richace_group_who[]	  = "GROUP@";
+EXPORT_SYMBOL_GPL(richace_group_who);
+const char richace_everyone_who[] = "EVERYONE@";
+EXPORT_SYMBOL_GPL(richace_everyone_who);
+
+/**
+ * richacl_alloc  -  allocate a richacl
+ * @count:	number of entries
+ */
+struct richacl *
+richacl_alloc(int count)
+{
+	size_t size = sizeof(struct richacl) + count * sizeof(struct richace);
+	struct richacl *acl = kzalloc(size, GFP_KERNEL);
+
+	if (acl) {
+		atomic_set(&acl->a_refcount, 1);
+		acl->a_count = count;
+	}
+	return acl;
+}
+EXPORT_SYMBOL_GPL(richacl_alloc);
+
+/**
+ * richacl_clone  -  create a copy of a richacl
+ */
+static struct richacl *
+richacl_clone(const struct richacl *acl)
+{
+	int count = acl->a_count;
+	size_t size = sizeof(struct richacl) + count * sizeof(struct richace);
+	struct richacl *dup = kmalloc(size, GFP_KERNEL);
+
+	if (dup) {
+		memcpy(dup, acl, size);
+		atomic_set(&dup->a_refcount, 1);
+	}
+	return dup;
+}
+
+/**
+ * richace_is_same_identifier  -  are both identifiers the same?
+ */
+int
+richace_is_same_identifier(const struct richace *a, const struct richace *b)
+{
+#define WHO_FLAGS (ACE4_SPECIAL_WHO | ACE4_IDENTIFIER_GROUP)
+	if ((a->e_flags & WHO_FLAGS) != (b->e_flags & WHO_FLAGS))
+		return 0;
+	if (a->e_flags & ACE4_SPECIAL_WHO)
+		return a->u.e_who == b->u.e_who;
+	else
+		return a->u.e_id == b->u.e_id;
+#undef WHO_FLAGS
+}
+
+/**
+ * richacl_set_who  -  set a special who value
+ * @ace:	acl entry
+ * @who:	who value to use
+ */
+int
+richace_set_who(struct richace *ace, const char *who)
+{
+	if (!strcmp(who, richace_owner_who))
+		who = richace_owner_who;
+	else if (!strcmp(who, richace_group_who))
+		who = richace_group_who;
+	else if (!strcmp(who, richace_everyone_who))
+		who = richace_everyone_who;
+	else
+		return -EINVAL;
+
+	ace->u.e_who = who;
+	ace->e_flags |= ACE4_SPECIAL_WHO;
+	ace->e_flags &= ~ACE4_IDENTIFIER_GROUP;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(richace_set_who);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
new file mode 100644
index 0000000..745cfc1
--- /dev/null
+++ b/include/linux/richacl.h
@@ -0,0 +1,245 @@
+/*
+ * Copyright (C) 2006, 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef __RICHACL_H
+#define __RICHACL_H
+#include <linux/slab.h>
+
+struct richace {
+	unsigned short	e_type;
+	unsigned short	e_flags;
+	unsigned int	e_mask;
+	union {
+		unsigned int	e_id;
+		const char	*e_who;
+	} u;
+};
+
+struct richacl {
+	atomic_t	a_refcount;
+	unsigned int	a_owner_mask;
+	unsigned int	a_group_mask;
+	unsigned int	a_other_mask;
+	unsigned short	a_count;
+	unsigned short	a_flags;
+	struct richace	a_entries[0];
+};
+
+#define richacl_for_each_entry(_ace, _acl) \
+	for (_ace = _acl->a_entries; \
+	     _ace != _acl->a_entries + _acl->a_count; \
+	     _ace++)
+
+#define richacl_for_each_entry_reverse(_ace, _acl) \
+	for (_ace = _acl->a_entries + _acl->a_count - 1; \
+	     _ace != _acl->a_entries - 1; \
+	     _ace--)
+
+/* Flag values defined by rich-acl */
+#define ACL4_MASKED			0x80
+
+#define ACL4_VALID_FLAGS (			\
+		ACL4_MASKED)
+
+/* e_type values */
+#define ACE4_ACCESS_ALLOWED_ACE_TYPE	0x0000
+#define ACE4_ACCESS_DENIED_ACE_TYPE	0x0001
+/*#define ACE4_SYSTEM_AUDIT_ACE_TYPE	0x0002*/
+/*#define ACE4_SYSTEM_ALARM_ACE_TYPE	0x0003*/
+
+/* e_flags bitflags */
+#define ACE4_FILE_INHERIT_ACE		0x0001
+#define ACE4_DIRECTORY_INHERIT_ACE	0x0002
+#define ACE4_NO_PROPAGATE_INHERIT_ACE	0x0004
+#define ACE4_INHERIT_ONLY_ACE		0x0008
+/*#define ACE4_SUCCESSFUL_ACCESS_ACE_FLAG	0x0010*/
+/*#define ACE4_FAILED_ACCESS_ACE_FLAG	0x0020*/
+#define ACE4_IDENTIFIER_GROUP		0x0040
+/* in-memory representation only */
+#define ACE4_SPECIAL_WHO		0x4000
+
+#define ACE4_VALID_FLAGS (			\
+	ACE4_FILE_INHERIT_ACE |			\
+	ACE4_DIRECTORY_INHERIT_ACE |		\
+	ACE4_NO_PROPAGATE_INHERIT_ACE |		\
+	ACE4_INHERIT_ONLY_ACE |			\
+	ACE4_IDENTIFIER_GROUP)
+
+/* e_mask bitflags */
+#define ACE4_READ_DATA			0x00000001
+#define ACE4_LIST_DIRECTORY		0x00000001
+#define ACE4_WRITE_DATA			0x00000002
+#define ACE4_ADD_FILE			0x00000002
+#define ACE4_APPEND_DATA		0x00000004
+#define ACE4_ADD_SUBDIRECTORY		0x00000004
+#define ACE4_READ_NAMED_ATTRS		0x00000008
+#define ACE4_WRITE_NAMED_ATTRS		0x00000010
+#define ACE4_EXECUTE			0x00000020
+#define ACE4_DELETE_CHILD		0x00000040
+#define ACE4_READ_ATTRIBUTES		0x00000080
+#define ACE4_WRITE_ATTRIBUTES		0x00000100
+#define ACE4_WRITE_RETENTION		0x00000200
+#define ACE4_WRITE_RETENTION_HOLD	0x00000400
+#define ACE4_DELETE			0x00010000
+#define ACE4_READ_ACL			0x00020000
+#define ACE4_WRITE_ACL			0x00040000
+#define ACE4_WRITE_OWNER		0x00080000
+#define ACE4_SYNCHRONIZE		0x00100000
+
+/* Valid ACE4_* flags for directories and non-directories */
+#define ACE4_VALID_MASK (				\
+	ACE4_READ_DATA | ACE4_LIST_DIRECTORY |		\
+	ACE4_WRITE_DATA | ACE4_ADD_FILE |		\
+	ACE4_APPEND_DATA | ACE4_ADD_SUBDIRECTORY |	\
+	ACE4_READ_NAMED_ATTRS |				\
+	ACE4_WRITE_NAMED_ATTRS |			\
+	ACE4_EXECUTE |					\
+	ACE4_DELETE_CHILD |				\
+	ACE4_READ_ATTRIBUTES |				\
+	ACE4_WRITE_ATTRIBUTES |				\
+	ACE4_WRITE_RETENTION |				\
+	ACE4_WRITE_RETENTION_HOLD |			\
+	ACE4_DELETE |					\
+	ACE4_READ_ACL |					\
+	ACE4_WRITE_ACL |				\
+	ACE4_WRITE_OWNER |				\
+	ACE4_SYNCHRONIZE)
+
+/**
+ * richacl_get  -  grab another reference to a richacl handle
+ */
+static inline struct richacl *
+richacl_get(struct richacl *acl)
+{
+	if (acl)
+		atomic_inc(&acl->a_refcount);
+	return acl;
+}
+
+/**
+ * richacl_put  -  free a richacl handle
+ */
+static inline void
+richacl_put(struct richacl *acl)
+{
+	if (acl && atomic_dec_and_test(&acl->a_refcount))
+		kfree(acl);
+}
+
+/*
+ * Special e_who identifiers: we use these pointer values in comparisons
+ * instead of doing a strcmp.
+ */
+extern const char richace_owner_who[];
+extern const char richace_group_who[];
+extern const char richace_everyone_who[];
+
+/**
+ * richace_is_owner  -  check if @ace is an OWNER@ entry
+ */
+static inline int
+richace_is_owner(const struct richace *ace)
+{
+	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
+	       ace->u.e_who == richace_owner_who;
+}
+
+/**
+ * richace_is_group  -  check if @ace is a GROUP@ entry
+ */
+static inline int
+richace_is_group(const struct richace *ace)
+{
+	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
+	       ace->u.e_who == richace_group_who;
+}
+
+/**
+ * richace_is_everyone  -  check if @ace is an EVERYONE@ entry
+ */
+static inline int
+richace_is_everyone(const struct richace *ace)
+{
+	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
+	       ace->u.e_who == richace_everyone_who;
+}
+
+/**
+ * richace_is_unix_id  -  check if @ace applies to a specific uid or gid
+ */
+static inline int
+richace_is_unix_id(const struct richace *ace)
+{
+	return !(ace->e_flags & ACE4_SPECIAL_WHO);
+}
+
+/**
+ * richace_is_inherit_only  -  check if @ace is for inheritance only
+ *
+ * ACEs with the %ACE4_INHERIT_ONLY_ACE flag set have no effect during
+ * permission checking.
+ */
+static inline int
+richace_is_inherit_only(const struct richace *ace)
+{
+	return ace->e_flags & ACE4_INHERIT_ONLY_ACE;
+}
+
+/**
+ * richace_is_inheritable  -  check if @ace is inheritable
+ */
+static inline int
+richace_is_inheritable(const struct richace *ace)
+{
+	return ace->e_flags & (ACE4_FILE_INHERIT_ACE |
+			       ACE4_DIRECTORY_INHERIT_ACE);
+}
+
+/**
+ * richace_clear_inheritance_flags  - clear all inheritance flags in @ace
+ */
+static inline void
+richace_clear_inheritance_flags(struct richace *ace)
+{
+	ace->e_flags &= ~(ACE4_FILE_INHERIT_ACE |
+			  ACE4_DIRECTORY_INHERIT_ACE |
+			  ACE4_NO_PROPAGATE_INHERIT_ACE |
+			  ACE4_INHERIT_ONLY_ACE);
+}
+
+/**
+ * richace_is_allow  -  check if @ace is an %ALLOW type entry
+ */
+static inline int
+richace_is_allow(const struct richace *ace)
+{
+	return ace->e_type == ACE4_ACCESS_ALLOWED_ACE_TYPE;
+}
+
+/**
+ * richace_is_deny  -  check if @ace is a %DENY type entry
+ */
+static inline int
+richace_is_deny(const struct richace *ace)
+{
+	return ace->e_type == ACE4_ACCESS_DENIED_ACE_TYPE;
+}
+
+extern struct richacl *richacl_alloc(int);
+extern int richace_is_same_identifier(const struct richace *,
+				      const struct richace *);
+extern int richace_set_who(struct richace *, const char *);
+
+#endif /* __RICHACL_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 14/26] richacl: Permission mapping functions
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (12 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 13/26] richacl: In-memory representation and helper functions Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 15/26] richacl: Compute maximum file masks from an acl Aneesh Kumar K.V
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

We need to map from POSIX permissions to NFSv4 permissions when a
chmod() is done, from NFSv4 permissions to POSIX permissions when an acl
is set (which implicitly sets the file permission bits), and from the
MAY_READ/MAY_WRITE/MAY_EXEC/MAY_APPEND flags to NFSv4 permissions when
doing an access check in a richacl.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/richacl_base.c       |  117 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |   46 ++++++++++++++++++
 2 files changed, 163 insertions(+), 0 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index 3536626..b6460a9 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -69,6 +69,123 @@ richacl_clone(const struct richacl *acl)
 }
 
 /**
+ * richacl_mask_to_mode  -  compute the file permission bits which correspond to @mask
+ * @mask:	%ACE4_* permission mask
+ *
+ * See richacl_masks_to_mode().
+ */
+static int
+richacl_mask_to_mode(unsigned int mask)
+{
+	int mode = 0;
+
+	if (mask & ACE4_POSIX_MODE_READ)
+		mode |= MAY_READ;
+	if (mask & ACE4_POSIX_MODE_WRITE)
+		mode |= MAY_WRITE;
+	if (mask & ACE4_POSIX_MODE_EXEC)
+		mode |= MAY_EXEC;
+
+	return mode;
+}
+
+/**
+ * richacl_masks_to_mode  -  compute the file permission bits from the file masks
+ *
+ * When setting a richacl, we set the file permission bits to indicate maximum
+ * permissions: for example, we set the Write permission when a mask contains
+ * ACE4_APPEND_DATA even if it does not also contain ACE4_WRITE_DATA.
+ *
+ * Permissions which are not in ACE4_POSIX_MODE_READ, ACE4_POSIX_MODE_WRITE, or
+ * ACE4_POSIX_MODE_EXEC cannot be represented in the file permission bits.
+ * Such permissions can still be effective, but not for new files or after a
+ * chmod(), and only if they were set explicitly, for example, by setting a
+ * richacl.
+ */
+int
+richacl_masks_to_mode(const struct richacl *acl)
+{
+	return richacl_mask_to_mode(acl->a_owner_mask) << 6 |
+	       richacl_mask_to_mode(acl->a_group_mask) << 3 |
+	       richacl_mask_to_mode(acl->a_other_mask);
+}
+EXPORT_SYMBOL_GPL(richacl_masks_to_mode);
+
+/**
+ * richacl_mode_to_mask  - compute a file mask from the lowest three mode bits
+ *
+ * When the file permission bits of a file are set with chmod(), this specifies
+ * the maximum permissions that processes will get.  All permissions beyond
+ * that will be removed from the file masks, and become ineffective.
+ *
+ * We also add in the permissions which are always allowed no matter what the
+ * acl says.
+ */
+unsigned int
+richacl_mode_to_mask(mode_t mode)
+{
+	unsigned int mask = ACE4_POSIX_ALWAYS_ALLOWED;
+
+	if (mode & MAY_READ)
+		mask |= ACE4_POSIX_MODE_READ;
+	if (mode & MAY_WRITE)
+		mask |= ACE4_POSIX_MODE_WRITE;
+	if (mode & MAY_EXEC)
+		mask |= ACE4_POSIX_MODE_EXEC;
+
+	return mask;
+}
+
+/**
+ * richacl_want_to_mask  - convert the iop->permission want argument to a mask
+ * @want:	@want argument of the permission inode operation
+ *
+ * When checking for append, @want is (MAY_WRITE | MAY_APPEND).
+ *
+ * Richacls use the iop->may_create and iop->may_delete hooks which are
+ * used for checking if creating and deleting files is allowed.  These hooks do
+ * not use richacl_want_to_mask(), so we do not have to deal with mapping
+ * MAY_WRITE to ACE4_ADD_FILE, ACE4_ADD_SUBDIRECTORY, and ACE4_DELETE_CHILD
+ * here.
+ */
+unsigned int
+richacl_want_to_mask(unsigned int want)
+{
+	unsigned int mask = 0;
+
+	if (want & MAY_READ)
+		mask |= ACE4_READ_DATA;
+	if (want & MAY_DELETE_SELF)
+		mask |= ACE4_DELETE;
+	if (want & MAY_TAKE_OWNERSHIP)
+		mask |= ACE4_WRITE_OWNER;
+	if (want & MAY_CHMOD)
+		mask |= ACE4_WRITE_ACL;
+	if (want & MAY_SET_TIMES)
+		mask |= ACE4_WRITE_ATTRIBUTES;
+	if (want & MAY_EXEC)
+		mask |= ACE4_EXECUTE;
+	/*
+	 * differentiate MAY_WRITE from these request
+	 */
+	if (want & (MAY_APPEND |
+		    MAY_CREATE_FILE | MAY_CREATE_DIR |
+		    MAY_DELETE_CHILD)) {
+		if (want & MAY_APPEND)
+			mask |= ACE4_APPEND_DATA;
+		if (want & MAY_CREATE_FILE)
+			mask |= ACE4_ADD_FILE;
+		if (want & MAY_CREATE_DIR)
+			mask |= ACE4_ADD_SUBDIRECTORY;
+		if (want & MAY_DELETE_CHILD)
+			mask |= ACE4_DELETE_CHILD;
+	} else if (want & MAY_WRITE)
+		mask |= ACE4_WRITE_DATA;
+	return mask;
+}
+EXPORT_SYMBOL_GPL(richacl_want_to_mask);
+
+/**
  * richace_is_same_identifier  -  are both identifiers the same?
  */
 int
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 745cfc1..61f1b8a 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -117,6 +117,49 @@ struct richacl {
 	ACE4_WRITE_OWNER |				\
 	ACE4_SYNCHRONIZE)
 
+/*
+ * The POSIX permissions are supersets of the following NFSv4 permissions:
+ *
+ *  - MAY_READ maps to READ_DATA or LIST_DIRECTORY, depending on the type
+ *    of the file system object.
+ *
+ *  - MAY_WRITE maps to WRITE_DATA or ACE4_APPEND_DATA for files, and to
+ *    ADD_FILE, ACE4_ADD_SUBDIRECTORY, or ACE4_DELETE_CHILD for directories.
+ *
+ *  - MAY_EXECUTE maps to ACE4_EXECUTE.
+ *
+ *  (Some of these NFSv4 permissions have the same bit values.)
+ */
+#define ACE4_POSIX_MODE_READ (			\
+		ACE4_READ_DATA |		\
+		ACE4_LIST_DIRECTORY)
+#define ACE4_POSIX_MODE_WRITE (			\
+		ACE4_WRITE_DATA |		\
+		ACE4_ADD_FILE |			\
+		ACE4_APPEND_DATA |		\
+		ACE4_ADD_SUBDIRECTORY |		\
+		ACE4_DELETE_CHILD)
+#define ACE4_POSIX_MODE_EXEC ACE4_EXECUTE
+#define ACE4_POSIX_MODE_ALL (			\
+		ACE4_POSIX_MODE_READ |		\
+		ACE4_POSIX_MODE_WRITE |		\
+		ACE4_POSIX_MODE_EXEC)
+/*
+ * These permissions are always allowed
+ * no matter what the acl says.
+ */
+#define ACE4_POSIX_ALWAYS_ALLOWED (	\
+		ACE4_SYNCHRONIZE |	\
+		ACE4_READ_ATTRIBUTES |	\
+		ACE4_READ_ACL)
+/*
+ * The owner is implicitly granted
+ * these permissions under POSIX.
+ */
+#define ACE4_POSIX_OWNER_ALLOWED (		\
+		ACE4_WRITE_ATTRIBUTES |		\
+		ACE4_WRITE_OWNER |		\
+		ACE4_WRITE_ACL)
 /**
  * richacl_get  -  grab another reference to a richacl handle
  */
@@ -241,5 +284,8 @@ extern struct richacl *richacl_alloc(int);
 extern int richace_is_same_identifier(const struct richace *,
 				      const struct richace *);
 extern int richace_set_who(struct richace *, const char *);
+extern int richacl_masks_to_mode(const struct richacl *);
+extern unsigned int richacl_mode_to_mask(mode_t);
+extern unsigned int richacl_want_to_mask(unsigned int);
 
 #endif /* __RICHACL_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 15/26] richacl: Compute maximum file masks from an acl
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (13 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 14/26] richacl: Permission mapping functions Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32   ` Aneesh Kumar K.V
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Compute upper bound owner, group, and other file masks with as few
permissions as possible without denying any permissions that the NFSv4
acl in a richacl grants.

This algorithm is used when a file inherits an acl at create time and
when an acl is set via a mechanism that does not specify file modes
(such as via nfsd).  When user-space sets an acl, the file masks are
passed in as part of the xattr.

When setting a richacl, the file masks determine what the file
permission bits will be set to; see richacl_masks_to_mode().

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/richacl_base.c       |  128 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |    1 +
 2 files changed, 129 insertions(+), 0 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index b6460a9..a0197ed 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -224,3 +224,131 @@ richace_set_who(struct richace *ace, const char *who)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(richace_set_who);
+
+/**
+ * richacl_allowed_to_who  -  mask flags allowed to a specific who value
+ *
+ * Computes the mask values allowed to a specific who value, taking
+ * EVERYONE@ entries into account.
+ */
+static unsigned int richacl_allowed_to_who(struct richacl *acl,
+					   struct richace *who)
+{
+	struct richace *ace;
+	unsigned int allowed = 0;
+
+	richacl_for_each_entry_reverse(ace, acl) {
+		if (richace_is_inherit_only(ace))
+			continue;
+		if (richace_is_same_identifier(ace, who) ||
+		    richace_is_everyone(ace)) {
+			if (richace_is_allow(ace))
+				allowed |= ace->e_mask;
+			else if (richace_is_deny(ace))
+				allowed &= ~ace->e_mask;
+		}
+	}
+	return allowed;
+}
+
+/**
+ * richacl_group_class_allowed  -  maximum permissions the group class is allowed
+ *
+ * See richacl_compute_max_masks().
+ */
+static unsigned int richacl_group_class_allowed(struct richacl *acl)
+{
+	struct richace *ace;
+	unsigned int everyone_allowed = 0, group_class_allowed = 0;
+	int had_group_ace = 0;
+
+	richacl_for_each_entry_reverse(ace, acl) {
+		if (richace_is_inherit_only(ace) ||
+		    richace_is_owner(ace))
+			continue;
+
+		if (richace_is_everyone(ace)) {
+			if (richace_is_allow(ace))
+				everyone_allowed |= ace->e_mask;
+			else if (richace_is_deny(ace))
+				everyone_allowed &= ~ace->e_mask;
+		} else {
+			group_class_allowed |=
+				richacl_allowed_to_who(acl, ace);
+
+			if (richace_is_group(ace))
+				had_group_ace = 1;
+		}
+	}
+	if (!had_group_ace)
+		group_class_allowed |= everyone_allowed;
+	return group_class_allowed;
+}
+
+/**
+ * richacl_compute_max_masks  -  compute upper bound masks
+ *
+ * Computes upper bound owner, group, and other masks so that none of
+ * the mask flags allowed by the acl are disabled (for any choice of the
+ * file owner or group membership).
+ */
+void richacl_compute_max_masks(struct richacl *acl)
+{
+	unsigned int gmask = ~0;
+	struct richace *ace;
+
+	/*
+	 * @gmask contains all permissions which the group class is ever
+	 * allowed.  We use it to avoid adding permissions to the group mask
+	 * from everyone@ allow aces which the group class is always denied
+	 * through other aces.  For example, the following acl would otherwise
+	 * result in a group mask or rw:
+	 *
+	 *	group@:w::deny
+	 *	everyone@:rw::allow
+	 *
+	 * Avoid computing @gmask for acls which do not include any group class
+	 * deny aces: in such acls, the group class is never denied any
+	 * permissions from everyone@ allow aces.
+	 */
+
+restart:
+	acl->a_owner_mask = 0;
+	acl->a_group_mask = 0;
+	acl->a_other_mask = 0;
+
+	richacl_for_each_entry_reverse(ace, acl) {
+		if (richace_is_inherit_only(ace))
+			continue;
+
+		if (richace_is_owner(ace)) {
+			if (richace_is_allow(ace))
+				acl->a_owner_mask |= ace->e_mask;
+			else if (richace_is_deny(ace))
+				acl->a_owner_mask &= ~ace->e_mask;
+		} else if (richace_is_everyone(ace)) {
+			if (richace_is_allow(ace)) {
+				acl->a_owner_mask |= ace->e_mask;
+				acl->a_group_mask |= ace->e_mask & gmask;
+				acl->a_other_mask |= ace->e_mask;
+			} else if (richace_is_deny(ace)) {
+				acl->a_owner_mask &= ~ace->e_mask;
+				acl->a_group_mask &= ~ace->e_mask;
+				acl->a_other_mask &= ~ace->e_mask;
+			}
+		} else {
+			if (richace_is_allow(ace)) {
+				acl->a_owner_mask |= ace->e_mask & gmask;
+				acl->a_group_mask |= ace->e_mask & gmask;
+			} else if (richace_is_deny(ace) && gmask == ~0) {
+				gmask = richacl_group_class_allowed(acl);
+				if (likely(gmask != ~0))
+					/* should always be true */
+					goto restart;
+			}
+		}
+	}
+
+	acl->a_flags &= ~ACL4_MASKED;
+}
+EXPORT_SYMBOL_GPL(richacl_compute_max_masks);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 61f1b8a..ded57e9 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -287,5 +287,6 @@ extern int richace_set_who(struct richace *, const char *);
 extern int richacl_masks_to_mode(const struct richacl *);
 extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
+extern void richacl_compute_max_masks(struct richacl *);
 
 #endif /* __RICHACL_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 16/26] richacl: Update the file masks in chmod()
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Doing a chmod() sets the file mode, which includes the file permission
bits.  When a file has a richacl, the permissions that the richacl
grants need to be limited to what the new file permission bits allow.

This is done by setting the file masks in the richacl to what the file
permission bits map to.  The richacl access check algorithm takes the
file masks into account, which ensures that the richacl cannot grant too
many permissions.

It is possible to explicitly add permissions to the file masks which go
beyond what the file permission bits can grant (like the ACE4_WRITE_ACL
permission).  The POSIX.1 standard calls this an alternate file access
control mechanism.  A subsequent chmod() would ensure that those
permissions are disabled again.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/richacl_base.c       |   40 ++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |    1 +
 2 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index a0197ed..a5f215e 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -352,3 +352,43 @@ restart:
 	acl->a_flags &= ~ACL4_MASKED;
 }
 EXPORT_SYMBOL_GPL(richacl_compute_max_masks);
+
+/**
+ * richacl_chmod  -  update the file masks to reflect the new mode
+ * @mode:	new file permission bits
+ *
+ * Return a copy of @acl where the file masks have been replaced by the file
+ * masks corresponding to the file permission bits in @mode, or returns @acl
+ * itself if the file masks are already up to date.  Takes over a reference
+ * to @acl.
+ */
+struct richacl *
+richacl_chmod(struct richacl *acl, mode_t mode)
+{
+	unsigned int owner_mask, group_mask, other_mask;
+	struct richacl *clone;
+
+	owner_mask = richacl_mode_to_mask(mode >> 6) |
+		     ACE4_POSIX_OWNER_ALLOWED;
+	group_mask = richacl_mode_to_mask(mode >> 3);
+	other_mask = richacl_mode_to_mask(mode);
+
+	if (acl->a_owner_mask == owner_mask &&
+	    acl->a_group_mask == group_mask &&
+	    acl->a_other_mask == other_mask &&
+	    (acl->a_flags & ACL4_MASKED))
+		return acl;
+
+	clone = richacl_clone(acl);
+	richacl_put(acl);
+	if (!clone)
+		return ERR_PTR(-ENOMEM);
+
+	clone->a_flags |= ACL4_MASKED;
+	clone->a_owner_mask = owner_mask;
+	clone->a_group_mask = group_mask;
+	clone->a_other_mask = other_mask;
+
+	return clone;
+}
+EXPORT_SYMBOL_GPL(richacl_chmod);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index ded57e9..be7db1f 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -288,5 +288,6 @@ extern int richacl_masks_to_mode(const struct richacl *);
 extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
 extern void richacl_compute_max_masks(struct richacl *);
+extern struct richacl *richacl_chmod(struct richacl *, mode_t);
 
 #endif /* __RICHACL_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 16/26] richacl: Update the file masks in chmod()
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen-DgEjT+Ai2ygdnm+yROfE0A, bfields-uC3wQj2KruNg9hUCZPvPmw,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA
  Cc: aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Doing a chmod() sets the file mode, which includes the file permission
bits.  When a file has a richacl, the permissions that the richacl
grants need to be limited to what the new file permission bits allow.

This is done by setting the file masks in the richacl to what the file
permission bits map to.  The richacl access check algorithm takes the
file masks into account, which ensures that the richacl cannot grant too
many permissions.

It is possible to explicitly add permissions to the file masks which go
beyond what the file permission bits can grant (like the ACE4_WRITE_ACL
permission).  The POSIX.1 standard calls this an alternate file access
control mechanism.  A subsequent chmod() would ensure that those
permissions are disabled again.

Acked-by: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/richacl_base.c       |   40 ++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |    1 +
 2 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index a0197ed..a5f215e 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -352,3 +352,43 @@ restart:
 	acl->a_flags &= ~ACL4_MASKED;
 }
 EXPORT_SYMBOL_GPL(richacl_compute_max_masks);
+
+/**
+ * richacl_chmod  -  update the file masks to reflect the new mode
+ * @mode:	new file permission bits
+ *
+ * Return a copy of @acl where the file masks have been replaced by the file
+ * masks corresponding to the file permission bits in @mode, or returns @acl
+ * itself if the file masks are already up to date.  Takes over a reference
+ * to @acl.
+ */
+struct richacl *
+richacl_chmod(struct richacl *acl, mode_t mode)
+{
+	unsigned int owner_mask, group_mask, other_mask;
+	struct richacl *clone;
+
+	owner_mask = richacl_mode_to_mask(mode >> 6) |
+		     ACE4_POSIX_OWNER_ALLOWED;
+	group_mask = richacl_mode_to_mask(mode >> 3);
+	other_mask = richacl_mode_to_mask(mode);
+
+	if (acl->a_owner_mask == owner_mask &&
+	    acl->a_group_mask == group_mask &&
+	    acl->a_other_mask == other_mask &&
+	    (acl->a_flags & ACL4_MASKED))
+		return acl;
+
+	clone = richacl_clone(acl);
+	richacl_put(acl);
+	if (!clone)
+		return ERR_PTR(-ENOMEM);
+
+	clone->a_flags |= ACL4_MASKED;
+	clone->a_owner_mask = owner_mask;
+	clone->a_group_mask = group_mask;
+	clone->a_other_mask = other_mask;
+
+	return clone;
+}
+EXPORT_SYMBOL_GPL(richacl_chmod);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index ded57e9..be7db1f 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -288,5 +288,6 @@ extern int richacl_masks_to_mode(const struct richacl *);
 extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
 extern void richacl_compute_max_masks(struct richacl *);
+extern struct richacl *richacl_chmod(struct richacl *, mode_t);
 
 #endif /* __RICHACL_H */
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 17/26] richacl: Permission check algorithm
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (15 preceding siblings ...)
  2011-10-18 15:32   ` Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 18/26] richacl: Create-time inheritance Aneesh Kumar K.V
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

As in the standard POSIX file permission model, each process is the
owner, group, or other file class.  A process is

  - in the owner file class if it owns the file,
  - in the group file class if it is in the file's owning group or it
    matches any of the user or group entries, and
  - in the other file class otherwise.

Each file class is associated with a file mask.

A richacl grants a requested access if the NFSv4 acl in the richacl
grants the requested permissions (according to the NFSv4 permission
check algorithm) and the file mask that applies to the process includes
the requested permissions.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/richacl_base.c       |   99 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |    2 +
 2 files changed, 101 insertions(+), 0 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index a5f215e..e9c2f30 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -392,3 +392,102 @@ richacl_chmod(struct richacl *acl, mode_t mode)
 	return clone;
 }
 EXPORT_SYMBOL_GPL(richacl_chmod);
+
+/**
+ * richacl_permission  -  richacl permission check algorithm
+ * @inode:	inode to check
+ * @acl:	rich acl of the inode
+ * @mask:	requested access (ACE4_* bitmask)
+ *
+ * Checks if the current process is granted @mask flags in @acl.
+ */
+int
+richacl_permission(struct inode *inode, const struct richacl *acl,
+		   unsigned int mask)
+{
+	const struct richace *ace;
+	unsigned int requested = mask, denied = 0;
+	int in_owning_group = in_group_p(inode->i_gid);
+	int in_owner_or_group_class = in_owning_group;
+
+	/*
+	 * We don't need to know which class the process is in when the acl is
+	 * not masked.
+	 */
+	if (!(acl->a_flags & ACL4_MASKED))
+		in_owner_or_group_class = 1;
+
+	/*
+	 * A process is
+	 *   - in the owner file class if it owns the file,
+	 *   - in the group file class if it is in the file's owning group or
+	 *     it matches any of the user or group entries, and
+	 *   - in the other file class otherwise.
+	 */
+
+	/*
+	 * Check if the acl grants the requested access and determine which
+	 * file class the process is in.
+	 */
+	richacl_for_each_entry(ace, acl) {
+		unsigned int ace_mask = ace->e_mask;
+
+		if (richace_is_inherit_only(ace))
+			continue;
+		if (richace_is_owner(ace)) {
+			if (current_fsuid() != inode->i_uid)
+				continue;
+			goto is_owner;
+		} else if (richace_is_group(ace)) {
+			if (!in_owning_group)
+				continue;
+		} else if (richace_is_unix_id(ace)) {
+			if (ace->e_flags & ACE4_IDENTIFIER_GROUP) {
+				if (!in_group_p(ace->u.e_id))
+					continue;
+			} else {
+				if (current_fsuid() != ace->u.e_id)
+					continue;
+			}
+		} else
+			goto is_everyone;
+
+is_owner:
+		/* The process is in the owner or group file class. */
+		in_owner_or_group_class = 1;
+
+is_everyone:
+		/* Check which mask flags the ACE allows or denies. */
+		if (richace_is_deny(ace))
+			denied |= ace_mask & mask;
+		mask &= ~ace_mask;
+
+		/*
+		 * Keep going until we know which file class
+		 * the process is in.
+		 */
+		if (!mask && in_owner_or_group_class)
+			break;
+	}
+	denied |= mask;
+
+	if (acl->a_flags & ACL4_MASKED) {
+		unsigned int file_mask;
+
+		/*
+		 * The file class a process is in determines which file mask
+		 * applies.  Check if that file mask also grants the requested
+		 * access.
+		 */
+		if (current_fsuid() == inode->i_uid)
+			file_mask = acl->a_owner_mask;
+		else if (in_owner_or_group_class)
+			file_mask = acl->a_group_mask;
+		else
+			file_mask = acl->a_other_mask;
+		denied |= requested & ~file_mask;
+	}
+
+	return denied ? -EACCES : 0;
+}
+EXPORT_SYMBOL_GPL(richacl_permission);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index be7db1f..86f7339 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -289,5 +289,7 @@ extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
 extern void richacl_compute_max_masks(struct richacl *);
 extern struct richacl *richacl_chmod(struct richacl *, mode_t);
+extern int richacl_permission(struct inode *, const struct richacl *,
+			      unsigned int);
 
 #endif /* __RICHACL_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 18/26] richacl: Create-time inheritance
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (16 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 17/26] richacl: Permission check algorithm Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 19/26] richacl: Check if an acl is equivalent to a file mode Aneesh Kumar K.V
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

When a new file is created, it can inherit an acl from its parent
directory; this is similar to how default acls work in POSIX (draft)
ACLs.

As with POSIX ACLs, if a file inherits an acl from its parent directory,
the intersection between the create mode and the permissions granted by
the inherited acl determines the file masks and file permission bits,
and the umask is ignored.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/Makefile             |    2 +-
 fs/richacl_base.c       |   69 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/richacl_inode.c      |   59 ++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |    4 +++
 4 files changed, 133 insertions(+), 1 deletions(-)
 create mode 100644 fs/richacl_inode.c

diff --git a/fs/Makefile b/fs/Makefile
index 7612168..1ecf9f2 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -49,7 +49,7 @@ obj-$(CONFIG_GENERIC_ACL)	+= generic_acl.o
 
 obj-$(CONFIG_FHANDLE)		+= fhandle.o
 obj-$(CONFIG_FS_RICHACL)	+= richacl.o
-richacl-y			:= richacl_base.o
+richacl-y			:= richacl_base.o richacl_inode.o
 
 obj-y				+= quota/
 
diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index e9c2f30..6c7e839 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -491,3 +491,72 @@ is_everyone:
 	return denied ? -EACCES : 0;
 }
 EXPORT_SYMBOL_GPL(richacl_permission);
+
+/**
+ * richacl_inherit  -  compute the inherited acl of a new file
+ * @dir_acl:	acl of the containing directory
+ * @isdir:	inherit by a directory or non-directory?
+ *
+ * A directory can have acl entries which files and/or directories created
+ * inside the directory will inherit.  This function computes the acl for such
+ * a new file.  If there is no inheritable acl, it will return %NULL.
+ */
+struct richacl *
+richacl_inherit(const struct richacl *dir_acl, int isdir)
+{
+	const struct richace *dir_ace;
+	struct richacl *acl = NULL;
+	struct richace *ace;
+	int count = 0;
+
+	if (isdir) {
+		richacl_for_each_entry(dir_ace, dir_acl) {
+			if (!richace_is_inheritable(dir_ace))
+				continue;
+			count++;
+		}
+		if (!count)
+			return NULL;
+		acl = richacl_alloc(count);
+		if (!acl)
+			return ERR_PTR(-ENOMEM);
+		ace = acl->a_entries;
+		richacl_for_each_entry(dir_ace, dir_acl) {
+			if (!richace_is_inheritable(dir_ace))
+				continue;
+			memcpy(ace, dir_ace, sizeof(struct richace));
+			if (dir_ace->e_flags & ACE4_NO_PROPAGATE_INHERIT_ACE)
+				richace_clear_inheritance_flags(ace);
+			if ((dir_ace->e_flags & ACE4_FILE_INHERIT_ACE) &&
+			    !(dir_ace->e_flags & ACE4_DIRECTORY_INHERIT_ACE))
+				ace->e_flags |= ACE4_INHERIT_ONLY_ACE;
+			ace++;
+		}
+	} else {
+		richacl_for_each_entry(dir_ace, dir_acl) {
+			if (!(dir_ace->e_flags & ACE4_FILE_INHERIT_ACE))
+				continue;
+			count++;
+		}
+		if (!count)
+			return NULL;
+		acl = richacl_alloc(count);
+		if (!acl)
+			return ERR_PTR(-ENOMEM);
+		ace = acl->a_entries;
+		richacl_for_each_entry(dir_ace, dir_acl) {
+			if (!(dir_ace->e_flags & ACE4_FILE_INHERIT_ACE))
+				continue;
+			memcpy(ace, dir_ace, sizeof(struct richace));
+			richace_clear_inheritance_flags(ace);
+			/*
+			 * ACE4_DELETE_CHILD is meaningless for
+			 * non-directories, so clear it.
+			 */
+			ace->e_mask &= ~ACE4_DELETE_CHILD;
+			ace++;
+		}
+	}
+
+	return acl;
+}
diff --git a/fs/richacl_inode.c b/fs/richacl_inode.c
new file mode 100644
index 0000000..f590fb5
--- /dev/null
+++ b/fs/richacl_inode.c
@@ -0,0 +1,59 @@
+/*
+ * Copyright (C) 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/sched.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/richacl.h>
+
+/**
+ * richacl_inherit_inode  -  compute inherited acl and file mode
+ * @dir_acl:	acl of the containing directory
+ * @inode:	inode of the new file (create mode in i_mode)
+ *
+ * The file permission bits in inode->i_mode must be set to the create mode.
+ * If there is an inheritable acl, the maximum permissions that the acl grants
+ * will be computed and permissions not granted by the acl will be removed from
+ * inode->i_mode.  If there is no inheritable acl, the umask will be applied
+ * instead.
+ */
+struct richacl *
+richacl_inherit_inode(const struct richacl *dir_acl, struct inode *inode)
+{
+	struct richacl *acl;
+	mode_t mask;
+
+	acl = richacl_inherit(dir_acl, S_ISDIR(inode->i_mode));
+	if (acl) {
+
+		richacl_compute_max_masks(acl);
+
+		/*
+		 * Ensure that the acl will not grant any permissions beyond
+		 * the create mode.
+		 */
+		acl->a_flags |= ACL4_MASKED;
+		acl->a_owner_mask &= richacl_mode_to_mask(inode->i_mode >> 6) |
+				     ACE4_POSIX_OWNER_ALLOWED;
+		acl->a_group_mask &= richacl_mode_to_mask(inode->i_mode >> 3);
+		acl->a_other_mask &= richacl_mode_to_mask(inode->i_mode);
+		mask = ~S_IRWXUGO | richacl_masks_to_mode(acl);
+	} else
+		mask = ~current_umask();
+
+	inode->i_mode &= mask;
+	return acl;
+}
+EXPORT_SYMBOL_GPL(richacl_inherit_inode);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 86f7339..15953ae 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -291,5 +291,9 @@ extern void richacl_compute_max_masks(struct richacl *);
 extern struct richacl *richacl_chmod(struct richacl *, mode_t);
 extern int richacl_permission(struct inode *, const struct richacl *,
 			      unsigned int);
+extern struct richacl *richacl_inherit(const struct richacl *, int);
 
+/* richacl_inode.c */
+extern struct richacl *richacl_inherit_inode(const struct richacl *,
+					     struct inode *);
 #endif /* __RICHACL_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 19/26] richacl: Check if an acl is equivalent to a file mode
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (17 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 18/26] richacl: Create-time inheritance Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 20/26] richacl: Automatic Inheritance Aneesh Kumar K.V
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

This function is used to avoid storing richacls on disk if the acl can
be computed from the file permission bits.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/richacl_base.c       |   54 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl.h |    1 +
 2 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index 6c7e839..3a9842e 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -560,3 +560,57 @@ richacl_inherit(const struct richacl *dir_acl, int isdir)
 
 	return acl;
 }
+
+/**
+ * richacl_equiv_mode  -  check if @acl is equivalent to file permission bits
+ * @mode_p:	the file mode (including the file type)
+ *
+ * If @acl can be fully represented by file permission bits, this function
+ * returns 0, and the file permission bits in @mode_p are set to the equivalent
+ * of @acl.
+ *
+ * This function is used to avoid storing richacls on disk if the acl can be
+ * computed from the file permission bits.  It allows user-space to make sure
+ * that a file has no explicit richacl set.
+ */
+int
+richacl_equiv_mode(const struct richacl *acl, mode_t *mode_p)
+{
+	const struct richace *ace = acl->a_entries;
+	unsigned int x;
+	mode_t mode;
+
+	if (acl->a_count != 1 ||
+	    acl->a_flags != ACL4_MASKED ||
+	    !richace_is_everyone(ace) ||
+	    !richace_is_allow(ace) ||
+	    ace->e_flags & ~ACE4_SPECIAL_WHO)
+		return -1;
+
+	/*
+	 * Figure out the permissions we care about: ACE4_DELETE_CHILD is
+	 * meaningless for non-directories, so we ignore it.
+	 */
+	x = ~ACE4_POSIX_ALWAYS_ALLOWED;
+	if (!S_ISDIR(*mode_p))
+		x &= ~ACE4_DELETE_CHILD;
+
+	mode = richacl_masks_to_mode(acl);
+	if ((acl->a_group_mask & x) != (richacl_mode_to_mask(mode >> 3) & x) ||
+	    (acl->a_other_mask & x) != (richacl_mode_to_mask(mode) & x))
+		return -1;
+
+	/*
+	 * Ignore permissions which the owner is always allowed.
+	 */
+	x &= ~ACE4_POSIX_OWNER_ALLOWED;
+	if ((acl->a_owner_mask & x) != (richacl_mode_to_mask(mode >> 6) & x))
+		return -1;
+
+	if ((ace->e_mask & x) != (ACE4_POSIX_MODE_ALL & x))
+		return -1;
+
+	*mode_p = (*mode_p & ~S_IRWXUGO) | mode;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(richacl_equiv_mode);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 15953ae..4eeb22f 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -292,6 +292,7 @@ extern struct richacl *richacl_chmod(struct richacl *, mode_t);
 extern int richacl_permission(struct inode *, const struct richacl *,
 			      unsigned int);
 extern struct richacl *richacl_inherit(const struct richacl *, int);
+extern int richacl_equiv_mode(const struct richacl *, mode_t *);
 
 /* richacl_inode.c */
 extern struct richacl *richacl_inherit_inode(const struct richacl *,
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 20/26] richacl: Automatic Inheritance
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (18 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 19/26] richacl: Check if an acl is equivalent to a file mode Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32   ` Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Automatic Inheritance (AI) allows changes to the acl of a directory to
recursively propagate down to files and directories in the directory.

To implement this, the kernel keeps track of which permissions have been
inherited, and makes sure that permission propagation is turned off when
the file permission bits of a file are changed (upon create or chmod).

The actual permission propagation is implemented in user space.

AI works as follows:

 - When the ACL4_AUTO_INHERIT flag in the acl of a file is cleared, the
   file is not affected by AI.

 - When the ACL4_AUTO_INHERIT flag in the acl of a directory is set and
   a file or subdirectory is created in that directory, files created in
   the directory will have the ACL4_AUTO_INHERIT flag set, and all
   inherited aces will have the ACE4_INHERITED_ACE flag set.  This
   allows user space to distinguish between aces which have been
   inherited, and aces which have been explicitly added.

 - When the ACL4_PROTECTED acl flag in the acl of a file is set, AI will
   not modify the acl of the file.  This does not affect propagation of
   permissions from the file to its children (if the file is a
   directory).

Linux does not have a way of creating files without setting the file
permission bits, so all files created inside a directory with
ACL4_AUTO_INHERIT set will also have the ACL4_PROTECTED flag set.  This
effectively disables AI.

Protocols which support creating files without specifying permissions
can explicitly clear the ACL4_PROTECTED flag after creating a file (and
reset the file masks to "undo" applying the create mode; see
richacl_compute_max_masks()).  This is a workaround; a per-create or
per-process flag indicating to ignore the create mode when AI is in
effect would fix this problem.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/richacl_base.c       |   10 +++++++++-
 fs/richacl_inode.c      |    7 ++++++-
 include/linux/richacl.h |   25 +++++++++++++++++++++++--
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index 3a9842e..bde2eea 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -376,7 +376,8 @@ richacl_chmod(struct richacl *acl, mode_t mode)
 	if (acl->a_owner_mask == owner_mask &&
 	    acl->a_group_mask == group_mask &&
 	    acl->a_other_mask == other_mask &&
-	    (acl->a_flags & ACL4_MASKED))
+	    (acl->a_flags & ACL4_MASKED) &&
+	    (!richacl_is_auto_inherit(acl) || richacl_is_protected(acl)))
 		return acl;
 
 	clone = richacl_clone(acl);
@@ -388,6 +389,8 @@ richacl_chmod(struct richacl *acl, mode_t mode)
 	clone->a_owner_mask = owner_mask;
 	clone->a_group_mask = group_mask;
 	clone->a_other_mask = other_mask;
+	if (richacl_is_auto_inherit(clone))
+		clone->a_flags |= ACL4_PROTECTED;
 
 	return clone;
 }
@@ -557,6 +560,11 @@ richacl_inherit(const struct richacl *dir_acl, int isdir)
 			ace++;
 		}
 	}
+	if (richacl_is_auto_inherit(dir_acl)) {
+		acl->a_flags = ACL4_AUTO_INHERIT;
+		richacl_for_each_entry(ace, acl)
+			ace->e_flags |= ACE4_INHERITED_ACE;
+	}
 
 	return acl;
 }
diff --git a/fs/richacl_inode.c b/fs/richacl_inode.c
index f590fb5..c1297ad 100644
--- a/fs/richacl_inode.c
+++ b/fs/richacl_inode.c
@@ -37,9 +37,14 @@ richacl_inherit_inode(const struct richacl *dir_acl, struct inode *inode)
 
 	acl = richacl_inherit(dir_acl, S_ISDIR(inode->i_mode));
 	if (acl) {
+		/*
+		 * We need to set ACL4_PROTECTED because we are
+		 * doing an implicit chmod
+		 */
+		if (richacl_is_auto_inherit(acl))
+			acl->a_flags |= ACL4_PROTECTED;
 
 		richacl_compute_max_masks(acl);
-
 		/*
 		 * Ensure that the acl will not grant any permissions beyond
 		 * the create mode.
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 4eeb22f..761585a 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -47,10 +47,16 @@ struct richacl {
 	     _ace != _acl->a_entries - 1; \
 	     _ace--)
 
+/* a_flags values */
+#define ACL4_AUTO_INHERIT		0x01
+#define ACL4_PROTECTED			0x02
+/* #define ACL4_DEFAULTED			0x04 */
 /* Flag values defined by rich-acl */
 #define ACL4_MASKED			0x80
 
 #define ACL4_VALID_FLAGS (			\
+		ACL4_AUTO_INHERIT |		\
+		ACL4_PROTECTED |		\
 		ACL4_MASKED)
 
 /* e_type values */
@@ -67,6 +73,7 @@ struct richacl {
 /*#define ACE4_SUCCESSFUL_ACCESS_ACE_FLAG	0x0010*/
 /*#define ACE4_FAILED_ACCESS_ACE_FLAG	0x0020*/
 #define ACE4_IDENTIFIER_GROUP		0x0040
+#define ACE4_INHERITED_ACE		0x0080
 /* in-memory representation only */
 #define ACE4_SPECIAL_WHO		0x4000
 
@@ -75,7 +82,8 @@ struct richacl {
 	ACE4_DIRECTORY_INHERIT_ACE |		\
 	ACE4_NO_PROPAGATE_INHERIT_ACE |		\
 	ACE4_INHERIT_ONLY_ACE |			\
-	ACE4_IDENTIFIER_GROUP)
+	ACE4_IDENTIFIER_GROUP |			\
+	ACE4_INHERITED_ACE)
 
 /* e_mask bitflags */
 #define ACE4_READ_DATA			0x00000001
@@ -181,6 +189,18 @@ richacl_put(struct richacl *acl)
 		kfree(acl);
 }
 
+static inline int
+richacl_is_auto_inherit(const struct richacl *acl)
+{
+	return acl->a_flags & ACL4_AUTO_INHERIT;
+}
+
+static inline int
+richacl_is_protected(const struct richacl *acl)
+{
+	return acl->a_flags & ACL4_PROTECTED;
+}
+
 /*
  * Special e_who identifiers: we use these pointer values in comparisons
  * instead of doing a strcmp.
@@ -259,7 +279,8 @@ richace_clear_inheritance_flags(struct richace *ace)
 	ace->e_flags &= ~(ACE4_FILE_INHERIT_ACE |
 			  ACE4_DIRECTORY_INHERIT_ACE |
 			  ACE4_NO_PROPAGATE_INHERIT_ACE |
-			  ACE4_INHERIT_ONLY_ACE);
+			  ACE4_INHERIT_ONLY_ACE |
+			  ACE4_INHERITED_ACE);
 }
 
 /**
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Map between "system.richacl" xattrs and the in-kernel representation.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/Makefile                   |    2 +-
 fs/richacl_xattr.c            |  156 +++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl_xattr.h |   47 ++++++++++++
 3 files changed, 204 insertions(+), 1 deletions(-)
 create mode 100644 fs/richacl_xattr.c
 create mode 100644 include/linux/richacl_xattr.h

diff --git a/fs/Makefile b/fs/Makefile
index 1ecf9f2..e217c65 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -49,7 +49,7 @@ obj-$(CONFIG_GENERIC_ACL)	+= generic_acl.o
 
 obj-$(CONFIG_FHANDLE)		+= fhandle.o
 obj-$(CONFIG_FS_RICHACL)	+= richacl.o
-richacl-y			:= richacl_base.o richacl_inode.o
+richacl-y			:= richacl_base.o richacl_inode.o richacl_xattr.o
 
 obj-y				+= quota/
 
diff --git a/fs/richacl_xattr.c b/fs/richacl_xattr.c
new file mode 100644
index 0000000..02a7986
--- /dev/null
+++ b/fs/richacl_xattr.c
@@ -0,0 +1,156 @@
+/*
+ * Copyright (C) 2006, 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/richacl_xattr.h>
+
+MODULE_LICENSE("GPL");
+
+/**
+ * richacl_from_xattr  -  convert a richacl xattr into the in-memory representation
+ */
+struct richacl *
+richacl_from_xattr(const void *value, size_t size)
+{
+	const struct richacl_xattr *xattr_acl = value;
+	const struct richace_xattr *xattr_ace = (void *)(xattr_acl + 1);
+	struct richacl *acl;
+	struct richace *ace;
+	int count;
+
+	if (size < sizeof(struct richacl_xattr) ||
+	    xattr_acl->a_version != ACL4_XATTR_VERSION ||
+	    (xattr_acl->a_flags & ~ACL4_VALID_FLAGS))
+		return ERR_PTR(-EINVAL);
+
+	count = le16_to_cpu(xattr_acl->a_count);
+	if (count > ACL4_XATTR_MAX_COUNT)
+		return ERR_PTR(-EINVAL);
+
+	acl = richacl_alloc(count);
+	if (!acl)
+		return ERR_PTR(-ENOMEM);
+
+	acl->a_flags = xattr_acl->a_flags;
+	acl->a_owner_mask = le32_to_cpu(xattr_acl->a_owner_mask);
+	if (acl->a_owner_mask & ~ACE4_VALID_MASK)
+		goto fail_einval;
+	acl->a_group_mask = le32_to_cpu(xattr_acl->a_group_mask);
+	if (acl->a_group_mask & ~ACE4_VALID_MASK)
+		goto fail_einval;
+	acl->a_other_mask = le32_to_cpu(xattr_acl->a_other_mask);
+	if (acl->a_other_mask & ~ACE4_VALID_MASK)
+		goto fail_einval;
+
+	richacl_for_each_entry(ace, acl) {
+		const char *who = (void *)(xattr_ace + 1), *end;
+		ssize_t used = (void *)who - value;
+
+		if (used > size)
+			goto fail_einval;
+		end = memchr(who, 0, size - used);
+		if (!end)
+			goto fail_einval;
+
+		ace->e_type = le16_to_cpu(xattr_ace->e_type);
+		ace->e_flags = le16_to_cpu(xattr_ace->e_flags);
+		ace->e_mask = le32_to_cpu(xattr_ace->e_mask);
+		ace->u.e_id = le32_to_cpu(xattr_ace->e_id);
+
+		if (ace->e_flags & ~ACE4_VALID_FLAGS)
+			goto fail_einval;
+		if (ace->e_type > ACE4_ACCESS_DENIED_ACE_TYPE ||
+		    (ace->e_mask & ~ACE4_VALID_MASK))
+			goto fail_einval;
+
+		if (who == end) {
+			if (ace->u.e_id == -1)
+				goto fail_einval;  /* uid/gid needed */
+		} else if (richace_set_who(ace, who))
+			goto fail_einval;
+
+		xattr_ace = (void *)who + ALIGN(end - who + 1, 4);
+	}
+
+	return acl;
+
+fail_einval:
+	richacl_put(acl);
+	return ERR_PTR(-EINVAL);
+}
+EXPORT_SYMBOL_GPL(richacl_from_xattr);
+
+/**
+ * richacl_xattr_size  -  compute the size of the xattr representation of @acl
+ */
+size_t
+richacl_xattr_size(const struct richacl *acl)
+{
+	size_t size = sizeof(struct richacl_xattr);
+	const struct richace *ace;
+
+	richacl_for_each_entry(ace, acl) {
+		size += sizeof(struct richace_xattr) +
+			(richace_is_unix_id(ace) ? 4 :
+			 ALIGN(strlen(ace->u.e_who) + 1, 4));
+	}
+	return size;
+}
+EXPORT_SYMBOL_GPL(richacl_xattr_size);
+
+/**
+ * richacl_to_xattr  -  convert @acl into its xattr representation
+ * @acl:	the richacl to convert
+ * @buffer:	buffer of size richacl_xattr_size(@acl) for the result
+ */
+void
+richacl_to_xattr(const struct richacl *acl, void *buffer)
+{
+	struct richacl_xattr *xattr_acl = buffer;
+	struct richace_xattr *xattr_ace;
+	const struct richace *ace;
+
+	xattr_acl->a_version = ACL4_XATTR_VERSION;
+	xattr_acl->a_flags = acl->a_flags;
+	xattr_acl->a_count = cpu_to_le16(acl->a_count);
+
+	xattr_acl->a_owner_mask = cpu_to_le32(acl->a_owner_mask);
+	xattr_acl->a_group_mask = cpu_to_le32(acl->a_group_mask);
+	xattr_acl->a_other_mask = cpu_to_le32(acl->a_other_mask);
+
+	xattr_ace = (void *)(xattr_acl + 1);
+	richacl_for_each_entry(ace, acl) {
+		xattr_ace->e_type = cpu_to_le16(ace->e_type);
+		xattr_ace->e_flags = cpu_to_le16(ace->e_flags &
+						 ACE4_VALID_FLAGS);
+		xattr_ace->e_mask = cpu_to_le32(ace->e_mask);
+		if (richace_is_unix_id(ace)) {
+			xattr_ace->e_id = cpu_to_le32(ace->u.e_id);
+			memset(xattr_ace->e_who, 0, 4);
+			xattr_ace = (void *)xattr_ace->e_who + 4;
+		} else {
+			int sz = ALIGN(strlen(ace->u.e_who) + 1, 4);
+
+			xattr_ace->e_id = cpu_to_le32(-1);
+			memset(xattr_ace->e_who + sz - 4, 0, 4);
+			strcpy(xattr_ace->e_who, ace->u.e_who);
+			xattr_ace = (void *)xattr_ace->e_who + sz;
+		}
+	}
+}
+EXPORT_SYMBOL_GPL(richacl_to_xattr);
diff --git a/include/linux/richacl_xattr.h b/include/linux/richacl_xattr.h
new file mode 100644
index 0000000..f79ec12
--- /dev/null
+++ b/include/linux/richacl_xattr.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2006, 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef __RICHACL_XATTR_H
+#define __RICHACL_XATTR_H
+
+#include <linux/richacl.h>
+
+#define RICHACL_XATTR "system.richacl"
+
+struct richace_xattr {
+	__le16		e_type;
+	__le16		e_flags;
+	__le32		e_mask;
+	__le32		e_id;
+	char		e_who[0];
+};
+
+struct richacl_xattr {
+	unsigned char	a_version;
+	unsigned char	a_flags;
+	__le16		a_count;
+	__le32		a_owner_mask;
+	__le32		a_group_mask;
+	__le32		a_other_mask;
+};
+
+#define ACL4_XATTR_VERSION	0
+#define ACL4_XATTR_MAX_COUNT	1024
+
+extern struct richacl *richacl_from_xattr(const void *, size_t);
+extern size_t richacl_xattr_size(const struct richacl *acl);
+extern void richacl_to_xattr(const struct richacl *, void *);
+
+#endif /* __RICHACL_XATTR_H */
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-18 15:32   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen-DgEjT+Ai2ygdnm+yROfE0A, bfields-uC3wQj2KruNg9hUCZPvPmw,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA
  Cc: aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Map between "system.richacl" xattrs and the in-kernel representation.

Acked-by: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/Makefile                   |    2 +-
 fs/richacl_xattr.c            |  156 +++++++++++++++++++++++++++++++++++++++++
 include/linux/richacl_xattr.h |   47 ++++++++++++
 3 files changed, 204 insertions(+), 1 deletions(-)
 create mode 100644 fs/richacl_xattr.c
 create mode 100644 include/linux/richacl_xattr.h

diff --git a/fs/Makefile b/fs/Makefile
index 1ecf9f2..e217c65 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -49,7 +49,7 @@ obj-$(CONFIG_GENERIC_ACL)	+= generic_acl.o
 
 obj-$(CONFIG_FHANDLE)		+= fhandle.o
 obj-$(CONFIG_FS_RICHACL)	+= richacl.o
-richacl-y			:= richacl_base.o richacl_inode.o
+richacl-y			:= richacl_base.o richacl_inode.o richacl_xattr.o
 
 obj-y				+= quota/
 
diff --git a/fs/richacl_xattr.c b/fs/richacl_xattr.c
new file mode 100644
index 0000000..02a7986
--- /dev/null
+++ b/fs/richacl_xattr.c
@@ -0,0 +1,156 @@
+/*
+ * Copyright (C) 2006, 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/richacl_xattr.h>
+
+MODULE_LICENSE("GPL");
+
+/**
+ * richacl_from_xattr  -  convert a richacl xattr into the in-memory representation
+ */
+struct richacl *
+richacl_from_xattr(const void *value, size_t size)
+{
+	const struct richacl_xattr *xattr_acl = value;
+	const struct richace_xattr *xattr_ace = (void *)(xattr_acl + 1);
+	struct richacl *acl;
+	struct richace *ace;
+	int count;
+
+	if (size < sizeof(struct richacl_xattr) ||
+	    xattr_acl->a_version != ACL4_XATTR_VERSION ||
+	    (xattr_acl->a_flags & ~ACL4_VALID_FLAGS))
+		return ERR_PTR(-EINVAL);
+
+	count = le16_to_cpu(xattr_acl->a_count);
+	if (count > ACL4_XATTR_MAX_COUNT)
+		return ERR_PTR(-EINVAL);
+
+	acl = richacl_alloc(count);
+	if (!acl)
+		return ERR_PTR(-ENOMEM);
+
+	acl->a_flags = xattr_acl->a_flags;
+	acl->a_owner_mask = le32_to_cpu(xattr_acl->a_owner_mask);
+	if (acl->a_owner_mask & ~ACE4_VALID_MASK)
+		goto fail_einval;
+	acl->a_group_mask = le32_to_cpu(xattr_acl->a_group_mask);
+	if (acl->a_group_mask & ~ACE4_VALID_MASK)
+		goto fail_einval;
+	acl->a_other_mask = le32_to_cpu(xattr_acl->a_other_mask);
+	if (acl->a_other_mask & ~ACE4_VALID_MASK)
+		goto fail_einval;
+
+	richacl_for_each_entry(ace, acl) {
+		const char *who = (void *)(xattr_ace + 1), *end;
+		ssize_t used = (void *)who - value;
+
+		if (used > size)
+			goto fail_einval;
+		end = memchr(who, 0, size - used);
+		if (!end)
+			goto fail_einval;
+
+		ace->e_type = le16_to_cpu(xattr_ace->e_type);
+		ace->e_flags = le16_to_cpu(xattr_ace->e_flags);
+		ace->e_mask = le32_to_cpu(xattr_ace->e_mask);
+		ace->u.e_id = le32_to_cpu(xattr_ace->e_id);
+
+		if (ace->e_flags & ~ACE4_VALID_FLAGS)
+			goto fail_einval;
+		if (ace->e_type > ACE4_ACCESS_DENIED_ACE_TYPE ||
+		    (ace->e_mask & ~ACE4_VALID_MASK))
+			goto fail_einval;
+
+		if (who == end) {
+			if (ace->u.e_id == -1)
+				goto fail_einval;  /* uid/gid needed */
+		} else if (richace_set_who(ace, who))
+			goto fail_einval;
+
+		xattr_ace = (void *)who + ALIGN(end - who + 1, 4);
+	}
+
+	return acl;
+
+fail_einval:
+	richacl_put(acl);
+	return ERR_PTR(-EINVAL);
+}
+EXPORT_SYMBOL_GPL(richacl_from_xattr);
+
+/**
+ * richacl_xattr_size  -  compute the size of the xattr representation of @acl
+ */
+size_t
+richacl_xattr_size(const struct richacl *acl)
+{
+	size_t size = sizeof(struct richacl_xattr);
+	const struct richace *ace;
+
+	richacl_for_each_entry(ace, acl) {
+		size += sizeof(struct richace_xattr) +
+			(richace_is_unix_id(ace) ? 4 :
+			 ALIGN(strlen(ace->u.e_who) + 1, 4));
+	}
+	return size;
+}
+EXPORT_SYMBOL_GPL(richacl_xattr_size);
+
+/**
+ * richacl_to_xattr  -  convert @acl into its xattr representation
+ * @acl:	the richacl to convert
+ * @buffer:	buffer of size richacl_xattr_size(@acl) for the result
+ */
+void
+richacl_to_xattr(const struct richacl *acl, void *buffer)
+{
+	struct richacl_xattr *xattr_acl = buffer;
+	struct richace_xattr *xattr_ace;
+	const struct richace *ace;
+
+	xattr_acl->a_version = ACL4_XATTR_VERSION;
+	xattr_acl->a_flags = acl->a_flags;
+	xattr_acl->a_count = cpu_to_le16(acl->a_count);
+
+	xattr_acl->a_owner_mask = cpu_to_le32(acl->a_owner_mask);
+	xattr_acl->a_group_mask = cpu_to_le32(acl->a_group_mask);
+	xattr_acl->a_other_mask = cpu_to_le32(acl->a_other_mask);
+
+	xattr_ace = (void *)(xattr_acl + 1);
+	richacl_for_each_entry(ace, acl) {
+		xattr_ace->e_type = cpu_to_le16(ace->e_type);
+		xattr_ace->e_flags = cpu_to_le16(ace->e_flags &
+						 ACE4_VALID_FLAGS);
+		xattr_ace->e_mask = cpu_to_le32(ace->e_mask);
+		if (richace_is_unix_id(ace)) {
+			xattr_ace->e_id = cpu_to_le32(ace->u.e_id);
+			memset(xattr_ace->e_who, 0, 4);
+			xattr_ace = (void *)xattr_ace->e_who + 4;
+		} else {
+			int sz = ALIGN(strlen(ace->u.e_who) + 1, 4);
+
+			xattr_ace->e_id = cpu_to_le32(-1);
+			memset(xattr_ace->e_who + sz - 4, 0, 4);
+			strcpy(xattr_ace->e_who, ace->u.e_who);
+			xattr_ace = (void *)xattr_ace->e_who + sz;
+		}
+	}
+}
+EXPORT_SYMBOL_GPL(richacl_to_xattr);
diff --git a/include/linux/richacl_xattr.h b/include/linux/richacl_xattr.h
new file mode 100644
index 0000000..f79ec12
--- /dev/null
+++ b/include/linux/richacl_xattr.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2006, 2010  Novell, Inc.
+ * Written by Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef __RICHACL_XATTR_H
+#define __RICHACL_XATTR_H
+
+#include <linux/richacl.h>
+
+#define RICHACL_XATTR "system.richacl"
+
+struct richace_xattr {
+	__le16		e_type;
+	__le16		e_flags;
+	__le32		e_mask;
+	__le32		e_id;
+	char		e_who[0];
+};
+
+struct richacl_xattr {
+	unsigned char	a_version;
+	unsigned char	a_flags;
+	__le16		a_count;
+	__le32		a_owner_mask;
+	__le32		a_group_mask;
+	__le32		a_other_mask;
+};
+
+#define ACL4_XATTR_VERSION	0
+#define ACL4_XATTR_MAX_COUNT	1024
+
+extern struct richacl *richacl_from_xattr(const void *, size_t);
+extern size_t richacl_xattr_size(const struct richacl *acl);
+extern void richacl_to_xattr(const struct richacl *, void *);
+
+#endif /* __RICHACL_XATTR_H */
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 22/26] vfs: Cache richacl in struct inode
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (20 preceding siblings ...)
  2011-10-18 15:32   ` Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 23/26] vfs: Add richacl permission check Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: Andreas Gruenbacher <agruen@kernel.org>

Cache richacls in struct inode so that this doesn't have to be done
individually in each filesystem.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/inode.c              |   25 +++++++++++++++++----
 include/linux/fs.h      |   12 ++++++++-
 include/linux/richacl.h |   53 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 83 insertions(+), 7 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index ec79246..1b442cf 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -26,6 +26,7 @@
 #include <linux/ima.h>
 #include <linux/cred.h>
 #include <linux/buffer_head.h> /* for inode_has_buffers */
+#include <linux/richacl.h>
 #include "internal.h"
 
 /*
@@ -192,7 +193,12 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 	inode->i_private = NULL;
 	inode->i_mapping = mapping;
 #ifdef CONFIG_FS_POSIX_ACL
-	inode->i_acl = inode->i_default_acl = ACL_NOT_CACHED;
+	if (IS_POSIXACL(inode))
+		inode->i_acl = inode->i_default_acl = ACL_NOT_CACHED;
+#endif
+#ifdef CONFIG_FS_RICHACL
+	if (IS_RICHACL(inode))
+		inode->i_richacl = ACL_NOT_CACHED;
 #endif
 
 #ifdef CONFIG_FSNOTIFY
@@ -242,10 +248,19 @@ void __destroy_inode(struct inode *inode)
 	security_inode_free(inode);
 	fsnotify_inode_delete(inode);
 #ifdef CONFIG_FS_POSIX_ACL
-	if (inode->i_acl && inode->i_acl != ACL_NOT_CACHED)
-		posix_acl_release(inode->i_acl);
-	if (inode->i_default_acl && inode->i_default_acl != ACL_NOT_CACHED)
-		posix_acl_release(inode->i_default_acl);
+	if (IS_POSIXACL(inode)) {
+		if (inode->i_acl && inode->i_acl != ACL_NOT_CACHED)
+			posix_acl_release(inode->i_acl);
+		if (inode->i_default_acl &&
+		    inode->i_default_acl != ACL_NOT_CACHED)
+			posix_acl_release(inode->i_default_acl);
+	}
+#endif
+#ifdef CONFIG_FS_RICHACL
+	if (IS_RICHACL(inode)) {
+		if (inode->i_richacl && inode->i_richacl != ACL_NOT_CACHED)
+			richacl_put(inode->i_richacl);
+	}
 #endif
 	this_cpu_dec(nr_inodes);
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ac1d8e5..771955c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -755,6 +755,7 @@ static inline int mapping_writably_mapped(struct address_space *mapping)
 #endif
 
 struct posix_acl;
+struct richacl;
 #define ACL_NOT_CACHED ((void *)(-1))
 
 #define IOP_FASTPERM	0x0001
@@ -773,10 +774,17 @@ struct inode {
 	gid_t			i_gid;
 	unsigned int		i_flags;
 
+	union {
 #ifdef CONFIG_FS_POSIX_ACL
-	struct posix_acl	*i_acl;
-	struct posix_acl	*i_default_acl;
+		struct {
+			struct posix_acl *i_acl;
+			struct posix_acl *i_default_acl;
+		};
 #endif
+#ifdef CONFIG_FS_RICHACL
+		struct richacl	*i_richacl;
+#endif
+	};
 
 	const struct inode_operations	*i_op;
 	struct super_block	*i_sb;
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 761585a..694b7dc 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -189,6 +189,59 @@ richacl_put(struct richacl *acl)
 		kfree(acl);
 }
 
+#ifdef CONFIG_FS_RICHACL
+static inline struct richacl *get_cached_richacl(struct inode *inode)
+{
+	struct richacl **p, *acl;
+
+	p = &inode->i_richacl;
+	acl = ACCESS_ONCE(*p);
+	if (acl) {
+		spin_lock(&inode->i_lock);
+		acl = *p;
+		if (acl != ACL_NOT_CACHED)
+			acl = richacl_get(acl);
+		spin_unlock(&inode->i_lock);
+	}
+	return acl;
+}
+
+static inline void set_cached_richacl(struct inode *inode,
+				      struct richacl *acl)
+{
+	struct richacl *old = NULL;
+	spin_lock(&inode->i_lock);
+	old = inode->i_richacl;
+	inode->i_richacl = richacl_get(acl);
+	spin_unlock(&inode->i_lock);
+	if (old != ACL_NOT_CACHED)
+		richacl_put(old);
+}
+
+static inline void forget_cached_richacl(struct inode *inode)
+{
+	struct richacl *old = NULL;
+	spin_lock(&inode->i_lock);
+	old = inode->i_richacl;
+	inode->i_richacl = ACL_NOT_CACHED;
+	spin_unlock(&inode->i_lock);
+	if (old != ACL_NOT_CACHED)
+		richacl_put(old);
+}
+
+static inline int negative_cached_richacl(struct inode *inode)
+{
+	struct richacl **p, *acl;
+
+	p = &inode->i_richacl;
+	acl = ACCESS_ONCE(*p);
+	if (acl)
+		return 0;
+	return 1;
+}
+
+#endif
+
 static inline int
 richacl_is_auto_inherit(const struct richacl *acl)
 {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 23/26] vfs: Add richacl permission check
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (21 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 22/26] vfs: Cache richacl in struct inode Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:32 ` [PATCH -V7 24/26] ext4: Use IS_POSIXACL() to check for POSIX ACL support Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/attr.c               |    6 +++-
 fs/namei.c              |   13 ++++++++++-
 fs/richacl_base.c       |   54 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h      |    2 +-
 include/linux/richacl.h |    2 +
 5 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 00578b9..2b445ba 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -13,6 +13,7 @@
 #include <linux/fsnotify.h>
 #include <linux/fcntl.h>
 #include <linux/security.h>
+#include <linux/richacl.h>
 
 static int richacl_change_ok(struct inode *inode, int mask)
 {
@@ -21,8 +22,9 @@ static int richacl_change_ok(struct inode *inode, int mask)
 
 	if (inode->i_op->permission)
 		return inode->i_op->permission(inode, mask);
-
-	return check_acl(inode, mask);
+	if (inode->i_op->get_richacl)
+		return check_richacl(inode, mask);
+	return -EPERM;
 }
 
 static bool inode_uid_change_ok(struct inode *inode, uid_t ia_uid)
diff --git a/fs/namei.c b/fs/namei.c
index 0c28f95..5d8f21e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -33,6 +33,7 @@
 #include <linux/device_cgroup.h>
 #include <linux/fs_struct.h>
 #include <linux/posix_acl.h>
+#include <linux/richacl.h>
 #include <asm/uaccess.h>
 
 #include "internal.h"
@@ -174,7 +175,7 @@ void putname(const char *name)
 EXPORT_SYMBOL(putname);
 #endif
 
-int check_acl(struct inode *inode, int mask)
+static int check_posix_acl(struct inode *inode, int mask)
 {
 #ifdef CONFIG_FS_POSIX_ACL
 	struct posix_acl *acl;
@@ -220,6 +221,16 @@ int check_acl(struct inode *inode, int mask)
 	return -EAGAIN;
 }
 
+static int check_acl(struct inode *inode, int mask)
+{
+	if (IS_POSIXACL(inode))
+		return check_posix_acl(inode, mask);
+	else if (IS_RICHACL(inode))
+		return check_richacl(inode, mask);
+	else
+		return -EAGAIN;
+}
+
 /*
  * This does the basic permission checking
  */
diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index bde2eea..9a57039 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -622,3 +622,57 @@ richacl_equiv_mode(const struct richacl *acl, mode_t *mode_p)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(richacl_equiv_mode);
+
+int check_richacl(struct inode *inode, int want)
+{
+#ifdef CONFIG_FS_RICHACL
+	struct richacl *acl;
+	int richacl_mask = richacl_want_to_mask(want);
+
+	if (want & MAY_NOT_BLOCK) {
+		acl = rcu_dereference(inode->i_richacl);
+		if (!acl)
+			return -EAGAIN;
+		/* no ->get_acl() calls in RCU mode... */
+		if (acl == ACL_NOT_CACHED)
+			return -ECHILD;
+		return richacl_permission(inode, acl, richacl_mask);
+	}
+	return richacl_check_acl(inode, richacl_mask);
+#endif
+	return -EAGAIN;
+}
+
+int richacl_check_acl(struct inode *inode, int richacl_mask)
+{
+
+#ifdef CONFIG_FS_RICHACL
+	struct richacl *acl;
+	acl = get_cached_richacl(inode);
+	/*
+	 * A filesystem can force a ACL callback by just never filling the
+	 * ACL cache. But normally you'd fill the cache either at inode
+	 * instantiation time, or on the first ->get_acl call.
+	 *
+	 * If the filesystem doesn't have a get_acl() function at all, we'll
+	 * just create the negative cache entry.
+	 */
+	if (acl == ACL_NOT_CACHED) {
+		if (inode->i_op->get_acl) {
+			acl = inode->i_op->get_richacl(inode);
+			if (IS_ERR(acl))
+				return PTR_ERR(acl);
+		} else {
+			set_cached_richacl(inode, NULL);
+			return -EAGAIN;
+		}
+	}
+	if (acl) {
+		int error = richacl_permission(inode, acl, richacl_mask);
+		richacl_put(acl);
+		return error;
+	}
+#endif
+	return -EAGAIN;
+}
+EXPORT_SYMBOL_GPL(richacl_check_acl);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 771955c..e01bad7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1624,6 +1624,7 @@ struct inode_operations {
 	void * (*follow_link) (struct dentry *, struct nameidata *);
 	int (*permission) (struct inode *, int);
 	struct posix_acl * (*get_acl)(struct inode *, int);
+	struct richacl * (*get_richacl)(struct inode *);
 
 	int (*readlink) (struct dentry *, char __user *,int);
 	void (*put_link) (struct dentry *, struct nameidata *, void *);
@@ -2243,7 +2244,6 @@ extern sector_t bmap(struct inode *, sector_t);
 extern int notify_change(struct dentry *, struct iattr *);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
-extern int check_acl(struct inode *, int);
 
 static inline bool execute_ok(struct inode *inode)
 {
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 694b7dc..4af6d22 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -367,6 +367,8 @@ extern int richacl_permission(struct inode *, const struct richacl *,
 			      unsigned int);
 extern struct richacl *richacl_inherit(const struct richacl *, int);
 extern int richacl_equiv_mode(const struct richacl *, mode_t *);
+extern int check_richacl(struct inode *, int);
+extern int richacl_check_acl(struct inode *, int);
 
 /* richacl_inode.c */
 extern struct richacl *richacl_inherit_inode(const struct richacl *,
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 24/26] ext4: Use IS_POSIXACL() to check for POSIX ACL support
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (22 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 23/26] vfs: Add richacl permission check Aneesh Kumar K.V
@ 2011-10-18 15:32 ` Aneesh Kumar K.V
  2011-10-18 15:33 ` [PATCH -V7 25/26] ext4: Implement rich acl for ext4 Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:32 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Use IS_POSIXACL() instead of a file system specific mount flag since we
have IS_POSIXACL() in the vfs already, anyway.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
---
 fs/ext4/acl.c   |   16 ++++++++--------
 fs/ext4/ext4.h  |    1 -
 fs/ext4/super.c |   16 +++++-----------
 3 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c
index a5c29bb..525bbc3 100644
--- a/fs/ext4/acl.c
+++ b/fs/ext4/acl.c
@@ -139,7 +139,7 @@ ext4_get_acl(struct inode *inode, int type)
 	struct posix_acl *acl;
 	int retval;
 
-	if (!test_opt(inode->i_sb, POSIX_ACL))
+	if (!IS_POSIXACL(inode))
 		return NULL;
 
 	acl = get_cached_acl(inode, type);
@@ -248,7 +248,7 @@ ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir)
 	int error = 0;
 
 	if (!S_ISLNK(inode->i_mode)) {
-		if (test_opt(dir->i_sb, POSIX_ACL)) {
+		if (IS_POSIXACL(inode)) {
 			acl = ext4_get_acl(dir, ACL_TYPE_DEFAULT);
 			if (IS_ERR(acl))
 				return PTR_ERR(acl);
@@ -256,7 +256,7 @@ ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir)
 		if (!acl)
 			inode->i_mode &= ~current_umask();
 	}
-	if (test_opt(inode->i_sb, POSIX_ACL) && acl) {
+	if (IS_POSIXACL(inode) && acl) {
 		if (S_ISDIR(inode->i_mode)) {
 			error = ext4_set_acl(handle, inode,
 					     ACL_TYPE_DEFAULT, acl);
@@ -302,7 +302,7 @@ ext4_acl_chmod(struct inode *inode)
 
 	if (S_ISLNK(inode->i_mode))
 		return -EOPNOTSUPP;
-	if (!test_opt(inode->i_sb, POSIX_ACL))
+	if (!IS_POSIXACL(inode))
 		return 0;
 	acl = ext4_get_acl(inode, ACL_TYPE_ACCESS);
 	if (IS_ERR(acl) || !acl)
@@ -337,7 +337,7 @@ ext4_xattr_list_acl_access(struct dentry *dentry, char *list, size_t list_len,
 {
 	const size_t size = sizeof(POSIX_ACL_XATTR_ACCESS);
 
-	if (!test_opt(dentry->d_sb, POSIX_ACL))
+	if (!IS_POSIXACL(dentry->d_inode))
 		return 0;
 	if (list && size <= list_len)
 		memcpy(list, POSIX_ACL_XATTR_ACCESS, size);
@@ -350,7 +350,7 @@ ext4_xattr_list_acl_default(struct dentry *dentry, char *list, size_t list_len,
 {
 	const size_t size = sizeof(POSIX_ACL_XATTR_DEFAULT);
 
-	if (!test_opt(dentry->d_sb, POSIX_ACL))
+	if (!IS_POSIXACL(dentry->d_inode))
 		return 0;
 	if (list && size <= list_len)
 		memcpy(list, POSIX_ACL_XATTR_DEFAULT, size);
@@ -366,7 +366,7 @@ ext4_xattr_get_acl(struct dentry *dentry, const char *name, void *buffer,
 
 	if (strcmp(name, "") != 0)
 		return -EINVAL;
-	if (!test_opt(dentry->d_sb, POSIX_ACL))
+	if (!IS_POSIXACL(dentry->d_inode))
 		return -EOPNOTSUPP;
 
 	acl = ext4_get_acl(dentry->d_inode, type);
@@ -391,7 +391,7 @@ ext4_xattr_set_acl(struct dentry *dentry, const char *name, const void *value,
 
 	if (strcmp(name, "") != 0)
 		return -EINVAL;
-	if (!test_opt(inode->i_sb, POSIX_ACL))
+	if (!IS_POSIXACL(dentry->d_inode))
 		return -EOPNOTSUPP;
 	if (!inode_owner_or_capable(inode))
 		return -EPERM;
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b7d7bd0..6627cc8 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -901,7 +901,6 @@ struct ext4_inode_info {
 #define EXT4_MOUNT_UPDATE_JOURNAL	0x01000	/* Update the journal format */
 #define EXT4_MOUNT_NO_UID32		0x02000  /* Disable 32-bit UIDs */
 #define EXT4_MOUNT_XATTR_USER		0x04000	/* Extended user attributes */
-#define EXT4_MOUNT_POSIX_ACL		0x08000	/* POSIX Access Control Lists */
 #define EXT4_MOUNT_NO_AUTO_DA_ALLOC	0x10000	/* No auto delalloc mapping */
 #define EXT4_MOUNT_BARRIER		0x20000 /* Use block barriers */
 #define EXT4_MOUNT_QUOTA		0x80000 /* Some quota option set */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 44d0c8d..99d72cf 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1066,9 +1066,9 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
 		seq_puts(seq, ",nouser_xattr");
 #endif
 #ifdef CONFIG_EXT4_FS_POSIX_ACL
-	if (test_opt(sb, POSIX_ACL) && !(def_mount_opts & EXT4_DEFM_ACL))
+	if ((sb->s_flags & MS_POSIXACL) && !(def_mount_opts & EXT4_DEFM_ACL))
 		seq_puts(seq, ",acl");
-	if (!test_opt(sb, POSIX_ACL) && (def_mount_opts & EXT4_DEFM_ACL))
+	if (!(sb->s_flags & MS_POSIXACL) && (def_mount_opts & EXT4_DEFM_ACL))
 		seq_puts(seq, ",noacl");
 #endif
 	if (sbi->s_commit_interval != JBD2_DEFAULT_MAX_COMMIT_AGE*HZ) {
@@ -1587,10 +1587,10 @@ static int parse_options(char *options, struct super_block *sb,
 #endif
 #ifdef CONFIG_EXT4_FS_POSIX_ACL
 		case Opt_acl:
-			set_opt(sb, POSIX_ACL);
+			sb->s_flags |= MS_POSIXACL;
 			break;
 		case Opt_noacl:
-			clear_opt(sb, POSIX_ACL);
+			sb->s_flags &= ~MS_POSIXACL;
 			break;
 #else
 		case Opt_acl:
@@ -3170,7 +3170,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 	set_opt(sb, XATTR_USER);
 #endif
 #ifdef CONFIG_EXT4_FS_POSIX_ACL
-	set_opt(sb, POSIX_ACL);
+	sb->s_flags |= MS_POSIXACL;
 #endif
 	set_opt(sb, MBLK_IO_SUBMIT);
 	if ((def_mount_opts & EXT4_DEFM_JMODE) == EXT4_DEFM_JMODE_DATA)
@@ -3224,9 +3224,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 			   &journal_ioprio, NULL, 0))
 		goto failed_mount;
 
-	sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
-		(test_opt(sb, POSIX_ACL) ? MS_POSIXACL : 0);
-
 	if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV &&
 	    (EXT4_HAS_COMPAT_FEATURE(sb, ~0U) ||
 	     EXT4_HAS_RO_COMPAT_FEATURE(sb, ~0U) ||
@@ -4351,9 +4348,6 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
 	if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
 		ext4_abort(sb, "Abort forced by user");
 
-	sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
-		(test_opt(sb, POSIX_ACL) ? MS_POSIXACL : 0);
-
 	es = sbi->s_es;
 
 	if (sbi->s_journal) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 25/26] ext4: Implement rich acl for ext4
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (23 preceding siblings ...)
  2011-10-18 15:32 ` [PATCH -V7 24/26] ext4: Use IS_POSIXACL() to check for POSIX ACL support Aneesh Kumar K.V
@ 2011-10-18 15:33 ` Aneesh Kumar K.V
  2011-10-18 18:41   ` Andreas Dilger
  2011-10-18 15:33 ` [PATCH -V7 26/26] ext4: Add Ext4 compat richacl feature flag Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  27 siblings, 1 reply; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:33 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Support the richacl permission model in ext4.  The richacls are stored
in "system.richacl" xattrs.This need to be enabled by tune2fs or during
mkfs.ext4

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
---
 fs/ext4/Kconfig   |   15 ++++
 fs/ext4/Makefile  |    1 +
 fs/ext4/acl.c     |    9 +-
 fs/ext4/acl.h     |    4 +-
 fs/ext4/file.c    |    4 +-
 fs/ext4/ialloc.c  |    7 ++-
 fs/ext4/inode.c   |   10 ++-
 fs/ext4/namei.c   |    7 +-
 fs/ext4/richacl.c |  227 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ext4/richacl.h |   46 +++++++++++
 fs/ext4/xattr.c   |    6 ++
 fs/ext4/xattr.h   |    2 +
 12 files changed, 324 insertions(+), 14 deletions(-)
 create mode 100644 fs/ext4/richacl.c
 create mode 100644 fs/ext4/richacl.h

diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig
index 9ed1bb1..a22b8f1d 100644
--- a/fs/ext4/Kconfig
+++ b/fs/ext4/Kconfig
@@ -83,3 +83,18 @@ config EXT4_DEBUG
 
 	  If you select Y here, then you will be able to turn on debugging
 	  with a command such as "echo 1 > /sys/kernel/debug/ext4/mballoc-debug"
+
+config EXT4_FS_RICHACL
+      bool "Ext4 Rich Access Control Lists (EXPERIMENTAL)"
+      depends on EXT4_FS_XATTR && EXPERIMENTAL
+      select FS_RICHACL
+      help
+	Rich ACLs are an implementation of NFSv4 ACLs, extended by file masks
+	to fit into the standard POSIX file permission model.  They are
+	designed to work seamlessly locally as well as across the NFSv4 and
+	CIFS/SMB2 network file system protocols.
+
+	To learn more about Rich ACL, visit
+	http://acl.bestbits.at/richacl/
+
+	If you don't know what Rich ACLs are, say N
diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile
index 56fd8f86..9cd271a 100644
--- a/fs/ext4/Makefile
+++ b/fs/ext4/Makefile
@@ -12,3 +12,4 @@ ext4-y	:= balloc.o bitmap.o dir.o file.o fsync.o ialloc.o inode.o page-io.o \
 ext4-$(CONFIG_EXT4_FS_XATTR)		+= xattr.o xattr_user.o xattr_trusted.o
 ext4-$(CONFIG_EXT4_FS_POSIX_ACL)	+= acl.o
 ext4-$(CONFIG_EXT4_FS_SECURITY)		+= xattr_security.o
+ext4-$(CONFIG_EXT4_FS_RICHACL) 		+= richacl.o
diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c
index 525bbc3..00e54b8 100644
--- a/fs/ext4/acl.c
+++ b/fs/ext4/acl.c
@@ -131,8 +131,7 @@ fail:
  *
  * inode->i_mutex: don't care
  */
-struct posix_acl *
-ext4_get_acl(struct inode *inode, int type)
+struct posix_acl *ext4_get_posix_acl(struct inode *inode, int type)
 {
 	int name_index;
 	char *value = NULL;
@@ -249,7 +248,7 @@ ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir)
 
 	if (!S_ISLNK(inode->i_mode)) {
 		if (IS_POSIXACL(inode)) {
-			acl = ext4_get_acl(dir, ACL_TYPE_DEFAULT);
+			acl = ext4_get_posix_acl(dir, ACL_TYPE_DEFAULT);
 			if (IS_ERR(acl))
 				return PTR_ERR(acl);
 		}
@@ -304,7 +303,7 @@ ext4_acl_chmod(struct inode *inode)
 		return -EOPNOTSUPP;
 	if (!IS_POSIXACL(inode))
 		return 0;
-	acl = ext4_get_acl(inode, ACL_TYPE_ACCESS);
+	acl = ext4_get_posix_acl(inode, ACL_TYPE_ACCESS);
 	if (IS_ERR(acl) || !acl)
 		return PTR_ERR(acl);
 	error = posix_acl_chmod(&acl, GFP_KERNEL, inode->i_mode);
@@ -369,7 +368,7 @@ ext4_xattr_get_acl(struct dentry *dentry, const char *name, void *buffer,
 	if (!IS_POSIXACL(dentry->d_inode))
 		return -EOPNOTSUPP;
 
-	acl = ext4_get_acl(dentry->d_inode, type);
+	acl = ext4_get_posix_acl(dentry->d_inode, type);
 	if (IS_ERR(acl))
 		return PTR_ERR(acl);
 	if (acl == NULL)
diff --git a/fs/ext4/acl.h b/fs/ext4/acl.h
index 18cb39e..ac2bad2 100644
--- a/fs/ext4/acl.h
+++ b/fs/ext4/acl.h
@@ -54,13 +54,13 @@ static inline int ext4_acl_count(size_t size)
 #ifdef CONFIG_EXT4_FS_POSIX_ACL
 
 /* acl.c */
-struct posix_acl *ext4_get_acl(struct inode *inode, int type);
+struct posix_acl *ext4_get_posix_acl(struct inode *inode, int type);
 extern int ext4_acl_chmod(struct inode *);
 extern int ext4_init_acl(handle_t *, struct inode *, struct inode *);
 
 #else  /* CONFIG_EXT4_FS_POSIX_ACL */
 #include <linux/sched.h>
-#define ext4_get_acl NULL
+#define ext4_get_posix_acl NULL
 
 static inline int
 ext4_acl_chmod(struct inode *inode)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index e4095e9..2f515a2 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -28,6 +28,7 @@
 #include "ext4_jbd2.h"
 #include "xattr.h"
 #include "acl.h"
+#include "richacl.h"
 
 /*
  * Called when an inode is released. Note that this is different
@@ -301,7 +302,8 @@ const struct inode_operations ext4_file_inode_operations = {
 	.listxattr	= ext4_listxattr,
 	.removexattr	= generic_removexattr,
 #endif
-	.get_acl	= ext4_get_acl,
+	.get_acl	= ext4_get_posix_acl,
+	.get_richacl	= ext4_get_richacl,
 	.fiemap		= ext4_fiemap,
 };
 
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 9c63f27..77ea40b 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -28,6 +28,7 @@
 #include "ext4_jbd2.h"
 #include "xattr.h"
 #include "acl.h"
+#include "richacl.h"
 
 #include <trace/events/ext4.h>
 
@@ -1039,7 +1040,11 @@ got:
 	if (err)
 		goto fail_drop;
 
-	err = ext4_init_acl(handle, inode, dir);
+	if (EXT4_IS_RICHACL(dir))
+		err = ext4_init_richacl(handle, inode, dir);
+	else
+		err = ext4_init_acl(handle, inode, dir);
+
 	if (err)
 		goto fail_free_drop;
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 986e238..4b536e5 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -44,6 +44,7 @@
 #include "acl.h"
 #include "ext4_extents.h"
 #include "truncate.h"
+#include "richacl.h"
 
 #include <trace/events/ext4.h>
 
@@ -3945,9 +3946,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 	if (orphan && inode->i_nlink)
 		ext4_orphan_del(NULL, inode);
 
-	if (!rc && (ia_valid & ATTR_MODE))
-		rc = ext4_acl_chmod(inode);
-
+	if (!rc && (ia_valid & ATTR_MODE)) {
+		if (EXT4_IS_RICHACL(inode))
+			rc = ext4_richacl_chmod(inode);
+		else
+			rc = ext4_acl_chmod(inode);
+	}
 err_out:
 	ext4_std_error(inode->i_sb, error);
 	if (!error)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 1c924fa..b03efb5 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -39,6 +39,7 @@
 
 #include "xattr.h"
 #include "acl.h"
+#include "richacl.h"
 
 #include <trace/events/ext4.h>
 /*
@@ -2586,7 +2587,8 @@ const struct inode_operations ext4_dir_inode_operations = {
 	.listxattr	= ext4_listxattr,
 	.removexattr	= generic_removexattr,
 #endif
-	.get_acl	= ext4_get_acl,
+	.get_acl	= ext4_get_posix_acl,
+	.get_richacl	= ext4_get_richacl,
 	.fiemap         = ext4_fiemap,
 };
 
@@ -2598,5 +2600,6 @@ const struct inode_operations ext4_special_inode_operations = {
 	.listxattr	= ext4_listxattr,
 	.removexattr	= generic_removexattr,
 #endif
-	.get_acl	= ext4_get_acl,
+	.get_acl	= ext4_get_posix_acl,
+	.get_richacl	= ext4_get_richacl,
 };
diff --git a/fs/ext4/richacl.c b/fs/ext4/richacl.c
new file mode 100644
index 0000000..a0f63f8
--- /dev/null
+++ b/fs/ext4/richacl.c
@@ -0,0 +1,227 @@
+/*
+ * Copyright IBM Corporation, 2010
+ * Author Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/richacl_xattr.h>
+
+#include "ext4.h"
+#include "ext4_jbd2.h"
+#include "xattr.h"
+#include "acl.h"
+#include "richacl.h"
+
+struct richacl *
+ext4_get_richacl(struct inode *inode)
+{
+	const int name_index = EXT4_XATTR_INDEX_RICHACL;
+	void *value = NULL;
+	struct richacl *acl;
+	int retval;
+
+	if (!IS_RICHACL(inode))
+		return ERR_PTR(-EOPNOTSUPP);
+	acl = get_cached_richacl(inode);
+	if (acl != ACL_NOT_CACHED)
+		return acl;
+	retval = ext4_xattr_get(inode, name_index, "", NULL, 0);
+	if (retval > 0) {
+		value = kmalloc(retval, GFP_KERNEL);
+		if (!value)
+			return ERR_PTR(-ENOMEM);
+		retval = ext4_xattr_get(inode, name_index, "", value, retval);
+	}
+	if (retval > 0) {
+		acl = richacl_from_xattr(value, retval);
+		if (acl == ERR_PTR(-EINVAL))
+			acl = ERR_PTR(-EIO);
+	} else if (retval == -ENODATA || retval == -ENOSYS)
+		acl = NULL;
+	else
+		acl = ERR_PTR(retval);
+	kfree(value);
+
+	if (!IS_ERR_OR_NULL(acl))
+		set_cached_richacl(inode, acl);
+
+	return acl;
+}
+
+static int
+ext4_set_richacl(handle_t *handle, struct inode *inode, struct richacl *acl)
+{
+	const int name_index = EXT4_XATTR_INDEX_RICHACL;
+	size_t size = 0;
+	void *value = NULL;
+	int retval;
+
+	if (acl) {
+		mode_t mode = inode->i_mode;
+		if (richacl_equiv_mode(acl, &mode) == 0) {
+			inode->i_mode = mode;
+			ext4_mark_inode_dirty(handle, inode);
+			acl = NULL;
+		}
+	}
+	if (acl) {
+		size = richacl_xattr_size(acl);
+		value = kmalloc(size, GFP_KERNEL);
+		if (!value)
+			return -ENOMEM;
+		richacl_to_xattr(acl, value);
+	}
+	if (handle)
+		retval = ext4_xattr_set_handle(handle, inode, name_index, "",
+					       value, size, 0);
+	else
+		retval = ext4_xattr_set(inode, name_index, "", value, size, 0);
+	kfree(value);
+	if (!retval)
+		set_cached_richacl(inode, acl);
+
+	return retval;
+}
+
+int
+ext4_init_richacl(handle_t *handle, struct inode *inode, struct inode *dir)
+{
+	struct richacl *dir_acl = NULL;
+
+	if (!S_ISLNK(inode->i_mode)) {
+		dir_acl = ext4_get_richacl(dir);
+		if (IS_ERR(dir_acl))
+			return PTR_ERR(dir_acl);
+	}
+	if (dir_acl) {
+		struct richacl *acl;
+		int retval;
+
+		acl = richacl_inherit_inode(dir_acl, inode);
+		richacl_put(dir_acl);
+
+		retval = PTR_ERR(acl);
+		if (acl && !IS_ERR(acl)) {
+			retval = ext4_set_richacl(handle, inode, acl);
+			richacl_put(acl);
+		}
+		return retval;
+	} else {
+		inode->i_mode &= ~current_umask();
+		return 0;
+	}
+}
+
+int
+ext4_richacl_chmod(struct inode *inode)
+{
+	struct richacl *acl;
+	int retval;
+
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+	acl = ext4_get_richacl(inode);
+	if (IS_ERR_OR_NULL(acl))
+		return PTR_ERR(acl);
+	acl = richacl_chmod(acl, inode->i_mode);
+	if (IS_ERR(acl))
+		return PTR_ERR(acl);
+	retval = ext4_set_richacl(NULL, inode, acl);
+	richacl_put(acl);
+
+	return retval;
+}
+
+static size_t
+ext4_xattr_list_richacl(struct dentry *dentry, char *list, size_t list_len,
+			const char *name, size_t name_len, int type)
+{
+	const size_t size = sizeof(RICHACL_XATTR);
+	if (!IS_RICHACL(dentry->d_inode))
+		return 0;
+	if (list && size <= list_len)
+		memcpy(list, RICHACL_XATTR, size);
+	return size;
+}
+
+static int
+ext4_xattr_get_richacl(struct dentry *dentry, const char *name, void *buffer,
+		size_t buffer_size, int type)
+{
+	struct richacl *acl;
+	size_t size;
+
+	if (strcmp(name, "") != 0)
+		return -EINVAL;
+	acl = ext4_get_richacl(dentry->d_inode);
+	if (IS_ERR(acl))
+		return PTR_ERR(acl);
+	if (acl == NULL)
+		return -ENODATA;
+	size = richacl_xattr_size(acl);
+	if (buffer) {
+		if (size > buffer_size)
+			return -ERANGE;
+		richacl_to_xattr(acl, buffer);
+	}
+	richacl_put(acl);
+
+	return size;
+}
+
+static int
+ext4_xattr_set_richacl(struct dentry *dentry, const char *name,
+		const void *value, size_t size, int flags, int type)
+{
+	handle_t *handle;
+	struct richacl *acl = NULL;
+	int retval, retries = 0;
+	struct inode *inode = dentry->d_inode;
+
+	if (!IS_RICHACL(dentry->d_inode))
+		return -EOPNOTSUPP;
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+	if (strcmp(name, "") != 0)
+		return -EINVAL;
+	if (current_fsuid() != inode->i_uid &&
+	    richacl_check_acl(inode, ACE4_WRITE_ACL) &&
+	    !capable(CAP_FOWNER))
+		return -EPERM;
+	if (value) {
+		acl = richacl_from_xattr(value, size);
+		if (IS_ERR(acl))
+			return PTR_ERR(acl);
+
+		inode->i_mode &= ~S_IRWXUGO;
+		inode->i_mode |= richacl_masks_to_mode(acl);
+	}
+
+retry:
+	handle = ext4_journal_start(inode, EXT4_DATA_TRANS_BLOCKS(inode->i_sb));
+	if (IS_ERR(handle))
+		return PTR_ERR(handle);
+	retval = ext4_set_richacl(handle, inode, acl);
+	ext4_journal_stop(handle);
+	if (retval == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+		goto retry;
+	richacl_put(acl);
+	return retval;
+}
+
+const struct xattr_handler ext4_richacl_xattr_handler = {
+	.prefix	= RICHACL_XATTR,
+	.list	= ext4_xattr_list_richacl,
+	.get	= ext4_xattr_get_richacl,
+	.set	= ext4_xattr_set_richacl,
+};
diff --git a/fs/ext4/richacl.h b/fs/ext4/richacl.h
new file mode 100644
index 0000000..2577c34
--- /dev/null
+++ b/fs/ext4/richacl.h
@@ -0,0 +1,46 @@
+/*
+ * Copyright IBM Corporation, 2010
+ * Author Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#ifndef __FS_EXT4_RICHACL_H
+#define __FS_EXT4_RICHACL_H
+
+#include <linux/richacl.h>
+
+#ifdef CONFIG_EXT4_FS_RICHACL
+
+#define EXT4_IS_RICHACL(inode) IS_RICHACL(inode)
+
+extern struct richacl *ext4_get_richacl(struct inode *);
+extern int ext4_init_richacl(handle_t *, struct inode *, struct inode *);
+extern int ext4_richacl_chmod(struct inode *);
+
+#else  /* CONFIG_FS_EXT4_RICHACL */
+
+#define EXT4_IS_RICHACL(inode) (0)
+#define ext4_get_richacl   NULL
+
+static inline int
+ext4_init_richacl(handle_t *handle, struct inode *inode, struct inode *dir)
+{
+	return 0;
+}
+
+static inline int
+ext4_richacl_chmod(struct inode *inode)
+{
+	return 0;
+}
+
+#endif  /* CONFIG_FS_EXT4_RICHACL */
+#endif  /* __FS_EXT4_RICHACL_H */
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index c757adc..9a00772 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -107,6 +107,9 @@ static const struct xattr_handler *ext4_xattr_handler_map[] = {
 #ifdef CONFIG_EXT4_FS_SECURITY
 	[EXT4_XATTR_INDEX_SECURITY]	     = &ext4_xattr_security_handler,
 #endif
+#ifdef CONFIG_EXT4_FS_RICHACL
+	[EXT4_XATTR_INDEX_RICHACL]           = &ext4_richacl_xattr_handler,
+#endif
 };
 
 const struct xattr_handler *ext4_xattr_handlers[] = {
@@ -119,6 +122,9 @@ const struct xattr_handler *ext4_xattr_handlers[] = {
 #ifdef CONFIG_EXT4_FS_SECURITY
 	&ext4_xattr_security_handler,
 #endif
+#ifdef CONFIG_EXT4_FS_RICHACL
+	&ext4_richacl_xattr_handler,
+#endif
 	NULL
 };
 
diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h
index 25b7387..d5ad729 100644
--- a/fs/ext4/xattr.h
+++ b/fs/ext4/xattr.h
@@ -21,6 +21,7 @@
 #define EXT4_XATTR_INDEX_TRUSTED		4
 #define	EXT4_XATTR_INDEX_LUSTRE			5
 #define EXT4_XATTR_INDEX_SECURITY	        6
+#define EXT4_XATTR_INDEX_RICHACL		7
 
 struct ext4_xattr_header {
 	__le32	h_magic;	/* magic number for identification */
@@ -70,6 +71,7 @@ extern const struct xattr_handler ext4_xattr_trusted_handler;
 extern const struct xattr_handler ext4_xattr_acl_access_handler;
 extern const struct xattr_handler ext4_xattr_acl_default_handler;
 extern const struct xattr_handler ext4_xattr_security_handler;
+extern const struct xattr_handler ext4_richacl_xattr_handler;
 
 extern ssize_t ext4_listxattr(struct dentry *, char *, size_t);
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH -V7 26/26] ext4: Add Ext4 compat richacl feature flag
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (24 preceding siblings ...)
  2011-10-18 15:33 ` [PATCH -V7 25/26] ext4: Implement rich acl for ext4 Aneesh Kumar K.V
@ 2011-10-18 15:33 ` Aneesh Kumar K.V
  2011-10-18 16:17 ` [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Shea Levy
  2011-10-19 22:21 ` J. Bruce Fields
  27 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-18 15:33 UTC (permalink / raw)
  To: agruen, bfields, akpm, viro, dhowells
  Cc: aneesh.kumar, linux-fsdevel, linux-nfs, linux-kernel

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

This feature flag can be used to enable richacl on
the file system. Once enabled the "acl" mount option
will enable richacl instead of posix acl. The patch also
removes the richacl mount option.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext4/ext4.h  |    1 +
 fs/ext4/super.c |   49 +++++++++++++++++++++++++++++++++++--------------
 2 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 6627cc8..c71d9fe 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1350,6 +1350,7 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
 #define EXT4_FEATURE_COMPAT_EXT_ATTR		0x0008
 #define EXT4_FEATURE_COMPAT_RESIZE_INODE	0x0010
 #define EXT4_FEATURE_COMPAT_DIR_INDEX		0x0020
+#define EXT4_FEATURE_COMPAT_RICHACL		0x0200
 
 #define EXT4_FEATURE_RO_COMPAT_SPARSE_SUPER	0x0001
 #define EXT4_FEATURE_RO_COMPAT_LARGE_FILE	0x0002
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 99d72cf..4a3f0dd 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1065,10 +1065,12 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
 	if (!test_opt(sb, XATTR_USER))
 		seq_puts(seq, ",nouser_xattr");
 #endif
-#ifdef CONFIG_EXT4_FS_POSIX_ACL
-	if ((sb->s_flags & MS_POSIXACL) && !(def_mount_opts & EXT4_DEFM_ACL))
+#if defined(CONFIG_EXT4_FS_POSIX_ACL) || defined(CONFIG_EXT4_FS_RICHACL)
+	if ((sb->s_flags & (MS_POSIXACL|MS_RICHACL)) &&
+	    !(def_mount_opts & EXT4_DEFM_ACL))
 		seq_puts(seq, ",acl");
-	if (!(sb->s_flags & MS_POSIXACL) && (def_mount_opts & EXT4_DEFM_ACL))
+	if (!(sb->s_flags & (MS_POSIXACL|MS_RICHACL)) &&
+	    (def_mount_opts & EXT4_DEFM_ACL))
 		seq_puts(seq, ",noacl");
 #endif
 	if (sbi->s_commit_interval != JBD2_DEFAULT_MAX_COMMIT_AGE*HZ) {
@@ -1421,6 +1423,32 @@ static ext4_fsblk_t get_sb_block(void **data)
 	return sb_block;
 }
 
+static void enable_acl(struct super_block *sb)
+{
+#if !defined(CONFIG_EXT4_FS_POSIX_ACL) && !defined(CONFIG_EXT4_FS_RICHACL)
+	ext4_msg(sb, KERN_ERR, "acl options not supported");
+	return;
+#endif
+	if (EXT4_HAS_COMPAT_FEATURE(sb, EXT4_FEATURE_COMPAT_RICHACL)) {
+		sb->s_flags |= MS_RICHACL;
+		sb->s_flags &= ~MS_POSIXACL;
+	} else {
+		sb->s_flags |= MS_POSIXACL;
+		sb->s_flags &= ~MS_RICHACL;
+	}
+	return;
+}
+
+static void disable_acl(struct super_block *sb)
+{
+#if !defined(CONFIG_EXT4_FS_POSIX_ACL) && !defined(CONFIG_EXT4_FS_RICHACL)
+	ext4_msg(sb, KERN_ERR, "acl options not supported");
+	return;
+#endif
+	sb->s_flags &= ~(MS_POSIXACL | MS_RICHACL);
+	return;
+}
+
 #define DEFAULT_JOURNAL_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))
 static char deprecated_msg[] = "Mount option \"%s\" will be removed by %s\n"
 	"Contact linux-ext4@vger.kernel.org if you think we should keep it.\n";
@@ -1585,19 +1613,12 @@ static int parse_options(char *options, struct super_block *sb,
 			ext4_msg(sb, KERN_ERR, "(no)user_xattr options not supported");
 			break;
 #endif
-#ifdef CONFIG_EXT4_FS_POSIX_ACL
 		case Opt_acl:
-			sb->s_flags |= MS_POSIXACL;
+			enable_acl(sb);
 			break;
 		case Opt_noacl:
-			sb->s_flags &= ~MS_POSIXACL;
+			disable_acl(sb);
 			break;
-#else
-		case Opt_acl:
-		case Opt_noacl:
-			ext4_msg(sb, KERN_ERR, "(no)acl options not supported");
-			break;
-#endif
 		case Opt_journal_update:
 			/* @@@ FIXME */
 			/* Eventually we will want to be able to create
@@ -3169,8 +3190,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 #ifdef CONFIG_EXT4_FS_XATTR
 	set_opt(sb, XATTR_USER);
 #endif
-#ifdef CONFIG_EXT4_FS_POSIX_ACL
-	sb->s_flags |= MS_POSIXACL;
+#if defined(CONFIG_EXT4_FS_POSIX_ACL) || defined(CONFIG_EXT4_FS_RICHACL)
+		enable_acl(sb);
 #endif
 	set_opt(sb, MBLK_IO_SUBMIT);
 	if ((def_mount_opts & EXT4_DEFM_JMODE) == EXT4_DEFM_JMODE_DATA)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 00/26]  New ACL format for better NFSv4 acl interoperability
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (25 preceding siblings ...)
  2011-10-18 15:33 ` [PATCH -V7 26/26] ext4: Add Ext4 compat richacl feature flag Aneesh Kumar K.V
@ 2011-10-18 16:17 ` Shea Levy
  2011-10-19  5:54   ` Aneesh Kumar K.V
  2011-10-19 22:21 ` J. Bruce Fields
  27 siblings, 1 reply; 66+ messages in thread
From: Shea Levy @ 2011-10-18 16:17 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, bfields, akpm, viro, dhowells, linux-fsdevel, linux-nfs,
	linux-kernel

On 10/18/11 11:32 AM, Aneesh Kumar K.V wrote:
> More details regarding richacl can be found at
> http://acl.bestbits.at/richacl/
>
FYI, this site says nfs4acls is the successor project of richacls, but 
from what I can see it is actually the predecessor. Is my understanding 
correct, or is nfs4acls the next-gen product here?

Regards,
Shea Levy

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 25/26] ext4: Implement rich acl for ext4
  2011-10-18 15:33 ` [PATCH -V7 25/26] ext4: Implement rich acl for ext4 Aneesh Kumar K.V
@ 2011-10-18 18:41   ` Andreas Dilger
  2011-10-19  5:43     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 66+ messages in thread
From: Andreas Dilger @ 2011-10-18 18:41 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, bfields, akpm, viro, dhowells, linux-fsdevel, linux-nfs,
	linux-kernel

On 2011-10-18, at 9:33 AM, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> 
> Support the richacl permission model in ext4.  The richacls are stored
> in "system.richacl" xattrs.This need to be enabled by tune2fs or during
> mkfs.ext4

It isn't clear from your commit comment or the code what needs to be enabled by tune2fs or mkfs.ext4.  Please list the specific ext4 feature
that needs to be enabled.

> +#ifdef CONFIG_EXT4_FS_RICHACL
> +#define EXT4_IS_RICHACL(inode) IS_RICHACL(inode)
> 

> +#else  /* CONFIG_FS_EXT4_RICHACL */
> +
> +#define EXT4_IS_RICHACL(inode) (0)

It is a bit confusing that you are using both EXT4_IS_RICHACL() and
IS_RICHACL() in this code.  Initially I thought EXT4_IS_RICHACL() was
checking an ext4-specific inode flag, but it seems that it is instead
conditional upon the configure flags.

It looks like it should be possible to use EXT4_IS_RICHACL() in all
of the code, since the richacl-specific code will not be compiled
anyway.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 25/26] ext4: Implement rich acl for ext4
  2011-10-18 18:41   ` Andreas Dilger
@ 2011-10-19  5:43     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-19  5:43 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: agruen, bfields, akpm, viro, dhowells, linux-fsdevel, linux-nfs,
	linux-kernel

On Tue, 18 Oct 2011 12:41:15 -0600, Andreas Dilger <adilger@dilger.ca> wrote:
> On 2011-10-18, at 9:33 AM, Aneesh Kumar K.V wrote:
> > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> > 
> > Support the richacl permission model in ext4.  The richacls are stored
> > in "system.richacl" xattrs.This need to be enabled by tune2fs or during
> > mkfs.ext4
> 
> It isn't clear from your commit comment or the code what needs to be enabled by tune2fs or mkfs.ext4.  Please list the specific ext4 feature
> that needs to be enabled.


The last patch explains the feature flag details 
http://article.gmane.org/gmane.linux.kernel/1204873

I am adding a new compat feature flag to indicate richacl is
enabled.

> 
> > +#ifdef CONFIG_EXT4_FS_RICHACL
> > +#define EXT4_IS_RICHACL(inode) IS_RICHACL(inode)
> > 
> 
> > +#else  /* CONFIG_FS_EXT4_RICHACL */
> > +
> > +#define EXT4_IS_RICHACL(inode) (0)
> 
> It is a bit confusing that you are using both EXT4_IS_RICHACL() and
> IS_RICHACL() in this code.  Initially I thought EXT4_IS_RICHACL() was
> checking an ext4-specific inode flag, but it seems that it is instead
> conditional upon the configure flags.
> 

The reason is to not do the superblock flag check when EXT4_FS_RICHACL is not
enabled.


> It looks like it should be possible to use EXT4_IS_RICHACL() in all
> of the code, since the richacl-specific code will not be compiled
> anyway.
> 

The reasoning is, all richacl specific code do check for whether
MS_RICHACL is enabled or not and the common file system code does
something similar to EXT4_IS_RICHACL() that is (0) when the file
system is not compiled with richacl option.

-aneesh

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 00/26]  New ACL format for better NFSv4 acl interoperability
  2011-10-18 16:17 ` [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Shea Levy
@ 2011-10-19  5:54   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-19  5:54 UTC (permalink / raw)
  To: Shea Levy
  Cc: agruen, bfields, akpm, viro, dhowells, linux-fsdevel, linux-nfs,
	linux-kernel

On Tue, 18 Oct 2011 12:17:56 -0400, Shea Levy <shea@shealevy.com> wrote:
> On 10/18/11 11:32 AM, Aneesh Kumar K.V wrote:
> > More details regarding richacl can be found at
> > http://acl.bestbits.at/richacl/
> >
> FYI, this site says nfs4acls is the successor project of richacls, but 
> from what I can see it is actually the predecessor. Is my understanding 
> correct, or is nfs4acls the next-gen product here?
> 

You are correct. it is the predecessor.

-aneesh


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 08/26] vfs: Add new file and directory create permission flags
  2011-10-18 15:32 ` [PATCH -V7 08/26] vfs: Add new file and directory create permission flags Aneesh Kumar K.V
@ 2011-10-19 16:42   ` J. Bruce Fields
  2011-10-20  5:20       ` Aneesh Kumar K.V
  0 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-19 16:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Tue, Oct 18, 2011 at 09:02:43PM +0530, Aneesh Kumar K.V wrote:
> From: Andreas Gruenbacher <agruen@kernel.org>
> 
> Some permission models distinguish between the permission to create a
> non-directory and a directory.  Pass this information down to
> inode_permission() as mask flags
...
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f3ebf86..60361c6 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -67,6 +67,8 @@ struct inodes_stat_t {
>  #define MAY_CHDIR		0x00000040
>  /* called from RCU mode, don't block */
>  #define MAY_NOT_BLOCK		0x00000080
> +#define MAY_CREATE_FILE		0x00000100
> +#define MAY_CREATE_DIR		0x00000200

Hm, are the flags in fs/nfsd/vfs.h going to need fixing up?

Looking at the nfsd code....  No, I guess it's OK, nfsd does

	err = inode_permission(inode, acc & (MAY_READ|MAY_WRITE|MAY_EXEC));

So we can wait to fix up any collisions until we need to pass these
extra bits.

--b.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 09/26] vfs: Add delete child and delete self permission flags
  2011-10-18 15:32 ` [PATCH -V7 09/26] vfs: Add delete child and delete self " Aneesh Kumar K.V
@ 2011-10-19 22:09   ` J. Bruce Fields
  2011-10-20  7:35       ` Aneesh Kumar K.V
  0 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-19 22:09 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Tue, Oct 18, 2011 at 09:02:44PM +0530, Aneesh Kumar K.V wrote:
> From: Andreas Gruenbacher <agruen@kernel.org>
> 
> Normally, deleting a file requires write access to the parent directory.
> Some permission models use a different permission on the parent
> directory to indicate delete access.  In addition, a process can have
> per-file delete access even without delete access on the parent
> directory.
> 
> Introduce two new inode_permission() mask flags and use them in
> may_delete()
> 
> Acked-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/namei.c         |   42 ++++++++++++++++++++++++++++--------------
>  include/linux/fs.h |    2 ++
>  2 files changed, 30 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index f6184b8..7bf42e8 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -337,7 +337,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
>   * are used for other things.
>   *
>   * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
> - * MAY_WRITE must also be set in @mask.
> + * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
>   */
>  int inode_permission(struct inode *inode, int mask)
>  {
> @@ -1853,7 +1853,7 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
>  		return 0;
>  
>  other_userns:
> -	return !ns_capable(inode_userns(inode), CAP_FOWNER);
> +	return 1;
>  }
>  
>  /*
> @@ -1875,30 +1875,44 @@ other_userns:
>   * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
>   *     nfs_async_unlink().
>   */
> -static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> +static int may_delete(struct inode *dir, struct dentry *victim,
> +		      int isdir, int replace)
>  {
> -	int error;
> +	struct inode *inode = victim->d_inode;
> +	int mask, replace_mask = 0, error, is_sticky;
> +
>  
> -	if (!victim->d_inode)
> +	if (!inode)
>  		return -ENOENT;
>  
>  	BUG_ON(victim->d_parent->d_inode != dir);
>  	audit_inode_child(victim, dir);
>  
> -	error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
> +	mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
> +	if (replace)
> +		replace_mask = S_ISDIR(inode->i_mode) ?
> +				MAY_CREATE_DIR : MAY_CREATE_FILE;
> +	is_sticky = check_sticky(dir, inode);
> +	error = inode_permission(dir, mask | replace_mask);
> +	if ((error || is_sticky) && IS_RICHACL(inode) &&
> +	    (inode_permission(dir, MAY_EXEC | replace_mask) == 0) &&
> +	    (inode_permission(inode, MAY_DELETE_SELF) == 0))
> +		error = 0;
> +	else if (!error && is_sticky &&
> +		 !ns_capable(inode_userns(inode), CAP_FOWNER))
> +		error = -EPERM;

Maybe I'm dense, but that big if-else-if is still giving me a headache.

The point is just to delay the ns_capable() check to avoid setting
PF_SUPERPRIV in cases where we weren't before?

How about putting using a helper function for the richacl check, and
calling it from check_sticky instead? That makes the above:

	error = inode_permission(dir, mask | replace_mask);
	if (error && !richacl_may_delete(dir, inode, replace_mask))
		return error;
	if (check_sticky(dir, inode, replace_mask))
		return -EPERM;

(As in the following--totally untested and possibly wrong.)

Also: the comment before may_delete() needs updating.

--b.

commit 7fe4b12ba6b914167ed1f1bc617af04eecbce7d1
Author: Andreas Gruenbacher <agruen@kernel.org>
Date:   Tue Oct 18 15:17:50 2011 +0530

    vfs: Add delete child and delete self permission flags
    
    Normally, deleting a file requires write access to the parent directory.
    Some permission models use a different permission on the parent
    directory to indicate delete access.  In addition, a process can have
    per-file delete access even without delete access on the parent
    directory.
    
    Introduce two new inode_permission() mask flags and use them in
    may_delete()
    
    Acked-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

diff --git a/fs/namei.c b/fs/namei.c
index f6184b8..f0cccd9 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -337,7 +337,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
  * are used for other things.
  *
  * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
- * MAY_WRITE must also be set in @mask.
+ * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
@@ -1835,11 +1835,18 @@ static int user_path_parent(int dfd, const char __user *path,
 	return error;
 }
 
+static bool richacl_may_delete(struct inode *dir, struct inode *inode, int replace_mask)
+{
+	return IS_RICHACL(inode)
+		&& (inode_permission(dir, MAY_EXEC | replace_mask) == 0)
+		&& (inode_permission(inode, MAY_DELETE_SELF) == 0);
+}
+
 /*
  * It's inline, so penalty for filesystems that don't use sticky bit is
  * minimal.
  */
-static inline int check_sticky(struct inode *dir, struct inode *inode)
+static inline int check_sticky(struct inode *dir, struct inode *inode, int replace_mask)
 {
 	uid_t fsuid = current_fsuid();
 
@@ -1851,7 +1858,8 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
 		return 0;
 	if (dir->i_uid == fsuid)
 		return 0;
-
+	if (richacl_may_delete(dir, inode, replace_mask))
+		return 0;
 other_userns:
 	return !ns_capable(inode_userns(inode), CAP_FOWNER);
 }
@@ -1875,30 +1883,38 @@ other_userns:
  * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
  *     nfs_async_unlink().
  */
-static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
+static int may_delete(struct inode *dir, struct dentry *victim,
+		      int isdir, int replace)
 {
-	int error;
+	struct inode *inode = victim->d_inode;
+	int mask, replace_mask = 0, error;
+
 
-	if (!victim->d_inode)
+	if (!inode)
 		return -ENOENT;
 
 	BUG_ON(victim->d_parent->d_inode != dir);
 	audit_inode_child(victim, dir);
 
-	error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
-	if (error)
+	mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
+	if (replace)
+		replace_mask = S_ISDIR(inode->i_mode) ?
+				MAY_CREATE_DIR : MAY_CREATE_FILE;
+	error = inode_permission(dir, mask | replace_mask);
+	if (error && !richacl_may_delete(dir, inode, replace_mask))
 		return error;
+	if (check_sticky(dir, inode, replace_mask))
+		return -EPERM;
 	if (IS_APPEND(dir))
 		return -EPERM;
-	if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
-	    IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
+	if (IS_APPEND(inode) || IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
 		return -EPERM;
 	if (isdir) {
-		if (!S_ISDIR(victim->d_inode->i_mode))
+		if (!S_ISDIR(inode->i_mode))
 			return -ENOTDIR;
 		if (IS_ROOT(victim))
 			return -EBUSY;
-	} else if (S_ISDIR(victim->d_inode->i_mode))
+	} else if (S_ISDIR(inode->i_mode))
 		return -EISDIR;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
@@ -2605,7 +2621,7 @@ void dentry_unhash(struct dentry *dentry)
 
 int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 1);
+	int error = may_delete(dir, dentry, 1, 0);
 
 	if (error)
 		return error;
@@ -2700,7 +2716,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 0);
+	int error = may_delete(dir, dentry, 0, 0);
 
 	if (error)
 		return error;
@@ -3096,14 +3112,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (old_dentry->d_inode == new_dentry->d_inode)
  		return 0;
  
-	error = may_delete(old_dir, old_dentry, is_dir);
+	error = may_delete(old_dir, old_dentry, is_dir, 0);
 	if (error)
 		return error;
 
 	if (!new_dentry->d_inode)
 		error = may_create(new_dir, new_dentry, is_dir);
 	else
-		error = may_delete(new_dir, new_dentry, is_dir);
+		error = may_delete(new_dir, new_dentry, is_dir, 1);
 	if (error)
 		return error;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 60361c6..ccece40 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -69,6 +69,8 @@ struct inodes_stat_t {
 #define MAY_NOT_BLOCK		0x00000080
 #define MAY_CREATE_FILE		0x00000100
 #define MAY_CREATE_DIR		0x00000200
+#define MAY_DELETE_CHILD	0x00000400
+#define MAY_DELETE_SELF		0x00000800
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-18 15:32   ` Aneesh Kumar K.V
  (?)
@ 2011-10-19 22:20   ` J. Bruce Fields
  2011-10-20  8:30     ` Aneesh Kumar K.V
  -1 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-19 22:20 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Tue, Oct 18, 2011 at 09:02:56PM +0530, Aneesh Kumar K.V wrote:
> +#define RICHACL_XATTR "system.richacl"
> +
> +struct richace_xattr {
> +	__le16		e_type;
> +	__le16		e_flags;
> +	__le32		e_mask;
> +	__le32		e_id;
> +	char		e_who[0];
> +};

Does it really make sense to use a string here just to pick between the
three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
integer?  Is the goal to expand this somehow eventually?

--b.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 00/26]  New ACL format for better NFSv4 acl interoperability
  2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
                   ` (26 preceding siblings ...)
  2011-10-18 16:17 ` [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Shea Levy
@ 2011-10-19 22:21 ` J. Bruce Fields
  27 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-19 22:21 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Tue, Oct 18, 2011 at 09:02:35PM +0530, Aneesh Kumar K.V wrote:
> Hi,
> 
> The following set of patches implements VFS and ext4 changes needed to implement
> a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
> extended by file masks to fit into the standard POSIX file permission model.
> They are designed to work seamlessly locally as well as across the NFSv4 and
> CIFS/SMB2 network file system protocols.

Except for two questions in replies to individual patches, these look
good to me.

--b.

> 
> A user-space utility for displaying and changing richacls is available at [4]
> (a number of examples can be found at http://acl.bestbits.at/richacl/examples.html).
> 
> [4] git://github.com/kvaneesh/richacl-tools.git master
> 
> To test richacl on ext4 use tune2fs -O richacl to enable richacl feature and mount
> the file system using -o acl mount option.
> 
> More details regarding richacl can be found at
> http://acl.bestbits.at/richacl/
> 
> Changes from v6:
> a) Update patches based on review comments.
> b) Add Acked-by:
> c) rebase to 3.1-rc10
> 
> git repository With all the patches can be found at
> git://github.com/kvaneesh/linux.git richacl
> 
> IMHO the patches are ready to be merged upstream. How do we push these changes
> to Linus tree ? Andrew, Viro, any comment on how we can get this merged upstream ?
> 
> -aneesh
> 
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 08/26] vfs: Add new file and directory create permission flags
@ 2011-10-20  5:20       ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20  5:20 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Wed, 19 Oct 2011 12:42:16 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Tue, Oct 18, 2011 at 09:02:43PM +0530, Aneesh Kumar K.V wrote:
> > From: Andreas Gruenbacher <agruen@kernel.org>
> > 
> > Some permission models distinguish between the permission to create a
> > non-directory and a directory.  Pass this information down to
> > inode_permission() as mask flags
> ...
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index f3ebf86..60361c6 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -67,6 +67,8 @@ struct inodes_stat_t {
> >  #define MAY_CHDIR		0x00000040
> >  /* called from RCU mode, don't block */
> >  #define MAY_NOT_BLOCK		0x00000080
> > +#define MAY_CREATE_FILE		0x00000100
> > +#define MAY_CREATE_DIR		0x00000200
> 
> Hm, are the flags in fs/nfsd/vfs.h going to need fixing up?
> 
> Looking at the nfsd code....  No, I guess it's OK, nfsd does
> 
> 	err = inode_permission(inode, acc & (MAY_READ|MAY_WRITE|MAY_EXEC));
> 

nfsd bits need fixing once nfs starts doing richacl permission check.
The changes I did can be found at
https://github.com/kvaneesh/linux/commits/richacl-fullset/

> So we can wait to fix up any collisions until we need to pass these
> extra bits.
> 

Yes. And that we will do once we get the VFS and local file system
changes upstream.

-aneesh

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 08/26] vfs: Add new file and directory create permission flags
@ 2011-10-20  5:20       ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20  5:20 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: agruen-DgEjT+Ai2ygdnm+yROfE0A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, 19 Oct 2011 12:42:16 -0400, "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> On Tue, Oct 18, 2011 at 09:02:43PM +0530, Aneesh Kumar K.V wrote:
> > From: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > 
> > Some permission models distinguish between the permission to create a
> > non-directory and a directory.  Pass this information down to
> > inode_permission() as mask flags
> ...
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index f3ebf86..60361c6 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -67,6 +67,8 @@ struct inodes_stat_t {
> >  #define MAY_CHDIR		0x00000040
> >  /* called from RCU mode, don't block */
> >  #define MAY_NOT_BLOCK		0x00000080
> > +#define MAY_CREATE_FILE		0x00000100
> > +#define MAY_CREATE_DIR		0x00000200
> 
> Hm, are the flags in fs/nfsd/vfs.h going to need fixing up?
> 
> Looking at the nfsd code....  No, I guess it's OK, nfsd does
> 
> 	err = inode_permission(inode, acc & (MAY_READ|MAY_WRITE|MAY_EXEC));
> 

nfsd bits need fixing once nfs starts doing richacl permission check.
The changes I did can be found at
https://github.com/kvaneesh/linux/commits/richacl-fullset/


> So we can wait to fix up any collisions until we need to pass these
> extra bits.
> 

Yes. And that we will do once we get the VFS and local file system
changes upstream.

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 09/26] vfs: Add delete child and delete self permission flags
@ 2011-10-20  7:35       ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20  7:35 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Wed, 19 Oct 2011 18:09:15 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> Maybe I'm dense, but that big if-else-if is still giving me a headache.
> 
> The point is just to delay the ns_capable() check to avoid setting
> PF_SUPERPRIV in cases where we weren't before?
> 
> How about putting using a helper function for the richacl check, and
> calling it from check_sticky instead? That makes the above:
> 
> 	error = inode_permission(dir, mask | replace_mask);
> 	if (error && !richacl_may_delete(dir, inode, replace_mask))
> 		return error;
> 	if (check_sticky(dir, inode, replace_mask))
> 		return -EPERM;
> 
> (As in the following--totally untested and possibly wrong.)
> 
> Also: the comment before may_delete() needs updating.
> 

Thanks for the suggestion. That made the code simpler. Updated patch
below.

commit 3c92363ce2dee22aa174327c21726f8f02cbcd6e
Author: Andreas Gruenbacher <agruen@kernel.org>
Date:   Tue Oct 18 15:17:50 2011 +0530

    vfs: Add delete child and delete self permission flags
    
    Normally, deleting a file requires write access to the parent directory.
    Some permission models use a different permission on the parent
    directory to indicate delete access.  In addition, a process can have
    per-file delete access even without delete access on the parent
    directory.
    
    Introduce two new inode_permission() mask flags and use them in
    may_delete()
    
    Acked-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

diff --git a/fs/namei.c b/fs/namei.c
index f6184b8..044b6d1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -337,7 +337,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
  * are used for other things.
  *
  * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
- * MAY_WRITE must also be set in @mask.
+ * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
@@ -1835,11 +1835,25 @@ static int user_path_parent(int dfd, const char __user *path,
 	return error;
 }
 
+
+/*
+ * We should have exec permission on directory and MAY_DELETE_SELF
+ * on the object being deleted.
+ */
+static int richacl_may_selfdelete(struct inode *dir,
+				  struct inode *inode, int replace_mask)
+{
+	return (IS_RICHACL(inode) &&
+		(inode_permission(dir, MAY_EXEC | replace_mask) == 0) &&
+		(inode_permission(inode, MAY_DELETE_SELF) == 0));
+}
+
 /*
  * It's inline, so penalty for filesystems that don't use sticky bit is
  * minimal.
  */
-static inline int check_sticky(struct inode *dir, struct inode *inode)
+static inline int check_sticky(struct inode *dir,
+			       struct inode *inode, int replace_mask)
 {
 	uid_t fsuid = current_fsuid();
 
@@ -1851,7 +1865,8 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
 		return 0;
 	if (dir->i_uid == fsuid)
 		return 0;
-
+	if (richacl_may_selfdelete(dir, inode, replace_mask))
+		return 0;
 other_userns:
 	return !ns_capable(inode_userns(inode), CAP_FOWNER);
 }
@@ -1875,30 +1890,38 @@ other_userns:
  * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
  *     nfs_async_unlink().
  */
-static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
+static int may_delete(struct inode *dir, struct dentry *victim,
+		      int isdir, int replace)
 {
-	int error;
+	int mask, replace_mask = 0, error;
+	struct inode *inode = victim->d_inode;
 
-	if (!victim->d_inode)
+	if (!inode)
 		return -ENOENT;
 
 	BUG_ON(victim->d_parent->d_inode != dir);
 	audit_inode_child(victim, dir);
 
-	error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
+	mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
+	if (replace)
+		replace_mask = S_ISDIR(inode->i_mode) ?
+				MAY_CREATE_DIR : MAY_CREATE_FILE;
+	error = inode_permission(dir, mask | replace_mask);
+	if (error && richacl_may_selfdelete(dir, inode, replace_mask))
+		error = 0;
 	if (error)
 		return error;
 	if (IS_APPEND(dir))
 		return -EPERM;
-	if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
-	    IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
+	if (check_sticky(dir, inode, replace_mask) || IS_APPEND(inode) ||
+	    IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
 		return -EPERM;
 	if (isdir) {
-		if (!S_ISDIR(victim->d_inode->i_mode))
+		if (!S_ISDIR(inode->i_mode))
 			return -ENOTDIR;
 		if (IS_ROOT(victim))
 			return -EBUSY;
-	} else if (S_ISDIR(victim->d_inode->i_mode))
+	} else if (S_ISDIR(inode->i_mode))
 		return -EISDIR;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
@@ -2605,7 +2628,7 @@ void dentry_unhash(struct dentry *dentry)
 
 int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 1);
+	int error = may_delete(dir, dentry, 1, 0);
 
 	if (error)
 		return error;
@@ -2700,7 +2723,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 0);
+	int error = may_delete(dir, dentry, 0, 0);
 
 	if (error)
 		return error;
@@ -3096,14 +3119,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (old_dentry->d_inode == new_dentry->d_inode)
  		return 0;
  
-	error = may_delete(old_dir, old_dentry, is_dir);
+	error = may_delete(old_dir, old_dentry, is_dir, 0);
 	if (error)
 		return error;
 
 	if (!new_dentry->d_inode)
 		error = may_create(new_dir, new_dentry, is_dir);
 	else
-		error = may_delete(new_dir, new_dentry, is_dir);
+		error = may_delete(new_dir, new_dentry, is_dir, 1);
 	if (error)
 		return error;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 60361c6..ccece40 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -69,6 +69,8 @@ struct inodes_stat_t {
 #define MAY_NOT_BLOCK		0x00000080
 #define MAY_CREATE_FILE		0x00000100
 #define MAY_CREATE_DIR		0x00000200
+#define MAY_DELETE_CHILD	0x00000400
+#define MAY_DELETE_SELF		0x00000800
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 09/26] vfs: Add delete child and delete self permission flags
@ 2011-10-20  7:35       ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20  7:35 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: agruen-DgEjT+Ai2ygdnm+yROfE0A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, 19 Oct 2011 18:09:15 -0400, "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> Maybe I'm dense, but that big if-else-if is still giving me a headache.
> 
> The point is just to delay the ns_capable() check to avoid setting
> PF_SUPERPRIV in cases where we weren't before?
> 
> How about putting using a helper function for the richacl check, and
> calling it from check_sticky instead? That makes the above:
> 
> 	error = inode_permission(dir, mask | replace_mask);
> 	if (error && !richacl_may_delete(dir, inode, replace_mask))
> 		return error;
> 	if (check_sticky(dir, inode, replace_mask))
> 		return -EPERM;
> 
> (As in the following--totally untested and possibly wrong.)
> 
> Also: the comment before may_delete() needs updating.
> 

Thanks for the suggestion. That made the code simpler. Updated patch
below.

commit 3c92363ce2dee22aa174327c21726f8f02cbcd6e
Author: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date:   Tue Oct 18 15:17:50 2011 +0530

    vfs: Add delete child and delete self permission flags
    
    Normally, deleting a file requires write access to the parent directory.
    Some permission models use a different permission on the parent
    directory to indicate delete access.  In addition, a process can have
    per-file delete access even without delete access on the parent
    directory.
    
    Introduce two new inode_permission() mask flags and use them in
    may_delete()
    
    Acked-by: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    Signed-off-by: Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

diff --git a/fs/namei.c b/fs/namei.c
index f6184b8..044b6d1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -337,7 +337,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
  * are used for other things.
  *
  * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
- * MAY_WRITE must also be set in @mask.
+ * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
@@ -1835,11 +1835,25 @@ static int user_path_parent(int dfd, const char __user *path,
 	return error;
 }
 
+
+/*
+ * We should have exec permission on directory and MAY_DELETE_SELF
+ * on the object being deleted.
+ */
+static int richacl_may_selfdelete(struct inode *dir,
+				  struct inode *inode, int replace_mask)
+{
+	return (IS_RICHACL(inode) &&
+		(inode_permission(dir, MAY_EXEC | replace_mask) == 0) &&
+		(inode_permission(inode, MAY_DELETE_SELF) == 0));
+}
+
 /*
  * It's inline, so penalty for filesystems that don't use sticky bit is
  * minimal.
  */
-static inline int check_sticky(struct inode *dir, struct inode *inode)
+static inline int check_sticky(struct inode *dir,
+			       struct inode *inode, int replace_mask)
 {
 	uid_t fsuid = current_fsuid();
 
@@ -1851,7 +1865,8 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
 		return 0;
 	if (dir->i_uid == fsuid)
 		return 0;
-
+	if (richacl_may_selfdelete(dir, inode, replace_mask))
+		return 0;
 other_userns:
 	return !ns_capable(inode_userns(inode), CAP_FOWNER);
 }
@@ -1875,30 +1890,38 @@ other_userns:
  * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
  *     nfs_async_unlink().
  */
-static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
+static int may_delete(struct inode *dir, struct dentry *victim,
+		      int isdir, int replace)
 {
-	int error;
+	int mask, replace_mask = 0, error;
+	struct inode *inode = victim->d_inode;
 
-	if (!victim->d_inode)
+	if (!inode)
 		return -ENOENT;
 
 	BUG_ON(victim->d_parent->d_inode != dir);
 	audit_inode_child(victim, dir);
 
-	error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
+	mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
+	if (replace)
+		replace_mask = S_ISDIR(inode->i_mode) ?
+				MAY_CREATE_DIR : MAY_CREATE_FILE;
+	error = inode_permission(dir, mask | replace_mask);
+	if (error && richacl_may_selfdelete(dir, inode, replace_mask))
+		error = 0;
 	if (error)
 		return error;
 	if (IS_APPEND(dir))
 		return -EPERM;
-	if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
-	    IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
+	if (check_sticky(dir, inode, replace_mask) || IS_APPEND(inode) ||
+	    IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
 		return -EPERM;
 	if (isdir) {
-		if (!S_ISDIR(victim->d_inode->i_mode))
+		if (!S_ISDIR(inode->i_mode))
 			return -ENOTDIR;
 		if (IS_ROOT(victim))
 			return -EBUSY;
-	} else if (S_ISDIR(victim->d_inode->i_mode))
+	} else if (S_ISDIR(inode->i_mode))
 		return -EISDIR;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
@@ -2605,7 +2628,7 @@ void dentry_unhash(struct dentry *dentry)
 
 int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 1);
+	int error = may_delete(dir, dentry, 1, 0);
 
 	if (error)
 		return error;
@@ -2700,7 +2723,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
 
 int vfs_unlink(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_delete(dir, dentry, 0);
+	int error = may_delete(dir, dentry, 0, 0);
 
 	if (error)
 		return error;
@@ -3096,14 +3119,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (old_dentry->d_inode == new_dentry->d_inode)
  		return 0;
  
-	error = may_delete(old_dir, old_dentry, is_dir);
+	error = may_delete(old_dir, old_dentry, is_dir, 0);
 	if (error)
 		return error;
 
 	if (!new_dentry->d_inode)
 		error = may_create(new_dir, new_dentry, is_dir);
 	else
-		error = may_delete(new_dir, new_dentry, is_dir);
+		error = may_delete(new_dir, new_dentry, is_dir, 1);
 	if (error)
 		return error;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 60361c6..ccece40 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -69,6 +69,8 @@ struct inodes_stat_t {
 #define MAY_NOT_BLOCK		0x00000080
 #define MAY_CREATE_FILE		0x00000100
 #define MAY_CREATE_DIR		0x00000200
+#define MAY_DELETE_CHILD	0x00000400
+#define MAY_DELETE_SELF		0x00000800
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 09/26] vfs: Add delete child and delete self permission flags
  2011-10-20  7:35       ` Aneesh Kumar K.V
  (?)
@ 2011-10-20  8:11       ` J. Bruce Fields
  -1 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-20  8:11 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Thu, Oct 20, 2011 at 01:05:26PM +0530, Aneesh Kumar K.V wrote:
> On Wed, 19 Oct 2011 18:09:15 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > Maybe I'm dense, but that big if-else-if is still giving me a headache.
> > 
> > The point is just to delay the ns_capable() check to avoid setting
> > PF_SUPERPRIV in cases where we weren't before?
> > 
> > How about putting using a helper function for the richacl check, and
> > calling it from check_sticky instead? That makes the above:
> > 
> > 	error = inode_permission(dir, mask | replace_mask);
> > 	if (error && !richacl_may_delete(dir, inode, replace_mask))
> > 		return error;
> > 	if (check_sticky(dir, inode, replace_mask))
> > 		return -EPERM;
> > 
> > (As in the following--totally untested and possibly wrong.)
> > 
> > Also: the comment before may_delete() needs updating.
> > 
> 
> Thanks for the suggestion. That made the code simpler. Updated patch
> below.

Looks good to me if it passes your tests, thanks!  Feel free to add
Reviewed-by or Acked-by for "J. Bruce Fields" <bfields@redhat.com>.

--b.

> 
> commit 3c92363ce2dee22aa174327c21726f8f02cbcd6e
> Author: Andreas Gruenbacher <agruen@kernel.org>
> Date:   Tue Oct 18 15:17:50 2011 +0530
> 
>     vfs: Add delete child and delete self permission flags
>     
>     Normally, deleting a file requires write access to the parent directory.
>     Some permission models use a different permission on the parent
>     directory to indicate delete access.  In addition, a process can have
>     per-file delete access even without delete access on the parent
>     directory.
>     
>     Introduce two new inode_permission() mask flags and use them in
>     may_delete()
>     
>     Acked-by: David Howells <dhowells@redhat.com>
>     Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
>     Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index f6184b8..044b6d1 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -337,7 +337,7 @@ static inline int do_inode_permission(struct inode *inode, int mask)
>   * are used for other things.
>   *
>   * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
> - * MAY_WRITE must also be set in @mask.
> + * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
>   */
>  int inode_permission(struct inode *inode, int mask)
>  {
> @@ -1835,11 +1835,25 @@ static int user_path_parent(int dfd, const char __user *path,
>  	return error;
>  }
>  
> +
> +/*
> + * We should have exec permission on directory and MAY_DELETE_SELF
> + * on the object being deleted.
> + */
> +static int richacl_may_selfdelete(struct inode *dir,
> +				  struct inode *inode, int replace_mask)
> +{
> +	return (IS_RICHACL(inode) &&
> +		(inode_permission(dir, MAY_EXEC | replace_mask) == 0) &&
> +		(inode_permission(inode, MAY_DELETE_SELF) == 0));
> +}
> +
>  /*
>   * It's inline, so penalty for filesystems that don't use sticky bit is
>   * minimal.
>   */
> -static inline int check_sticky(struct inode *dir, struct inode *inode)
> +static inline int check_sticky(struct inode *dir,
> +			       struct inode *inode, int replace_mask)
>  {
>  	uid_t fsuid = current_fsuid();
>  
> @@ -1851,7 +1865,8 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
>  		return 0;
>  	if (dir->i_uid == fsuid)
>  		return 0;
> -
> +	if (richacl_may_selfdelete(dir, inode, replace_mask))
> +		return 0;
>  other_userns:
>  	return !ns_capable(inode_userns(inode), CAP_FOWNER);
>  }
> @@ -1875,30 +1890,38 @@ other_userns:
>   * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
>   *     nfs_async_unlink().
>   */
> -static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> +static int may_delete(struct inode *dir, struct dentry *victim,
> +		      int isdir, int replace)
>  {
> -	int error;
> +	int mask, replace_mask = 0, error;
> +	struct inode *inode = victim->d_inode;
>  
> -	if (!victim->d_inode)
> +	if (!inode)
>  		return -ENOENT;
>  
>  	BUG_ON(victim->d_parent->d_inode != dir);
>  	audit_inode_child(victim, dir);
>  
> -	error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
> +	mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
> +	if (replace)
> +		replace_mask = S_ISDIR(inode->i_mode) ?
> +				MAY_CREATE_DIR : MAY_CREATE_FILE;
> +	error = inode_permission(dir, mask | replace_mask);
> +	if (error && richacl_may_selfdelete(dir, inode, replace_mask))
> +		error = 0;
>  	if (error)
>  		return error;
>  	if (IS_APPEND(dir))
>  		return -EPERM;
> -	if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
> -	    IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
> +	if (check_sticky(dir, inode, replace_mask) || IS_APPEND(inode) ||
> +	    IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
>  		return -EPERM;
>  	if (isdir) {
> -		if (!S_ISDIR(victim->d_inode->i_mode))
> +		if (!S_ISDIR(inode->i_mode))
>  			return -ENOTDIR;
>  		if (IS_ROOT(victim))
>  			return -EBUSY;
> -	} else if (S_ISDIR(victim->d_inode->i_mode))
> +	} else if (S_ISDIR(inode->i_mode))
>  		return -EISDIR;
>  	if (IS_DEADDIR(dir))
>  		return -ENOENT;
> @@ -2605,7 +2628,7 @@ void dentry_unhash(struct dentry *dentry)
>  
>  int vfs_rmdir(struct inode *dir, struct dentry *dentry)
>  {
> -	int error = may_delete(dir, dentry, 1);
> +	int error = may_delete(dir, dentry, 1, 0);
>  
>  	if (error)
>  		return error;
> @@ -2700,7 +2723,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)
>  
>  int vfs_unlink(struct inode *dir, struct dentry *dentry)
>  {
> -	int error = may_delete(dir, dentry, 0);
> +	int error = may_delete(dir, dentry, 0, 0);
>  
>  	if (error)
>  		return error;
> @@ -3096,14 +3119,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
>  	if (old_dentry->d_inode == new_dentry->d_inode)
>   		return 0;
>   
> -	error = may_delete(old_dir, old_dentry, is_dir);
> +	error = may_delete(old_dir, old_dentry, is_dir, 0);
>  	if (error)
>  		return error;
>  
>  	if (!new_dentry->d_inode)
>  		error = may_create(new_dir, new_dentry, is_dir);
>  	else
> -		error = may_delete(new_dir, new_dentry, is_dir);
> +		error = may_delete(new_dir, new_dentry, is_dir, 1);
>  	if (error)
>  		return error;
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 60361c6..ccece40 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -69,6 +69,8 @@ struct inodes_stat_t {
>  #define MAY_NOT_BLOCK		0x00000080
>  #define MAY_CREATE_FILE		0x00000100
>  #define MAY_CREATE_DIR		0x00000200
> +#define MAY_DELETE_CHILD	0x00000400
> +#define MAY_DELETE_SELF		0x00000800
>  
>  /*
>   * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-19 22:20   ` J. Bruce Fields
@ 2011-10-20  8:30     ` Aneesh Kumar K.V
  2011-10-20  9:14       ` J. Bruce Fields
  0 siblings, 1 reply; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20  8:30 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Wed, 19 Oct 2011 18:20:21 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Tue, Oct 18, 2011 at 09:02:56PM +0530, Aneesh Kumar K.V wrote:
> > +#define RICHACL_XATTR "system.richacl"
> > +
> > +struct richace_xattr {
> > +	__le16		e_type;
> > +	__le16		e_flags;
> > +	__le32		e_mask;
> > +	__le32		e_id;
> > +	char		e_who[0];
> > +};
> 
> Does it really make sense to use a string here just to pick between the
> three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> integer?  Is the goal to expand this somehow eventually?

I guess Andreas wanted the disk layout to be able to store user@domain
format if needed. That should make the layout flexible enough so that
we won't have to add another xattr later.

-aneesh

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-20  8:30     ` Aneesh Kumar K.V
@ 2011-10-20  9:14       ` J. Bruce Fields
  2011-10-20  9:19         ` Christoph Hellwig
  0 siblings, 1 reply; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-20  9:14 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Thu, Oct 20, 2011 at 02:00:02PM +0530, Aneesh Kumar K.V wrote:
> On Wed, 19 Oct 2011 18:20:21 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > On Tue, Oct 18, 2011 at 09:02:56PM +0530, Aneesh Kumar K.V wrote:
> > > +#define RICHACL_XATTR "system.richacl"
> > > +
> > > +struct richace_xattr {
> > > +	__le16		e_type;
> > > +	__le16		e_flags;
> > > +	__le32		e_mask;
> > > +	__le32		e_id;
> > > +	char		e_who[0];
> > > +};
> > 
> > Does it really make sense to use a string here just to pick between the
> > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > integer?  Is the goal to expand this somehow eventually?
> 
> I guess Andreas wanted the disk layout to be able to store user@domain
> format if needed.

Is that likely?  For that to be useful, tasks would need to be able to
run as user@domain strings.  And we'd probably want owners and groups to
also be user@domain strings.

The container people seem to eventually want to add some kind of
namespace identifier everywhere:

	http://marc.info/?l=linux-kernel&m=131836778427871&w=2

in which case I guess we'd likely end up with (uid, user namespace id)
instead of user@domain?

I suppose the variable-length string field could store that too.

I don't hate the idea, it would make life easier for the NFS server.

--b.

> That should make the layout flexible enough so that
> we won't have to add another xattr later.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-20  9:14       ` J. Bruce Fields
@ 2011-10-20  9:19         ` Christoph Hellwig
  2011-10-20 10:25             ` J. Bruce Fields
  2011-10-20 11:02             ` Aneesh Kumar K.V
  0 siblings, 2 replies; 66+ messages in thread
From: Christoph Hellwig @ 2011-10-20  9:19 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Aneesh Kumar K.V, agruen, akpm, viro, dhowells, linux-fsdevel,
	linux-nfs, linux-kernel

On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > Does it really make sense to use a string here just to pick between the
> > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > integer?  Is the goal to expand this somehow eventually?
> > 

> > I guess Andreas wanted the disk layout to be able to store user@domain
> > format if needed.
> 
> Is that likely?  For that to be useful, tasks would need to be able to
> run as user@domain strings.  And we'd probably want owners and groups to
> also be user@domain strings.
> 
> The container people seem to eventually want to add some kind of
> namespace identifier everywhere:
> 
> 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> 
> in which case I guess we'd likely end up with (uid, user namespace id)
> instead of user@domain?


Storing strings is an extremly stupid idea.  The only thing that would
make sense would be storing a windows-style 128-bit GUID.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 10:25             ` J. Bruce Fields
  0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-20 10:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Aneesh Kumar K.V, agruen, akpm, viro, dhowells, linux-fsdevel,
	linux-nfs, linux-kernel

On Thu, Oct 20, 2011 at 05:19:46AM -0400, Christoph Hellwig wrote:
> On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > Does it really make sense to use a string here just to pick between the
> > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > integer?  Is the goal to expand this somehow eventually?
> > > 
> 
> > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > format if needed.
> > 
> > Is that likely?  For that to be useful, tasks would need to be able to
> > run as user@domain strings.  And we'd probably want owners and groups to
> > also be user@domain strings.
> > 
> > The container people seem to eventually want to add some kind of
> > namespace identifier everywhere:
> > 
> > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > 
> > in which case I guess we'd likely end up with (uid, user namespace id)
> > instead of user@domain?
> 
> 
> Storing strings is an extremly stupid idea.  The only thing that would
> make sense would be storing a windows-style 128-bit GUID.
> 

So if we want to do this without strings:

> > > +struct richace_xattr {
> > > + __le16          e_type;
> > > + __le16          e_flags;
> > > + __le32          e_mask;
> > > + __le32          e_id;
> > > + char            e_who[0];

We could drop that last field and use some predefined values for e_id to
represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.

Then I'm not sure how you'd extend it if you later decided to add
Windows GUID's or whatever.

But maybe it's not realistic to expect to be able to do that without a
new interface and on-disk format: how could old software be expected to
deal with acls that didn't use uid's?

--b.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 10:25             ` J. Bruce Fields
  0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-20 10:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Aneesh Kumar K.V, agruen-DgEjT+Ai2ygdnm+yROfE0A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, Oct 20, 2011 at 05:19:46AM -0400, Christoph Hellwig wrote:
> On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > Does it really make sense to use a string here just to pick between the
> > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > integer?  Is the goal to expand this somehow eventually?
> > > 
> 
> > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > format if needed.
> > 
> > Is that likely?  For that to be useful, tasks would need to be able to
> > run as user@domain strings.  And we'd probably want owners and groups to
> > also be user@domain strings.
> > 
> > The container people seem to eventually want to add some kind of
> > namespace identifier everywhere:
> > 
> > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > 
> > in which case I guess we'd likely end up with (uid, user namespace id)
> > instead of user@domain?
> 
> 
> Storing strings is an extremly stupid idea.  The only thing that would
> make sense would be storing a windows-style 128-bit GUID.
> 

So if we want to do this without strings:

> > > +struct richace_xattr {
> > > + __le16          e_type;
> > > + __le16          e_flags;
> > > + __le32          e_mask;
> > > + __le32          e_id;
> > > + char            e_who[0];

We could drop that last field and use some predefined values for e_id to
represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.

Then I'm not sure how you'd extend it if you later decided to add
Windows GUID's or whatever.

But maybe it's not realistic to expect to be able to do that without a
new interface and on-disk format: how could old software be expected to
deal with acls that didn't use uid's?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 11:02             ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20 11:02 UTC (permalink / raw)
  To: Christoph Hellwig, J. Bruce Fields
  Cc: agruen, akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel

On Thu, 20 Oct 2011 05:19:46 -0400, Christoph Hellwig <hch@infradead.org> wrote:
> On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > Does it really make sense to use a string here just to pick between the
> > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > integer?  Is the goal to expand this somehow eventually?
> > > 
> 
> > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > format if needed.
> > 
> > Is that likely?  For that to be useful, tasks would need to be able to
> > run as user@domain strings.  And we'd probably want owners and groups to
> > also be user@domain strings.
> > 
> > The container people seem to eventually want to add some kind of
> > namespace identifier everywhere:
> > 
> > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > 
> > in which case I guess we'd likely end up with (uid, user namespace id)
> > instead of user@domain?
> 
> 
> Storing strings is an extremly stupid idea.  The only thing that would
> make sense would be storing a windows-style 128-bit GUID.
> 

How about updating the richacl_xattr as below 

struct richace_xattr {
	__le16		e_type;
	__le16		e_flags;
	__le32		e_mask;
	__le32		e_size;
	u8		e_id[0];
};

now e_flags can contain ACE4_SPECIAL_WHO to indicate value in e_id
indicate special who values (which could be 1 byte value indicating
OWNER@, GROUP@ or EVERYONE@), ACE4_UNIXID_WHO, to indicate value
in e_id is the little endian value of unix id. ACE_WINSID_WHO to
indicate e_id is the 128 bit array containing SID value. ?

-aneesh


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 11:02             ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-20 11:02 UTC (permalink / raw)
  To: Christoph Hellwig, J. Bruce Fields
  Cc: agruen-DgEjT+Ai2ygdnm+yROfE0A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, 20 Oct 2011 05:19:46 -0400, Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > Does it really make sense to use a string here just to pick between the
> > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > integer?  Is the goal to expand this somehow eventually?
> > > 
> 
> > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > format if needed.
> > 
> > Is that likely?  For that to be useful, tasks would need to be able to
> > run as user@domain strings.  And we'd probably want owners and groups to
> > also be user@domain strings.
> > 
> > The container people seem to eventually want to add some kind of
> > namespace identifier everywhere:
> > 
> > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > 
> > in which case I guess we'd likely end up with (uid, user namespace id)
> > instead of user@domain?
> 
> 
> Storing strings is an extremly stupid idea.  The only thing that would
> make sense would be storing a windows-style 128-bit GUID.
> 

How about updating the richacl_xattr as below 

struct richace_xattr {
	__le16		e_type;
	__le16		e_flags;
	__le32		e_mask;
	__le32		e_size;
	u8		e_id[0];
};

now e_flags can contain ACE4_SPECIAL_WHO to indicate value in e_id
indicate special who values (which could be 1 byte value indicating
OWNER@, GROUP@ or EVERYONE@), ACE4_UNIXID_WHO, to indicate value
in e_id is the little endian value of unix id. ACE_WINSID_WHO to
indicate e_id is the 128 bit array containing SID value. ?

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 17:49               ` J. Bruce Fields
  0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-20 17:49 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Christoph Hellwig, agruen, akpm, viro, dhowells, linux-fsdevel,
	linux-nfs, linux-kernel, ebiederm

On Thu, Oct 20, 2011 at 04:32:04PM +0530, Aneesh Kumar K.V wrote:
> On Thu, 20 Oct 2011 05:19:46 -0400, Christoph Hellwig <hch@infradead.org> wrote:
> > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > > Does it really make sense to use a string here just to pick between the
> > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > > integer?  Is the goal to expand this somehow eventually?
> > > > 
> > 
> > > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > > format if needed.
> > > 
> > > Is that likely?  For that to be useful, tasks would need to be able to
> > > run as user@domain strings.  And we'd probably want owners and groups to
> > > also be user@domain strings.
> > > 
> > > The container people seem to eventually want to add some kind of
> > > namespace identifier everywhere:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > > 
> > > in which case I guess we'd likely end up with (uid, user namespace id)
> > > instead of user@domain?
> > 
> > 
> > Storing strings is an extremly stupid idea.  The only thing that would
> > make sense would be storing a windows-style 128-bit GUID.
> > 
> 
> How about updating the richacl_xattr as below 
> 
> struct richace_xattr {
> 	__le16		e_type;
> 	__le16		e_flags;
> 	__le32		e_mask;
> 	__le32		e_size;
> 	u8		e_id[0];
> };
> 
> now e_flags can contain ACE4_SPECIAL_WHO to indicate value in e_id
> indicate special who values (which could be 1 byte value indicating
> OWNER@, GROUP@ or EVERYONE@), ACE4_UNIXID_WHO, to indicate value
> in e_id is the little endian value of unix id. ACE_WINSID_WHO to
> indicate e_id is the 128 bit array containing SID value. ?

That's effectively still a string.

Would it be so bad to have to introduce another xattr type if we needed
a new id type?  You'll have to modify the filesystem and the userspace
tools and everything anyway, won't you?

But if we decide we don't need strings, then at a minimum let's make
these some fixed small size.

You could do something like:

	struct richace_xattr {
		__le16		e_type;
		__le16		e_flags;
		__le32		e_mask;
		__le32		e_id[4];
	}

and just use e_id[0] for now.  That would still leave room for a 128-bit
id, or for a 32-bit uid + some-size namespace-id.

Cc'ing Eric Biederman in hopes of finding out whether that would satifsy
whatever wacky future ideas might be expected for user namespaces.

--b.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 17:49               ` J. Bruce Fields
  0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-20 17:49 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Christoph Hellwig, agruen-DgEjT+Ai2ygdnm+yROfE0A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w

On Thu, Oct 20, 2011 at 04:32:04PM +0530, Aneesh Kumar K.V wrote:
> On Thu, 20 Oct 2011 05:19:46 -0400, Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > > Does it really make sense to use a string here just to pick between the
> > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > > integer?  Is the goal to expand this somehow eventually?
> > > > 
> > 
> > > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > > format if needed.
> > > 
> > > Is that likely?  For that to be useful, tasks would need to be able to
> > > run as user@domain strings.  And we'd probably want owners and groups to
> > > also be user@domain strings.
> > > 
> > > The container people seem to eventually want to add some kind of
> > > namespace identifier everywhere:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > > 
> > > in which case I guess we'd likely end up with (uid, user namespace id)
> > > instead of user@domain?
> > 
> > 
> > Storing strings is an extremly stupid idea.  The only thing that would
> > make sense would be storing a windows-style 128-bit GUID.
> > 
> 
> How about updating the richacl_xattr as below 
> 
> struct richace_xattr {
> 	__le16		e_type;
> 	__le16		e_flags;
> 	__le32		e_mask;
> 	__le32		e_size;
> 	u8		e_id[0];
> };
> 
> now e_flags can contain ACE4_SPECIAL_WHO to indicate value in e_id
> indicate special who values (which could be 1 byte value indicating
> OWNER@, GROUP@ or EVERYONE@), ACE4_UNIXID_WHO, to indicate value
> in e_id is the little endian value of unix id. ACE_WINSID_WHO to
> indicate e_id is the 128 bit array containing SID value. ?

That's effectively still a string.

Would it be so bad to have to introduce another xattr type if we needed
a new id type?  You'll have to modify the filesystem and the userspace
tools and everything anyway, won't you?

But if we decide we don't need strings, then at a minimum let's make
these some fixed small size.

You could do something like:

	struct richace_xattr {
		__le16		e_type;
		__le16		e_flags;
		__le32		e_mask;
		__le32		e_id[4];
	}

and just use e_id[0] for now.  That would still leave room for a 128-bit
id, or for a 32-bit uid + some-size namespace-id.

Cc'ing Eric Biederman in hopes of finding out whether that would satifsy
whatever wacky future ideas might be expected for user namespaces.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-20 17:49               ` J. Bruce Fields
  (?)
@ 2011-10-20 19:49               ` Andreas Dilger
  2011-11-19  9:35                 ` Eric W. Biederman
  -1 siblings, 1 reply; 66+ messages in thread
From: Andreas Dilger @ 2011-10-20 19:49 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Aneesh Kumar K.V, Christoph Hellwig, agruen, akpm, viro,
	dhowells, linux-fsdevel, linux-nfs, linux-kernel, ebiederm

On 2011-10-20, at 11:49 AM, J. Bruce Fields wrote:
> On Thu, Oct 20, 2011 at 04:32:04PM +0530, Aneesh Kumar K.V wrote:
>> On Thu, 20 Oct 2011 05:19:46 -0400, Christoph Hellwig <hch@infradead.org> wrote:
>>> Storing strings is an extremly stupid idea.  The only thing that would
>>> make sense would be storing a windows-style 128-bit GUID.
>> 
>> How about updating the richacl_xattr as below 
>> 
>> struct richace_xattr {
>> 	__le16		e_type;
>> 	__le16		e_flags;
>> 	__le32		e_mask;
>> 	__le32		e_size;
>> 	u8		e_id[0];
>> };
>> 
>> now e_flags can contain ACE4_SPECIAL_WHO to indicate value in e_id
>> indicate special who values (which could be 1 byte value indicating
>> OWNER@, GROUP@ or EVERYONE@), ACE4_UNIXID_WHO, to indicate value
>> in e_id is the little endian value of unix id. ACE_WINSID_WHO to
>> indicate e_id is the 128 bit array containing SID value. ?
> 
> That's effectively still a string.
> 
> Would it be so bad to have to introduce another xattr type if we needed
> a new id type?  You'll have to modify the filesystem and the userspace
> tools and everything anyway, won't you?
> 
> But if we decide we don't need strings, then at a minimum let's make
> these some fixed small size.
> 
> You could do something like:
> 
> 	struct richace_xattr {
> 		__le16		e_type;
> 		__le16		e_flags;
> 		__le32		e_mask;
> 		__le32		e_id[4];
> 	}
> 
> and just use e_id[0] for now.  That would still leave room for a 128-bit
> id, or for a 32-bit uid + some-size namespace-id.

Just as an FYI, from back when we were trying to port Lustre to Solaris,
Solaris itself uses a 64-bit "FUID" (32-bit UID + 32-bit namespace) to
handle this.

It has a table for arbitrary mapping of 128-bit Windows domains to a
32-bit FUID namespace (don't know much detail here, sorry), and it is
(reasonably) expected that a single system will not be in more than
2^32 namespaces at once.  This keeps the datatypes sane (u64 or 2x u32)
and doesn't put much complexity into the filesystem/kernel.  For most
uses, the high 32-bit value is 0 (local Unix domain).

Cheers, Andreas






^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 23:46               ` Andreas Gruenbacher
  0 siblings, 0 replies; 66+ messages in thread
From: Andreas Gruenbacher @ 2011-10-20 23:46 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Aneesh Kumar K.V, akpm, viro, dhowells,
	linux-fsdevel, linux-nfs, linux-kernel

On Thu, 2011-10-20 at 06:25 -0400, J. Bruce Fields wrote:
> On Thu, Oct 20, 2011 at 05:19:46AM -0400, Christoph Hellwig wrote:
> > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > > Does it really make sense to use a string here just to pick between the
> > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > > integer?  Is the goal to expand this somehow eventually?
> > > > 
> > 
> > > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > > format if needed.

Yep.  On the other hand, none of the code won't actually allow to use
user@domain identifiers, it won't help with other identifier types like
Windows SIDs, and it doesn't make the code any prettier, so this should
probably go away.

> > > Is that likely?  For that to be useful, tasks would need to be able to
> > > run as user@domain strings.  And we'd probably want owners and groups to
> > > also be user@domain strings.

I really don't see this happen anytime soon, and likely not at all.

> > > The container people seem to eventually want to add some kind of
> > > namespace identifier everywhere:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > > 
> > > in which case I guess we'd likely end up with (uid, user namespace id)
> > > instead of user@domain?

The filesystem still wouldn't have namespace ids for the owner and
owning group, which is a much bigger issue.  I think we're safe not to
worry about namespace ids at this point; they also might never happen.

> > Storing strings is an extremly stupid idea.  The only thing that would
> > make sense would be storing a windows-style 128-bit GUID.
> > 
> 
> So if we want to do this without strings:
> 
> > > > +struct richace_xattr {
> > > > + __le16          e_type;
> > > > + __le16          e_flags;
> > > > + __le32          e_mask;
> > > > + __le32          e_id;
> > > > + char            e_who[0];
> 
> We could drop that last field and use some predefined values for e_id to
> represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.

That makes sense to me.

There seems to be a WELL_KNOWN_SID_TYPE enumeration which maps those
kinds of special identifiers to small integers in Windows; maybe it
makes sense to use the same numbers for OWNER@, GROUP@, and EVERYONE@.

> Then I'm not sure how you'd extend it if you later decided to add
> Windows GUID's or whatever.
> 
> But maybe it's not realistic to expect to be able to do that without a
> new interface and on-disk format: how could old software be expected to
> deal with acls that didn't use uid's?

The acl itself has a version field, so new formats could be introduced
in the future with a new version.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-20 23:46               ` Andreas Gruenbacher
  0 siblings, 0 replies; 66+ messages in thread
From: Andreas Gruenbacher @ 2011-10-20 23:46 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Aneesh Kumar K.V,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, 2011-10-20 at 06:25 -0400, J. Bruce Fields wrote:
> On Thu, Oct 20, 2011 at 05:19:46AM -0400, Christoph Hellwig wrote:
> > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > > Does it really make sense to use a string here just to pick between the
> > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > > integer?  Is the goal to expand this somehow eventually?
> > > > 
> > 
> > > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > > format if needed.

Yep.  On the other hand, none of the code won't actually allow to use
user@domain identifiers, it won't help with other identifier types like
Windows SIDs, and it doesn't make the code any prettier, so this should
probably go away.

> > > Is that likely?  For that to be useful, tasks would need to be able to
> > > run as user@domain strings.  And we'd probably want owners and groups to
> > > also be user@domain strings.

I really don't see this happen anytime soon, and likely not at all.

> > > The container people seem to eventually want to add some kind of
> > > namespace identifier everywhere:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > > 
> > > in which case I guess we'd likely end up with (uid, user namespace id)
> > > instead of user@domain?

The filesystem still wouldn't have namespace ids for the owner and
owning group, which is a much bigger issue.  I think we're safe not to
worry about namespace ids at this point; they also might never happen.

> > Storing strings is an extremly stupid idea.  The only thing that would
> > make sense would be storing a windows-style 128-bit GUID.
> > 
> 
> So if we want to do this without strings:
> 
> > > > +struct richace_xattr {
> > > > + __le16          e_type;
> > > > + __le16          e_flags;
> > > > + __le32          e_mask;
> > > > + __le32          e_id;
> > > > + char            e_who[0];
> 
> We could drop that last field and use some predefined values for e_id to
> represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.

That makes sense to me.

There seems to be a WELL_KNOWN_SID_TYPE enumeration which maps those
kinds of special identifiers to small integers in Windows; maybe it
makes sense to use the same numbers for OWNER@, GROUP@, and EVERYONE@.

> Then I'm not sure how you'd extend it if you later decided to add
> Windows GUID's or whatever.
> 
> But maybe it's not realistic to expect to be able to do that without a
> new interface and on-disk format: how could old software be expected to
> deal with acls that didn't use uid's?

The acl itself has a version field, so new formats could be introduced
in the future with a new version.

Thanks,
Andreas

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-20 23:46               ` Andreas Gruenbacher
  (?)
@ 2011-10-21  0:45               ` J. Bruce Fields
  -1 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-10-21  0:45 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Christoph Hellwig, Aneesh Kumar K.V, akpm, viro, dhowells,
	linux-fsdevel, linux-nfs, linux-kernel

On Fri, Oct 21, 2011 at 01:46:29AM +0200, Andreas Gruenbacher wrote:
> On Thu, 2011-10-20 at 06:25 -0400, J. Bruce Fields wrote:
> > So if we want to do this without strings:
> > 
> > > > > +struct richace_xattr {
> > > > > + __le16          e_type;
> > > > > + __le16          e_flags;
> > > > > + __le32          e_mask;
> > > > > + __le32          e_id;
> > > > > + char            e_who[0];
> > 
> > We could drop that last field and use some predefined values for e_id to
> > represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.
> 
> That makes sense to me.
> 
> There seems to be a WELL_KNOWN_SID_TYPE enumeration which maps those
> kinds of special identifiers to small integers in Windows; maybe it
> makes sense to use the same numbers for OWNER@, GROUP@, and EVERYONE@.
> 
> > Then I'm not sure how you'd extend it if you later decided to add
> > Windows GUID's or whatever.
> > 
> > But maybe it's not realistic to expect to be able to do that without a
> > new interface and on-disk format: how could old software be expected to
> > deal with acls that didn't use uid's?
> 
> The acl itself has a version field, so new formats could be introduced
> in the future with a new version.

OK, sounds good.  So let's just assume uid's, and wait to deal with
anything more complicated until we know what's going to happen.  Aneesh,
does that sound good?

And then I think the patches are ready....

--b.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-21  9:40                 ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-21  9:40 UTC (permalink / raw)
  To: Andreas Gruenbacher, J. Bruce Fields
  Cc: Christoph Hellwig, akpm, viro, dhowells, linux-fsdevel,
	linux-nfs, linux-kernel

On Fri, 21 Oct 2011 01:46:29 +0200, Andreas Gruenbacher <agruen@kernel.org> wrote:
> On Thu, 2011-10-20 at 06:25 -0400, J. Bruce Fields wrote:
> > On Thu, Oct 20, 2011 at 05:19:46AM -0400, Christoph Hellwig wrote:
> > > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > > > Does it really make sense to use a string here just to pick between the
> > > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > > > integer?  Is the goal to expand this somehow eventually?
> > > > > 
> > > 
> > > > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > > > format if needed.
> 
> Yep.  On the other hand, none of the code won't actually allow to use
> user@domain identifiers, it won't help with other identifier types like
> Windows SIDs, and it doesn't make the code any prettier, so this should
> probably go away.
> 
> > > > Is that likely?  For that to be useful, tasks would need to be able to
> > > > run as user@domain strings.  And we'd probably want owners and groups to
> > > > also be user@domain strings.
> 
> I really don't see this happen anytime soon, and likely not at all.
> 
> > > > The container people seem to eventually want to add some kind of
> > > > namespace identifier everywhere:
> > > > 
> > > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > > > 
> > > > in which case I guess we'd likely end up with (uid, user namespace id)
> > > > instead of user@domain?
> 
> The filesystem still wouldn't have namespace ids for the owner and
> owning group, which is a much bigger issue.  I think we're safe not to
> worry about namespace ids at this point; they also might never happen.
> 
> > > Storing strings is an extremly stupid idea.  The only thing that would
> > > make sense would be storing a windows-style 128-bit GUID.
> > > 
> > 
> > So if we want to do this without strings:
> > 
> > > > > +struct richace_xattr {
> > > > > + __le16          e_type;
> > > > > + __le16          e_flags;
> > > > > + __le32          e_mask;
> > > > > + __le32          e_id;
> > > > > + char            e_who[0];
> > 
> > We could drop that last field and use some predefined values for e_id to
> > represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.
> 
> That makes sense to me.
> 
> There seems to be a WELL_KNOWN_SID_TYPE enumeration which maps those
> kinds of special identifiers to small integers in Windows; maybe it
> makes sense to use the same numbers for OWNER@, GROUP@, and EVERYONE@.
> 
> > Then I'm not sure how you'd extend it if you later decided to add
> > Windows GUID's or whatever.
> > 
> > But maybe it's not realistic to expect to be able to do that without a
> > new interface and on-disk format: how could old software be expected to
> > deal with acls that didn't use uid's?
> 
> The acl itself has a version field, so new formats could be introduced
> in the future with a new version.

How about the below change. This will require richacl tools change
also. I made the e_flags 32 bit to make sure we don't take the space
needed NFSv4 ACL related flags.

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index 9a57039..9179fcd 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -20,19 +20,6 @@
 
 MODULE_LICENSE("GPL");
 
-/*
- * Special e_who identifiers:  ACEs which have ACE4_SPECIAL_WHO set in
- * ace->e_flags use these constants in ace->u.e_who.
- *
- * For efficiency, we compare pointers instead of comparing strings.
- */
-const char richace_owner_who[]	  = "OWNER@";
-EXPORT_SYMBOL_GPL(richace_owner_who);
-const char richace_group_who[]	  = "GROUP@";
-EXPORT_SYMBOL_GPL(richace_group_who);
-const char richace_everyone_who[] = "EVERYONE@";
-EXPORT_SYMBOL_GPL(richace_everyone_who);
-
 /**
  * richacl_alloc  -  allocate a richacl
  * @count:	number of entries
@@ -191,41 +178,14 @@ EXPORT_SYMBOL_GPL(richacl_want_to_mask);
 int
 richace_is_same_identifier(const struct richace *a, const struct richace *b)
 {
-#define WHO_FLAGS (ACE4_SPECIAL_WHO | ACE4_IDENTIFIER_GROUP)
+#define WHO_FLAGS (ACE4_SPECIAL_WHO | ACE4_UNIXID_WHO | ACE4_IDENTIFIER_GROUP)
 	if ((a->e_flags & WHO_FLAGS) != (b->e_flags & WHO_FLAGS))
 		return 0;
-	if (a->e_flags & ACE4_SPECIAL_WHO)
-		return a->u.e_who == b->u.e_who;
-	else
-		return a->u.e_id == b->u.e_id;
+	return a->e_id == b->e_id;
 #undef WHO_FLAGS
 }
 
 /**
- * richacl_set_who  -  set a special who value
- * @ace:	acl entry
- * @who:	who value to use
- */
-int
-richace_set_who(struct richace *ace, const char *who)
-{
-	if (!strcmp(who, richace_owner_who))
-		who = richace_owner_who;
-	else if (!strcmp(who, richace_group_who))
-		who = richace_group_who;
-	else if (!strcmp(who, richace_everyone_who))
-		who = richace_everyone_who;
-	else
-		return -EINVAL;
-
-	ace->u.e_who = who;
-	ace->e_flags |= ACE4_SPECIAL_WHO;
-	ace->e_flags &= ~ACE4_IDENTIFIER_GROUP;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(richace_set_who);
-
-/**
  * richacl_allowed_to_who  -  mask flags allowed to a specific who value
  *
  * Computes the mask values allowed to a specific who value, taking
@@ -446,10 +406,10 @@ richacl_permission(struct inode *inode, const struct richacl *acl,
 				continue;
 		} else if (richace_is_unix_id(ace)) {
 			if (ace->e_flags & ACE4_IDENTIFIER_GROUP) {
-				if (!in_group_p(ace->u.e_id))
+				if (!in_group_p(ace->e_id))
 					continue;
 			} else {
-				if (current_fsuid() != ace->u.e_id)
+				if (current_fsuid() != ace->e_id)
 					continue;
 			}
 		} else
diff --git a/fs/richacl_xattr.c b/fs/richacl_xattr.c
index 02a7986..3f1f557 100644
--- a/fs/richacl_xattr.c
+++ b/fs/richacl_xattr.c
@@ -58,19 +58,14 @@ richacl_from_xattr(const void *value, size_t size)
 		goto fail_einval;
 
 	richacl_for_each_entry(ace, acl) {
-		const char *who = (void *)(xattr_ace + 1), *end;
-		ssize_t used = (void *)who - value;
 
-		if (used > size)
-			goto fail_einval;
-		end = memchr(who, 0, size - used);
-		if (!end)
+		if (((void *)xattr_ace + sizeof(*xattr_ace)) > value + size)
 			goto fail_einval;
 
-		ace->e_type = le16_to_cpu(xattr_ace->e_type);
-		ace->e_flags = le16_to_cpu(xattr_ace->e_flags);
-		ace->e_mask = le32_to_cpu(xattr_ace->e_mask);
-		ace->u.e_id = le32_to_cpu(xattr_ace->e_id);
+		ace->e_type  = le16_to_cpu(xattr_ace->e_type);
+		ace->e_flags = le32_to_cpu(xattr_ace->e_flags);
+		ace->e_mask  = le32_to_cpu(xattr_ace->e_mask);
+		ace->e_id    = le32_to_cpu(xattr_ace->e_id);
 
 		if (ace->e_flags & ~ACE4_VALID_FLAGS)
 			goto fail_einval;
@@ -78,13 +73,7 @@ richacl_from_xattr(const void *value, size_t size)
 		    (ace->e_mask & ~ACE4_VALID_MASK))
 			goto fail_einval;
 
-		if (who == end) {
-			if (ace->u.e_id == -1)
-				goto fail_einval;  /* uid/gid needed */
-		} else if (richace_set_who(ace, who))
-			goto fail_einval;
-
-		xattr_ace = (void *)who + ALIGN(end - who + 1, 4);
+		xattr_ace = xattr_ace + 1;
 	}
 
 	return acl;
@@ -102,13 +91,8 @@ size_t
 richacl_xattr_size(const struct richacl *acl)
 {
 	size_t size = sizeof(struct richacl_xattr);
-	const struct richace *ace;
 
-	richacl_for_each_entry(ace, acl) {
-		size += sizeof(struct richace_xattr) +
-			(richace_is_unix_id(ace) ? 4 :
-			 ALIGN(strlen(ace->u.e_who) + 1, 4));
-	}
+	size += sizeof(struct richace_xattr) * acl->a_count;
 	return size;
 }
 EXPORT_SYMBOL_GPL(richacl_xattr_size);
@@ -136,21 +120,11 @@ richacl_to_xattr(const struct richacl *acl, void *buffer)
 	xattr_ace = (void *)(xattr_acl + 1);
 	richacl_for_each_entry(ace, acl) {
 		xattr_ace->e_type = cpu_to_le16(ace->e_type);
-		xattr_ace->e_flags = cpu_to_le16(ace->e_flags &
+		xattr_ace->e_flags = cpu_to_le32(ace->e_flags &
 						 ACE4_VALID_FLAGS);
 		xattr_ace->e_mask = cpu_to_le32(ace->e_mask);
-		if (richace_is_unix_id(ace)) {
-			xattr_ace->e_id = cpu_to_le32(ace->u.e_id);
-			memset(xattr_ace->e_who, 0, 4);
-			xattr_ace = (void *)xattr_ace->e_who + 4;
-		} else {
-			int sz = ALIGN(strlen(ace->u.e_who) + 1, 4);
-
-			xattr_ace->e_id = cpu_to_le32(-1);
-			memset(xattr_ace->e_who + sz - 4, 0, 4);
-			strcpy(xattr_ace->e_who, ace->u.e_who);
-			xattr_ace = (void *)xattr_ace->e_who + sz;
-		}
+		xattr_ace->e_id = cpu_to_le32(ace->e_id);
+		xattr_ace = xattr_ace + 1;
 	}
 }
 EXPORT_SYMBOL_GPL(richacl_to_xattr);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 4af6d22..e4c5156 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -17,14 +17,15 @@
 #define __RICHACL_H
 #include <linux/slab.h>
 
+#define ACE_OWNER_ID		130
+#define ACE_GROUP_ID		131
+#define ACE_EVERYONE_ID		110
+
 struct richace {
 	unsigned short	e_type;
-	unsigned short	e_flags;
+	unsigned int	e_flags;
 	unsigned int	e_mask;
-	union {
-		unsigned int	e_id;
-		const char	*e_who;
-	} u;
+	unsigned int	e_id;
 };
 
 struct richacl {
@@ -74,8 +75,10 @@ struct richacl {
 /*#define ACE4_FAILED_ACCESS_ACE_FLAG	0x0020*/
 #define ACE4_IDENTIFIER_GROUP		0x0040
 #define ACE4_INHERITED_ACE		0x0080
-/* in-memory representation only */
-#define ACE4_SPECIAL_WHO		0x4000
+/* richacl specific flag values */
+#define ACE4_SPECIAL_WHO		0x80000000
+#define ACE4_UNIXID_WHO			0x40000000
+
 
 #define ACE4_VALID_FLAGS (			\
 	ACE4_FILE_INHERIT_ACE |			\
@@ -83,7 +86,9 @@ struct richacl {
 	ACE4_NO_PROPAGATE_INHERIT_ACE |		\
 	ACE4_INHERIT_ONLY_ACE |			\
 	ACE4_IDENTIFIER_GROUP |			\
-	ACE4_INHERITED_ACE)
+	ACE4_INHERITED_ACE |			\
+	ACE4_SPECIAL_WHO |			\
+	ACE4_UNIXID_WHO)
 
 /* e_mask bitflags */
 #define ACE4_READ_DATA			0x00000001
@@ -254,14 +259,6 @@ richacl_is_protected(const struct richacl *acl)
 	return acl->a_flags & ACL4_PROTECTED;
 }
 
-/*
- * Special e_who identifiers: we use these pointer values in comparisons
- * instead of doing a strcmp.
- */
-extern const char richace_owner_who[];
-extern const char richace_group_who[];
-extern const char richace_everyone_who[];
-
 /**
  * richace_is_owner  -  check if @ace is an OWNER@ entry
  */
@@ -269,7 +266,7 @@ static inline int
 richace_is_owner(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_owner_who;
+	       ace->e_id == ACE_OWNER_ID;
 }
 
 /**
@@ -279,7 +276,7 @@ static inline int
 richace_is_group(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_group_who;
+	       ace->e_id == ACE_GROUP_ID;
 }
 
 /**
@@ -289,7 +286,7 @@ static inline int
 richace_is_everyone(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_everyone_who;
+	       ace->e_id == ACE_EVERYONE_ID;
 }
 
 /**
@@ -298,7 +295,7 @@ richace_is_everyone(const struct richace *ace)
 static inline int
 richace_is_unix_id(const struct richace *ace)
 {
-	return !(ace->e_flags & ACE4_SPECIAL_WHO);
+	return (ace->e_flags & ACE4_UNIXID_WHO);
 }
 
 /**
@@ -357,7 +354,7 @@ richace_is_deny(const struct richace *ace)
 extern struct richacl *richacl_alloc(int);
 extern int richace_is_same_identifier(const struct richace *,
 				      const struct richace *);
-extern int richace_set_who(struct richace *, const char *);
+extern int richace_set_who(struct richace *, const u8*, u_int32_t);
 extern int richacl_masks_to_mode(const struct richacl *);
 extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
diff --git a/include/linux/richacl_xattr.h b/include/linux/richacl_xattr.h
index f79ec12..19cb61e 100644
--- a/include/linux/richacl_xattr.h
+++ b/include/linux/richacl_xattr.h
@@ -22,10 +22,9 @@
 
 struct richace_xattr {
 	__le16		e_type;
-	__le16		e_flags;
+	__le32		e_flags;
 	__le32		e_mask;
 	__le32		e_id;
-	char		e_who[0];
 };
 
 struct richacl_xattr {


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
@ 2011-10-21  9:40                 ` Aneesh Kumar K.V
  0 siblings, 0 replies; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-21  9:40 UTC (permalink / raw)
  To: Andreas Gruenbacher, J. Bruce Fields
  Cc: Christoph Hellwig, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn,
	dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 21 Oct 2011 01:46:29 +0200, Andreas Gruenbacher <agruen-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu, 2011-10-20 at 06:25 -0400, J. Bruce Fields wrote:
> > On Thu, Oct 20, 2011 at 05:19:46AM -0400, Christoph Hellwig wrote:
> > > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
> > > > > > Does it really make sense to use a string here just to pick between the
> > > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
> > > > > > integer?  Is the goal to expand this somehow eventually?
> > > > > 
> > > 
> > > > > I guess Andreas wanted the disk layout to be able to store user@domain
> > > > > format if needed.
> 
> Yep.  On the other hand, none of the code won't actually allow to use
> user@domain identifiers, it won't help with other identifier types like
> Windows SIDs, and it doesn't make the code any prettier, so this should
> probably go away.
> 
> > > > Is that likely?  For that to be useful, tasks would need to be able to
> > > > run as user@domain strings.  And we'd probably want owners and groups to
> > > > also be user@domain strings.
> 
> I really don't see this happen anytime soon, and likely not at all.
> 
> > > > The container people seem to eventually want to add some kind of
> > > > namespace identifier everywhere:
> > > > 
> > > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
> > > > 
> > > > in which case I guess we'd likely end up with (uid, user namespace id)
> > > > instead of user@domain?
> 
> The filesystem still wouldn't have namespace ids for the owner and
> owning group, which is a much bigger issue.  I think we're safe not to
> worry about namespace ids at this point; they also might never happen.
> 
> > > Storing strings is an extremly stupid idea.  The only thing that would
> > > make sense would be storing a windows-style 128-bit GUID.
> > > 
> > 
> > So if we want to do this without strings:
> > 
> > > > > +struct richace_xattr {
> > > > > + __le16          e_type;
> > > > > + __le16          e_flags;
> > > > > + __le32          e_mask;
> > > > > + __le32          e_id;
> > > > > + char            e_who[0];
> > 
> > We could drop that last field and use some predefined values for e_id to
> > represent owner/group/everyone in the e_type == ACE4_SPECIAL_WHO case.
> 
> That makes sense to me.
> 
> There seems to be a WELL_KNOWN_SID_TYPE enumeration which maps those
> kinds of special identifiers to small integers in Windows; maybe it
> makes sense to use the same numbers for OWNER@, GROUP@, and EVERYONE@.
> 
> > Then I'm not sure how you'd extend it if you later decided to add
> > Windows GUID's or whatever.
> > 
> > But maybe it's not realistic to expect to be able to do that without a
> > new interface and on-disk format: how could old software be expected to
> > deal with acls that didn't use uid's?
> 
> The acl itself has a version field, so new formats could be introduced
> in the future with a new version.

How about the below change. This will require richacl tools change
also. I made the e_flags 32 bit to make sure we don't take the space
needed NFSv4 ACL related flags.

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index 9a57039..9179fcd 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -20,19 +20,6 @@
 
 MODULE_LICENSE("GPL");
 
-/*
- * Special e_who identifiers:  ACEs which have ACE4_SPECIAL_WHO set in
- * ace->e_flags use these constants in ace->u.e_who.
- *
- * For efficiency, we compare pointers instead of comparing strings.
- */
-const char richace_owner_who[]	  = "OWNER@";
-EXPORT_SYMBOL_GPL(richace_owner_who);
-const char richace_group_who[]	  = "GROUP@";
-EXPORT_SYMBOL_GPL(richace_group_who);
-const char richace_everyone_who[] = "EVERYONE@";
-EXPORT_SYMBOL_GPL(richace_everyone_who);
-
 /**
  * richacl_alloc  -  allocate a richacl
  * @count:	number of entries
@@ -191,41 +178,14 @@ EXPORT_SYMBOL_GPL(richacl_want_to_mask);
 int
 richace_is_same_identifier(const struct richace *a, const struct richace *b)
 {
-#define WHO_FLAGS (ACE4_SPECIAL_WHO | ACE4_IDENTIFIER_GROUP)
+#define WHO_FLAGS (ACE4_SPECIAL_WHO | ACE4_UNIXID_WHO | ACE4_IDENTIFIER_GROUP)
 	if ((a->e_flags & WHO_FLAGS) != (b->e_flags & WHO_FLAGS))
 		return 0;
-	if (a->e_flags & ACE4_SPECIAL_WHO)
-		return a->u.e_who == b->u.e_who;
-	else
-		return a->u.e_id == b->u.e_id;
+	return a->e_id == b->e_id;
 #undef WHO_FLAGS
 }
 
 /**
- * richacl_set_who  -  set a special who value
- * @ace:	acl entry
- * @who:	who value to use
- */
-int
-richace_set_who(struct richace *ace, const char *who)
-{
-	if (!strcmp(who, richace_owner_who))
-		who = richace_owner_who;
-	else if (!strcmp(who, richace_group_who))
-		who = richace_group_who;
-	else if (!strcmp(who, richace_everyone_who))
-		who = richace_everyone_who;
-	else
-		return -EINVAL;
-
-	ace->u.e_who = who;
-	ace->e_flags |= ACE4_SPECIAL_WHO;
-	ace->e_flags &= ~ACE4_IDENTIFIER_GROUP;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(richace_set_who);
-
-/**
  * richacl_allowed_to_who  -  mask flags allowed to a specific who value
  *
  * Computes the mask values allowed to a specific who value, taking
@@ -446,10 +406,10 @@ richacl_permission(struct inode *inode, const struct richacl *acl,
 				continue;
 		} else if (richace_is_unix_id(ace)) {
 			if (ace->e_flags & ACE4_IDENTIFIER_GROUP) {
-				if (!in_group_p(ace->u.e_id))
+				if (!in_group_p(ace->e_id))
 					continue;
 			} else {
-				if (current_fsuid() != ace->u.e_id)
+				if (current_fsuid() != ace->e_id)
 					continue;
 			}
 		} else
diff --git a/fs/richacl_xattr.c b/fs/richacl_xattr.c
index 02a7986..3f1f557 100644
--- a/fs/richacl_xattr.c
+++ b/fs/richacl_xattr.c
@@ -58,19 +58,14 @@ richacl_from_xattr(const void *value, size_t size)
 		goto fail_einval;
 
 	richacl_for_each_entry(ace, acl) {
-		const char *who = (void *)(xattr_ace + 1), *end;
-		ssize_t used = (void *)who - value;
 
-		if (used > size)
-			goto fail_einval;
-		end = memchr(who, 0, size - used);
-		if (!end)
+		if (((void *)xattr_ace + sizeof(*xattr_ace)) > value + size)
 			goto fail_einval;
 
-		ace->e_type = le16_to_cpu(xattr_ace->e_type);
-		ace->e_flags = le16_to_cpu(xattr_ace->e_flags);
-		ace->e_mask = le32_to_cpu(xattr_ace->e_mask);
-		ace->u.e_id = le32_to_cpu(xattr_ace->e_id);
+		ace->e_type  = le16_to_cpu(xattr_ace->e_type);
+		ace->e_flags = le32_to_cpu(xattr_ace->e_flags);
+		ace->e_mask  = le32_to_cpu(xattr_ace->e_mask);
+		ace->e_id    = le32_to_cpu(xattr_ace->e_id);
 
 		if (ace->e_flags & ~ACE4_VALID_FLAGS)
 			goto fail_einval;
@@ -78,13 +73,7 @@ richacl_from_xattr(const void *value, size_t size)
 		    (ace->e_mask & ~ACE4_VALID_MASK))
 			goto fail_einval;
 
-		if (who == end) {
-			if (ace->u.e_id == -1)
-				goto fail_einval;  /* uid/gid needed */
-		} else if (richace_set_who(ace, who))
-			goto fail_einval;
-
-		xattr_ace = (void *)who + ALIGN(end - who + 1, 4);
+		xattr_ace = xattr_ace + 1;
 	}
 
 	return acl;
@@ -102,13 +91,8 @@ size_t
 richacl_xattr_size(const struct richacl *acl)
 {
 	size_t size = sizeof(struct richacl_xattr);
-	const struct richace *ace;
 
-	richacl_for_each_entry(ace, acl) {
-		size += sizeof(struct richace_xattr) +
-			(richace_is_unix_id(ace) ? 4 :
-			 ALIGN(strlen(ace->u.e_who) + 1, 4));
-	}
+	size += sizeof(struct richace_xattr) * acl->a_count;
 	return size;
 }
 EXPORT_SYMBOL_GPL(richacl_xattr_size);
@@ -136,21 +120,11 @@ richacl_to_xattr(const struct richacl *acl, void *buffer)
 	xattr_ace = (void *)(xattr_acl + 1);
 	richacl_for_each_entry(ace, acl) {
 		xattr_ace->e_type = cpu_to_le16(ace->e_type);
-		xattr_ace->e_flags = cpu_to_le16(ace->e_flags &
+		xattr_ace->e_flags = cpu_to_le32(ace->e_flags &
 						 ACE4_VALID_FLAGS);
 		xattr_ace->e_mask = cpu_to_le32(ace->e_mask);
-		if (richace_is_unix_id(ace)) {
-			xattr_ace->e_id = cpu_to_le32(ace->u.e_id);
-			memset(xattr_ace->e_who, 0, 4);
-			xattr_ace = (void *)xattr_ace->e_who + 4;
-		} else {
-			int sz = ALIGN(strlen(ace->u.e_who) + 1, 4);
-
-			xattr_ace->e_id = cpu_to_le32(-1);
-			memset(xattr_ace->e_who + sz - 4, 0, 4);
-			strcpy(xattr_ace->e_who, ace->u.e_who);
-			xattr_ace = (void *)xattr_ace->e_who + sz;
-		}
+		xattr_ace->e_id = cpu_to_le32(ace->e_id);
+		xattr_ace = xattr_ace + 1;
 	}
 }
 EXPORT_SYMBOL_GPL(richacl_to_xattr);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 4af6d22..e4c5156 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -17,14 +17,15 @@
 #define __RICHACL_H
 #include <linux/slab.h>
 
+#define ACE_OWNER_ID		130
+#define ACE_GROUP_ID		131
+#define ACE_EVERYONE_ID		110
+
 struct richace {
 	unsigned short	e_type;
-	unsigned short	e_flags;
+	unsigned int	e_flags;
 	unsigned int	e_mask;
-	union {
-		unsigned int	e_id;
-		const char	*e_who;
-	} u;
+	unsigned int	e_id;
 };
 
 struct richacl {
@@ -74,8 +75,10 @@ struct richacl {
 /*#define ACE4_FAILED_ACCESS_ACE_FLAG	0x0020*/
 #define ACE4_IDENTIFIER_GROUP		0x0040
 #define ACE4_INHERITED_ACE		0x0080
-/* in-memory representation only */
-#define ACE4_SPECIAL_WHO		0x4000
+/* richacl specific flag values */
+#define ACE4_SPECIAL_WHO		0x80000000
+#define ACE4_UNIXID_WHO			0x40000000
+
 
 #define ACE4_VALID_FLAGS (			\
 	ACE4_FILE_INHERIT_ACE |			\
@@ -83,7 +86,9 @@ struct richacl {
 	ACE4_NO_PROPAGATE_INHERIT_ACE |		\
 	ACE4_INHERIT_ONLY_ACE |			\
 	ACE4_IDENTIFIER_GROUP |			\
-	ACE4_INHERITED_ACE)
+	ACE4_INHERITED_ACE |			\
+	ACE4_SPECIAL_WHO |			\
+	ACE4_UNIXID_WHO)
 
 /* e_mask bitflags */
 #define ACE4_READ_DATA			0x00000001
@@ -254,14 +259,6 @@ richacl_is_protected(const struct richacl *acl)
 	return acl->a_flags & ACL4_PROTECTED;
 }
 
-/*
- * Special e_who identifiers: we use these pointer values in comparisons
- * instead of doing a strcmp.
- */
-extern const char richace_owner_who[];
-extern const char richace_group_who[];
-extern const char richace_everyone_who[];
-
 /**
  * richace_is_owner  -  check if @ace is an OWNER@ entry
  */
@@ -269,7 +266,7 @@ static inline int
 richace_is_owner(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_owner_who;
+	       ace->e_id == ACE_OWNER_ID;
 }
 
 /**
@@ -279,7 +276,7 @@ static inline int
 richace_is_group(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_group_who;
+	       ace->e_id == ACE_GROUP_ID;
 }
 
 /**
@@ -289,7 +286,7 @@ static inline int
 richace_is_everyone(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_everyone_who;
+	       ace->e_id == ACE_EVERYONE_ID;
 }
 
 /**
@@ -298,7 +295,7 @@ richace_is_everyone(const struct richace *ace)
 static inline int
 richace_is_unix_id(const struct richace *ace)
 {
-	return !(ace->e_flags & ACE4_SPECIAL_WHO);
+	return (ace->e_flags & ACE4_UNIXID_WHO);
 }
 
 /**
@@ -357,7 +354,7 @@ richace_is_deny(const struct richace *ace)
 extern struct richacl *richacl_alloc(int);
 extern int richace_is_same_identifier(const struct richace *,
 				      const struct richace *);
-extern int richace_set_who(struct richace *, const char *);
+extern int richace_set_who(struct richace *, const u8*, u_int32_t);
 extern int richacl_masks_to_mode(const struct richacl *);
 extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
diff --git a/include/linux/richacl_xattr.h b/include/linux/richacl_xattr.h
index f79ec12..19cb61e 100644
--- a/include/linux/richacl_xattr.h
+++ b/include/linux/richacl_xattr.h
@@ -22,10 +22,9 @@
 
 struct richace_xattr {
 	__le16		e_type;
-	__le16		e_flags;
+	__le32		e_flags;
 	__le32		e_mask;
 	__le32		e_id;
-	char		e_who[0];
 };
 
 struct richacl_xattr {

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-21  9:40                 ` Aneesh Kumar K.V
  (?)
@ 2011-10-21 10:52                 ` Andreas Gruenbacher
  2011-10-21 13:12                   ` Aneesh Kumar K.V
  -1 siblings, 1 reply; 66+ messages in thread
From: Andreas Gruenbacher @ 2011-10-21 10:52 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: J. Bruce Fields, Christoph Hellwig, akpm, viro, dhowells,
	linux-fsdevel, linux-nfs, linux-kernel

On Fri, 2011-10-21 at 15:10 +0530, Aneesh Kumar K.V wrote:
> How about the below change. This will require richacl tools change
> also.

>  I made the e_flags 32 bit to make sure we don't take the space
>  needed NFSv4 ACL related flags.

But struct richace_xattr has a hole now.  

There's ample of space left in the 16-bit field; I don't think there is
a need to extend it.  If the need should ever arise, we can still define
a new version of the xattr format.  Also, this change creates a hole in
struct richace_xattr; we can't do that.

> +#define ACE4_SPECIAL_WHO		0x80000000
> +#define ACE4_UNIXID_WHO			0x40000000

Can the ACE4_UNIXID_WHO flag please be removed again?  It isn't needed,
it just creates a mess.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-21 10:52                 ` Andreas Gruenbacher
@ 2011-10-21 13:12                   ` Aneesh Kumar K.V
  2011-10-21 23:58                     ` Andreas Gruenbacher
  0 siblings, 1 reply; 66+ messages in thread
From: Aneesh Kumar K.V @ 2011-10-21 13:12 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: J. Bruce Fields, Christoph Hellwig, akpm, viro, dhowells,
	linux-fsdevel, linux-nfs, linux-kernel

On Fri, 21 Oct 2011 12:52:10 +0200, Andreas Gruenbacher <agruen@kernel.org> wrote:
> On Fri, 2011-10-21 at 15:10 +0530, Aneesh Kumar K.V wrote:
> > How about the below change. This will require richacl tools change
> > also.
> 
> >  I made the e_flags 32 bit to make sure we don't take the space
> >  needed NFSv4 ACL related flags.
> 
> But struct richace_xattr has a hole now.  
> 
> There's ample of space left in the 16-bit field; I don't think there is
> a need to extend it.  If the need should ever arise, we can still define
> a new version of the xattr format.  Also, this change creates a hole in
> struct richace_xattr; we can't do that.
> 
> > +#define ACE4_SPECIAL_WHO		0x80000000
> > +#define ACE4_UNIXID_WHO			0x40000000
> 
> Can the ACE4_UNIXID_WHO flag please be removed again?  It isn't needed,
> it just creates a mess.
> 

Updated one below

diff --git a/fs/richacl_base.c b/fs/richacl_base.c
index 9a57039..fcc37d6 100644
--- a/fs/richacl_base.c
+++ b/fs/richacl_base.c
@@ -20,19 +20,6 @@
 
 MODULE_LICENSE("GPL");
 
-/*
- * Special e_who identifiers:  ACEs which have ACE4_SPECIAL_WHO set in
- * ace->e_flags use these constants in ace->u.e_who.
- *
- * For efficiency, we compare pointers instead of comparing strings.
- */
-const char richace_owner_who[]	  = "OWNER@";
-EXPORT_SYMBOL_GPL(richace_owner_who);
-const char richace_group_who[]	  = "GROUP@";
-EXPORT_SYMBOL_GPL(richace_group_who);
-const char richace_everyone_who[] = "EVERYONE@";
-EXPORT_SYMBOL_GPL(richace_everyone_who);
-
 /**
  * richacl_alloc  -  allocate a richacl
  * @count:	number of entries
@@ -194,38 +181,11 @@ richace_is_same_identifier(const struct richace *a, const struct richace *b)
 #define WHO_FLAGS (ACE4_SPECIAL_WHO | ACE4_IDENTIFIER_GROUP)
 	if ((a->e_flags & WHO_FLAGS) != (b->e_flags & WHO_FLAGS))
 		return 0;
-	if (a->e_flags & ACE4_SPECIAL_WHO)
-		return a->u.e_who == b->u.e_who;
-	else
-		return a->u.e_id == b->u.e_id;
+	return a->e_id == b->e_id;
 #undef WHO_FLAGS
 }
 
 /**
- * richacl_set_who  -  set a special who value
- * @ace:	acl entry
- * @who:	who value to use
- */
-int
-richace_set_who(struct richace *ace, const char *who)
-{
-	if (!strcmp(who, richace_owner_who))
-		who = richace_owner_who;
-	else if (!strcmp(who, richace_group_who))
-		who = richace_group_who;
-	else if (!strcmp(who, richace_everyone_who))
-		who = richace_everyone_who;
-	else
-		return -EINVAL;
-
-	ace->u.e_who = who;
-	ace->e_flags |= ACE4_SPECIAL_WHO;
-	ace->e_flags &= ~ACE4_IDENTIFIER_GROUP;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(richace_set_who);
-
-/**
  * richacl_allowed_to_who  -  mask flags allowed to a specific who value
  *
  * Computes the mask values allowed to a specific who value, taking
@@ -446,10 +406,10 @@ richacl_permission(struct inode *inode, const struct richacl *acl,
 				continue;
 		} else if (richace_is_unix_id(ace)) {
 			if (ace->e_flags & ACE4_IDENTIFIER_GROUP) {
-				if (!in_group_p(ace->u.e_id))
+				if (!in_group_p(ace->e_id))
 					continue;
 			} else {
-				if (current_fsuid() != ace->u.e_id)
+				if (current_fsuid() != ace->e_id)
 					continue;
 			}
 		} else
diff --git a/fs/richacl_xattr.c b/fs/richacl_xattr.c
index 02a7986..31e33b5 100644
--- a/fs/richacl_xattr.c
+++ b/fs/richacl_xattr.c
@@ -58,19 +58,14 @@ richacl_from_xattr(const void *value, size_t size)
 		goto fail_einval;
 
 	richacl_for_each_entry(ace, acl) {
-		const char *who = (void *)(xattr_ace + 1), *end;
-		ssize_t used = (void *)who - value;
 
-		if (used > size)
-			goto fail_einval;
-		end = memchr(who, 0, size - used);
-		if (!end)
+		if (((void *)xattr_ace + sizeof(*xattr_ace)) > (value + size))
 			goto fail_einval;
 
-		ace->e_type = le16_to_cpu(xattr_ace->e_type);
+		ace->e_type  = le16_to_cpu(xattr_ace->e_type);
 		ace->e_flags = le16_to_cpu(xattr_ace->e_flags);
-		ace->e_mask = le32_to_cpu(xattr_ace->e_mask);
-		ace->u.e_id = le32_to_cpu(xattr_ace->e_id);
+		ace->e_mask  = le32_to_cpu(xattr_ace->e_mask);
+		ace->e_id    = le32_to_cpu(xattr_ace->e_id);
 
 		if (ace->e_flags & ~ACE4_VALID_FLAGS)
 			goto fail_einval;
@@ -78,13 +73,7 @@ richacl_from_xattr(const void *value, size_t size)
 		    (ace->e_mask & ~ACE4_VALID_MASK))
 			goto fail_einval;
 
-		if (who == end) {
-			if (ace->u.e_id == -1)
-				goto fail_einval;  /* uid/gid needed */
-		} else if (richace_set_who(ace, who))
-			goto fail_einval;
-
-		xattr_ace = (void *)who + ALIGN(end - who + 1, 4);
+		xattr_ace++;
 	}
 
 	return acl;
@@ -102,13 +91,8 @@ size_t
 richacl_xattr_size(const struct richacl *acl)
 {
 	size_t size = sizeof(struct richacl_xattr);
-	const struct richace *ace;
 
-	richacl_for_each_entry(ace, acl) {
-		size += sizeof(struct richace_xattr) +
-			(richace_is_unix_id(ace) ? 4 :
-			 ALIGN(strlen(ace->u.e_who) + 1, 4));
-	}
+	size += sizeof(struct richace_xattr) * acl->a_count;
 	return size;
 }
 EXPORT_SYMBOL_GPL(richacl_xattr_size);
@@ -139,18 +123,8 @@ richacl_to_xattr(const struct richacl *acl, void *buffer)
 		xattr_ace->e_flags = cpu_to_le16(ace->e_flags &
 						 ACE4_VALID_FLAGS);
 		xattr_ace->e_mask = cpu_to_le32(ace->e_mask);
-		if (richace_is_unix_id(ace)) {
-			xattr_ace->e_id = cpu_to_le32(ace->u.e_id);
-			memset(xattr_ace->e_who, 0, 4);
-			xattr_ace = (void *)xattr_ace->e_who + 4;
-		} else {
-			int sz = ALIGN(strlen(ace->u.e_who) + 1, 4);
-
-			xattr_ace->e_id = cpu_to_le32(-1);
-			memset(xattr_ace->e_who + sz - 4, 0, 4);
-			strcpy(xattr_ace->e_who, ace->u.e_who);
-			xattr_ace = (void *)xattr_ace->e_who + sz;
-		}
+		xattr_ace->e_id = cpu_to_le32(ace->e_id);
+		xattr_ace++;
 	}
 }
 EXPORT_SYMBOL_GPL(richacl_to_xattr);
diff --git a/include/linux/richacl.h b/include/linux/richacl.h
index 4af6d22..3fc6be2 100644
--- a/include/linux/richacl.h
+++ b/include/linux/richacl.h
@@ -17,14 +17,15 @@
 #define __RICHACL_H
 #include <linux/slab.h>
 
+#define ACE_OWNER_ID		130
+#define ACE_GROUP_ID		131
+#define ACE_EVERYONE_ID		110
+
 struct richace {
 	unsigned short	e_type;
 	unsigned short	e_flags;
 	unsigned int	e_mask;
-	union {
-		unsigned int	e_id;
-		const char	*e_who;
-	} u;
+	unsigned int	e_id;
 };
 
 struct richacl {
@@ -74,7 +75,7 @@ struct richacl {
 /*#define ACE4_FAILED_ACCESS_ACE_FLAG	0x0020*/
 #define ACE4_IDENTIFIER_GROUP		0x0040
 #define ACE4_INHERITED_ACE		0x0080
-/* in-memory representation only */
+/* richacl specific flag values */
 #define ACE4_SPECIAL_WHO		0x4000
 
 #define ACE4_VALID_FLAGS (			\
@@ -83,7 +84,9 @@ struct richacl {
 	ACE4_NO_PROPAGATE_INHERIT_ACE |		\
 	ACE4_INHERIT_ONLY_ACE |			\
 	ACE4_IDENTIFIER_GROUP |			\
-	ACE4_INHERITED_ACE)
+	ACE4_INHERITED_ACE |			\
+	ACE4_SPECIAL_WHO)
+
 
 /* e_mask bitflags */
 #define ACE4_READ_DATA			0x00000001
@@ -254,14 +257,6 @@ richacl_is_protected(const struct richacl *acl)
 	return acl->a_flags & ACL4_PROTECTED;
 }
 
-/*
- * Special e_who identifiers: we use these pointer values in comparisons
- * instead of doing a strcmp.
- */
-extern const char richace_owner_who[];
-extern const char richace_group_who[];
-extern const char richace_everyone_who[];
-
 /**
  * richace_is_owner  -  check if @ace is an OWNER@ entry
  */
@@ -269,7 +264,7 @@ static inline int
 richace_is_owner(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_owner_who;
+	       ace->e_id == ACE_OWNER_ID;
 }
 
 /**
@@ -279,7 +274,7 @@ static inline int
 richace_is_group(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_group_who;
+	       ace->e_id == ACE_GROUP_ID;
 }
 
 /**
@@ -289,7 +284,7 @@ static inline int
 richace_is_everyone(const struct richace *ace)
 {
 	return (ace->e_flags & ACE4_SPECIAL_WHO) &&
-	       ace->u.e_who == richace_everyone_who;
+	       ace->e_id == ACE_EVERYONE_ID;
 }
 
 /**
@@ -357,7 +352,6 @@ richace_is_deny(const struct richace *ace)
 extern struct richacl *richacl_alloc(int);
 extern int richace_is_same_identifier(const struct richace *,
 				      const struct richace *);
-extern int richace_set_who(struct richace *, const char *);
 extern int richacl_masks_to_mode(const struct richacl *);
 extern unsigned int richacl_mode_to_mask(mode_t);
 extern unsigned int richacl_want_to_mask(unsigned int);
diff --git a/include/linux/richacl_xattr.h b/include/linux/richacl_xattr.h
index f79ec12..792abcc 100644
--- a/include/linux/richacl_xattr.h
+++ b/include/linux/richacl_xattr.h
@@ -25,7 +25,6 @@ struct richace_xattr {
 	__le16		e_flags;
 	__le32		e_mask;
 	__le32		e_id;
-	char		e_who[0];
 };
 
 struct richacl_xattr {


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-21 13:12                   ` Aneesh Kumar K.V
@ 2011-10-21 23:58                     ` Andreas Gruenbacher
  0 siblings, 0 replies; 66+ messages in thread
From: Andreas Gruenbacher @ 2011-10-21 23:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: J. Bruce Fields, Christoph Hellwig, akpm, viro, dhowells,
	linux-fsdevel, linux-nfs, linux-kernel

On Fri, 2011-10-21 at 18:42 +0530, Aneesh Kumar K.V wrote:
> diff --git a/fs/richacl_xattr.c b/fs/richacl_xattr.c
> index 02a7986..31e33b5 100644
> --- a/fs/richacl_xattr.c
> +++ b/fs/richacl_xattr.c
> @@ -58,19 +58,14 @@ richacl_from_xattr(const void *value, size_t size)
>  		goto fail_einval;
>  
>  	richacl_for_each_entry(ace, acl) {
> -		const char *who = (void *)(xattr_ace + 1), *end;
> -		ssize_t used = (void *)who - value;
>  
> -		if (used > size)
> -			goto fail_einval;
> -		end = memchr(who, 0, size - used);
> -		if (!end)
> +		if (((void *)xattr_ace + sizeof(*xattr_ace)) > (value + size))
>  			goto fail_einval;

This check can be moved out of the loop now.

Other than that, I'm happy with the patch; acked.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-20 17:49               ` J. Bruce Fields
  (?)
  (?)
@ 2011-11-19  9:28               ` Eric W. Biederman
  2011-11-21 13:35                 ` J. Bruce Fields
  -1 siblings, 1 reply; 66+ messages in thread
From: Eric W. Biederman @ 2011-11-19  9:28 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Aneesh Kumar K.V, Christoph Hellwig, agruen, akpm, viro,
	dhowells, linux-fsdevel, linux-nfs, linux-kernel,
	Serge E. Hallyn, Andreas Dilger

"J. Bruce Fields" <bfields@fieldses.org> writes:

> On Thu, Oct 20, 2011 at 04:32:04PM +0530, Aneesh Kumar K.V wrote:
>> On Thu, 20 Oct 2011 05:19:46 -0400, Christoph Hellwig <hch@infradead.org> wrote:
>> > On Thu, Oct 20, 2011 at 05:14:34AM -0400, J. Bruce Fields wrote:
>> > > > > Does it really make sense to use a string here just to pick between the
>> > > > > three choices OWNER@, GROUP@, and EVERYONE@?  Why not just another small
>> > > > > integer?  Is the goal to expand this somehow eventually?
>> > > > 
>> > 
>> > > > I guess Andreas wanted the disk layout to be able to store user@domain
>> > > > format if needed.
>> > > 
>> > > Is that likely?  For that to be useful, tasks would need to be able to
>> > > run as user@domain strings.  And we'd probably want owners and groups to
>> > > also be user@domain strings.
>> > > 
>> > > The container people seem to eventually want to add some kind of
>> > > namespace identifier everywhere:
>> > > 
>> > > 	http://marc.info/?l=linux-kernel&m=131836778427871&w=2
>> > > 
>> > > in which case I guess we'd likely end up with (uid, user namespace id)
>> > > instead of user@domain?
>> > 
>> > 
>> > Storing strings is an extremly stupid idea.  The only thing that would
>> > make sense would be storing a windows-style 128-bit GUID.
>> > 
>> 
>> How about updating the richacl_xattr as below 
>> 
>> struct richace_xattr {
>> 	__le16		e_type;
>> 	__le16		e_flags;
>> 	__le32		e_mask;
>> 	__le32		e_size;
>> 	u8		e_id[0];
>> };
>> 
>> now e_flags can contain ACE4_SPECIAL_WHO to indicate value in e_id
>> indicate special who values (which could be 1 byte value indicating
>> OWNER@, GROUP@ or EVERYONE@), ACE4_UNIXID_WHO, to indicate value
>> in e_id is the little endian value of unix id. ACE_WINSID_WHO to
>> indicate e_id is the 128 bit array containing SID value. ?
>
> That's effectively still a string.
>
> Would it be so bad to have to introduce another xattr type if we needed
> a new id type?  You'll have to modify the filesystem and the userspace
> tools and everything anyway, won't you?
>
> But if we decide we don't need strings, then at a minimum let's make
> these some fixed small size.
>
> You could do something like:
>
> 	struct richace_xattr {
> 		__le16		e_type;
> 		__le16		e_flags;
> 		__le32		e_mask;
> 		__le32		e_id[4];
> 	}
>
> and just use e_id[0] for now.  That would still leave room for a 128-bit
> id, or for a 32-bit uid + some-size namespace-id.
>
> Cc'ing Eric Biederman in hopes of finding out whether that would satifsy
> whatever wacky future ideas might be expected for user namespaces.

Thanks for the cc.  After looking at the user namespace issues it looks
like the sane thing is really to map the user namespace uids into
appropriate uids for storing on the filesystem.  Anything else
seems to be a lot of pain for very little gain.

If a filesystem went as far as storing string ids.  I think I would
be happy to use different domains for different user namespaces, but
for anything else I just don't see the point.

What it does look like to me is that at some point we will want to
support > 32bit uids.  There are 7 billion people on the planet and we
only have 4 billion user ids.  The biggest individual organization have
3 million users, which keeps us safe for now.  However my forecast is
each user namespace is going to wind up giving each user a bunch of
uids.  That will accelerate the point at which we find 32bit uids tight.
How fast being generous and assigning 10k uids per user is going to get
us into trouble I don't know. 

Eric

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-10-20 19:49               ` Andreas Dilger
@ 2011-11-19  9:35                 ` Eric W. Biederman
  0 siblings, 0 replies; 66+ messages in thread
From: Eric W. Biederman @ 2011-11-19  9:35 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: J. Bruce Fields, Aneesh Kumar K.V, Christoph Hellwig, agruen,
	akpm, viro, dhowells, linux-fsdevel, linux-nfs, linux-kernel,
	Serge E. Hallyn

Andreas Dilger <adilger@dilger.ca> writes:

> Just as an FYI, from back when we were trying to port Lustre to Solaris,
> Solaris itself uses a 64-bit "FUID" (32-bit UID + 32-bit namespace) to
> handle this.
>
> It has a table for arbitrary mapping of 128-bit Windows domains to a
> 32-bit FUID namespace (don't know much detail here, sorry), and it is
> (reasonably) expected that a single system will not be in more than
> 2^32 namespaces at once.  This keeps the datatypes sane (u64 or 2x u32)
> and doesn't put much complexity into the filesystem/kernel.  For most
> uses, the high 32-bit value is 0 (local Unix domain).

Interesting.  For now it looks to me like a fixed partitions of a uid
into a namespace identifier and a normal uid is a brittle path to walk.

I am looking at using something slightly more dynamic.  Here for your
user namespace you get this range of uids.  That nests better and
allows for more flexibility in the future.

I think I can mostly keep filesystems where they don't care.  With just
a few places changes where we take the filesystem uid and map it into
what we store in the vfs cache.  And when we read the uid value out
of the vfs cache and map it into the id value that the filesystem
needs.

Eric

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH -V7 21/26] richacl: xattr mapping functions
  2011-11-19  9:28               ` Eric W. Biederman
@ 2011-11-21 13:35                 ` J. Bruce Fields
  0 siblings, 0 replies; 66+ messages in thread
From: J. Bruce Fields @ 2011-11-21 13:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Aneesh Kumar K.V, Christoph Hellwig, agruen, akpm, viro,
	dhowells, linux-fsdevel, linux-nfs, linux-kernel,
	Serge E. Hallyn, Andreas Dilger

On Sat, Nov 19, 2011 at 01:28:10AM -0800, Eric W. Biederman wrote:
> Thanks for the cc.  After looking at the user namespace issues it looks
> like the sane thing is really to map the user namespace uids into
> appropriate uids for storing on the filesystem.  Anything else
> seems to be a lot of pain for very little gain.
> 
> If a filesystem went as far as storing string ids.  I think I would
> be happy to use different domains for different user namespaces, but
> for anything else I just don't see the point.
> 
> What it does look like to me is that at some point we will want to
> support > 32bit uids.  There are 7 billion people on the planet and we
> only have 4 billion user ids.  The biggest individual organization have
> 3 million users, which keeps us safe for now.  However my forecast is
> each user namespace is going to wind up giving each user a bunch of
> uids.  That will accelerate the point at which we find 32bit uids tight.
> How fast being generous and assigning 10k uids per user is going to get
> us into trouble I don't know. 

Yes, bigger uid's make sense to me.

But at the point when we make that transition I think updating the ACL
format will be the least of our troubles.  So I think we'll leave it
alone rather than try to guess the right type now.

--b.

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2011-11-21 13:35 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-18 15:32 [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 01/26] vfs: Indicate that the permission functions take all the MAY_* flags Aneesh Kumar K.V
2011-10-18 15:32   ` Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 02/26] vfs: Add hex format for MAY_* flag values Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 03/26] vfs: Pass all mask flags down to iop->check_acl Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 04/26] vfs: Add a comment to inode_permission() Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 05/26] vfs: Add generic IS_ACL() test for acl support Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 06/26] vfs: Add IS_RICHACL() test for richacl support Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 07/26] vfs: Optimize out IS_RICHACL() if CONFIG_FS_RICHACL is not defined Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 08/26] vfs: Add new file and directory create permission flags Aneesh Kumar K.V
2011-10-19 16:42   ` J. Bruce Fields
2011-10-20  5:20     ` Aneesh Kumar K.V
2011-10-20  5:20       ` Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 09/26] vfs: Add delete child and delete self " Aneesh Kumar K.V
2011-10-19 22:09   ` J. Bruce Fields
2011-10-20  7:35     ` Aneesh Kumar K.V
2011-10-20  7:35       ` Aneesh Kumar K.V
2011-10-20  8:11       ` J. Bruce Fields
2011-10-18 15:32 ` [PATCH -V7 10/26] vfs: Make the inode passed to inode_change_ok non-const Aneesh Kumar K.V
2011-10-18 15:32   ` Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 11/26] vfs: Add permission flags for setting file attributes Aneesh Kumar K.V
2011-10-18 15:32   ` Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 12/26] vfs: Make acl_permission_check() work for richacls Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 13/26] richacl: In-memory representation and helper functions Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 14/26] richacl: Permission mapping functions Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 15/26] richacl: Compute maximum file masks from an acl Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 16/26] richacl: Update the file masks in chmod() Aneesh Kumar K.V
2011-10-18 15:32   ` Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 17/26] richacl: Permission check algorithm Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 18/26] richacl: Create-time inheritance Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 19/26] richacl: Check if an acl is equivalent to a file mode Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 20/26] richacl: Automatic Inheritance Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 21/26] richacl: xattr mapping functions Aneesh Kumar K.V
2011-10-18 15:32   ` Aneesh Kumar K.V
2011-10-19 22:20   ` J. Bruce Fields
2011-10-20  8:30     ` Aneesh Kumar K.V
2011-10-20  9:14       ` J. Bruce Fields
2011-10-20  9:19         ` Christoph Hellwig
2011-10-20 10:25           ` J. Bruce Fields
2011-10-20 10:25             ` J. Bruce Fields
2011-10-20 23:46             ` Andreas Gruenbacher
2011-10-20 23:46               ` Andreas Gruenbacher
2011-10-21  0:45               ` J. Bruce Fields
2011-10-21  9:40               ` Aneesh Kumar K.V
2011-10-21  9:40                 ` Aneesh Kumar K.V
2011-10-21 10:52                 ` Andreas Gruenbacher
2011-10-21 13:12                   ` Aneesh Kumar K.V
2011-10-21 23:58                     ` Andreas Gruenbacher
2011-10-20 11:02           ` Aneesh Kumar K.V
2011-10-20 11:02             ` Aneesh Kumar K.V
2011-10-20 17:49             ` J. Bruce Fields
2011-10-20 17:49               ` J. Bruce Fields
2011-10-20 19:49               ` Andreas Dilger
2011-11-19  9:35                 ` Eric W. Biederman
2011-11-19  9:28               ` Eric W. Biederman
2011-11-21 13:35                 ` J. Bruce Fields
2011-10-18 15:32 ` [PATCH -V7 22/26] vfs: Cache richacl in struct inode Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 23/26] vfs: Add richacl permission check Aneesh Kumar K.V
2011-10-18 15:32 ` [PATCH -V7 24/26] ext4: Use IS_POSIXACL() to check for POSIX ACL support Aneesh Kumar K.V
2011-10-18 15:33 ` [PATCH -V7 25/26] ext4: Implement rich acl for ext4 Aneesh Kumar K.V
2011-10-18 18:41   ` Andreas Dilger
2011-10-19  5:43     ` Aneesh Kumar K.V
2011-10-18 15:33 ` [PATCH -V7 26/26] ext4: Add Ext4 compat richacl feature flag Aneesh Kumar K.V
2011-10-18 16:17 ` [PATCH -V7 00/26] New ACL format for better NFSv4 acl interoperability Shea Levy
2011-10-19  5:54   ` Aneesh Kumar K.V
2011-10-19 22:21 ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.