linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Support user xattrs in cgroupfs
@ 2020-03-03  1:38 Daniel Xu
  2020-03-03  1:39 ` [PATCH 1/2] kernfs: Add option to enable user xattrs Daniel Xu
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Daniel Xu @ 2020-03-03  1:38 UTC (permalink / raw)
  To: cgroups, tj, lizefan, hannes; +Cc: Daniel Xu, linux-kernel, gregkh, kernel-team

User extended attributes are useful as metadata storage for kernfs
consumers like cgroups. Especially in the case of cgroups, it is useful
to have a central metadata store that multiple processes/services can
use to coordinate actions.

A concrete example is for userspace out of memory killers. We want to
let delegated cgroup subtree owners (running as non-root) to be able to
say "please avoid killing this cgroup". In server environments this is
less important as everyone is running as root. But for desktop linux,
this is more important.

The first patch introduces a new flag, KERNFS_ROOT_SUPPORT_USER_XATTR,
that lets kernfs consumers enable user xattr support. The second patch
turns on this feature for cgroupfs.

Daniel Xu (2):
  kernfs: Add option to enable user xattrs
  cgroupfs: Support user xattrs

 fs/kernfs/inode.c           | 47 +++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |  1 +
 include/linux/kernfs.h      |  6 +++++
 kernel/cgroup/cgroup.c      |  3 ++-
 4 files changed, 56 insertions(+), 1 deletion(-)

-- 
2.21.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] kernfs: Add option to enable user xattrs
  2020-03-03  1:38 [PATCH 0/2] Support user xattrs in cgroupfs Daniel Xu
@ 2020-03-03  1:39 ` Daniel Xu
  2020-03-03 13:19   ` Tejun Heo
  2020-03-03  1:39 ` [PATCH 2/2] cgroupfs: Support " Daniel Xu
  2020-03-03 16:50 ` [PATCH 0/2] Support user xattrs in cgroupfs Shakeel Butt
  2 siblings, 1 reply; 9+ messages in thread
From: Daniel Xu @ 2020-03-03  1:39 UTC (permalink / raw)
  To: cgroups, tj, lizefan, hannes; +Cc: Daniel Xu, linux-kernel, gregkh, kernel-team

User extended attributes are useful as metadata storage for kernfs
consumers like cgroups. Especially in the case of cgroups, it is useful
to have a central metadata store that multiple processes/services can
use to coordinate actions.

A concrete example is for userspace out of memory killers. We want to
let delegated cgroup subtree owners (running as non-root) to be able to
say "please avoid killing this cgroup". In server environments this is
less important as everyone is running as root. But for desktop linux,
this is more important.

This patch introduces a new flag, KERNFS_ROOT_SUPPORT_USER_XATTR, that
lets kernfs consumers enable user xattr support. An initial limit of 128
entries is placed because xattrs come from kernel memory and we don't
want to let unprivileged users accidentally eat up too much kernel
memory.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 fs/kernfs/inode.c           | 47 +++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |  1 +
 include/linux/kernfs.h      |  6 +++++
 3 files changed, 54 insertions(+)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index d0f7a5abd9a9..6d603d1177c4 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -53,6 +53,7 @@ static struct kernfs_iattrs *__kernfs_iattrs(struct kernfs_node *kn, int alloc)
 	kn->iattr->ia_ctime = kn->iattr->ia_atime;
 
 	simple_xattrs_init(&kn->iattr->xattrs);
+	atomic_set(&kn->iattr->nr_user_xattrs, 0);
 out_unlock:
 	ret = kn->iattr;
 	mutex_unlock(&iattr_mutex);
@@ -327,6 +328,45 @@ static int kernfs_vfs_xattr_set(const struct xattr_handler *handler,
 	return kernfs_xattr_set(kn, name, value, size, flags);
 }
 
+static int kernfs_vfs_user_xattr_set(const struct xattr_handler *handler,
+				     struct dentry *unused, struct inode *inode,
+				     const char *suffix, const void *value,
+				     size_t size, int flags)
+{
+	struct kernfs_node *kn = inode->i_private;
+	atomic_t *nr = &kn->iattr->nr_user_xattrs;
+	int ret;
+
+	if (!(kernfs_root(kn)->flags & KERNFS_ROOT_SUPPORT_USER_XATTR)) {
+		ret = -EOPNOTSUPP;
+		goto out;
+	}
+
+	if (value && atomic_inc_return(nr) > KERNFS_MAX_USER_XATTRS) {
+		ret = -ENOSPC;
+		goto dec_out;
+	}
+
+	ret = kernfs_vfs_xattr_set(handler, unused, inode, suffix, value,
+				   size, flags);
+
+	/*
+	 * Don't decrement counter if we successfully added an xattr
+	 */
+	if (value && !ret)
+		goto out;
+
+dec_out:
+	/*
+	 * Removing a nonexistent xattr without XATTR_REPLACE returns success
+	 * so we have to make sure here we don't go negative.
+	 */
+	if (atomic_dec_return(nr) < 0)
+		atomic_inc(nr);
+out:
+	return ret;
+}
+
 static const struct xattr_handler kernfs_trusted_xattr_handler = {
 	.prefix = XATTR_TRUSTED_PREFIX,
 	.get = kernfs_vfs_xattr_get,
@@ -339,8 +379,15 @@ static const struct xattr_handler kernfs_security_xattr_handler = {
 	.set = kernfs_vfs_xattr_set,
 };
 
+static const struct xattr_handler kernfs_user_xattr_handler = {
+	.prefix = XATTR_USER_PREFIX,
+	.get = kernfs_vfs_xattr_get,
+	.set = kernfs_vfs_user_xattr_set,
+};
+
 const struct xattr_handler *kernfs_xattr_handlers[] = {
 	&kernfs_trusted_xattr_handler,
 	&kernfs_security_xattr_handler,
+	&kernfs_user_xattr_handler,
 	NULL
 };
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 2f3c51d55261..745505ce1f37 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -26,6 +26,7 @@ struct kernfs_iattrs {
 	struct timespec64	ia_ctime;
 
 	struct simple_xattrs	xattrs;
+	atomic_t		nr_user_xattrs;
 };
 
 /* +1 to avoid triggering overflow warning when negating it */
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index dded2e5a9f42..0e06d67db05f 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -39,6 +39,7 @@ enum kernfs_node_type {
 
 #define KERNFS_TYPE_MASK	0x000f
 #define KERNFS_FLAG_MASK	~KERNFS_TYPE_MASK
+#define KERNFS_MAX_USER_XATTRS  128
 
 enum kernfs_node_flag {
 	KERNFS_ACTIVATED	= 0x0010,
@@ -78,6 +79,11 @@ enum kernfs_root_flag {
 	 * fhandle to access nodes of the fs.
 	 */
 	KERNFS_ROOT_SUPPORT_EXPORTOP		= 0x0004,
+
+	/*
+	 * Support user xattrs to be written to nodes rooted at this root.
+	 */
+	KERNFS_ROOT_SUPPORT_USER_XATTR		= 0x0008,
 };
 
 /* type-specific structures for kernfs_node union members */
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] cgroupfs: Support user xattrs
  2020-03-03  1:38 [PATCH 0/2] Support user xattrs in cgroupfs Daniel Xu
  2020-03-03  1:39 ` [PATCH 1/2] kernfs: Add option to enable user xattrs Daniel Xu
@ 2020-03-03  1:39 ` Daniel Xu
  2020-03-03 16:50 ` [PATCH 0/2] Support user xattrs in cgroupfs Shakeel Butt
  2 siblings, 0 replies; 9+ messages in thread
From: Daniel Xu @ 2020-03-03  1:39 UTC (permalink / raw)
  To: cgroups, tj, lizefan, hannes; +Cc: Daniel Xu, linux-kernel, gregkh, kernel-team

This patch turns on xattr support for cgroupfs. This is useful for
letting non-root owners of delegated subtrees attach metadata to
cgroups.

One use case is for subtree owners to tell a userspace out of memory
killer to bias away from killing specific subtrees.

Tests:

    [/sys/fs/cgroup]# for i in $(seq 0 130); \
        do setfattr workload.slice -n user.name$i -v wow; done
    setfattr: workload.slice: No space left on device
    setfattr: workload.slice: No space left on device
    setfattr: workload.slice: No space left on device

    [/sys/fs/cgroup]# for i in $(seq 0 130); \
        do setfattr workload.slice --remove user.name$i; done
    setfattr: workload.slice: No such attribute
    setfattr: workload.slice: No such attribute
    setfattr: workload.slice: No such attribute

    [/sys/fs/cgroup]# for i in $(seq 0 130); \
        do setfattr workload.slice -n user.name$i -v wow; done
    setfattr: workload.slice: No space left on device
    setfattr: workload.slice: No space left on device
    setfattr: workload.slice: No space left on device

`seq 0 130` is inclusive, and 131 - 128 = 3, which is the number of
errors we expect to see.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 kernel/cgroup/cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 75f687301bbf..21621b43a8ab 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1954,7 +1954,8 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask)
 
 	root->kf_root = kernfs_create_root(kf_sops,
 					   KERNFS_ROOT_CREATE_DEACTIVATED |
-					   KERNFS_ROOT_SUPPORT_EXPORTOP,
+					   KERNFS_ROOT_SUPPORT_EXPORTOP |
+					   KERNFS_ROOT_SUPPORT_USER_XATTR,
 					   root_cgrp);
 	if (IS_ERR(root->kf_root)) {
 		ret = PTR_ERR(root->kf_root);
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] kernfs: Add option to enable user xattrs
  2020-03-03  1:39 ` [PATCH 1/2] kernfs: Add option to enable user xattrs Daniel Xu
@ 2020-03-03 13:19   ` Tejun Heo
  2020-03-03 19:19     ` Daniel Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2020-03-03 13:19 UTC (permalink / raw)
  To: Daniel Xu; +Cc: cgroups, lizefan, hannes, linux-kernel, gregkh, kernel-team

Hello,

On Mon, Mar 02, 2020 at 05:39:00PM -0800, Daniel Xu wrote:
> +static int kernfs_vfs_user_xattr_set(const struct xattr_handler *handler,
> +				     struct dentry *unused, struct inode *inode,
> +				     const char *suffix, const void *value,
> +				     size_t size, int flags)
> +{
...
> +	if (value && atomic_inc_return(nr) > KERNFS_MAX_USER_XATTRS) {
> +		ret = -ENOSPC;
> +		goto dec_out;
> +	}

So, we limit the number of user xattrs here but

> +	ret = kernfs_vfs_xattr_set(handler, unused, inode, suffix, value,
> +				   size, flags);

This will call into simple_xattr_set() which doesn't put any further
restriction on size and just calls GFP_KERNEL kmalloc on it allowing
users incur high-order allocations. Maybe it'd make sense to limit
both the number and size?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] Support user xattrs in cgroupfs
  2020-03-03  1:38 [PATCH 0/2] Support user xattrs in cgroupfs Daniel Xu
  2020-03-03  1:39 ` [PATCH 1/2] kernfs: Add option to enable user xattrs Daniel Xu
  2020-03-03  1:39 ` [PATCH 2/2] cgroupfs: Support " Daniel Xu
@ 2020-03-03 16:50 ` Shakeel Butt
  2 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2020-03-03 16:50 UTC (permalink / raw)
  To: Daniel Xu
  Cc: Cgroups, Tejun Heo, Li Zefan, Johannes Weiner, LKML,
	Greg Kroah-Hartman, Kernel Team

Hi Daniel,

On Mon, Mar 2, 2020 at 5:42 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> User extended attributes are useful as metadata storage for kernfs
> consumers like cgroups. Especially in the case of cgroups, it is useful
> to have a central metadata store that multiple processes/services can
> use to coordinate actions.
>
> A concrete example is for userspace out of memory killers. We want to
> let delegated cgroup subtree owners (running as non-root) to be able to
> say "please avoid killing this cgroup". In server environments this is
> less important as everyone is running as root.

I would recommend removing the "everyone is running as root" statement
as it is not generally true.

Shakeel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] kernfs: Add option to enable user xattrs
  2020-03-03 13:19   ` Tejun Heo
@ 2020-03-03 19:19     ` Daniel Xu
  2020-03-03 19:37       ` Daniel Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Xu @ 2020-03-03 19:19 UTC (permalink / raw)
  To: Tejun Heo; +Cc: cgroups, lizefan, hannes, linux-kernel, gregkh, kernel-team

On Tue Mar 3, 2020 at 8:19 AM, Tejun Heo wrote:
> Hello,
>
> 
> On Mon, Mar 02, 2020 at 05:39:00PM -0800, Daniel Xu wrote:
> > +static int kernfs_vfs_user_xattr_set(const struct xattr_handler *handler,
> > +				     struct dentry *unused, struct inode *inode,
> > +				     const char *suffix, const void *value,
> > +				     size_t size, int flags)
> > +{
> ...
> > +	if (value && atomic_inc_return(nr) > KERNFS_MAX_USER_XATTRS) {
> > +		ret = -ENOSPC;
> > +		goto dec_out;
> > +	}
>
> 
> So, we limit the number of user xattrs here but
>
> 
> > +	ret = kernfs_vfs_xattr_set(handler, unused, inode, suffix, value,
> > +				   size, flags);
>
> 
> This will call into simple_xattr_set() which doesn't put any further
> restriction on size and just calls GFP_KERNEL kmalloc on it allowing
> users incur high-order allocations. Maybe it'd make sense to limit
> both the number and size?

Ah yeah good point. Will add.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] kernfs: Add option to enable user xattrs
  2020-03-03 19:19     ` Daniel Xu
@ 2020-03-03 19:37       ` Daniel Xu
  2020-03-04 16:39         ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Xu @ 2020-03-03 19:37 UTC (permalink / raw)
  To: Daniel Xu, Tejun Heo
  Cc: cgroups, lizefan, hannes, linux-kernel, gregkh, kernel-team

On Tue Mar 3, 2020 at 11:19 AM, Daniel Xu wrote:
> On Tue Mar 3, 2020 at 8:19 AM, Tejun Heo wrote:
> > Hello,
> >
> > 
> > On Mon, Mar 02, 2020 at 05:39:00PM -0800, Daniel Xu wrote:
> > > +static int kernfs_vfs_user_xattr_set(const struct xattr_handler *handler,
> > > +				     struct dentry *unused, struct inode *inode,
> > > +				     const char *suffix, const void *value,
> > > +				     size_t size, int flags)
> > > +{
> > ...
> > > +	if (value && atomic_inc_return(nr) > KERNFS_MAX_USER_XATTRS) {
> > > +		ret = -ENOSPC;
> > > +		goto dec_out;
> > > +	}
> >
> > 
> > So, we limit the number of user xattrs here but
> >
> > 
> > > +	ret = kernfs_vfs_xattr_set(handler, unused, inode, suffix, value,
> > > +				   size, flags);
> >
> > 
> > This will call into simple_xattr_set() which doesn't put any further
> > restriction on size and just calls GFP_KERNEL kmalloc on it allowing
> > users incur high-order allocations. Maybe it'd make sense to limit
> > both the number and size?
>
> 
> Ah yeah good point. Will add.
>

It looks like in fs/xattr.c:setxattr, there is already:

    ...
    if (size) {
        if (size > XATTR_SIZE_MAX)
            return -E2BIG;
    ...

where XATTR_SIZE_MAX is defined as 64k. Do you want it even smaller?


Thanks,
Daniel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] kernfs: Add option to enable user xattrs
  2020-03-03 19:37       ` Daniel Xu
@ 2020-03-04 16:39         ` Tejun Heo
  0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2020-03-04 16:39 UTC (permalink / raw)
  To: Daniel Xu; +Cc: cgroups, lizefan, hannes, linux-kernel, gregkh, kernel-team

Hello,

On Tue, Mar 03, 2020 at 11:37:28AM -0800, Daniel Xu wrote:
> It looks like in fs/xattr.c:setxattr, there is already:
> 
>     ...
>     if (size) {
>         if (size > XATTR_SIZE_MAX)
>             return -E2BIG;
>     ...
> 
> where XATTR_SIZE_MAX is defined as 64k. Do you want it even smaller?

Oh, I missed that. Order 5 allocations can still be on the big side
tho. Ideally, something like the following?

* Total number of bytes limit as the primary limit so that we can say
  that for a given cgroup user xattrs don't consume more than X bytes.
  We can pick an arbitrary number which is large enough for most use
  cases but not too big. e.g. 128k or whatever.

* Total number of xattrs limit. Again, some arbitrary not too low, not
  too high limit.

* Switch to kvmalloc() for attr allocation so that we don't have to
  worry about high order allocations.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] Support user xattrs in cgroupfs
       [not found] <CALvZod5m3otRRqcLBebbgiZbhoYWAMbMg+ESkacJuj64OP =H4Q@mail.gmail.com>
@ 2020-03-03 19:20 ` Daniel Xu
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Xu @ 2020-03-03 19:20 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Cgroups, Tejun Heo, Li Zefan, Johannes Weiner, LKML,
	Greg Kroah-Hartman, Kernel Team

On Tue Mar 3, 2020 at 8:50 AM, Shakeel Butt wrote:
> Hi Daniel,
>
> 
> On Mon, Mar 2, 2020 at 5:42 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
> >
> > User extended attributes are useful as metadata storage for kernfs
> > consumers like cgroups. Especially in the case of cgroups, it is useful
> > to have a central metadata store that multiple processes/services can
> > use to coordinate actions.
> >
> > A concrete example is for userspace out of memory killers. We want to
> > let delegated cgroup subtree owners (running as non-root) to be able to
> > say "please avoid killing this cgroup". In server environments this is
> > less important as everyone is running as root.
>
> 
> I would recommend removing the "everyone is running as root" statement
> as it is not generally true.
>
> 
> Shakeel

Good point, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-03-04 16:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03  1:38 [PATCH 0/2] Support user xattrs in cgroupfs Daniel Xu
2020-03-03  1:39 ` [PATCH 1/2] kernfs: Add option to enable user xattrs Daniel Xu
2020-03-03 13:19   ` Tejun Heo
2020-03-03 19:19     ` Daniel Xu
2020-03-03 19:37       ` Daniel Xu
2020-03-04 16:39         ` Tejun Heo
2020-03-03  1:39 ` [PATCH 2/2] cgroupfs: Support " Daniel Xu
2020-03-03 16:50 ` [PATCH 0/2] Support user xattrs in cgroupfs Shakeel Butt
     [not found] <CALvZod5m3otRRqcLBebbgiZbhoYWAMbMg+ESkacJuj64OP =H4Q@mail.gmail.com>
2020-03-03 19:20 ` Daniel Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).