All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
@ 2022-09-01 15:26 Ondrej Mosnacek
  2022-09-01 15:26 ` [PATCH 1/2] fs: convert simple_xattrs to RCU list Ondrej Mosnacek
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2022-09-01 15:26 UTC (permalink / raw)
  To: Alexander Viro
  Cc: linux-fsdevel, linux-security-module, selinux, rcu, linux-kernel,
	Martin Pitt

The goal of these patches is to avoid calling capable() unconditionally
in simple_xattr_list(), which causes issues under SELinux (see
explanation in the second patch).

The first patch tries to make this change safer by converting
simple_xattrs to use the RCU mechanism, so that capable() is not called
while the xattrs->lock is held. I didn't find evidence that this is an
issue in the current code, but it can't hurt to make that change
either way (and it was quite straightforward).

Ondrej Mosnacek (2):
  fs: convert simple_xattrs to RCU list
  fs: don't call capable() prematurely in simple_xattr_list()

 fs/xattr.c            | 39 +++++++++++++++++++++++----------------
 include/linux/xattr.h |  1 +
 2 files changed, 24 insertions(+), 16 deletions(-)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] fs: convert simple_xattrs to RCU list
  2022-09-01 15:26 [PATCH 0/2] fs: fix capable() call in simple_xattr_list() Ondrej Mosnacek
@ 2022-09-01 15:26 ` Ondrej Mosnacek
  2022-09-01 15:26 ` [PATCH 2/2] fs: don't call capable() prematurely in simple_xattr_list() Ondrej Mosnacek
  2022-09-05  9:08 ` [PATCH 0/2] fs: fix capable() call " Christian Brauner
  2 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2022-09-01 15:26 UTC (permalink / raw)
  To: Alexander Viro
  Cc: linux-fsdevel, linux-security-module, selinux, rcu, linux-kernel,
	Martin Pitt

Use the RCU list mechanism instead of a simple lock to access/modify
simple_xattrs. The performance benefit is probably negligible, but it
will help avoid lock nesting concerns for an upcoming patch.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 fs/xattr.c            | 36 ++++++++++++++++++++++--------------
 include/linux/xattr.h |  1 +
 2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index a1f4998bc6be..fad2344f1168 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -22,6 +22,8 @@
 #include <linux/audit.h>
 #include <linux/vmalloc.h>
 #include <linux/posix_acl_xattr.h>
+#include <linux/rculist.h>
+#include <linux/rcupdate.h>
 
 #include <linux/uaccess.h>
 
@@ -1030,8 +1032,8 @@ int simple_xattr_get(struct simple_xattrs *xattrs, const char *name,
 	struct simple_xattr *xattr;
 	int ret = -ENODATA;
 
-	spin_lock(&xattrs->lock);
-	list_for_each_entry(xattr, &xattrs->head, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(xattr, &xattrs->head, list) {
 		if (strcmp(name, xattr->name))
 			continue;
 
@@ -1044,10 +1046,18 @@ int simple_xattr_get(struct simple_xattrs *xattrs, const char *name,
 		}
 		break;
 	}
-	spin_unlock(&xattrs->lock);
+	rcu_read_unlock();
 	return ret;
 }
 
+static void simple_xattr_free_rcu(struct rcu_head *rcu)
+{
+	struct simple_xattr *xattr = container_of(rcu, struct simple_xattr, rcu);
+
+	kfree(xattr->name);
+	kvfree(xattr);
+}
+
 /**
  * simple_xattr_set - xattr SET operation for in-memory/pseudo filesystems
  * @xattrs: target simple_xattr list
@@ -1094,11 +1104,11 @@ int simple_xattr_set(struct simple_xattrs *xattrs, const char *name,
 				xattr = new_xattr;
 				err = -EEXIST;
 			} else if (new_xattr) {
-				list_replace(&xattr->list, &new_xattr->list);
+				list_replace_rcu(&xattr->list, &new_xattr->list);
 				if (removed_size)
 					*removed_size = xattr->size;
 			} else {
-				list_del(&xattr->list);
+				list_del_rcu(&xattr->list);
 				if (removed_size)
 					*removed_size = xattr->size;
 			}
@@ -1109,15 +1119,13 @@ int simple_xattr_set(struct simple_xattrs *xattrs, const char *name,
 		xattr = new_xattr;
 		err = -ENODATA;
 	} else {
-		list_add(&new_xattr->list, &xattrs->head);
+		list_add_rcu(&new_xattr->list, &xattrs->head);
 		xattr = NULL;
 	}
 out:
 	spin_unlock(&xattrs->lock);
-	if (xattr) {
-		kfree(xattr->name);
-		kvfree(xattr);
-	}
+	if (xattr)
+		call_rcu(&xattr->rcu, simple_xattr_free_rcu);
 	return err;
 
 }
@@ -1169,8 +1177,8 @@ ssize_t simple_xattr_list(struct inode *inode, struct simple_xattrs *xattrs,
 	}
 #endif
 
-	spin_lock(&xattrs->lock);
-	list_for_each_entry(xattr, &xattrs->head, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(xattr, &xattrs->head, list) {
 		/* skip "trusted." attributes for unprivileged callers */
 		if (!trusted && xattr_is_trusted(xattr->name))
 			continue;
@@ -1179,7 +1187,7 @@ ssize_t simple_xattr_list(struct inode *inode, struct simple_xattrs *xattrs,
 		if (err)
 			break;
 	}
-	spin_unlock(&xattrs->lock);
+	rcu_read_unlock();
 
 	return err ? err : size - remaining_size;
 }
@@ -1191,6 +1199,6 @@ void simple_xattr_list_add(struct simple_xattrs *xattrs,
 			   struct simple_xattr *new_xattr)
 {
 	spin_lock(&xattrs->lock);
-	list_add(&new_xattr->list, &xattrs->head);
+	list_add_rcu(&new_xattr->list, &xattrs->head);
 	spin_unlock(&xattrs->lock);
 }
diff --git a/include/linux/xattr.h b/include/linux/xattr.h
index 979a9d3e5bfb..3236c469aaac 100644
--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -86,6 +86,7 @@ struct simple_xattrs {
 
 struct simple_xattr {
 	struct list_head list;
+	struct rcu_head rcu;
 	char *name;
 	size_t size;
 	char value[];
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] fs: don't call capable() prematurely in simple_xattr_list()
  2022-09-01 15:26 [PATCH 0/2] fs: fix capable() call in simple_xattr_list() Ondrej Mosnacek
  2022-09-01 15:26 ` [PATCH 1/2] fs: convert simple_xattrs to RCU list Ondrej Mosnacek
@ 2022-09-01 15:26 ` Ondrej Mosnacek
  2022-09-05  9:08 ` [PATCH 0/2] fs: fix capable() call " Christian Brauner
  2 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2022-09-01 15:26 UTC (permalink / raw)
  To: Alexander Viro
  Cc: linux-fsdevel, linux-security-module, selinux, rcu, linux-kernel,
	Martin Pitt

Calling capable() pre-emptively causes a problem for SELinux, which will
normally log a denial whenever capable() is called and the task's
SELinux context doesn't have the corresponding capability permission
allowed.

With the current implementation of simple_xattr_list(), any time a
process without CAP_SYS_ADMIN calls listxattr(2) or similar on a
filesystem that uses this function, a denial is logged even if there are
no trusted.* xattrs on the inode in question. In such situation, policy
writers are forced to chose one of the following options:

1. Grant CAP_SYS_ADMIN to the given SELinux domain even though it
   doesn't really need it. (Not good for security.)
2. Add a rule to the policy that will silence CAP_SYS_ADMIN denials for
   the given domain without actually granting it. (Not good, because now
   denials that make actual difference may be hidden, making
   troubleshooting harder.)
3. Do nothing and let the denials appear. (Not good, because the audit
   spam could obscure actual important denials.)

To avoid this misery, only call capable() when an actual trusted.* xattr
is encountered. This is somewhat less optimal, since capable() will now
be called once per each trusted.* xattr, but that's pretty unlikely to
matter in practice.

Even after this fix any process listing xattrs on an inode that has one
or more trusted.* ones may trigger an "irrelevant" denial if it doesn't
actually care about the trusted.* xattrs, but such cases should be rare
and thus silencing the denial in such cases would not be as big of a
deal.

Fixes: b09e0fa4b4ea ("tmpfs: implement generic xattr support")
Reported-by: Martin Pitt <mpitt@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 fs/xattr.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index fad2344f1168..84a459ac779a 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -1155,7 +1155,6 @@ static int xattr_list_one(char **buffer, ssize_t *remaining_size,
 ssize_t simple_xattr_list(struct inode *inode, struct simple_xattrs *xattrs,
 			  char *buffer, size_t size)
 {
-	bool trusted = capable(CAP_SYS_ADMIN);
 	struct simple_xattr *xattr;
 	ssize_t remaining_size = size;
 	int err = 0;
@@ -1180,7 +1179,7 @@ ssize_t simple_xattr_list(struct inode *inode, struct simple_xattrs *xattrs,
 	rcu_read_lock();
 	list_for_each_entry_rcu(xattr, &xattrs->head, list) {
 		/* skip "trusted." attributes for unprivileged callers */
-		if (!trusted && xattr_is_trusted(xattr->name))
+		if (xattr_is_trusted(xattr->name) && !capable(CAP_SYS_ADMIN))
 			continue;
 
 		err = xattr_list_one(&buffer, &remaining_size, xattr->name);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-09-01 15:26 [PATCH 0/2] fs: fix capable() call in simple_xattr_list() Ondrej Mosnacek
  2022-09-01 15:26 ` [PATCH 1/2] fs: convert simple_xattrs to RCU list Ondrej Mosnacek
  2022-09-01 15:26 ` [PATCH 2/2] fs: don't call capable() prematurely in simple_xattr_list() Ondrej Mosnacek
@ 2022-09-05  9:08 ` Christian Brauner
  2022-09-05 10:15   ` Ondrej Mosnacek
  2 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2022-09-05  9:08 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Alexander Viro, linux-fsdevel, linux-security-module, selinux,
	rcu, linux-kernel, Martin Pitt, Vasily Averin

On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> The goal of these patches is to avoid calling capable() unconditionally
> in simple_xattr_list(), which causes issues under SELinux (see
> explanation in the second patch).
> 
> The first patch tries to make this change safer by converting
> simple_xattrs to use the RCU mechanism, so that capable() is not called
> while the xattrs->lock is held. I didn't find evidence that this is an
> issue in the current code, but it can't hurt to make that change
> either way (and it was quite straightforward).

Hey Ondrey,

There's another patchset I'd like to see first which switches from a
linked list to an rbtree to get rid of performance issues in this code
that can be used to dos tmpfs in containers:

https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org

I don't think Vasily has time to continue with this so I'll just pick it
up hopefully this or the week after LPC.

Christian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-09-05  9:08 ` [PATCH 0/2] fs: fix capable() call " Christian Brauner
@ 2022-09-05 10:15   ` Ondrej Mosnacek
  2022-09-05 15:30     ` Christian Brauner
  0 siblings, 1 reply; 11+ messages in thread
From: Ondrej Mosnacek @ 2022-09-05 10:15 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Linux FS Devel, Linux Security Module list,
	SElinux list, rcu, Linux kernel mailing list, Martin Pitt,
	Vasily Averin

On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > The goal of these patches is to avoid calling capable() unconditionally
> > in simple_xattr_list(), which causes issues under SELinux (see
> > explanation in the second patch).
> >
> > The first patch tries to make this change safer by converting
> > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > while the xattrs->lock is held. I didn't find evidence that this is an
> > issue in the current code, but it can't hurt to make that change
> > either way (and it was quite straightforward).
>
> Hey Ondrey,
>
> There's another patchset I'd like to see first which switches from a
> linked list to an rbtree to get rid of performance issues in this code
> that can be used to dos tmpfs in containers:
>
> https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
>
> I don't think Vasily has time to continue with this so I'll just pick it
> up hopefully this or the week after LPC.

Hm... does rbtree support lockless traversal? Because if not, that
would make it impossible to fix the issue without calling capable()
inside the critical section (or doing something complicated), AFAICT.
Would rhashtable be a workable alternative to rbtree for this use
case? Skimming <linux/rhashtable.h> it seems to support both lockless
lookup and traversal using RCU. And according to its manpage,
*listxattr(2) doesn't guarantee that the returned names are sorted.

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-09-05 10:15   ` Ondrej Mosnacek
@ 2022-09-05 15:30     ` Christian Brauner
  2022-11-02 18:24       ` Christian Brauner
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2022-09-05 15:30 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Alexander Viro, Linux FS Devel, Linux Security Module list,
	SElinux list, rcu, Linux kernel mailing list, Martin Pitt,
	Vasily Averin

On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > The goal of these patches is to avoid calling capable() unconditionally
> > > in simple_xattr_list(), which causes issues under SELinux (see
> > > explanation in the second patch).
> > >
> > > The first patch tries to make this change safer by converting
> > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > issue in the current code, but it can't hurt to make that change
> > > either way (and it was quite straightforward).
> >
> > Hey Ondrey,
> >
> > There's another patchset I'd like to see first which switches from a
> > linked list to an rbtree to get rid of performance issues in this code
> > that can be used to dos tmpfs in containers:
> >
> > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
> >
> > I don't think Vasily has time to continue with this so I'll just pick it
> > up hopefully this or the week after LPC.
> 
> Hm... does rbtree support lockless traversal? Because if not, that

The rfc that Vasily sent didn't allow for that at least.

> would make it impossible to fix the issue without calling capable()
> inside the critical section (or doing something complicated), AFAICT.
> Would rhashtable be a workable alternative to rbtree for this use
> case? Skimming <linux/rhashtable.h> it seems to support both lockless
> lookup and traversal using RCU. And according to its manpage,
> *listxattr(2) doesn't guarantee that the returned names are sorted.

I've never used the rhashtable infrastructure in any meaningful way. All
I can say from looking at current users that it looks like it could work
well for us here:

struct simple_xattr {
	struct rhlist_head rhlist_head;
	char *name;
	size_t size;
	char value[];
};

static const struct rhashtable_params simple_xattr_rhashtable = {
	.head_offset = offsetof(struct simple_xattr, rhlist_head),
	.key_offset = offsetof(struct simple_xattr, name),

or sm like this.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-09-05 15:30     ` Christian Brauner
@ 2022-11-02 18:24       ` Christian Brauner
  2022-11-03  1:59         ` Serge E. Hallyn
  2022-11-03  9:04         ` Ondrej Mosnacek
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Brauner @ 2022-11-02 18:24 UTC (permalink / raw)
  To: Ondrej Mosnacek, Vasily Averin
  Cc: Alexander Viro, Linux FS Devel, Linux Security Module list,
	SElinux list, rcu, Martin Pitt

On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote:
> On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > > The goal of these patches is to avoid calling capable() unconditionally
> > > > in simple_xattr_list(), which causes issues under SELinux (see
> > > > explanation in the second patch).
> > > >
> > > > The first patch tries to make this change safer by converting
> > > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > > issue in the current code, but it can't hurt to make that change
> > > > either way (and it was quite straightforward).
> > >
> > > Hey Ondrey,
> > >
> > > There's another patchset I'd like to see first which switches from a
> > > linked list to an rbtree to get rid of performance issues in this code
> > > that can be used to dos tmpfs in containers:
> > >
> > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
> > >
> > > I don't think Vasily has time to continue with this so I'll just pick it
> > > up hopefully this or the week after LPC.
> > 
> > Hm... does rbtree support lockless traversal? Because if not, that
> 
> The rfc that Vasily sent didn't allow for that at least.
> 
> > would make it impossible to fix the issue without calling capable()
> > inside the critical section (or doing something complicated), AFAICT.
> > Would rhashtable be a workable alternative to rbtree for this use
> > case? Skimming <linux/rhashtable.h> it seems to support both lockless
> > lookup and traversal using RCU. And according to its manpage,
> > *listxattr(2) doesn't guarantee that the returned names are sorted.
> 
> I've never used the rhashtable infrastructure in any meaningful way. All
> I can say from looking at current users that it looks like it could work
> well for us here:
> 
> struct simple_xattr {
> 	struct rhlist_head rhlist_head;
> 	char *name;
> 	size_t size;
> 	char value[];
> };
> 
> static const struct rhashtable_params simple_xattr_rhashtable = {
> 	.head_offset = offsetof(struct simple_xattr, rhlist_head),
> 	.key_offset = offsetof(struct simple_xattr, name),
> 
> or sm like this.

I have a patch in rough shape that converts struct simple_xattr to use
an rhashtable:

https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/

Light testing, not a lot useful comments and no meaningful commit
message as of yet but I'll get to that.

Even though your issue is orthogonal to the performance issues I'm
trying to fix I went back to your patch, Ondrej to apply it on top.
But I think it has one problem.

Afaict, by moving the capable() call from the top of the function into
the actual traversal portion an unprivileged user can potentially learn
whether a file has trusted.* xattrs set. At least if dmesg isn't
restricted on the kernel. That may very well be the reason why the
capable() call is on top.
(Because the straightforward fix for this would be to just call
capable() a single time if at least one trusted xattr is encountered and
store the result. That's pretty easy to do by making turning the trusted
variable into an int, setting it to -1, and only if it's -1 and a
trusted xattr has been found call capable() and store the result.)

One option to fix all of that is to switch simple_xattr_list() to use

        ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)

which doesn't generate an audit event.

I think this is even the correct thing to do as listing xattrs isn't a
targeted operation. IOW, if the the user had used getxattr() to request
a trusted.* xattr then logging a denial makes sense as the user
explicitly wanted to retrieve a trusted.* xattr. But if the user just
requested to list all xattrs then silently skipping trusted without
logging an explicit denial xattrs makes sense.

Does that sound acceptable?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-11-02 18:24       ` Christian Brauner
@ 2022-11-03  1:59         ` Serge E. Hallyn
  2022-11-03  9:04         ` Ondrej Mosnacek
  1 sibling, 0 replies; 11+ messages in thread
From: Serge E. Hallyn @ 2022-11-03  1:59 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Ondrej Mosnacek, Vasily Averin, Alexander Viro, Linux FS Devel,
	Linux Security Module list, SElinux list, rcu, Martin Pitt

On Wed, Nov 02, 2022 at 07:24:51PM +0100, Christian Brauner wrote:
> On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote:
> > On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> > > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> > > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > > > The goal of these patches is to avoid calling capable() unconditionally
> > > > > in simple_xattr_list(), which causes issues under SELinux (see
> > > > > explanation in the second patch).
> > > > >
> > > > > The first patch tries to make this change safer by converting
> > > > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > > > issue in the current code, but it can't hurt to make that change
> > > > > either way (and it was quite straightforward).
> > > >
> > > > Hey Ondrey,
> > > >
> > > > There's another patchset I'd like to see first which switches from a
> > > > linked list to an rbtree to get rid of performance issues in this code
> > > > that can be used to dos tmpfs in containers:
> > > >
> > > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
> > > >
> > > > I don't think Vasily has time to continue with this so I'll just pick it
> > > > up hopefully this or the week after LPC.
> > > 
> > > Hm... does rbtree support lockless traversal? Because if not, that
> > 
> > The rfc that Vasily sent didn't allow for that at least.
> > 
> > > would make it impossible to fix the issue without calling capable()
> > > inside the critical section (or doing something complicated), AFAICT.
> > > Would rhashtable be a workable alternative to rbtree for this use
> > > case? Skimming <linux/rhashtable.h> it seems to support both lockless
> > > lookup and traversal using RCU. And according to its manpage,
> > > *listxattr(2) doesn't guarantee that the returned names are sorted.
> > 
> > I've never used the rhashtable infrastructure in any meaningful way. All
> > I can say from looking at current users that it looks like it could work
> > well for us here:
> > 
> > struct simple_xattr {
> > 	struct rhlist_head rhlist_head;
> > 	char *name;
> > 	size_t size;
> > 	char value[];
> > };
> > 
> > static const struct rhashtable_params simple_xattr_rhashtable = {
> > 	.head_offset = offsetof(struct simple_xattr, rhlist_head),
> > 	.key_offset = offsetof(struct simple_xattr, name),
> > 
> > or sm like this.
> 
> I have a patch in rough shape that converts struct simple_xattr to use
> an rhashtable:
> 
> https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/
> 
> Light testing, not a lot useful comments and no meaningful commit
> message as of yet but I'll get to that.
> 
> Even though your issue is orthogonal to the performance issues I'm
> trying to fix I went back to your patch, Ondrej to apply it on top.
> But I think it has one problem.
> 
> Afaict, by moving the capable() call from the top of the function into
> the actual traversal portion an unprivileged user can potentially learn
> whether a file has trusted.* xattrs set. At least if dmesg isn't
> restricted on the kernel. That may very well be the reason why the
> capable() call is on top.
> (Because the straightforward fix for this would be to just call
> capable() a single time if at least one trusted xattr is encountered and
> store the result. That's pretty easy to do by making turning the trusted
> variable into an int, setting it to -1, and only if it's -1 and a
> trusted xattr has been found call capable() and store the result.)
> 
> One option to fix all of that is to switch simple_xattr_list() to use
> 
>         ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)
> 
> which doesn't generate an audit event.
> 
> I think this is even the correct thing to do as listing xattrs isn't a
> targeted operation. IOW, if the the user had used getxattr() to request
> a trusted.* xattr then logging a denial makes sense as the user
> explicitly wanted to retrieve a trusted.* xattr. But if the user just
> requested to list all xattrs then silently skipping trusted without
> logging an explicit denial xattrs makes sense.
> 
> Does that sound acceptable?

Agreed, auditing that seems like unwanted noise.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-11-02 18:24       ` Christian Brauner
  2022-11-03  1:59         ` Serge E. Hallyn
@ 2022-11-03  9:04         ` Ondrej Mosnacek
  2022-11-03  9:12           ` Christian Brauner
  1 sibling, 1 reply; 11+ messages in thread
From: Ondrej Mosnacek @ 2022-11-03  9:04 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Vasily Averin, Alexander Viro, Linux FS Devel,
	Linux Security Module list, SElinux list, rcu, Martin Pitt

On Wed, Nov 2, 2022 at 7:25 PM Christian Brauner <brauner@kernel.org> wrote:
> On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote:
> > On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> > > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> > > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > > > The goal of these patches is to avoid calling capable() unconditionally
> > > > > in simple_xattr_list(), which causes issues under SELinux (see
> > > > > explanation in the second patch).
> > > > >
> > > > > The first patch tries to make this change safer by converting
> > > > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > > > issue in the current code, but it can't hurt to make that change
> > > > > either way (and it was quite straightforward).
> > > >
> > > > Hey Ondrey,
> > > >
> > > > There's another patchset I'd like to see first which switches from a
> > > > linked list to an rbtree to get rid of performance issues in this code
> > > > that can be used to dos tmpfs in containers:
> > > >
> > > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
> > > >
> > > > I don't think Vasily has time to continue with this so I'll just pick it
> > > > up hopefully this or the week after LPC.
> > >
> > > Hm... does rbtree support lockless traversal? Because if not, that
> >
> > The rfc that Vasily sent didn't allow for that at least.
> >
> > > would make it impossible to fix the issue without calling capable()
> > > inside the critical section (or doing something complicated), AFAICT.
> > > Would rhashtable be a workable alternative to rbtree for this use
> > > case? Skimming <linux/rhashtable.h> it seems to support both lockless
> > > lookup and traversal using RCU. And according to its manpage,
> > > *listxattr(2) doesn't guarantee that the returned names are sorted.
> >
> > I've never used the rhashtable infrastructure in any meaningful way. All
> > I can say from looking at current users that it looks like it could work
> > well for us here:
> >
> > struct simple_xattr {
> >       struct rhlist_head rhlist_head;
> >       char *name;
> >       size_t size;
> >       char value[];
> > };
> >
> > static const struct rhashtable_params simple_xattr_rhashtable = {
> >       .head_offset = offsetof(struct simple_xattr, rhlist_head),
> >       .key_offset = offsetof(struct simple_xattr, name),
> >
> > or sm like this.
>
> I have a patch in rough shape that converts struct simple_xattr to use
> an rhashtable:
>
> https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/
>
> Light testing, not a lot useful comments and no meaningful commit
> message as of yet but I'll get to that.

Looks mostly good at first glance. I left comments for some minor
stuff I noticed.

> Even though your issue is orthogonal to the performance issues I'm
> trying to fix I went back to your patch, Ondrej to apply it on top.
> But I think it has one problem.
>
> Afaict, by moving the capable() call from the top of the function into
> the actual traversal portion an unprivileged user can potentially learn
> whether a file has trusted.* xattrs set. At least if dmesg isn't
> restricted on the kernel. That may very well be the reason why the
> capable() call is on top.

Technically it would be possible, for example with SELinux if the
audit daemon is dead. Not a likely situation, but I agree it's better
to be safe.

> (Because the straightforward fix for this would be to just call
> capable() a single time if at least one trusted xattr is encountered and
> store the result. That's pretty easy to do by making turning the trusted
> variable into an int, setting it to -1, and only if it's -1 and a
> trusted xattr has been found call capable() and store the result.)

That would also run into the conundrum of holding a lock while
(potentially) calling into the LSM subsystem. And would it even fix
the information leak? Unless I'm missing something it would only
prevent a leak of the trusted xattr count, but not the presence of any
trusted xattr.

> One option to fix all of that is to switch simple_xattr_list() to use
>
>         ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)
>
> which doesn't generate an audit event.
>
> I think this is even the correct thing to do as listing xattrs isn't a
> targeted operation. IOW, if the the user had used getxattr() to request
> a trusted.* xattr then logging a denial makes sense as the user
> explicitly wanted to retrieve a trusted.* xattr. But if the user just
> requested to list all xattrs then silently skipping trusted without
> logging an explicit denial xattrs makes sense.
>
> Does that sound acceptable?

Yes, I can't see any reason why that wouldn't be the best solution.
Why haven't I thought of that? :)

I guess you will want to submit a patch for it along with your
rhashtable patch to avoid a conflict? Or would you like me to submit
it separately?


--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-11-03  9:04         ` Ondrej Mosnacek
@ 2022-11-03  9:12           ` Christian Brauner
  2022-11-03 10:51             ` Ondrej Mosnacek
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2022-11-03  9:12 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Vasily Averin, Alexander Viro, Linux FS Devel,
	Linux Security Module list, SElinux list, rcu, Martin Pitt

On Thu, Nov 03, 2022 at 10:04:25AM +0100, Ondrej Mosnacek wrote:
> On Wed, Nov 2, 2022 at 7:25 PM Christian Brauner <brauner@kernel.org> wrote:
> > On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote:
> > > On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> > > > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> > > > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > > > > The goal of these patches is to avoid calling capable() unconditionally
> > > > > > in simple_xattr_list(), which causes issues under SELinux (see
> > > > > > explanation in the second patch).
> > > > > >
> > > > > > The first patch tries to make this change safer by converting
> > > > > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > > > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > > > > issue in the current code, but it can't hurt to make that change
> > > > > > either way (and it was quite straightforward).
> > > > >
> > > > > Hey Ondrey,
> > > > >
> > > > > There's another patchset I'd like to see first which switches from a
> > > > > linked list to an rbtree to get rid of performance issues in this code
> > > > > that can be used to dos tmpfs in containers:
> > > > >
> > > > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
> > > > >
> > > > > I don't think Vasily has time to continue with this so I'll just pick it
> > > > > up hopefully this or the week after LPC.
> > > >
> > > > Hm... does rbtree support lockless traversal? Because if not, that
> > >
> > > The rfc that Vasily sent didn't allow for that at least.
> > >
> > > > would make it impossible to fix the issue without calling capable()
> > > > inside the critical section (or doing something complicated), AFAICT.
> > > > Would rhashtable be a workable alternative to rbtree for this use
> > > > case? Skimming <linux/rhashtable.h> it seems to support both lockless
> > > > lookup and traversal using RCU. And according to its manpage,
> > > > *listxattr(2) doesn't guarantee that the returned names are sorted.
> > >
> > > I've never used the rhashtable infrastructure in any meaningful way. All
> > > I can say from looking at current users that it looks like it could work
> > > well for us here:
> > >
> > > struct simple_xattr {
> > >       struct rhlist_head rhlist_head;
> > >       char *name;
> > >       size_t size;
> > >       char value[];
> > > };
> > >
> > > static const struct rhashtable_params simple_xattr_rhashtable = {
> > >       .head_offset = offsetof(struct simple_xattr, rhlist_head),
> > >       .key_offset = offsetof(struct simple_xattr, name),
> > >
> > > or sm like this.
> >
> > I have a patch in rough shape that converts struct simple_xattr to use
> > an rhashtable:
> >
> > https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/
> >
> > Light testing, not a lot useful comments and no meaningful commit
> > message as of yet but I'll get to that.
> 
> Looks mostly good at first glance. I left comments for some minor
> stuff I noticed.
> 
> > Even though your issue is orthogonal to the performance issues I'm
> > trying to fix I went back to your patch, Ondrej to apply it on top.
> > But I think it has one problem.
> >
> > Afaict, by moving the capable() call from the top of the function into
> > the actual traversal portion an unprivileged user can potentially learn
> > whether a file has trusted.* xattrs set. At least if dmesg isn't
> > restricted on the kernel. That may very well be the reason why the
> > capable() call is on top.
> 
> Technically it would be possible, for example with SELinux if the
> audit daemon is dead. Not a likely situation, but I agree it's better
> to be safe.
> 
> > (Because the straightforward fix for this would be to just call
> > capable() a single time if at least one trusted xattr is encountered and
> > store the result. That's pretty easy to do by making turning the trusted
> > variable into an int, setting it to -1, and only if it's -1 and a
> > trusted xattr has been found call capable() and store the result.)
> 
> That would also run into the conundrum of holding a lock while
> (potentially) calling into the LSM subsystem. And would it even fix
> the information leak? Unless I'm missing something it would only
> prevent a leak of the trusted xattr count, but not the presence of any
> trusted xattr.

No it wouldn't. I just meant this to illustrate that with your patch we
could've made it so that capable() would've only been called once.

> 
> > One option to fix all of that is to switch simple_xattr_list() to use
> >
> >         ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)
> >
> > which doesn't generate an audit event.
> >
> > I think this is even the correct thing to do as listing xattrs isn't a
> > targeted operation. IOW, if the the user had used getxattr() to request
> > a trusted.* xattr then logging a denial makes sense as the user
> > explicitly wanted to retrieve a trusted.* xattr. But if the user just
> > requested to list all xattrs then silently skipping trusted without
> > logging an explicit denial xattrs makes sense.
> >
> > Does that sound acceptable?
> 
> Yes, I can't see any reason why that wouldn't be the best solution.
> Why haven't I thought of that? :)
> 
> I guess you will want to submit a patch for it along with your
> rhashtable patch to avoid a conflict? Or would you like me to submit
> it separately?

I think you can send a patch for this separately as we don't need to
massage the data structure for this.

I think we can reasonably give this a

Fixes: 38f38657444d ("xattr: extract simple_xattr code from tmpfs") # no backport

But note the "# no backport" as imho it isn't worth backporting this to
older kernels unless that's really desirable.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()
  2022-11-03  9:12           ` Christian Brauner
@ 2022-11-03 10:51             ` Ondrej Mosnacek
  0 siblings, 0 replies; 11+ messages in thread
From: Ondrej Mosnacek @ 2022-11-03 10:51 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Vasily Averin, Alexander Viro, Linux FS Devel,
	Linux Security Module list, SElinux list, rcu, Martin Pitt

On Thu, Nov 3, 2022 at 10:12 AM Christian Brauner <brauner@kernel.org> wrote:
> On Thu, Nov 03, 2022 at 10:04:25AM +0100, Ondrej Mosnacek wrote:
> > On Wed, Nov 2, 2022 at 7:25 PM Christian Brauner <brauner@kernel.org> wrote:
> > > On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote:
> > > > On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> > > > > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@kernel.org> wrote:
> > > > > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > > > > > The goal of these patches is to avoid calling capable() unconditionally
> > > > > > > in simple_xattr_list(), which causes issues under SELinux (see
> > > > > > > explanation in the second patch).
> > > > > > >
> > > > > > > The first patch tries to make this change safer by converting
> > > > > > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > > > > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > > > > > issue in the current code, but it can't hurt to make that change
> > > > > > > either way (and it was quite straightforward).
> > > > > >
> > > > > > Hey Ondrey,
> > > > > >
> > > > > > There's another patchset I'd like to see first which switches from a
> > > > > > linked list to an rbtree to get rid of performance issues in this code
> > > > > > that can be used to dos tmpfs in containers:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@openvz.org
> > > > > >
> > > > > > I don't think Vasily has time to continue with this so I'll just pick it
> > > > > > up hopefully this or the week after LPC.
> > > > >
> > > > > Hm... does rbtree support lockless traversal? Because if not, that
> > > >
> > > > The rfc that Vasily sent didn't allow for that at least.
> > > >
> > > > > would make it impossible to fix the issue without calling capable()
> > > > > inside the critical section (or doing something complicated), AFAICT.
> > > > > Would rhashtable be a workable alternative to rbtree for this use
> > > > > case? Skimming <linux/rhashtable.h> it seems to support both lockless
> > > > > lookup and traversal using RCU. And according to its manpage,
> > > > > *listxattr(2) doesn't guarantee that the returned names are sorted.
> > > >
> > > > I've never used the rhashtable infrastructure in any meaningful way. All
> > > > I can say from looking at current users that it looks like it could work
> > > > well for us here:
> > > >
> > > > struct simple_xattr {
> > > >       struct rhlist_head rhlist_head;
> > > >       char *name;
> > > >       size_t size;
> > > >       char value[];
> > > > };
> > > >
> > > > static const struct rhashtable_params simple_xattr_rhashtable = {
> > > >       .head_offset = offsetof(struct simple_xattr, rhlist_head),
> > > >       .key_offset = offsetof(struct simple_xattr, name),
> > > >
> > > > or sm like this.
> > >
> > > I have a patch in rough shape that converts struct simple_xattr to use
> > > an rhashtable:
> > >
> > > https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/
> > >
> > > Light testing, not a lot useful comments and no meaningful commit
> > > message as of yet but I'll get to that.
> >
> > Looks mostly good at first glance. I left comments for some minor
> > stuff I noticed.
> >
> > > Even though your issue is orthogonal to the performance issues I'm
> > > trying to fix I went back to your patch, Ondrej to apply it on top.
> > > But I think it has one problem.
> > >
> > > Afaict, by moving the capable() call from the top of the function into
> > > the actual traversal portion an unprivileged user can potentially learn
> > > whether a file has trusted.* xattrs set. At least if dmesg isn't
> > > restricted on the kernel. That may very well be the reason why the
> > > capable() call is on top.
> >
> > Technically it would be possible, for example with SELinux if the
> > audit daemon is dead. Not a likely situation, but I agree it's better
> > to be safe.
> >
> > > (Because the straightforward fix for this would be to just call
> > > capable() a single time if at least one trusted xattr is encountered and
> > > store the result. That's pretty easy to do by making turning the trusted
> > > variable into an int, setting it to -1, and only if it's -1 and a
> > > trusted xattr has been found call capable() and store the result.)
> >
> > That would also run into the conundrum of holding a lock while
> > (potentially) calling into the LSM subsystem. And would it even fix
> > the information leak? Unless I'm missing something it would only
> > prevent a leak of the trusted xattr count, but not the presence of any
> > trusted xattr.
>
> No it wouldn't. I just meant this to illustrate that with your patch we
> could've made it so that capable() would've only been called once.
>
> >
> > > One option to fix all of that is to switch simple_xattr_list() to use
> > >
> > >         ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)
> > >
> > > which doesn't generate an audit event.
> > >
> > > I think this is even the correct thing to do as listing xattrs isn't a
> > > targeted operation. IOW, if the the user had used getxattr() to request
> > > a trusted.* xattr then logging a denial makes sense as the user
> > > explicitly wanted to retrieve a trusted.* xattr. But if the user just
> > > requested to list all xattrs then silently skipping trusted without
> > > logging an explicit denial xattrs makes sense.
> > >
> > > Does that sound acceptable?
> >
> > Yes, I can't see any reason why that wouldn't be the best solution.
> > Why haven't I thought of that? :)
> >
> > I guess you will want to submit a patch for it along with your
> > rhashtable patch to avoid a conflict? Or would you like me to submit
> > it separately?
>
> I think you can send a patch for this separately as we don't need to
> massage the data structure for this.

Ok, will do.

> I think we can reasonably give this a
>
> Fixes: 38f38657444d ("xattr: extract simple_xattr code from tmpfs") # no backport
>
> But note the "# no backport" as imho it isn't worth backporting this to
> older kernels unless that's really desirable.

Actually, it would be valuable to have it backported to linux-stable
at least, since we have users encountering this on Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=2122888

In the end it's up to the backporter to assess each commit, but at
least I wouldn't want to outright discourage the backport in the
commit message.

-- 
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-11-03 10:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-01 15:26 [PATCH 0/2] fs: fix capable() call in simple_xattr_list() Ondrej Mosnacek
2022-09-01 15:26 ` [PATCH 1/2] fs: convert simple_xattrs to RCU list Ondrej Mosnacek
2022-09-01 15:26 ` [PATCH 2/2] fs: don't call capable() prematurely in simple_xattr_list() Ondrej Mosnacek
2022-09-05  9:08 ` [PATCH 0/2] fs: fix capable() call " Christian Brauner
2022-09-05 10:15   ` Ondrej Mosnacek
2022-09-05 15:30     ` Christian Brauner
2022-11-02 18:24       ` Christian Brauner
2022-11-03  1:59         ` Serge E. Hallyn
2022-11-03  9:04         ` Ondrej Mosnacek
2022-11-03  9:12           ` Christian Brauner
2022-11-03 10:51             ` Ondrej Mosnacek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.