selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Broken SELinux/LSM labeling with MPTCP and accept(2)
@ 2022-12-01 13:42 Ondrej Mosnacek
  2022-12-01 18:26 ` Paolo Abeni
  0 siblings, 1 reply; 10+ messages in thread
From: Ondrej Mosnacek @ 2022-12-01 13:42 UTC (permalink / raw)
  To: SElinux list, Linux Security Module list, mptcp
  Cc: network dev, Mat Martineau, Matthieu Baerts, Paul Moore, Paolo Abeni

Hi all,

As discovered by our QE, there is a problem with how the
(userspace-facing) sockets returned by accept(2) are labeled when
using MPTCP. Currently they always end up with the label representing
the kernel (typically system_u:system_r:kernel_t:s0), white they
should inherit the context from the parent socket (the one that is
passed to accept(2)).

A minimal reproducer on a Fedora/CentOS/RHEL system:

# Install dependencies
dnf install -y mptcpd nginx curl
# Disable rules that silence some SELinux denials
semodule -DB
# Set up a dummy file to be served by nginx
echo test > /usr/share/nginx/html/testfile
chmod +r /usr/share/nginx/html/testfile
# Set up nginx to use MPTCP
sysctl -w net.mptcp.enabled=1
systemctl stop nginx
mptcpize enable nginx
systemctl start nginx
# This will fail (no reply from server)
mptcpize run curl -k -o /dev/null http://127.0.0.1/testfile
# This will show the SELinux denial that caused the failure
ausearch -i -m avc | grep httpd

It is also possible to trigger the issue by running the
selinux-testsuite [1] under `mptcpize run` (it will fail on the
inet_socket test in multiple places).

Based on what I could infer from the net & mptcp code, this is roughly
how it happens (may be inaccurate or incorrect - the maze of the
networking stack is not easy to navigate for me):
1. When the server starts, the main mptcp socket is created:
   socket(2) -> ... -> socket_create() -> inet_create() ->
mptcp_init_sock() -> __mptcp_socket_create()
2. __mptcp_socket_create() calls mptcp_subflow_create_socket(), which
creates another "kern" socket, which represents the initial(?)
subflow.
3. This subflow socket goes through security_socket_post_create() ->
selinux_socket_post_create(), which gives it a kernel label based on
kern == 1, which indicates that it's a kernel-internal socket.
4. The main socket goes through its own selinux_socket_post_create(),
which gives it the label based on the current task.
5. Later, when the client connection is accepted via accept(2) on the
main socket, an underlying accept operation is performed on the
subflow socket, which is then returned directly as the result of the
accept(2) syscall.
6. Since this socket is cloned from the subflow socket, it inherits
the kernel label from the original subflow socket (via
selinux_inet_conn_request() and selinux_inet_csk_clone()).
selinux_sock_graft() then also copies the label onto the inode
representing the socket.
7. When nginx later calls writev(2) on the new socket,
selinux_file_permission() uses the inode label as the target in a
tcp_socket::write permission check. This is denied, as in the Fedora
policy httpd_t isn't allowed to write to kernel_t TCP sockets.

Side note: There is currently an odd conditional in sock_has_perm() in
security/selinux/hooks.c that skips SELinux permission checking for
sockets that have the kernel label, so native socket operations (such
as recv(2), send(2), recvmsg(2), ...) will not uncover this problem,
only generic file operations such as read(2), write(2), writev(2),
etc. I believe that check shouldn't be there, but that's for another
discussion...

So now the big question is: How to fix this? I can think of several
possible solutions, but neither of them seems to be the obvious
correct one:
1. Wrap the socket cloned from the subflow socket in another socket
(similar to how the main socket + subflow(s) are handled), which would
be cloned from the non-kern outer socket that has the right label.
This could have the disadvantage of adding unnecessary overhead, but
would probably be simpler to do.
2. Somehow ensure that the cloned socket gets the label from the main
socket instead of the subflow socket. This would probably require
adding a new LSM hook and I'm not sure at all what would be the best
way to implement this.
3. Somehow communicate the subflow socket <-> main socket relationship
to the LSM layer so that it can switch to use the label of the main
socket when handling an operation on a subflow socket (thus copying
the label correctly on accept(2)). Not a great solution, as it
requires each LSM that labels sockets to duplicate the indirection
logic.
4. Do not create the subflow sockets as "kern". (Not sure if that
would be desirable.)
5. Stop labeling kern sockets with the kernel's label on the SELinux
side and just label them based on the current task as usual. (This
would probably cause other issues, but maybe not...)

Any ideas, suggestions, or patches welcome!

[1] https://github.com/SELinuxProject/selinux-testsuite/

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-01 13:42 Broken SELinux/LSM labeling with MPTCP and accept(2) Ondrej Mosnacek
@ 2022-12-01 18:26 ` Paolo Abeni
  2022-12-02  2:06   ` Paul Moore
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2022-12-01 18:26 UTC (permalink / raw)
  To: Ondrej Mosnacek, SElinux list, Linux Security Module list, mptcp
  Cc: network dev, Mat Martineau, Matthieu Baerts, Paul Moore

Hello,

On Thu, 2022-12-01 at 14:42 +0100, Ondrej Mosnacek wrote:
> As discovered by our QE, there is a problem with how the
> (userspace-facing) sockets returned by accept(2) are labeled when
> using MPTCP. Currently they always end up with the label representing
> the kernel (typically system_u:system_r:kernel_t:s0), white they
> should inherit the context from the parent socket (the one that is
> passed to accept(2)).
> 
> A minimal reproducer on a Fedora/CentOS/RHEL system:
> 
> # Install dependencies
> dnf install -y mptcpd nginx curl
> # Disable rules that silence some SELinux denials
> semodule -DB
> # Set up a dummy file to be served by nginx
> echo test > /usr/share/nginx/html/testfile
> chmod +r /usr/share/nginx/html/testfile
> # Set up nginx to use MPTCP
> sysctl -w net.mptcp.enabled=1
> systemctl stop nginx
> mptcpize enable nginx
> systemctl start nginx
> # This will fail (no reply from server)
> mptcpize run curl -k -o /dev/null http://127.0.0.1/testfile
> # This will show the SELinux denial that caused the failure
> ausearch -i -m avc | grep httpd
> 
> It is also possible to trigger the issue by running the
> selinux-testsuite [1] under `mptcpize run` (it will fail on the
> inet_socket test in multiple places).
> 
> Based on what I could infer from the net & mptcp code, this is roughly
> how it happens (may be inaccurate or incorrect - the maze of the
> networking stack is not easy to navigate for me):
> 1. When the server starts, the main mptcp socket is created:
>    socket(2) -> ... -> socket_create() -> inet_create() ->
> mptcp_init_sock() -> __mptcp_socket_create()
> 2. __mptcp_socket_create() calls mptcp_subflow_create_socket(), which
> creates another "kern" socket, which represents the initial(?)
> subflow.
> 3. This subflow socket goes through security_socket_post_create() ->
> selinux_socket_post_create(), which gives it a kernel label based on
> kern == 1, which indicates that it's a kernel-internal socket.
> 4. The main socket goes through its own selinux_socket_post_create(),
> which gives it the label based on the current task.
> 5. Later, when the client connection is accepted via accept(2) on the
> main socket, an underlying accept operation is performed on the
> subflow socket, which is then returned directly as the result of the
> accept(2) syscall.
> 6. Since this socket is cloned from the subflow socket, it inherits
> the kernel label from the original subflow socket (via
> selinux_inet_conn_request() and selinux_inet_csk_clone()).
> selinux_sock_graft() then also copies the label onto the inode
> representing the socket.
> 7. When nginx later calls writev(2) on the new socket,
> selinux_file_permission() uses the inode label as the target in a
> tcp_socket::write permission check. This is denied, as in the Fedora
> policy httpd_t isn't allowed to write to kernel_t TCP sockets.
> 
> Side note: There is currently an odd conditional in sock_has_perm() in
> security/selinux/hooks.c that skips SELinux permission checking for
> sockets that have the kernel label, so native socket operations (such
> as recv(2), send(2), recvmsg(2), ...) will not uncover this problem,
> only generic file operations such as read(2), write(2), writev(2),
> etc. I believe that check shouldn't be there, but that's for another
> discussion...

That comes from:

commit e2943dca2d5b67e9578111986495483fe720d58b
Author: James Morris <jmorris@redhat.com>
Date:   Sat May 8 01:00:33 2004 -0700

    [NET]: Add sock_create_kern()

"""
    This addresses a class of potential issues in SELinux where, for example,
    a TCP NFS session times out, then the kernel re-establishes an RPC
    connection upon further user activity.  We do not want such kernel
    created sockets to be labeled with user security contexts.
"""

> So now the big question is: How to fix this? I can think of several
> possible solutions, but neither of them seems to be the obvious
> correct one:
> 1. Wrap the socket cloned from the subflow socket in another socket
> (similar to how the main socket + subflow(s) are handled), which would
> be cloned from the non-kern outer socket that has the right label.
> This could have the disadvantage of adding unnecessary overhead, but
> would probably be simpler to do.

I would avoid that option: we already have a suboptimal amount of
indirection.

> 2. Somehow ensure that the cloned socket gets the label from the main
> socket instead of the subflow socket. This would probably require
> adding a new LSM hook and I'm not sure at all what would be the best
> way to implement this.
> 3. Somehow communicate the subflow socket <-> main socket relationship
> to the LSM layer so that it can switch to use the label of the main
> socket when handling an operation on a subflow socket (thus copying
> the label correctly on accept(2)). Not a great solution, as it
> requires each LSM that labels sockets to duplicate the indirection
> logic.
> 4. Do not create the subflow sockets as "kern". (Not sure if that
> would be desirable.)

No, we need subflow being kernel sockets. Lockdep will bail otherwise,
and the would be the tip of the iceberg.

> 5. Stop labeling kern sockets with the kernel's label on the SELinux
> side and just label them based on the current task as usual. (This
> would probably cause other issues, but maybe not...)
> 
> Any ideas, suggestions, or patches welcome!

I think something alike the following could work - not even tested,
with comments on bold assumptions.

I'll try to do some testing later.

---
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 99f5e51d5ca4..6cad50c6fd24 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -102,6 +102,20 @@ static int __mptcp_socket_create(struct mptcp_sock *msk)
 	if (err)
 		return err;
 
+	/* The first subflow can later be indirectly exposed to security
+	 * relevant syscall alike accept() and bind(), and at this point
+	 * carries a 'kern' related security context.
+	 * Reset the security context to the relevant user-space one.
+	 * Note that the following assumes security_socket_post_create()
+	 * being idempotent
+	 */
+	err = security_socket_post_create(ssock, sk->sk_family, SOCK_STREAM,
+					  IPPROTO_TCP, 0);
+	if (err) {
+		sock_release(ssock);
+		return err;
+	}
+
 	msk->first = ssock->sk;
 	msk->subflow = ssock;
 	subflow = mptcp_subflow_ctx(ssock->sk);

	


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-01 18:26 ` Paolo Abeni
@ 2022-12-02  2:06   ` Paul Moore
  2022-12-02 12:07     ` Paolo Abeni
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Moore @ 2022-12-02  2:06 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Ondrej Mosnacek, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Thu, Dec 1, 2022 at 1:26 PM Paolo Abeni <pabeni@redhat.com> wrote:
> Hello,

Hello all.

> On Thu, 2022-12-01 at 14:42 +0100, Ondrej Mosnacek wrote:
> > As discovered by our QE, there is a problem with how the
> > (userspace-facing) sockets returned by accept(2) are labeled when
> > using MPTCP. Currently they always end up with the label representing
> > the kernel (typically system_u:system_r:kernel_t:s0), white they
> > should inherit the context from the parent socket (the one that is
> > passed to accept(2)) ...

Ooof, that's unfortunate :/  Thanks for reporting it though ...

> > So now the big question is: How to fix this? I can think of several
> > possible solutions, but neither of them seems to be the obvious
> > correct one:
> > 1. Wrap the socket cloned from the subflow socket in another socket
> > (similar to how the main socket + subflow(s) are handled), which would
> > be cloned from the non-kern outer socket that has the right label.
> > This could have the disadvantage of adding unnecessary overhead, but
> > would probably be simpler to do.
>
> I would avoid that option: we already have a suboptimal amount of
> indirection.

Agreed, I'm not sure I want to push for this as a fix.

> > 2. Somehow ensure that the cloned socket gets the label from the main
> > socket instead of the subflow socket. This would probably require
> > adding a new LSM hook and I'm not sure at all what would be the best
> > way to implement this.

2a. (?)  Another option would simply be ensuring that the subflow
socket has the proper LSM/SELinux label from the start.  This is
arguably The Right Thing To Do regardless, and in cases where explicit
packet labeling is used, e.g. CALIPSO, it may be necessary to do this
to ensure the traffic is labeled correctly from the start (I would
need to spend more time looking at the MPTCP code to say for certain
here).

> > 3. Somehow communicate the subflow socket <-> main socket relationship
> > to the LSM layer so that it can switch to use the label of the main
> > socket when handling an operation on a subflow socket (thus copying
> > the label correctly on accept(2)). Not a great solution, as it
> > requires each LSM that labels sockets to duplicate the indirection
> > logic.
> > 4. Do not create the subflow sockets as "kern". (Not sure if that
> > would be desirable.)
>
> No, we need subflow being kernel sockets. Lockdep will bail otherwise,
> and the would be the tip of the iceberg.
>
> > 5. Stop labeling kern sockets with the kernel's label on the SELinux
> > side and just label them based on the current task as usual. (This
> > would probably cause other issues, but maybe not...)

I'm not sure we want to open that can of worms right now.

> > Any ideas, suggestions, or patches welcome!
>
> I think something alike the following could work - not even tested,
> with comments on bold assumptions.
>
> I'll try to do some testing later.
>
> ---
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 99f5e51d5ca4..6cad50c6fd24 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -102,6 +102,20 @@ static int __mptcp_socket_create(struct mptcp_sock *msk)
>         if (err)
>                 return err;
>
> +       /* The first subflow can later be indirectly exposed to security
> +        * relevant syscall alike accept() and bind(), and at this point
> +        * carries a 'kern' related security context.
> +        * Reset the security context to the relevant user-space one.
> +        * Note that the following assumes security_socket_post_create()
> +        * being idempotent
> +        */
> +       err = security_socket_post_create(ssock, sk->sk_family, SOCK_STREAM,
> +                                         IPPROTO_TCP, 0);
> +       if (err) {
> +               sock_release(ssock);
> +               return err;
> +       }

I'm not sure we want to start calling security_socket_post_create()
twice on a given socket, *maybe* it works okay now but that seems like
an awkward constraint to put on future LSMs (or changes to existing
ones).  If we decide that the best approach is to add a LSM hook call
here, we should create a new hook instead of reusing an existing one;
I think this falls under Ondrej's possible solution #2.

However, I think this simplest solution might be what I mentioned
above as #2a, simply labeling the subflow socket properly from the
beginning.  In the case of SELinux I think we could do that by simply
clearing the @kern flag in the case of IPPROTO_MPTCP; completely
untested (and likely whitespace mangled) hack shown below:

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index f553c370397e..de6aa80b2319 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4562,6 +4562,7 @@ static int selinux_socket_create(int family, int type,
       u16 secclass;
       int rc;

+       kern = (protocol == IPPROTO_MPTCP ? 0 : kern);
       if (kern)
               return 0;

@@ -4584,6 +4585,7 @@ static int selinux_socket_post_create(struct
socket *sock, int family,
       u32 sid = SECINITSID_KERNEL;
       int err = 0;

+       kern = (protocol == IPPROTO_MPTCP ? 0 : kern);
       if (!kern) {
               err = socket_sockcreate_sid(tsec, sclass, &sid);
               if (err)

... of course the downside is that this is not a general cross-LSM
solution, but those are very hard, if not impossible, to create as the
individual LSMs can vary tremendously.  There is also the downside of
having to have a protocol specific hack in the LSM socket creation
hooks, which is a bit of a bummer, but then again we already have to
do so much worse elsewhere in the LSM networking hooks that this is
far from the worst thing we've had to do.

All this of course assumes that the quick little hack even works, I
may be missing something very obvious ... ;)

-- 
paul-moore.com

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-02  2:06   ` Paul Moore
@ 2022-12-02 12:07     ` Paolo Abeni
  2022-12-02 12:23       ` Florian Westphal
  2022-12-02 20:16       ` Paul Moore
  0 siblings, 2 replies; 10+ messages in thread
From: Paolo Abeni @ 2022-12-02 12:07 UTC (permalink / raw)
  To: Paul Moore
  Cc: Ondrej Mosnacek, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Thu, 2022-12-01 at 21:06 -0500, Paul Moore wrote:
> > > Any ideas, suggestions, or patches welcome!
> > 
> > I think something alike the following could work - not even tested,
> > with comments on bold assumptions.
> > 
> > I'll try to do some testing later.
> > 
> > ---
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index 99f5e51d5ca4..6cad50c6fd24 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -102,6 +102,20 @@ static int __mptcp_socket_create(struct mptcp_sock *msk)
> >         if (err)
> >                 return err;
> > 
> > +       /* The first subflow can later be indirectly exposed to security
> > +        * relevant syscall alike accept() and bind(), and at this point
> > +        * carries a 'kern' related security context.
> > +        * Reset the security context to the relevant user-space one.
> > +        * Note that the following assumes security_socket_post_create()
> > +        * being idempotent
> > +        */
> > +       err = security_socket_post_create(ssock, sk->sk_family, SOCK_STREAM,
> > +                                         IPPROTO_TCP, 0);
> > +       if (err) {
> > +               sock_release(ssock);
> > +               return err;
> > +       }
> 
> I'm not sure we want to start calling security_socket_post_create()
> twice on a given socket, *maybe* it works okay now but that seems like
> an awkward constraint to put on future LSMs (or changes to existing
> ones). If we decide that the best approach is to add a LSM hook call
> here, we should create a new hook instead of reusing an existing one;
> I think this falls under Ondrej's possible solution #2.

I agree the above code is not very nice and an additional selinux hook
would be much cleaner. I tried the above path as a possible quick
fixup. AFAICS all the existing selinux modules implement the
socket_post_create() hook in such a way that calling it on an already
initialized socket yeld to the desidered result.

I agree putting additional, currently non exiting, constraint on
existing hooks is definitelly not nice. 
> 
> However, I think this simplest solution might be what I mentioned
> above as #2a, simply labeling the subflow socket properly from the
> beginning.  In the case of SELinux I think we could do that by simply
> clearing the @kern flag in the case of IPPROTO_MPTCP; completely
> untested (and likely whitespace mangled) hack shown below:
> 
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index f553c370397e..de6aa80b2319 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -4562,6 +4562,7 @@ static int selinux_socket_create(int family, int type,
>        u16 secclass;
>        int rc;
> 
> +       kern = (protocol == IPPROTO_MPTCP ? 0 : kern);
>        if (kern)
>                return 0;
> 
> @@ -4584,6 +4585,7 @@ static int selinux_socket_post_create(struct
> socket *sock, int family,
>        u32 sid = SECINITSID_KERNEL;
>        int err = 0;
> 
> +       kern = (protocol == IPPROTO_MPTCP ? 0 : kern);
>        if (!kern) {
>                err = socket_sockcreate_sid(tsec, sclass, &sid);
>                if (err)
> 
> ... of course the downside is that this is not a general cross-LSM
> solution, but those are very hard, if not impossible, to create as the
> individual LSMs can vary tremendously.  There is also the downside of
> having to have a protocol specific hack in the LSM socket creation
> hooks, which is a bit of a bummer, but then again we already have to
> do so much worse elsewhere in the LSM networking hooks that this is
> far from the worst thing we've had to do.

There is a problem with the above: the relevant/affected socket is an
MPTCP subflow, which is actually a TCP socket (protocol ==
IPPROTO_TCP). Yep, MPTCP is quite a mes... complex.

I think we can't follow this later path.

If even the initially proposed hack is a no-go, another option could
possibly be the following - that is: do not touch the subflows, and try
to initilize the accepted msk context correctly.

Note that the relevant context does not held the 'sk' socket lock, nor
it could acquire it, but at least 'sk' can't be freed under the hood
and there are exiting places e.g. the unix socket accept() where
security_sock_graft() is invoked with similar (lack of) locking.

Side note: I'm confused by the selinux_sock_graft() implementation:

https://elixir.bootlin.com/linux/v6.1-rc7/source/security/selinux/hooks.c#L5243

it looks like the 'sid' is copied from the 'child' socket into the
'parent', while the sclass from the 'parent' into the 'child'. Am I
misreading it? is that intended? I would have expeted both 'sid' and
'sclass' being copied from the parent into the child. Other LSM's
sock_graft() initilize the child and leave alone the parent.

---
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 99f5e51d5ca4..b8095b8df71d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3085,7 +3085,10 @@ struct sock *mptcp_sk_clone(const struct sock *sk,
 	/* will be fully established after successful MPC subflow creation */
 	inet_sk_state_store(nsk, TCP_SYN_RECV);
 
-	security_inet_csk_clone(nsk, req);
+	/* let's the new socket inherit the security label from the msk
+	 * listener, as the TCP reqest socket carries a kernel context
+	 */
+	security_sock_graft(nsk, sk->sk_socket);
 	bh_unlock_sock(nsk);
 
 	/* keep a single reference */



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-02 12:07     ` Paolo Abeni
@ 2022-12-02 12:23       ` Florian Westphal
  2022-12-02 20:16       ` Paul Moore
  1 sibling, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2022-12-02 12:23 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Paul Moore, Ondrej Mosnacek, SElinux list,
	Linux Security Module list, mptcp, network dev, Mat Martineau,
	Matthieu Baerts

Paolo Abeni <pabeni@redhat.com> wrote:
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 99f5e51d5ca4..b8095b8df71d 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -3085,7 +3085,10 @@ struct sock *mptcp_sk_clone(const struct sock *sk,
>  	/* will be fully established after successful MPC subflow creation */
>  	inet_sk_state_store(nsk, TCP_SYN_RECV);
>  
> -	security_inet_csk_clone(nsk, req);
> +	/* let's the new socket inherit the security label from the msk
> +	 * listener, as the TCP reqest socket carries a kernel context
> +	 */
> +	security_sock_graft(nsk, sk->sk_socket);
>  	bh_unlock_sock(nsk);

FWIW this makes Ondrejs test case work:

before:
mptcp successfully enabled on unit /usr/lib/systemd/system/nginx.service
% Total    % Received % Xferd  Average Speed   Time    Time     Time % Current
Dload  Upload   Total   Spent    Left Speed
0     0    0     0    0     0 0      0 --:--:-- --:--:-- --:--:-- 0
curl: (52) Empty reply from server

With above change:
mptcp successfully enabled on unit /usr/lib/systemd/system/nginx.service
% Total    % Received % Xferd  Average Speed   Time    Time     Time % Current
Dload  Upload   Total   Spent    Left Speed 100     5  100     5    0     0 1770      0 --:--:-- --:--:-- --:--:--  5000

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-02 12:07     ` Paolo Abeni
  2022-12-02 12:23       ` Florian Westphal
@ 2022-12-02 20:16       ` Paul Moore
  2022-12-05 20:58         ` Paolo Abeni
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Moore @ 2022-12-02 20:16 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Ondrej Mosnacek, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Fri, Dec 2, 2022 at 7:07 AM Paolo Abeni <pabeni@redhat.com> wrote:
> On Thu, 2022-12-01 at 21:06 -0500, Paul Moore wrote:

...

> > However, I think this simplest solution might be what I mentioned
> > above as #2a, simply labeling the subflow socket properly from the
> > beginning.  In the case of SELinux I think we could do that by simply
> > clearing the @kern flag in the case of IPPROTO_MPTCP; completely
> > untested (and likely whitespace mangled) hack shown below:
> >
> > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > index f553c370397e..de6aa80b2319 100644
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -4562,6 +4562,7 @@ static int selinux_socket_create(int family, int type,
> >        u16 secclass;
> >        int rc;
> >
> > +       kern = (protocol == IPPROTO_MPTCP ? 0 : kern);
> >        if (kern)
> >                return 0;
> >
> > @@ -4584,6 +4585,7 @@ static int selinux_socket_post_create(struct
> > socket *sock, int family,
> >        u32 sid = SECINITSID_KERNEL;
> >        int err = 0;
> >
> > +       kern = (protocol == IPPROTO_MPTCP ? 0 : kern);
> >        if (!kern) {
> >                err = socket_sockcreate_sid(tsec, sclass, &sid);
> >                if (err)
> >
> > ... of course the downside is that this is not a general cross-LSM
> > solution, but those are very hard, if not impossible, to create as the
> > individual LSMs can vary tremendously.  There is also the downside of
> > having to have a protocol specific hack in the LSM socket creation
> > hooks, which is a bit of a bummer, but then again we already have to
> > do so much worse elsewhere in the LSM networking hooks that this is
> > far from the worst thing we've had to do.
>
> There is a problem with the above: the relevant/affected socket is an
> MPTCP subflow, which is actually a TCP socket (protocol ==
> IPPROTO_TCP). Yep, MPTCP is quite a mes... complex.
>
> I think we can't follow this later path.

Bummer.  I was afraid I was missing something, that was too easy of a "fix" ;)

As I continue to look at the MPTCP code, I'm also now growing a little
concerned that we may have another issue regarding the number of
subflows attached to a sock; more on this below.

> Side note: I'm confused by the selinux_sock_graft() implementation:
>
> https://elixir.bootlin.com/linux/v6.1-rc7/source/security/selinux/hooks.c#L5243
>
> it looks like the 'sid' is copied from the 'child' socket into the
> 'parent', while the sclass from the 'parent' into the 'child'. Am I
> misreading it? is that intended? I would have expeted both 'sid' and
> 'sclass' being copied from the parent into the child. Other LSM's
> sock_graft() initilize the child and leave alone the parent.

MPTCP isn't the only thing that is ... complex ;)

Believe it or not, selinux_sock_graft() is correct.  The reasoning is
that the new connection sock has been labeled based on the incoming
connection, which can be influenced by a variety of sources including
the security attributes of the remote endpoint; however, the
associated child socket is always labeled based on the security
context of the task calling accept(2).  Transfering the sock's label
(sid) to the child socket during the accept(2) operation ensures that
the newly created socket is labeled based on the inbound connection
labeling rules, and not simply the security context of the calling
process.

Transferring the object class (sclass) from the socket/inode to the
newly grafted sock just ensures that the sock's object class is set
correctly.

> ---
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 99f5e51d5ca4..b8095b8df71d 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -3085,7 +3085,10 @@ struct sock *mptcp_sk_clone(const struct sock *sk,
>         /* will be fully established after successful MPC subflow creation */
>         inet_sk_state_store(nsk, TCP_SYN_RECV);
>
> -       security_inet_csk_clone(nsk, req);
> +       /* let's the new socket inherit the security label from the msk
> +        * listener, as the TCP reqest socket carries a kernel context
> +        */
> +       security_sock_graft(nsk, sk->sk_socket);
>         bh_unlock_sock(nsk);

As a quick aside, I'm working under the assumption that a MPTCP
request_sock goes through the same sort of logic/processing as TCP, if
that is wrong we likely have additional issues.

I think one potential problem with the code above is that if the
subflow socket, @sk above, has multiple inbound connections (is that
legal with MPTCP?) it is going to be relabeled multiple times which is
not good (label's shouldn't change once set, unless there is an
explicit relabel event).  I think we need to focus on ensuring that
the subflow socket is labeled properly at creation time, and that has
me looking more at mptcp_subflow_create_socket() ...

What if we added a new LSM call in mptcp_subflow_create_socket(), just
after the sock_create_kern() call?  Inside
mptcp_subflow_create_socket() we have a pointer to the top level MPTCP
sock that we should serve as the source of the subflow's label, so for
SELinux it's just a matter of copying the label information and doing
basically the same setup we do in selinux_socket_post_create().  The
only wrinkle that I can see is that this new LSM/SELinux hook would be
called *after* security_socket_post_create() so we would need to add a
special comment about that and write the code accordingly.  That said,
it *shouldn't* be an issue for the SELinux code right now, but if this
approach looks reasonable to you we should probably double check that
(the NetLabel/CIPSO and NetLabel/CALIPSO code would be my main concern
here).

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-02 20:16       ` Paul Moore
@ 2022-12-05 20:58         ` Paolo Abeni
  2022-12-06 14:43           ` Ondrej Mosnacek
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Abeni @ 2022-12-05 20:58 UTC (permalink / raw)
  To: Paul Moore
  Cc: Ondrej Mosnacek, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Fri, 2022-12-02 at 15:16 -0500, Paul Moore wrote:
> On Fri, Dec 2, 2022 at 7:07 AM Paolo Abeni <pabeni@redhat.com> wrote:
> 
> > Side note: I'm confused by the selinux_sock_graft() implementation:
> > 
> > https://elixir.bootlin.com/linux/v6.1-rc7/source/security/selinux/hooks.c#L5243
> > 
> > it looks like the 'sid' is copied from the 'child' socket into the
> > 'parent', while the sclass from the 'parent' into the 'child'. Am I
> > misreading it? is that intended? I would have expeted both 'sid' and
> > 'sclass' being copied from the parent into the child. Other LSM's
> > sock_graft() initilize the child and leave alone the parent.
> 
> MPTCP isn't the only thing that is ... complex ;)
> 
> Believe it or not, selinux_sock_graft() is correct.  The reasoning is
> that the new connection sock has been labeled based on the incoming
> connection, which can be influenced by a variety of sources including
> the security attributes of the remote endpoint; however, the
> associated child socket is always labeled based on the security
> context of the task calling accept(2).  Transfering the sock's label
> (sid) to the child socket during the accept(2) operation ensures that
> the newly created socket is labeled based on the inbound connection
> labeling rules, and not simply the security context of the calling
> process.
> 
> Transferring the object class (sclass) from the socket/inode to the
> newly grafted sock just ensures that the sock's object class is set
> correctly.

Thank you for the explaination. Hopefully I'm less confused now;)

> > ---
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index 99f5e51d5ca4..b8095b8df71d 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -3085,7 +3085,10 @@ struct sock *mptcp_sk_clone(const struct
> > sock *sk,
> >         /* will be fully established after successful MPC subflow
> > creation */
> >         inet_sk_state_store(nsk, TCP_SYN_RECV);
> > 
> > - security_inet_csk_clone(nsk, req);
> > + /* let's the new socket inherit the security label from the msk
> > + * listener, as the TCP reqest socket carries a kernel context
> > + */
> > + security_sock_graft(nsk, sk->sk_socket);
> >         bh_unlock_sock(nsk);
> 
> As a quick aside, I'm working under the assumption that a MPTCP
> request_sock goes through the same sort of logic/processing as TCP, 

The above assumption is correct.

> if that is wrong we likely have additional issues.
> 
> I think one potential problem with the code above is that if the
> subflow socket, @sk above, has multiple inbound connections (is that
> legal with MPTCP?) 

Here there are few things that need some clarifications. In the above
chunk of code, 'sk' is the main mptcp socket, not a subflow. Insite the
mptcp code, subflow sockets variable name is usually 'ssk' (== subflow
sk).

An mptcp socket allows multiple inbound connections (each of them is a
subflow).

> it is going to be relabeled multiple times which is
> not good (label's shouldn't change once set, unless there is an
> explicit relabel event).  

I now see that even my 2nd proposal is wrong, thanks for pointing that
out!

> I think we need to focus on ensuring that
> the subflow socket is labeled properly at creation time, and that has
> me looking more at mptcp_subflow_create_socket() ...

Agreed.

> What if we added a new LSM call in mptcp_subflow_create_socket(), just
> after the sock_create_kern() call?  

That should work, I think. I would like to propose a (last) attempt
that will not need an additional selinux hook - to try to minimize the
required changes and avoid unnecessary addional work for current and
future LSM mainteniance and creation.

I tested the following patch and passes the reproducer (and mptcp self-
tests). Basically it introduces and uses a sock_create_nosec variant,
to allow mptcp_subflow_create_socket() calling
security_socket_post_create() with the corrct arguments. WDYT?

---
 include/linux/net.h |  2 ++
 net/mptcp/subflow.c | 18 ++++++++++++--
 net/socket.c        | 60 ++++++++++++++++++++++++++++++---------------
 3 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..91713012504d 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -251,6 +251,8 @@ int sock_wake_async(struct socket_wq *sk_wq, int how, int band);
 int sock_register(const struct net_proto_family *fam);
 void sock_unregister(int family);
 bool sock_is_registered(int family);
+int __sock_create_nosec(struct net *net, int family, int type, int proto,
+			struct socket **res, int kern);
 int __sock_create(struct net *net, int family, int type, int proto,
 		  struct socket **res, int kern);
 int sock_create(int family, int type, int proto, struct socket **res);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 29904303f5c2..9341f9313154 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1646,11 +1646,25 @@ int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock)
 	if (unlikely(!sk->sk_socket))
 		return -EINVAL;
 
-	err = sock_create_kern(net, sk->sk_family, SOCK_STREAM, IPPROTO_TCP,
-			       &sf);
+	/* the subflow is created by the kernel, and we need kernel annotation
+	 * for lockdep's sake...
+	 */
+	err = __sock_create_nosec(net, sk->sk_family, SOCK_STREAM, IPPROTO_TCP,
+				  &sf, 1);
 	if (err)
 		return err;
 
+	/* ... but the first subflow will be indirectly exposed to the
+	 * user-space via accept(). Let's attach the current user security
+	 * label
+	 */
+	err = security_socket_post_create(sf, sk->sk_family, SOCK_STREAM,
+					  IPPROTO_TCP, 0);
+	if (err) {
+		sock_release(sf);
+		return err;
+	}
+
 	lock_sock(sf->sk);
 
 	/* the newly created socket has to be in the same cgroup as its parent */
diff --git a/net/socket.c b/net/socket.c
index 55c5d536e5f6..d5d51e4e26ae 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1426,23 +1426,11 @@ int sock_wake_async(struct socket_wq *wq, int how, int band)
 }
 EXPORT_SYMBOL(sock_wake_async);
 
-/**
- *	__sock_create - creates a socket
- *	@net: net namespace
- *	@family: protocol family (AF_INET, ...)
- *	@type: communication type (SOCK_STREAM, ...)
- *	@protocol: protocol (0, ...)
- *	@res: new socket
- *	@kern: boolean for kernel space sockets
- *
- *	Creates a new socket and assigns it to @res, passing through LSM.
- *	Returns 0 or an error. On failure @res is set to %NULL. @kern must
- *	be set to true if the socket resides in kernel space.
- *	This function internally uses GFP_KERNEL.
- */
 
-int __sock_create(struct net *net, int family, int type, int protocol,
-			 struct socket **res, int kern)
+
+/* Creates a socket leaving LSM post-creation checks to the caller */
+int __sock_create_nosec(struct net *net, int family, int type, int protocol,
+			struct socket **res, int kern)
 {
 	int err;
 	struct socket *sock;
@@ -1528,11 +1516,8 @@ int __sock_create(struct net *net, int family, int type, int protocol,
 	 * module can have its refcnt decremented
 	 */
 	module_put(pf->owner);
-	err = security_socket_post_create(sock, family, type, protocol, kern);
-	if (err)
-		goto out_sock_release;
-	*res = sock;
 
+	*res = sock;
 	return 0;
 
 out_module_busy:
@@ -1548,6 +1533,41 @@ int __sock_create(struct net *net, int family, int type, int protocol,
 	rcu_read_unlock();
 	goto out_sock_release;
 }
+
+/**
+ *	__sock_create - creates a socket
+ *	@net: net namespace
+ *	@family: protocol family (AF_INET, ...)
+ *	@type: communication type (SOCK_STREAM, ...)
+ *	@protocol: protocol (0, ...)
+ *	@res: new socket
+ *	@kern: boolean for kernel space sockets
+ *
+ *	Creates a new socket and assigns it to @res, passing through LSM.
+ *	Returns 0 or an error. On failure @res is set to %NULL. @kern must
+ *	be set to true if the socket resides in kernel space.
+ *	This function internally uses GFP_KERNEL.
+ */
+
+int __sock_create(struct net *net, int family, int type, int protocol,
+		  struct socket **res, int kern)
+{
+	struct socket *sock;
+	int err;
+
+	err = __sock_create_nosec(net, family, type, protocol, &sock, kern);
+	if (err)
+		return err;
+
+	err = security_socket_post_create(sock, family, type, protocol, kern);
+	if (err) {
+		sock_release(sock);
+		return err;
+	}
+
+	*res = sock;
+	return 0;
+}
 EXPORT_SYMBOL(__sock_create);
 
 /**


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-05 20:58         ` Paolo Abeni
@ 2022-12-06 14:43           ` Ondrej Mosnacek
  2022-12-06 16:51             ` Paolo Abeni
  2022-12-08 22:45             ` Paul Moore
  0 siblings, 2 replies; 10+ messages in thread
From: Ondrej Mosnacek @ 2022-12-06 14:43 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Paul Moore, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Mon, Dec 5, 2022 at 9:58 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Fri, 2022-12-02 at 15:16 -0500, Paul Moore wrote:
[...]
> > What if we added a new LSM call in mptcp_subflow_create_socket(), just
> > after the sock_create_kern() call?
>
> That should work, I think. I would like to propose a (last) attempt
> that will not need an additional selinux hook - to try to minimize the
> required changes and avoid unnecessary addional work for current and
> future LSM mainteniance and creation.
>
> I tested the following patch and passes the reproducer (and mptcp self-
> tests). Basically it introduces and uses a sock_create_nosec variant,
> to allow mptcp_subflow_create_socket() calling
> security_socket_post_create() with the corrct arguments. WDYT?

This seems like a step in the right direction, but I wonder if we
shouldn't solve the current overloading of the "kern" flag more
explicitly - i.e. split it into two flags: one to indicate that the
socket will only be used internally by the kernel ("internal") and
another one to indicate if it should be labeled according to the
current task or as a kernel-created socket ("kern"?). Technically,
each combination could have a valid use case:
- !internal && !kern -> a regular userspace-created socket,
- !internal && kern -> a socket that is exposed to userspace, but
created by the kernel outside of a syscall (e.g. some global socket
created during initcall phase and later returned to userspace via an
ioctl or something),
- internal && !kern -> our MPTCP case, where the socket itself is
internal, but the label is still important so it can be passed onto
its accept-offspring (which may no longer be internal),
- internal && kern -> a completely kernel-internal socket.

Another concern I have about this approach is whether it is possible
(in some more advanced scenario) for mptcp_subflow_create_socket() to
be called in the context of a different task than the one
creating/handling the main socket. Because then a potential socket
accepted from the new subflow socket would end up with an unexpected
(and probably semantically wrong) label. Glancing over the call tree,
it seems it can be called via some netlink commands - presumably
intended to be used by mptcpd?


>
> ---
>  include/linux/net.h |  2 ++
>  net/mptcp/subflow.c | 18 ++++++++++++--
>  net/socket.c        | 60 ++++++++++++++++++++++++++++++---------------
>  3 files changed, 58 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/net.h b/include/linux/net.h
> index b73ad8e3c212..91713012504d 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -251,6 +251,8 @@ int sock_wake_async(struct socket_wq *sk_wq, int how, int band);
>  int sock_register(const struct net_proto_family *fam);
>  void sock_unregister(int family);
>  bool sock_is_registered(int family);
> +int __sock_create_nosec(struct net *net, int family, int type, int proto,
> +                       struct socket **res, int kern);
>  int __sock_create(struct net *net, int family, int type, int proto,
>                   struct socket **res, int kern);
>  int sock_create(int family, int type, int proto, struct socket **res);
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 29904303f5c2..9341f9313154 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1646,11 +1646,25 @@ int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock)
>         if (unlikely(!sk->sk_socket))
>                 return -EINVAL;
>
> -       err = sock_create_kern(net, sk->sk_family, SOCK_STREAM, IPPROTO_TCP,
> -                              &sf);
> +       /* the subflow is created by the kernel, and we need kernel annotation
> +        * for lockdep's sake...
> +        */
> +       err = __sock_create_nosec(net, sk->sk_family, SOCK_STREAM, IPPROTO_TCP,
> +                                 &sf, 1);
>         if (err)
>                 return err;
>
> +       /* ... but the first subflow will be indirectly exposed to the
> +        * user-space via accept(). Let's attach the current user security
> +        * label
> +        */
> +       err = security_socket_post_create(sf, sk->sk_family, SOCK_STREAM,
> +                                         IPPROTO_TCP, 0);
> +       if (err) {
> +               sock_release(sf);
> +               return err;
> +       }
> +
>         lock_sock(sf->sk);
>
>         /* the newly created socket has to be in the same cgroup as its parent */
> diff --git a/net/socket.c b/net/socket.c
> index 55c5d536e5f6..d5d51e4e26ae 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -1426,23 +1426,11 @@ int sock_wake_async(struct socket_wq *wq, int how, int band)
>  }
>  EXPORT_SYMBOL(sock_wake_async);
>
> -/**
> - *     __sock_create - creates a socket
> - *     @net: net namespace
> - *     @family: protocol family (AF_INET, ...)
> - *     @type: communication type (SOCK_STREAM, ...)
> - *     @protocol: protocol (0, ...)
> - *     @res: new socket
> - *     @kern: boolean for kernel space sockets
> - *
> - *     Creates a new socket and assigns it to @res, passing through LSM.
> - *     Returns 0 or an error. On failure @res is set to %NULL. @kern must
> - *     be set to true if the socket resides in kernel space.
> - *     This function internally uses GFP_KERNEL.
> - */
>
> -int __sock_create(struct net *net, int family, int type, int protocol,
> -                        struct socket **res, int kern)
> +
> +/* Creates a socket leaving LSM post-creation checks to the caller */
> +int __sock_create_nosec(struct net *net, int family, int type, int protocol,
> +                       struct socket **res, int kern)
>  {
>         int err;
>         struct socket *sock;
> @@ -1528,11 +1516,8 @@ int __sock_create(struct net *net, int family, int type, int protocol,
>          * module can have its refcnt decremented
>          */
>         module_put(pf->owner);
> -       err = security_socket_post_create(sock, family, type, protocol, kern);
> -       if (err)
> -               goto out_sock_release;
> -       *res = sock;
>
> +       *res = sock;
>         return 0;
>
>  out_module_busy:
> @@ -1548,6 +1533,41 @@ int __sock_create(struct net *net, int family, int type, int protocol,
>         rcu_read_unlock();
>         goto out_sock_release;
>  }
> +
> +/**
> + *     __sock_create - creates a socket
> + *     @net: net namespace
> + *     @family: protocol family (AF_INET, ...)
> + *     @type: communication type (SOCK_STREAM, ...)
> + *     @protocol: protocol (0, ...)
> + *     @res: new socket
> + *     @kern: boolean for kernel space sockets
> + *
> + *     Creates a new socket and assigns it to @res, passing through LSM.
> + *     Returns 0 or an error. On failure @res is set to %NULL. @kern must
> + *     be set to true if the socket resides in kernel space.
> + *     This function internally uses GFP_KERNEL.
> + */
> +
> +int __sock_create(struct net *net, int family, int type, int protocol,
> +                 struct socket **res, int kern)
> +{
> +       struct socket *sock;
> +       int err;
> +
> +       err = __sock_create_nosec(net, family, type, protocol, &sock, kern);
> +       if (err)
> +               return err;
> +
> +       err = security_socket_post_create(sock, family, type, protocol, kern);
> +       if (err) {
> +               sock_release(sock);
> +               return err;
> +       }
> +
> +       *res = sock;
> +       return 0;
> +}
>  EXPORT_SYMBOL(__sock_create);
>
>  /**
>

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-06 14:43           ` Ondrej Mosnacek
@ 2022-12-06 16:51             ` Paolo Abeni
  2022-12-08 22:45             ` Paul Moore
  1 sibling, 0 replies; 10+ messages in thread
From: Paolo Abeni @ 2022-12-06 16:51 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Paul Moore, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Tue, 2022-12-06 at 15:43 +0100, Ondrej Mosnacek wrote:
> On Mon, Dec 5, 2022 at 9:58 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > 
> > On Fri, 2022-12-02 at 15:16 -0500, Paul Moore wrote:
> [...]
> > > What if we added a new LSM call in mptcp_subflow_create_socket(), just
> > > after the sock_create_kern() call?
> > 
> > That should work, I think. I would like to propose a (last) attempt
> > that will not need an additional selinux hook - to try to minimize the
> > required changes and avoid unnecessary addional work for current and
> > future LSM mainteniance and creation.
> > 
> > I tested the following patch and passes the reproducer (and mptcp self-
> > tests). Basically it introduces and uses a sock_create_nosec variant,
> > to allow mptcp_subflow_create_socket() calling
> > security_socket_post_create() with the corrct arguments. WDYT?
> 
> This seems like a step in the right direction, but I wonder if we
> shouldn't solve the current overloading of the "kern" flag more
> explicitly - i.e. split it into two flags: one to indicate that the
> socket will only be used internally by the kernel ("internal") and
> another one to indicate if it should be labeled according to the
> current task or as a kernel-created socket ("kern"?). Technically,
> each combination could have a valid use case:
> - !internal && !kern -> a regular userspace-created socket,
> - !internal && kern -> a socket that is exposed to userspace, but
> created by the kernel outside of a syscall (e.g. some global socket
> created during initcall phase and later returned to userspace via an
> ioctl or something),
> - internal && !kern -> our MPTCP case, where the socket itself is
> internal, but the label is still important so it can be passed onto
> its accept-offspring (which may no longer be internal),
> - internal && kern -> a completely kernel-internal socket.

I would say perfect is the enemy of good ;) it would be nice to have a
fix sometime soon, and we can improve as needed.

> Another concern I have about this approach is whether it is possible
> (in some more advanced scenario) for mptcp_subflow_create_socket() to
> be called in the context of a different task than the one
> creating/handling the main socket. Because then a potential socket
> accepted from the new subflow socket would end up with an unexpected
> (and probably semantically wrong) label. Glancing over the call tree,
> it seems it can be called via some netlink commands - presumably
> intended to be used by mptcpd?

Yes, the above can happen, but I think it does not have LSM-related
implications, as subflows created in the above scenario can be MP_JOIN
only - that is, will never be even indirectly exposed to user-space.


Cheers,

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Broken SELinux/LSM labeling with MPTCP and accept(2)
  2022-12-06 14:43           ` Ondrej Mosnacek
  2022-12-06 16:51             ` Paolo Abeni
@ 2022-12-08 22:45             ` Paul Moore
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Moore @ 2022-12-08 22:45 UTC (permalink / raw)
  To: Ondrej Mosnacek
  Cc: Paolo Abeni, SElinux list, Linux Security Module list, mptcp,
	network dev, Mat Martineau, Matthieu Baerts

On Tue, Dec 6, 2022 at 9:43 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> On Mon, Dec 5, 2022 at 9:58 PM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On Fri, 2022-12-02 at 15:16 -0500, Paul Moore wrote:
> [...]
> > > What if we added a new LSM call in mptcp_subflow_create_socket(), just
> > > after the sock_create_kern() call?
> >
> > That should work, I think. I would like to propose a (last) attempt
> > that will not need an additional selinux hook - to try to minimize the
> > required changes and avoid unnecessary addional work for current and
> > future LSM mainteniance and creation.
> >
> > I tested the following patch and passes the reproducer (and mptcp self-
> > tests). Basically it introduces and uses a sock_create_nosec variant,
> > to allow mptcp_subflow_create_socket() calling
> > security_socket_post_create() with the corrct arguments. WDYT?

I'm still working my way through the recent patch posted by Paolo
(sorry, I lost some time due to build failures and other discussions),
but I wanted to comment on a few things in Ondrej's email ...

> This seems like a step in the right direction, but I wonder if we
> shouldn't solve the current overloading of the "kern" flag more
> explicitly - i.e. split it into two flags: one to indicate that the
> socket will only be used internally by the kernel ("internal") and
> another one to indicate if it should be labeled according to the
> current task or as a kernel-created socket ("kern"?).

The problem here is that labeling behavior isn't always going to be
the same across LSMs, or rather I don't want to force a specific
labeling behavior at the LSM layer.  We pass the @kern flag to
indicate a kernel socket that is not visible to userspace, but we
leave it up to the individual LSMs to handle that as they see fit
(including if they should label sockets based on @current, an
associated socket, or something else).

> Another concern I have about this approach is whether it is possible
> (in some more advanced scenario) for mptcp_subflow_create_socket() to
> be called in the context of a different task than the one
> creating/handling the main socket. Because then a potential socket
> accepted from the new subflow socket would end up with an unexpected
> (and probably semantically wrong) label. Glancing over the call tree,
> it seems it can be called via some netlink commands - presumably
> intended to be used by mptcpd?

Yes, this is something we need to be careful about, if we want to
treat all of these subflows just as we would treat the main socket,
the subflows need to be labeled the same as the main socket,
regardless of the process which creates them.

As a FYI to the non-SELinux folks, a socket's label can be important
even when it is not visible to userspace as the label is used both to
set the on-the-wire label (CIPSO, CALIPSO, labeled IPsec) and in the
per-packet access controls (see security_sock_rcv_skb()).

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-12-08 22:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-01 13:42 Broken SELinux/LSM labeling with MPTCP and accept(2) Ondrej Mosnacek
2022-12-01 18:26 ` Paolo Abeni
2022-12-02  2:06   ` Paul Moore
2022-12-02 12:07     ` Paolo Abeni
2022-12-02 12:23       ` Florian Westphal
2022-12-02 20:16       ` Paul Moore
2022-12-05 20:58         ` Paolo Abeni
2022-12-06 14:43           ` Ondrej Mosnacek
2022-12-06 16:51             ` Paolo Abeni
2022-12-08 22:45             ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).