netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook
@ 2019-07-19  2:28 Eric Dumazet
  2019-07-19  2:50 ` Neal Cardwell
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eric Dumazet @ 2019-07-19  2:28 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Lawrence Brakmo, Neal Cardwell

Neal reported incorrect use of ns_capable() from bpf hook.

bpf_setsockopt(...TCP_CONGESTION...)
  -> tcp_set_congestion_control()
   -> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
    -> ns_capable_common()
     -> current_cred()
      -> rcu_dereference_protected(current->cred, 1)

Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.

As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.

The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.

Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lawrence Brakmo <brakmo@fb.com>
Reported-by: Neal Cardwell <ncardwell@google.com>
---
 include/net/tcp.h   | 3 ++-
 net/core/filter.c   | 2 +-
 net/ipv4/tcp.c      | 4 +++-
 net/ipv4/tcp_cong.c | 6 +++---
 4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index cca3c59b98bf85c2bdd7adf79157159df163b1ae..f42d300f0cfaa87520320dd287a7b4750adf7d8a 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1064,7 +1064,8 @@ void tcp_get_default_congestion_control(struct net *net, char *name);
 void tcp_get_available_congestion_control(char *buf, size_t len);
 void tcp_get_allowed_congestion_control(char *buf, size_t len);
 int tcp_set_allowed_congestion_control(char *allowed);
-int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, bool reinit);
+int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
+			       bool reinit, bool cap_net_admin);
 u32 tcp_slow_start(struct tcp_sock *tp, u32 acked);
 void tcp_cong_avoid_ai(struct tcp_sock *tp, u32 w, u32 acked);
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 0f6854ccf8949f131f7e229d552f9f947dc205a2..4e2a79b2fd77f36ba2a31e9e43af1abc1207766e 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4335,7 +4335,7 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
 						    TCP_CA_NAME_MAX-1));
 			name[TCP_CA_NAME_MAX-1] = 0;
 			ret = tcp_set_congestion_control(sk, name, false,
-							 reinit);
+							 reinit, true);
 		} else {
 			struct tcp_sock *tp = tcp_sk(sk);
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7846afacdf0bfdbc5ba5c6d48b2c5873df1309c9..776905899ac06bcbaa7ece1f580303478e736d56 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2785,7 +2785,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 		name[val] = 0;
 
 		lock_sock(sk);
-		err = tcp_set_congestion_control(sk, name, true, true);
+		err = tcp_set_congestion_control(sk, name, true, true,
+						 ns_capable(sock_net(sk)->user_ns,
+							    CAP_NET_ADMIN));
 		release_sock(sk);
 		return err;
 	}
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index e1862b64a90fba25b84dd9d5584e1f843406edd0..c445a81d144ea4ed1c67ad80a96433df35f5f8de 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -333,7 +333,8 @@ int tcp_set_allowed_congestion_control(char *val)
  * tcp_reinit_congestion_control (if the current congestion control was
  * already initialized.
  */
-int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, bool reinit)
+int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
+			       bool reinit, bool cap_net_admin)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	const struct tcp_congestion_ops *ca;
@@ -369,8 +370,7 @@ int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, boo
 		} else {
 			err = -EBUSY;
 		}
-	} else if (!((ca->flags & TCP_CONG_NON_RESTRICTED) ||
-		     ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))) {
+	} else if (!((ca->flags & TCP_CONG_NON_RESTRICTED) || cap_net_admin)) {
 		err = -EPERM;
 	} else if (!try_module_get(ca->owner)) {
 		err = -EBUSY;
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook
  2019-07-19  2:28 [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook Eric Dumazet
@ 2019-07-19  2:50 ` Neal Cardwell
  2019-07-19  2:55 ` Lawrence Brakmo
  2019-07-19  3:34 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Neal Cardwell @ 2019-07-19  2:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S . Miller, netdev, Eric Dumazet, Lawrence Brakmo

On Thu, Jul 18, 2019 at 10:28 PM Eric Dumazet <edumazet@google.com> wrote:
>
> Neal reported incorrect use of ns_capable() from bpf hook.
>
> bpf_setsockopt(...TCP_CONGESTION...)
>   -> tcp_set_congestion_control()
>    -> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
>     -> ns_capable_common()
>      -> current_cred()
>       -> rcu_dereference_protected(current->cred, 1)
>
> Accessing 'current' in bpf context makes no sense, since packets
> are processed from softirq context.
>
> As Neal stated : The capability check in tcp_set_congestion_control()
> was written assuming a system call context, and then was reused from
> a BPF call site.
>
> The fix is to add a new parameter to tcp_set_congestion_control(),
> so that the ns_capable() call is only performed under the right
> context.
>
> Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Lawrence Brakmo <brakmo@fb.com>
> Reported-by: Neal Cardwell <ncardwell@google.com>
> ---

Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks, Eric!

neal

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook
  2019-07-19  2:28 [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook Eric Dumazet
  2019-07-19  2:50 ` Neal Cardwell
@ 2019-07-19  2:55 ` Lawrence Brakmo
  2019-07-19  3:34 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Lawrence Brakmo @ 2019-07-19  2:55 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller; +Cc: netdev, Eric Dumazet, Neal Cardwell

On 7/18/19, 7:28 PM, "Eric Dumazet" <edumazet@google.com> wrote:

    Neal reported incorrect use of ns_capable() from bpf hook.
    
    bpf_setsockopt(...TCP_CONGESTION...)
      -> tcp_set_congestion_control()
       -> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
        -> ns_capable_common()
         -> current_cred()
          -> rcu_dereference_protected(current->cred, 1)
    
    Accessing 'current' in bpf context makes no sense, since packets
    are processed from softirq context.
    
    As Neal stated : The capability check in tcp_set_congestion_control()
    was written assuming a system call context, and then was reused from
    a BPF call site.
    
    The fix is to add a new parameter to tcp_set_congestion_control(),
    so that the ns_capable() call is only performed under the right
    context.
    
    Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Lawrence Brakmo <brakmo@fb.com>
    Reported-by: Neal Cardwell <ncardwell@google.com>
    ---

Acked-by: Lawrence Brakmo <brakmo@fb.com>
Thanks, Eric!
 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook
  2019-07-19  2:28 [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook Eric Dumazet
  2019-07-19  2:50 ` Neal Cardwell
  2019-07-19  2:55 ` Lawrence Brakmo
@ 2019-07-19  3:34 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2019-07-19  3:34 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, brakmo, ncardwell

From: Eric Dumazet <edumazet@google.com>
Date: Thu, 18 Jul 2019 19:28:14 -0700

> Neal reported incorrect use of ns_capable() from bpf hook.
> 
> bpf_setsockopt(...TCP_CONGESTION...)
>   -> tcp_set_congestion_control()
>    -> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
>     -> ns_capable_common()
>      -> current_cred()
>       -> rcu_dereference_protected(current->cred, 1)
> 
> Accessing 'current' in bpf context makes no sense, since packets
> are processed from softirq context.
> 
> As Neal stated : The capability check in tcp_set_congestion_control()
> was written assuming a system call context, and then was reused from
> a BPF call site.
> 
> The fix is to add a new parameter to tcp_set_congestion_control(),
> so that the ns_capable() call is only performed under the right
> context.
> 
> Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Lawrence Brakmo <brakmo@fb.com>
> Reported-by: Neal Cardwell <ncardwell@google.com>

Applied and queued up for -stable.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-19  3:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19  2:28 [PATCH net] tcp: fix tcp_set_congestion_control() use from bpf hook Eric Dumazet
2019-07-19  2:50 ` Neal Cardwell
2019-07-19  2:55 ` Lawrence Brakmo
2019-07-19  3:34 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).