bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE
@ 2020-06-16 22:52 Stanislav Fomichev
  2020-06-16 22:52 ` [PATCH bpf v4 2/2] selftests/bpf: make sure optvals > PAGE_SIZE are bypassed Stanislav Fomichev
  2020-06-16 23:05 ` [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Alexei Starovoitov
  0 siblings, 2 replies; 5+ messages in thread
From: Stanislav Fomichev @ 2020-06-16 22:52 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, David Laight

Attaching to these hooks can break iptables because its optval is
usually quite big, or at least bigger than the current PAGE_SIZE limit.
David also mentioned some SCTP options can be big (around 256k).

There are two possible ways to fix it:
1. Increase the limit to match iptables max optval. There is, however,
   no clear upper limit. Technically, iptables can accept up to
   512M of data (not sure how practical it is though).

2. Bypass the value (don't expose to BPF) if it's too big and trigger
   BPF only with level/optname so BPF can still decide whether
   to allow/deny big sockopts.

The initial attempt was implemented using strategy #1. Due to
listed shortcomings, let's switch to strategy #2. When there is
legitimate a real use-case for iptables/SCTP, we can consider increasing
 the PAGE_SIZE limit.

To support the cases where len(optval) > PAGE_SIZE we can
leverage upcoming sleepable BPF work by providing a helper
which can do copy_from_user (sleepable) at the given offset
from the original large buffer.

v4:
* use temporary buffer to avoid optval == optval_end == NULL;
  this removes the corner case in the verifier that might assume
  non-zero PTR_TO_PACKET/PTR_TO_PACKET_END.

v3:
* don't increase the limit, bypass the argument

v2:
* proper comments formatting (Jakub Kicinski)

Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Cc: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/filter.h |  1 +
 kernel/bpf/cgroup.c    | 31 +++++++++++++++++++++++++------
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 259377723603..f4565a70f8ba 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1276,6 +1276,7 @@ struct bpf_sockopt_kern {
 	s32		optname;
 	s32		optlen;
 	s32		retval;
+	u8		optval_too_large;
 };
 
 #endif /* __LINUX_FILTER_H__ */
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 4d76f16524cc..be78c01bf459 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1276,9 +1276,18 @@ static bool __cgroup_bpf_prog_array_is_empty(struct cgroup *cgrp,
 
 static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen)
 {
-	if (unlikely(max_optlen > PAGE_SIZE) || max_optlen < 0)
+	if (unlikely(max_optlen < 0))
 		return -EINVAL;
 
+	if (unlikely(max_optlen > PAGE_SIZE)) {
+		/* We don't expose optvals that are greater than PAGE_SIZE
+		 * to the BPF program.
+		 */
+		ctx->optval = &ctx->optval_too_large;
+		ctx->optval_end = &ctx->optval_too_large;
+		return 0;
+	}
+
 	ctx->optval = kzalloc(max_optlen, GFP_USER);
 	if (!ctx->optval)
 		return -ENOMEM;
@@ -1288,9 +1297,15 @@ static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen)
 	return 0;
 }
 
+static int sockopt_has_optval(struct bpf_sockopt_kern *ctx)
+{
+	return ctx->optval != &ctx->optval_too_large;
+}
+
 static void sockopt_free_buf(struct bpf_sockopt_kern *ctx)
 {
-	kfree(ctx->optval);
+	if (sockopt_has_optval(ctx))
+		kfree(ctx->optval);
 }
 
 int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
@@ -1325,7 +1340,8 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 
 	ctx.optlen = *optlen;
 
-	if (copy_from_user(ctx.optval, optval, *optlen) != 0) {
+	if (sockopt_has_optval(&ctx) &&
+	    copy_from_user(ctx.optval, optval, *optlen) != 0) {
 		ret = -EFAULT;
 		goto out;
 	}
@@ -1354,7 +1370,8 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 		*level = ctx.level;
 		*optname = ctx.optname;
 		*optlen = ctx.optlen;
-		*kernel_optval = ctx.optval;
+		if (sockopt_has_optval(&ctx))
+			*kernel_optval = ctx.optval;
 	}
 
 out:
@@ -1407,7 +1424,8 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 		if (ctx.optlen > max_optlen)
 			ctx.optlen = max_optlen;
 
-		if (copy_from_user(ctx.optval, optval, ctx.optlen) != 0) {
+		if (sockopt_has_optval(&ctx) &&
+		    copy_from_user(ctx.optval, optval, ctx.optlen) != 0) {
 			ret = -EFAULT;
 			goto out;
 		}
@@ -1436,7 +1454,8 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 		goto out;
 	}
 
-	if (copy_to_user(optval, ctx.optval, ctx.optlen) ||
+	if ((sockopt_has_optval(&ctx) &&
+	     copy_to_user(optval, ctx.optval, ctx.optlen)) ||
 	    put_user(ctx.optlen, optlen)) {
 		ret = -EFAULT;
 		goto out;
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH bpf v4 2/2] selftests/bpf: make sure optvals > PAGE_SIZE are bypassed
  2020-06-16 22:52 [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Stanislav Fomichev
@ 2020-06-16 22:52 ` Stanislav Fomichev
  2020-06-16 23:06   ` Alexei Starovoitov
  2020-06-16 23:05 ` [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Alexei Starovoitov
  1 sibling, 1 reply; 5+ messages in thread
From: Stanislav Fomichev @ 2020-06-16 22:52 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev

We are relying on the fact, that we can pass > sizeof(int) optvals
to the SOL_IP+IP_FREEBIND option (the kernel will take first 4 bytes).
In the BPF program, we return EPERM if optval is greater than optval_end
(implemented via PTR_TO_PACKET/PTR_TO_PACKET_END) and rely on the verifier
to enforce the fact that this data can not be touched.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../selftests/bpf/prog_tests/sockopt_sk.c     | 26 +++++++++++++++++++
 .../testing/selftests/bpf/progs/sockopt_sk.c  | 20 ++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
index 2061a6beac0f..eae1c8a1fee0 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockopt_sk.c
@@ -13,6 +13,7 @@ static int getsetsockopt(void)
 		char cc[16]; /* TCP_CA_NAME_MAX */
 	} buf = {};
 	socklen_t optlen;
+	char *big_buf;
 
 	fd = socket(AF_INET, SOCK_STREAM, 0);
 	if (fd < 0) {
@@ -78,6 +79,31 @@ static int getsetsockopt(void)
 		goto err;
 	}
 
+	/* IP_FREEBIND - BPF can't access optval when optlen > PAGE_SIZE */
+
+	optlen = getpagesize() * 2;
+	big_buf = calloc(1, optlen);
+	if (!big_buf) {
+		log_err("Couldn't allocate two pages");
+		goto err;
+	}
+
+	err = setsockopt(fd, SOL_IP, IP_FREEBIND, big_buf, optlen);
+	if (err != 0) {
+		log_err("Failed to call setsockopt, ret=%d", err);
+		free(big_buf);
+		goto err;
+	}
+
+	err = getsockopt(fd, SOL_IP, IP_FREEBIND, big_buf, &optlen);
+	if (err != 0) {
+		log_err("Failed to call getsockopt, ret=%d", err);
+		free(big_buf);
+		goto err;
+	}
+
+	free(big_buf);
+
 	/* SO_SNDBUF is overwritten */
 
 	buf.u32 = 0x01010101;
diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c
index d5a5eeb5fb52..933a2ef9c930 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c
@@ -51,6 +51,16 @@ int _getsockopt(struct bpf_sockopt *ctx)
 		return 1;
 	}
 
+	if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
+		if (optval > optval_end) {
+			/* For optval > PAGE_SIZE, the actual data
+			 * is not provided.
+			 */
+			return 0; /* EPERM, unexpected data size */
+		}
+		return 1;
+	}
+
 	if (ctx->level != SOL_CUSTOM)
 		return 0; /* EPERM, deny everything except custom level */
 
@@ -112,6 +122,16 @@ int _setsockopt(struct bpf_sockopt *ctx)
 		return 1;
 	}
 
+	if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
+		if (optval > optval_end) {
+			/* For optval > PAGE_SIZE, the actual data
+			 * is not provided.
+			 */
+			return 0; /* EPERM, unexpected data size */
+		}
+		return 1;
+	}
+
 	if (ctx->level != SOL_CUSTOM)
 		return 0; /* EPERM, deny everything except custom level */
 
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE
  2020-06-16 22:52 [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Stanislav Fomichev
  2020-06-16 22:52 ` [PATCH bpf v4 2/2] selftests/bpf: make sure optvals > PAGE_SIZE are bypassed Stanislav Fomichev
@ 2020-06-16 23:05 ` Alexei Starovoitov
  2020-06-16 23:20   ` Stanislav Fomichev
  1 sibling, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2020-06-16 23:05 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, David Laight

On Tue, Jun 16, 2020 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> Attaching to these hooks can break iptables because its optval is
> usually quite big, or at least bigger than the current PAGE_SIZE limit.
> David also mentioned some SCTP options can be big (around 256k).
>
> There are two possible ways to fix it:
> 1. Increase the limit to match iptables max optval. There is, however,
>    no clear upper limit. Technically, iptables can accept up to
>    512M of data (not sure how practical it is though).
>
> 2. Bypass the value (don't expose to BPF) if it's too big and trigger
>    BPF only with level/optname so BPF can still decide whether
>    to allow/deny big sockopts.
>
> The initial attempt was implemented using strategy #1. Due to
> listed shortcomings, let's switch to strategy #2. When there is
> legitimate a real use-case for iptables/SCTP, we can consider increasing
>  the PAGE_SIZE limit.
>
> To support the cases where len(optval) > PAGE_SIZE we can
> leverage upcoming sleepable BPF work by providing a helper
> which can do copy_from_user (sleepable) at the given offset
> from the original large buffer.
>
> v4:
> * use temporary buffer to avoid optval == optval_end == NULL;
>   this removes the corner case in the verifier that might assume
>   non-zero PTR_TO_PACKET/PTR_TO_PACKET_END.

just replied with another idea in v3 thread...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf v4 2/2] selftests/bpf: make sure optvals > PAGE_SIZE are bypassed
  2020-06-16 22:52 ` [PATCH bpf v4 2/2] selftests/bpf: make sure optvals > PAGE_SIZE are bypassed Stanislav Fomichev
@ 2020-06-16 23:06   ` Alexei Starovoitov
  0 siblings, 0 replies; 5+ messages in thread
From: Alexei Starovoitov @ 2020-06-16 23:06 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann

On Tue, Jun 16, 2020 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote:
> +       if (ctx->level == SOL_IP && ctx->optname == IP_FREEBIND) {
> +               if (optval > optval_end) {

same issue as before ?
see reply in v3.

> +                       /* For optval > PAGE_SIZE, the actual data
> +                        * is not provided.
> +                        */
> +                       return 0; /* EPERM, unexpected data size */
> +               }
> +               return 1;
> +       }
> +

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE
  2020-06-16 23:05 ` [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Alexei Starovoitov
@ 2020-06-16 23:20   ` Stanislav Fomichev
  0 siblings, 0 replies; 5+ messages in thread
From: Stanislav Fomichev @ 2020-06-16 23:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Network Development, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, David Laight

On Tue, Jun 16, 2020 at 4:05 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jun 16, 2020 at 3:53 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Attaching to these hooks can break iptables because its optval is
> > usually quite big, or at least bigger than the current PAGE_SIZE limit.
> > David also mentioned some SCTP options can be big (around 256k).
> >
> > There are two possible ways to fix it:
> > 1. Increase the limit to match iptables max optval. There is, however,
> >    no clear upper limit. Technically, iptables can accept up to
> >    512M of data (not sure how practical it is though).
> >
> > 2. Bypass the value (don't expose to BPF) if it's too big and trigger
> >    BPF only with level/optname so BPF can still decide whether
> >    to allow/deny big sockopts.
> >
> > The initial attempt was implemented using strategy #1. Due to
> > listed shortcomings, let's switch to strategy #2. When there is
> > legitimate a real use-case for iptables/SCTP, we can consider increasing
> >  the PAGE_SIZE limit.
> >
> > To support the cases where len(optval) > PAGE_SIZE we can
> > leverage upcoming sleepable BPF work by providing a helper
> > which can do copy_from_user (sleepable) at the given offset
> > from the original large buffer.
> >
> > v4:
> > * use temporary buffer to avoid optval == optval_end == NULL;
> >   this removes the corner case in the verifier that might assume
> >   non-zero PTR_TO_PACKET/PTR_TO_PACKET_END.
>
> just replied with another idea in v3 thread...
Yeah, sorry about that, posted 5 mins before your reply :-( Sorry for the noise.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-06-16 23:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16 22:52 [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Stanislav Fomichev
2020-06-16 22:52 ` [PATCH bpf v4 2/2] selftests/bpf: make sure optvals > PAGE_SIZE are bypassed Stanislav Fomichev
2020-06-16 23:06   ` Alexei Starovoitov
2020-06-16 23:05 ` [PATCH bpf v4 1/2] bpf: don't return EINVAL from {get,set}sockopt when optlen > PAGE_SIZE Alexei Starovoitov
2020-06-16 23:20   ` Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).