* [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock
@ 2020-04-29 17:05 Stanislav Fomichev
2020-04-29 23:08 ` Daniel Borkmann
0 siblings, 1 reply; 5+ messages in thread
From: Stanislav Fomichev @ 2020-04-29 17:05 UTC (permalink / raw)
To: netdev, bpf
Cc: davem, ast, daniel, Stanislav Fomichev, John Fastabend, Martin KaFai Lau
Currently, bpf_getsocktop and bpf_setsockopt helpers operate on the
'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
Let's generalize them and make the first argument be 'struct bpf_sock'.
That way, in the future, we can allow those helpers in more places.
BPF_PROG_TYPE_SOCK_OPS still has the existing helpers that operate
on 'struct bpf_sock_ops', but we add new bpf_{g,s}etsockopt that work
on 'struct bpf_sock'. [Alternatively, for BPF_PROG_TYPE_SOCK_OPS,
we can enable them both and teach verifier to pick the right one
based on the context (bpf_sock_ops vs bpf_sock).]
As an example, let's allow those 'struct bpf_sock' based helpers to
be called from the BPF_CGROUP_INET{4,6}_CONNECT hooks. That way
we can override CC before the connection is made.
v2:
* s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
include/uapi/linux/bpf.h | 12 +-
net/core/filter.c | 113 ++++++++++++++----
tools/include/uapi/linux/bpf.h | 12 +-
tools/testing/selftests/bpf/config | 1 +
.../selftests/bpf/progs/connect4_prog.c | 46 +++++++
5 files changed, 159 insertions(+), 25 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4a6c47f3febe..b50a316c6289 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1564,7 +1564,7 @@ union bpf_attr {
* Return
* 0
*
- * int bpf_setsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, void *optval, int optlen)
+ * int bpf_setsockopt(void *bpf_socket, int level, int optname, void *optval, int optlen)
* Description
* Emulate a call to **setsockopt()** on the socket associated to
* *bpf_socket*, which must be a full socket. The *level* at
@@ -1572,6 +1572,10 @@ union bpf_attr {
* must be specified, see **setsockopt(2)** for more information.
* The option value of length *optlen* is pointed by *optval*.
*
+ * *bpf_socket* should be one of the following:
+ * * **struct bpf_sock_ops** for **BPF_PROG_TYPE_SOCK_OPS**.
+ * * **struct bpf_sock** for the other types.
+ *
* This helper actually implements a subset of **setsockopt()**.
* It supports the following *level*\ s:
*
@@ -1766,7 +1770,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
- * int bpf_getsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, void *optval, int optlen)
+ * int bpf_getsockopt(void *bpf_socket, int level, int optname, void *optval, int optlen)
* Description
* Emulate a call to **getsockopt()** on the socket associated to
* *bpf_socket*, which must be a full socket. The *level* at
@@ -1775,6 +1779,10 @@ union bpf_attr {
* The retrieved value is stored in the structure pointed by
* *opval* and of length *optlen*.
*
+ * *bpf_socket* should be one of the following:
+ * * **struct bpf_sock_ops** for **BPF_PROG_TYPE_SOCK_OPS**.
+ * * **struct bpf_sock** for the other types.
+ *
* This helper actually implements a subset of **getsockopt()**.
* It supports the following *level*\ s:
*
diff --git a/net/core/filter.c b/net/core/filter.c
index da3b7a72c37c..0e8f2c9cfe27 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4194,16 +4194,19 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = {
.arg1_type = ARG_PTR_TO_CTX,
};
-BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
- int, level, int, optname, char *, optval, int, optlen)
+#define SOCKOPT_CC_REINIT (1 << 0)
+
+static int _bpf_setsockopt(struct sock *sk, int level, int optname,
+ char *optval, int optlen, u32 flags)
{
- struct sock *sk = bpf_sock->sk;
int ret = 0;
int val;
if (!sk_fullsock(sk))
return -EINVAL;
+ sock_owned_by_me(sk);
+
if (level == SOL_SOCKET) {
if (optlen != sizeof(int))
return -EINVAL;
@@ -4298,7 +4301,7 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
sk->sk_prot->setsockopt == tcp_setsockopt) {
if (optname == TCP_CONGESTION) {
char name[TCP_CA_NAME_MAX];
- bool reinit = bpf_sock->op > BPF_SOCK_OPS_NEEDS_ECN;
+ bool reinit = flags & SOCKOPT_CC_REINIT;
strncpy(name, optval, min_t(long, optlen,
TCP_CA_NAME_MAX-1));
@@ -4345,24 +4348,14 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
return ret;
}
-static const struct bpf_func_proto bpf_setsockopt_proto = {
- .func = bpf_setsockopt,
- .gpl_only = false,
- .ret_type = RET_INTEGER,
- .arg1_type = ARG_PTR_TO_CTX,
- .arg2_type = ARG_ANYTHING,
- .arg3_type = ARG_ANYTHING,
- .arg4_type = ARG_PTR_TO_MEM,
- .arg5_type = ARG_CONST_SIZE,
-};
-
-BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
- int, level, int, optname, char *, optval, int, optlen)
+static int _bpf_getsockopt(struct sock *sk, int level, int optname,
+ char *optval, int optlen)
{
- struct sock *sk = bpf_sock->sk;
-
if (!sk_fullsock(sk))
goto err_clear;
+
+ sock_owned_by_me(sk);
+
#ifdef CONFIG_INET
if (level == SOL_TCP && sk->sk_prot->getsockopt == tcp_getsockopt) {
struct inet_connection_sock *icsk;
@@ -4428,10 +4421,72 @@ BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
return -EINVAL;
}
+BPF_CALL_5(bpf_setsockopt, struct sock *, sk,
+ int, level, int, optname, char *, optval, int, optlen)
+{
+ u32 flags = 0;
+ return _bpf_setsockopt(sk, level, optname, optval, optlen, flags);
+}
+
+static const struct bpf_func_proto bpf_setsockopt_proto = {
+ .func = bpf_setsockopt,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_SOCKET,
+ .arg2_type = ARG_ANYTHING,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_PTR_TO_MEM,
+ .arg5_type = ARG_CONST_SIZE,
+};
+
+BPF_CALL_5(bpf_getsockopt, struct sock *, sk,
+ int, level, int, optname, char *, optval, int, optlen)
+{
+ return _bpf_getsockopt(sk, level, optname, optval, optlen);
+}
+
static const struct bpf_func_proto bpf_getsockopt_proto = {
.func = bpf_getsockopt,
.gpl_only = false,
.ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_SOCKET,
+ .arg2_type = ARG_ANYTHING,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_PTR_TO_UNINIT_MEM,
+ .arg5_type = ARG_CONST_SIZE,
+};
+
+BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
+ int, level, int, optname, char *, optval, int, optlen)
+{
+ u32 flags = 0;
+ if (bpf_sock->op > BPF_SOCK_OPS_NEEDS_ECN)
+ flags |= SOCKOPT_CC_REINIT;
+ return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen,
+ flags);
+}
+
+static const struct bpf_func_proto bpf_sock_ops_setsockopt_proto = {
+ .func = bpf_sock_ops_setsockopt,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_CTX,
+ .arg2_type = ARG_ANYTHING,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_PTR_TO_MEM,
+ .arg5_type = ARG_CONST_SIZE,
+};
+
+BPF_CALL_5(bpf_sock_ops_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
+ int, level, int, optname, char *, optval, int, optlen)
+{
+ return _bpf_getsockopt(bpf_sock->sk, level, optname, optval, optlen);
+}
+
+static const struct bpf_func_proto bpf_sock_ops_getsockopt_proto = {
+ .func = bpf_sock_ops_getsockopt,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_CTX,
.arg2_type = ARG_ANYTHING,
.arg3_type = ARG_ANYTHING,
@@ -6043,6 +6098,22 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_sk_storage_get_proto;
case BPF_FUNC_sk_storage_delete:
return &bpf_sk_storage_delete_proto;
+ case BPF_FUNC_setsockopt:
+ switch (prog->expected_attach_type) {
+ case BPF_CGROUP_INET4_CONNECT:
+ case BPF_CGROUP_INET6_CONNECT:
+ return &bpf_setsockopt_proto;
+ default:
+ return NULL;
+ }
+ case BPF_FUNC_getsockopt:
+ switch (prog->expected_attach_type) {
+ case BPF_CGROUP_INET4_CONNECT:
+ case BPF_CGROUP_INET6_CONNECT:
+ return &bpf_getsockopt_proto;
+ default:
+ return NULL;
+ }
default:
return bpf_base_func_proto(func_id);
}
@@ -6261,9 +6332,9 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
switch (func_id) {
case BPF_FUNC_setsockopt:
- return &bpf_setsockopt_proto;
+ return &bpf_sock_ops_setsockopt_proto;
case BPF_FUNC_getsockopt:
- return &bpf_getsockopt_proto;
+ return &bpf_sock_ops_getsockopt_proto;
case BPF_FUNC_sock_ops_cb_flags_set:
return &bpf_sock_ops_cb_flags_set_proto;
case BPF_FUNC_sock_map_update:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4a6c47f3febe..b50a316c6289 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1564,7 +1564,7 @@ union bpf_attr {
* Return
* 0
*
- * int bpf_setsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, void *optval, int optlen)
+ * int bpf_setsockopt(void *bpf_socket, int level, int optname, void *optval, int optlen)
* Description
* Emulate a call to **setsockopt()** on the socket associated to
* *bpf_socket*, which must be a full socket. The *level* at
@@ -1572,6 +1572,10 @@ union bpf_attr {
* must be specified, see **setsockopt(2)** for more information.
* The option value of length *optlen* is pointed by *optval*.
*
+ * *bpf_socket* should be one of the following:
+ * * **struct bpf_sock_ops** for **BPF_PROG_TYPE_SOCK_OPS**.
+ * * **struct bpf_sock** for the other types.
+ *
* This helper actually implements a subset of **setsockopt()**.
* It supports the following *level*\ s:
*
@@ -1766,7 +1770,7 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure.
*
- * int bpf_getsockopt(struct bpf_sock_ops *bpf_socket, int level, int optname, void *optval, int optlen)
+ * int bpf_getsockopt(void *bpf_socket, int level, int optname, void *optval, int optlen)
* Description
* Emulate a call to **getsockopt()** on the socket associated to
* *bpf_socket*, which must be a full socket. The *level* at
@@ -1775,6 +1779,10 @@ union bpf_attr {
* The retrieved value is stored in the structure pointed by
* *opval* and of length *optlen*.
*
+ * *bpf_socket* should be one of the following:
+ * * **struct bpf_sock_ops** for **BPF_PROG_TYPE_SOCK_OPS**.
+ * * **struct bpf_sock** for the other types.
+ *
* This helper actually implements a subset of **getsockopt()**.
* It supports the following *level*\ s:
*
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 60e3ae5d4e48..6e5b94c036ca 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -37,3 +37,4 @@ CONFIG_IPV6_SIT=m
CONFIG_BPF_JIT=y
CONFIG_BPF_LSM=y
CONFIG_SECURITY=y
+CONFIG_TCP_CONG_DCTCP=y
diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c
index ad3c498a8150..656eb8c1ffc8 100644
--- a/tools/testing/selftests/bpf/progs/connect4_prog.c
+++ b/tools/testing/selftests/bpf/progs/connect4_prog.c
@@ -8,6 +8,7 @@
#include <linux/in.h>
#include <linux/in6.h>
#include <sys/socket.h>
+#include <netinet/tcp.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
@@ -16,6 +17,10 @@
#define DST_REWRITE_IP4 0x7f000001U
#define DST_REWRITE_PORT4 4444
+#ifndef TCP_CA_NAME_MAX
+#define TCP_CA_NAME_MAX 16
+#endif
+
int _version SEC("version") = 1;
__attribute__ ((noinline))
@@ -33,6 +38,43 @@ int do_bind(struct bpf_sock_addr *ctx)
return 1;
}
+static __inline int verify_cc(struct bpf_sock *sk,
+ char expected[TCP_CA_NAME_MAX])
+{
+ char buf[TCP_CA_NAME_MAX];
+ int i;
+
+ if (bpf_getsockopt(sk, SOL_TCP, TCP_CONGESTION, &buf, sizeof(buf)))
+ return 1;
+
+ for (i = 0; i < TCP_CA_NAME_MAX; i++) {
+ if (buf[i] != expected[i])
+ return 1;
+ if (buf[i] == 0)
+ break;
+ }
+
+ return 0;
+}
+
+static __inline int set_cc(struct bpf_sock *sk)
+{
+ char dctcp[TCP_CA_NAME_MAX] = "dctcp";
+ char cubic[TCP_CA_NAME_MAX] = "cubic";
+
+ if (bpf_setsockopt(sk, SOL_TCP, TCP_CONGESTION, &dctcp, sizeof(dctcp)))
+ return 1;
+ if (verify_cc(sk, dctcp))
+ return 1;
+
+ if (bpf_setsockopt(sk, SOL_TCP, TCP_CONGESTION, &cubic, sizeof(cubic)))
+ return 1;
+ if (verify_cc(sk, cubic))
+ return 1;
+
+ return 0;
+}
+
SEC("cgroup/connect4")
int connect_v4_prog(struct bpf_sock_addr *ctx)
{
@@ -66,6 +108,10 @@ int connect_v4_prog(struct bpf_sock_addr *ctx)
bpf_sk_release(sk);
+ /* Rewrite congestion control. */
+ if (ctx->type == SOCK_STREAM && set_cc(ctx->sk))
+ return 0;
+
/* Rewrite destination. */
ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4);
ctx->user_port = bpf_htons(DST_REWRITE_PORT4);
--
2.26.2.303.gf8c07b1a785-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock
2020-04-29 17:05 [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock Stanislav Fomichev
@ 2020-04-29 23:08 ` Daniel Borkmann
2020-04-29 23:33 ` sdf
0 siblings, 1 reply; 5+ messages in thread
From: Daniel Borkmann @ 2020-04-29 23:08 UTC (permalink / raw)
To: Stanislav Fomichev, netdev, bpf
Cc: davem, ast, John Fastabend, Martin KaFai Lau
On 4/29/20 7:05 PM, Stanislav Fomichev wrote:
> Currently, bpf_getsocktop and bpf_setsockopt helpers operate on the
> 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
> Let's generalize them and make the first argument be 'struct bpf_sock'.
> That way, in the future, we can allow those helpers in more places.
>
> BPF_PROG_TYPE_SOCK_OPS still has the existing helpers that operate
> on 'struct bpf_sock_ops', but we add new bpf_{g,s}etsockopt that work
> on 'struct bpf_sock'. [Alternatively, for BPF_PROG_TYPE_SOCK_OPS,
> we can enable them both and teach verifier to pick the right one
> based on the context (bpf_sock_ops vs bpf_sock).]
>
> As an example, let's allow those 'struct bpf_sock' based helpers to
> be called from the BPF_CGROUP_INET{4,6}_CONNECT hooks. That way
> we can override CC before the connection is made.
>
> v2:
> * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
>
> Acked-by: John Fastabend <john.fastabend@gmail.com>
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
[...]
> +BPF_CALL_5(bpf_setsockopt, struct sock *, sk,
> + int, level, int, optname, char *, optval, int, optlen)
> +{
> + u32 flags = 0;
> + return _bpf_setsockopt(sk, level, optname, optval, optlen, flags);
> +}
> +
> +static const struct bpf_func_proto bpf_setsockopt_proto = {
> + .func = bpf_setsockopt,
> + .gpl_only = false,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_SOCKET,
> + .arg2_type = ARG_ANYTHING,
> + .arg3_type = ARG_ANYTHING,
> + .arg4_type = ARG_PTR_TO_MEM,
> + .arg5_type = ARG_CONST_SIZE,
> +};
> +
> +BPF_CALL_5(bpf_getsockopt, struct sock *, sk,
> + int, level, int, optname, char *, optval, int, optlen)
> +{
> + return _bpf_getsockopt(sk, level, optname, optval, optlen);
> +}
> +
> static const struct bpf_func_proto bpf_getsockopt_proto = {
> .func = bpf_getsockopt,
> .gpl_only = false,
> .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_SOCKET,
> + .arg2_type = ARG_ANYTHING,
> + .arg3_type = ARG_ANYTHING,
> + .arg4_type = ARG_PTR_TO_UNINIT_MEM,
> + .arg5_type = ARG_CONST_SIZE,
> +};
> +
[...]
> @@ -6043,6 +6098,22 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> return &bpf_sk_storage_get_proto;
> case BPF_FUNC_sk_storage_delete:
> return &bpf_sk_storage_delete_proto;
> + case BPF_FUNC_setsockopt:
> + switch (prog->expected_attach_type) {
> + case BPF_CGROUP_INET4_CONNECT:
> + case BPF_CGROUP_INET6_CONNECT:
> + return &bpf_setsockopt_proto;
Hm, I'm not sure this is safe. In the sock_addr_func_proto() we also have
other helpers callable from connect hooks like sk_lookup_{tcp,udp} which
return a PTR_TO_SOCKET_OR_NULL, and now we can pass those sockets also into
bpf_{get,set}sockopt() helper after lookup to change various sk related stuff
but w/o being under lock. Doesn't the sock_owned_by_me() yell here at minimum
(I'd expect so)?
> + default:
> + return NULL;
> + }
> + case BPF_FUNC_getsockopt:
> + switch (prog->expected_attach_type) {
> + case BPF_CGROUP_INET4_CONNECT:
> + case BPF_CGROUP_INET6_CONNECT:
> + return &bpf_getsockopt_proto;
> + default:
> + return NULL;
> + }
> default:
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock
2020-04-29 23:08 ` Daniel Borkmann
@ 2020-04-29 23:33 ` sdf
2020-04-30 2:24 ` Alexei Starovoitov
0 siblings, 1 reply; 5+ messages in thread
From: sdf @ 2020-04-29 23:33 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: netdev, bpf, davem, ast, John Fastabend, Martin KaFai Lau
On 04/30, Daniel Borkmann wrote:
> On 4/29/20 7:05 PM, Stanislav Fomichev wrote:
> > Currently, bpf_getsocktop and bpf_setsockopt helpers operate on the
> > 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
> > Let's generalize them and make the first argument be 'struct bpf_sock'.
> > That way, in the future, we can allow those helpers in more places.
> >
> > BPF_PROG_TYPE_SOCK_OPS still has the existing helpers that operate
> > on 'struct bpf_sock_ops', but we add new bpf_{g,s}etsockopt that work
> > on 'struct bpf_sock'. [Alternatively, for BPF_PROG_TYPE_SOCK_OPS,
> > we can enable them both and teach verifier to pick the right one
> > based on the context (bpf_sock_ops vs bpf_sock).]
> >
> > As an example, let's allow those 'struct bpf_sock' based helpers to
> > be called from the BPF_CGROUP_INET{4,6}_CONNECT hooks. That way
> > we can override CC before the connection is made.
> >
> > v2:
> > * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
> >
> > Acked-by: John Fastabend <john.fastabend@gmail.com>
> > Acked-by: Martin KaFai Lau <kafai@fb.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> [...]
> > +BPF_CALL_5(bpf_setsockopt, struct sock *, sk,
> > + int, level, int, optname, char *, optval, int, optlen)
> > +{
> > + u32 flags = 0;
> > + return _bpf_setsockopt(sk, level, optname, optval, optlen, flags);
> > +}
> > +
> > +static const struct bpf_func_proto bpf_setsockopt_proto = {
> > + .func = bpf_setsockopt,
> > + .gpl_only = false,
> > + .ret_type = RET_INTEGER,
> > + .arg1_type = ARG_PTR_TO_SOCKET,
> > + .arg2_type = ARG_ANYTHING,
> > + .arg3_type = ARG_ANYTHING,
> > + .arg4_type = ARG_PTR_TO_MEM,
> > + .arg5_type = ARG_CONST_SIZE,
> > +};
> > +
> > +BPF_CALL_5(bpf_getsockopt, struct sock *, sk,
> > + int, level, int, optname, char *, optval, int, optlen)
> > +{
> > + return _bpf_getsockopt(sk, level, optname, optval, optlen);
> > +}
> > +
> > static const struct bpf_func_proto bpf_getsockopt_proto = {
> > .func = bpf_getsockopt,
> > .gpl_only = false,
> > .ret_type = RET_INTEGER,
> > + .arg1_type = ARG_PTR_TO_SOCKET,
> > + .arg2_type = ARG_ANYTHING,
> > + .arg3_type = ARG_ANYTHING,
> > + .arg4_type = ARG_PTR_TO_UNINIT_MEM,
> > + .arg5_type = ARG_CONST_SIZE,
> > +};
> > +
> [...]
> > @@ -6043,6 +6098,22 @@ sock_addr_func_proto(enum bpf_func_id func_id,
> const struct bpf_prog *prog)
> > return &bpf_sk_storage_get_proto;
> > case BPF_FUNC_sk_storage_delete:
> > return &bpf_sk_storage_delete_proto;
> > + case BPF_FUNC_setsockopt:
> > + switch (prog->expected_attach_type) {
> > + case BPF_CGROUP_INET4_CONNECT:
> > + case BPF_CGROUP_INET6_CONNECT:
> > + return &bpf_setsockopt_proto;
> Hm, I'm not sure this is safe. In the sock_addr_func_proto() we also have
> other helpers callable from connect hooks like sk_lookup_{tcp,udp} which
> return a PTR_TO_SOCKET_OR_NULL, and now we can pass those sockets also
> into
> bpf_{get,set}sockopt() helper after lookup to change various sk related
> stuff
> but w/o being under lock. Doesn't the sock_owned_by_me() yell here at
> minimum
> (I'd expect so)?
Ugh, good point, I missed the fact that sk_lookup_{tcp,udp} are there
for sock_addr :-( I can try to do a simple test case to verify
that sock_owned_by_me triggers, but I'm pretty certain it should
(I've been calling bpf_{s,g}etsockopt for context socket so it's quiet).
I don't think there is any helper similar to sock_owned_by_me() that
I can call to verify that the socket is held by current thread
(without the lockdep splat) and bail out?
In this case, is something like adding new PTR_TO_LOCKED_SOCKET_OR_NULL
is the way to go? Any other ideas?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock
2020-04-29 23:33 ` sdf
@ 2020-04-30 2:24 ` Alexei Starovoitov
2020-04-30 14:39 ` Daniel Borkmann
0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2020-04-30 2:24 UTC (permalink / raw)
To: sdf
Cc: Daniel Borkmann, netdev, bpf, davem, ast, John Fastabend,
Martin KaFai Lau
On Wed, Apr 29, 2020 at 04:33:12PM -0700, sdf@google.com wrote:
> On 04/30, Daniel Borkmann wrote:
> > On 4/29/20 7:05 PM, Stanislav Fomichev wrote:
> > > Currently, bpf_getsocktop and bpf_setsockopt helpers operate on the
> > > 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
> > > Let's generalize them and make the first argument be 'struct bpf_sock'.
> > > That way, in the future, we can allow those helpers in more places.
> > >
> > > BPF_PROG_TYPE_SOCK_OPS still has the existing helpers that operate
> > > on 'struct bpf_sock_ops', but we add new bpf_{g,s}etsockopt that work
> > > on 'struct bpf_sock'. [Alternatively, for BPF_PROG_TYPE_SOCK_OPS,
> > > we can enable them both and teach verifier to pick the right one
> > > based on the context (bpf_sock_ops vs bpf_sock).]
> > >
> > > As an example, let's allow those 'struct bpf_sock' based helpers to
> > > be called from the BPF_CGROUP_INET{4,6}_CONNECT hooks. That way
> > > we can override CC before the connection is made.
> > >
> > > v2:
> > > * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
> > >
> > > Acked-by: John Fastabend <john.fastabend@gmail.com>
> > > Acked-by: Martin KaFai Lau <kafai@fb.com>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > [...]
> > > +BPF_CALL_5(bpf_setsockopt, struct sock *, sk,
> > > + int, level, int, optname, char *, optval, int, optlen)
> > > +{
> > > + u32 flags = 0;
> > > + return _bpf_setsockopt(sk, level, optname, optval, optlen, flags);
> > > +}
> > > +
> > > +static const struct bpf_func_proto bpf_setsockopt_proto = {
> > > + .func = bpf_setsockopt,
> > > + .gpl_only = false,
> > > + .ret_type = RET_INTEGER,
> > > + .arg1_type = ARG_PTR_TO_SOCKET,
> > > + .arg2_type = ARG_ANYTHING,
> > > + .arg3_type = ARG_ANYTHING,
> > > + .arg4_type = ARG_PTR_TO_MEM,
> > > + .arg5_type = ARG_CONST_SIZE,
> > > +};
> > > +
> > > +BPF_CALL_5(bpf_getsockopt, struct sock *, sk,
> > > + int, level, int, optname, char *, optval, int, optlen)
> > > +{
> > > + return _bpf_getsockopt(sk, level, optname, optval, optlen);
> > > +}
> > > +
> > > static const struct bpf_func_proto bpf_getsockopt_proto = {
> > > .func = bpf_getsockopt,
> > > .gpl_only = false,
> > > .ret_type = RET_INTEGER,
> > > + .arg1_type = ARG_PTR_TO_SOCKET,
> > > + .arg2_type = ARG_ANYTHING,
> > > + .arg3_type = ARG_ANYTHING,
> > > + .arg4_type = ARG_PTR_TO_UNINIT_MEM,
> > > + .arg5_type = ARG_CONST_SIZE,
> > > +};
> > > +
> > [...]
> > > @@ -6043,6 +6098,22 @@ sock_addr_func_proto(enum bpf_func_id func_id,
> > const struct bpf_prog *prog)
> > > return &bpf_sk_storage_get_proto;
> > > case BPF_FUNC_sk_storage_delete:
> > > return &bpf_sk_storage_delete_proto;
> > > + case BPF_FUNC_setsockopt:
> > > + switch (prog->expected_attach_type) {
> > > + case BPF_CGROUP_INET4_CONNECT:
> > > + case BPF_CGROUP_INET6_CONNECT:
> > > + return &bpf_setsockopt_proto;
>
> > Hm, I'm not sure this is safe. In the sock_addr_func_proto() we also have
> > other helpers callable from connect hooks like sk_lookup_{tcp,udp} which
> > return a PTR_TO_SOCKET_OR_NULL, and now we can pass those sockets also
> > into
> > bpf_{get,set}sockopt() helper after lookup to change various sk related
> > stuff
> > but w/o being under lock. Doesn't the sock_owned_by_me() yell here at
> > minimum
> > (I'd expect so)?
> Ugh, good point, I missed the fact that sk_lookup_{tcp,udp} are there
> for sock_addr :-( I can try to do a simple test case to verify
> that sock_owned_by_me triggers, but I'm pretty certain it should
> (I've been calling bpf_{s,g}etsockopt for context socket so it's quiet).
>
> I don't think there is any helper similar to sock_owned_by_me() that
> I can call to verify that the socket is held by current thread
> (without the lockdep splat) and bail out?
>
> In this case, is something like adding new PTR_TO_LOCKED_SOCKET_OR_NULL
> is the way to go? Any other ideas?
Looks like networking will benefit from sleepable progs too.
We could have just did lock_sock() inside bpf_setsockopt
before setting cong control.
In the mean time how about introducing try_lock_sock()
that will bail out if it cannot grab the lock?
For most practical cases that would work and eventually we
can convert it to full lock_sock ?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock
2020-04-30 2:24 ` Alexei Starovoitov
@ 2020-04-30 14:39 ` Daniel Borkmann
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Borkmann @ 2020-04-30 14:39 UTC (permalink / raw)
To: Alexei Starovoitov, sdf
Cc: netdev, bpf, davem, ast, John Fastabend, Martin KaFai Lau
On 4/30/20 4:24 AM, Alexei Starovoitov wrote:
> On Wed, Apr 29, 2020 at 04:33:12PM -0700, sdf@google.com wrote:
>> On 04/30, Daniel Borkmann wrote:
>>> On 4/29/20 7:05 PM, Stanislav Fomichev wrote:
>>>> Currently, bpf_getsocktop and bpf_setsockopt helpers operate on the
>>>> 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
>>>> Let's generalize them and make the first argument be 'struct bpf_sock'.
>>>> That way, in the future, we can allow those helpers in more places.
>>>>
>>>> BPF_PROG_TYPE_SOCK_OPS still has the existing helpers that operate
>>>> on 'struct bpf_sock_ops', but we add new bpf_{g,s}etsockopt that work
>>>> on 'struct bpf_sock'. [Alternatively, for BPF_PROG_TYPE_SOCK_OPS,
>>>> we can enable them both and teach verifier to pick the right one
>>>> based on the context (bpf_sock_ops vs bpf_sock).]
>>>>
>>>> As an example, let's allow those 'struct bpf_sock' based helpers to
>>>> be called from the BPF_CGROUP_INET{4,6}_CONNECT hooks. That way
>>>> we can override CC before the connection is made.
>>>>
>>>> v2:
>>>> * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
>>>>
>>>> Acked-by: John Fastabend <john.fastabend@gmail.com>
>>>> Acked-by: Martin KaFai Lau <kafai@fb.com>
>>>> Signed-off-by: Stanislav Fomichev <sdf@google.com>
>>> [...]
>>>> +BPF_CALL_5(bpf_setsockopt, struct sock *, sk,
>>>> + int, level, int, optname, char *, optval, int, optlen)
>>>> +{
>>>> + u32 flags = 0;
>>>> + return _bpf_setsockopt(sk, level, optname, optval, optlen, flags);
>>>> +}
>>>> +
>>>> +static const struct bpf_func_proto bpf_setsockopt_proto = {
>>>> + .func = bpf_setsockopt,
>>>> + .gpl_only = false,
>>>> + .ret_type = RET_INTEGER,
>>>> + .arg1_type = ARG_PTR_TO_SOCKET,
>>>> + .arg2_type = ARG_ANYTHING,
>>>> + .arg3_type = ARG_ANYTHING,
>>>> + .arg4_type = ARG_PTR_TO_MEM,
>>>> + .arg5_type = ARG_CONST_SIZE,
>>>> +};
>>>> +
>>>> +BPF_CALL_5(bpf_getsockopt, struct sock *, sk,
>>>> + int, level, int, optname, char *, optval, int, optlen)
>>>> +{
>>>> + return _bpf_getsockopt(sk, level, optname, optval, optlen);
>>>> +}
>>>> +
>>>> static const struct bpf_func_proto bpf_getsockopt_proto = {
>>>> .func = bpf_getsockopt,
>>>> .gpl_only = false,
>>>> .ret_type = RET_INTEGER,
>>>> + .arg1_type = ARG_PTR_TO_SOCKET,
>>>> + .arg2_type = ARG_ANYTHING,
>>>> + .arg3_type = ARG_ANYTHING,
>>>> + .arg4_type = ARG_PTR_TO_UNINIT_MEM,
>>>> + .arg5_type = ARG_CONST_SIZE,
>>>> +};
>>>> +
>>> [...]
>>>> @@ -6043,6 +6098,22 @@ sock_addr_func_proto(enum bpf_func_id func_id,
>>> const struct bpf_prog *prog)
>>>> return &bpf_sk_storage_get_proto;
>>>> case BPF_FUNC_sk_storage_delete:
>>>> return &bpf_sk_storage_delete_proto;
>>>> + case BPF_FUNC_setsockopt:
>>>> + switch (prog->expected_attach_type) {
>>>> + case BPF_CGROUP_INET4_CONNECT:
>>>> + case BPF_CGROUP_INET6_CONNECT:
>>>> + return &bpf_setsockopt_proto;
>>
>>> Hm, I'm not sure this is safe. In the sock_addr_func_proto() we also have
>>> other helpers callable from connect hooks like sk_lookup_{tcp,udp} which
>>> return a PTR_TO_SOCKET_OR_NULL, and now we can pass those sockets also
>>> into
>>> bpf_{get,set}sockopt() helper after lookup to change various sk related
>>> stuff
>>> but w/o being under lock. Doesn't the sock_owned_by_me() yell here at
>>> minimum
>>> (I'd expect so)?
>> Ugh, good point, I missed the fact that sk_lookup_{tcp,udp} are there
>> for sock_addr :-( I can try to do a simple test case to verify
>> that sock_owned_by_me triggers, but I'm pretty certain it should
>> (I've been calling bpf_{s,g}etsockopt for context socket so it's quiet).
>>
>> I don't think there is any helper similar to sock_owned_by_me() that
>> I can call to verify that the socket is held by current thread
>> (without the lockdep splat) and bail out?
>>
>> In this case, is something like adding new PTR_TO_LOCKED_SOCKET_OR_NULL
>> is the way to go? Any other ideas?
>
> Looks like networking will benefit from sleepable progs too.
> We could have just did lock_sock() inside bpf_setsockopt
> before setting cong control.
> In the mean time how about introducing try_lock_sock()
> that will bail out if it cannot grab the lock?
> For most practical cases that would work and eventually we
> can convert it to full lock_sock ?
Right, also, worst case we could also go back to having ctx as input arg.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-04-30 14:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29 17:05 [PATCH bpf-next v2] bpf: bpf_{g,s}etsockopt for struct bpf_sock Stanislav Fomichev
2020-04-29 23:08 ` Daniel Borkmann
2020-04-29 23:33 ` sdf
2020-04-30 2:24 ` Alexei Starovoitov
2020-04-30 14:39 ` Daniel Borkmann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.