* [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt()
@ 2022-08-10 19:07 Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 01/15] net: Add sk_setsockopt() to take the sk ptr instead of the sock ptr Martin KaFai Lau
` (15 more replies)
0 siblings, 16 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:07 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
The code in bpf_setsockopt() is mostly a copy-and-paste from
the sock_setsockopt(), do_tcp_setsockopt(), do_ipv6_setsockopt(),
and do_ip_setsockopt(). As the allowed optnames in bpf_setsockopt()
grows, so are the duplicated code. The code between the copies
also slowly drifted.
This set is an effort to clean this up and reuse the existing
{sock,do_tcp,do_ipv6,do_ip}_setsockopt() as much as possible.
After the clean up, this set also adds a few allowed optnames
that we need to the bpf_setsockopt().
The initial attempt was to clean up both bpf_setsockopt() and
bpf_getsockopt() together. However, the patch set was getting
too long. It is beneficial to leave the bpf_getsockopt()
out for another patch set. Thus, this set is focusing
on the bpf_setsockopt().
v3:
- s/in_bpf/has_current_bpf_ctx/ (Andrii)
- Add comments to has_current_bpf_ctx() and sockopt_lock_sock()
(Stanislav)
- Use vmlinux.h in selftest and add defines to bpf_tracing_net.h
(Stanislav)
- Use bpf_getsockopt(SO_MARK) in selftest (Stanislav)
- Use BPF_CORE_READ_BITFIELD in selftest (Yonghong)
v2:
- A major change is to use in_bpf() to test if a setsockopt()
is called by a bpf prog and use in_bpf() to skip capable
check. Suggested by Stanislav.
- Instead of passing is_locked through sockptr_t or through an extra
argument to sk_setsockopt, v2 uses in_bpf() to skip the lock_sock()
also because bpf prog has the lock acquired.
- No change to the current sockptr_t in this revision
- s/codes/code/
Martin KaFai Lau (15):
net: Add sk_setsockopt() to take the sk ptr instead of the sock ptr
bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf
bpf: net: Consider has_current_bpf_ctx() when testing capable() in
sk_setsockopt()
bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock()
and capable()
bpf: net: Change do_ip_setsockopt() to use the sockopt's lock_sock()
and capable()
bpf: net: Change do_ipv6_setsockopt() to use the sockopt's lock_sock()
and capable()
bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog()
bpf: Embed kernel CONFIG check into the if statement in bpf_setsockopt
bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt()
bpf: Refactor bpf specific tcp optnames to a new function
bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt()
bpf: Change bpf_setsockopt(SOL_IP) to reuse do_ip_setsockopt()
bpf: Change bpf_setsockopt(SOL_IPV6) to reuse do_ipv6_setsockopt()
bpf: Add a few optnames to bpf_setsockopt
selftests/bpf: bpf_setsockopt tests
include/linux/bpf.h | 14 +
include/net/ip.h | 2 +
include/net/ipv6.h | 2 +
include/net/ipv6_stubs.h | 2 +
include/net/sock.h | 7 +
include/net/tcp.h | 2 +
kernel/bpf/bpf_iter.c | 5 +
net/core/filter.c | 377 ++++++---------
net/core/sock.c | 81 +++-
net/ipv4/ip_sockglue.c | 16 +-
net/ipv4/tcp.c | 22 +-
net/ipv6/af_inet6.c | 1 +
net/ipv6/ipv6_sockglue.c | 18 +-
.../selftests/bpf/prog_tests/setget_sockopt.c | 125 +++++
.../selftests/bpf/progs/bpf_tracing_net.h | 31 +-
.../selftests/bpf/progs/setget_sockopt.c | 451 ++++++++++++++++++
16 files changed, 883 insertions(+), 273 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
create mode 100644 tools/testing/selftests/bpf/progs/setget_sockopt.c
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 01/15] net: Add sk_setsockopt() to take the sk ptr instead of the sock ptr
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
@ 2022-08-10 19:07 ` Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf Martin KaFai Lau
` (14 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:07 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
A latter patch refactors bpf_setsockopt(SOL_SOCKET) with the
sock_setsockopt() to avoid code duplication and code
drift between the two duplicates.
The current sock_setsockopt() takes sock ptr as the argument.
The very first thing of this function is to get back the sk ptr
by 'sk = sock->sk'.
bpf_setsockopt() could be called when the sk does not have
the sock ptr created. Meaning sk->sk_socket is NULL. For example,
when a passive tcp connection has just been established but has yet
been accept()-ed. Thus, it cannot use the sock_setsockopt(sk->sk_socket)
or else it will pass a NULL ptr.
This patch moves all sock_setsockopt implementation to the newly
added sk_setsockopt(). The new sk_setsockopt() takes a sk ptr
and immediately gets the sock ptr by 'sock = sk->sk_socket'
The existing sock_setsockopt(sock) is changed to call
sk_setsockopt(sock->sk). All existing callers have both sock->sk
and sk->sk_socket pointer.
The latter patch will make bpf_setsockopt(SOL_SOCKET) call
sk_setsockopt(sk) directly. The bpf_setsockopt(SOL_SOCKET) does
not use the optnames that require sk->sk_socket, so it will
be safe.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/core/sock.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 4cb957d934a2..20269c37ab3b 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1041,12 +1041,12 @@ static int sock_reserve_memory(struct sock *sk, int bytes)
* at the socket level. Everything here is generic.
*/
-int sock_setsockopt(struct socket *sock, int level, int optname,
- sockptr_t optval, unsigned int optlen)
+static int sk_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
{
struct so_timestamping timestamping;
+ struct socket *sock = sk->sk_socket;
struct sock_txtime sk_txtime;
- struct sock *sk = sock->sk;
int val;
int valbool;
struct linger ling;
@@ -1499,6 +1499,13 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
release_sock(sk);
return ret;
}
+
+int sock_setsockopt(struct socket *sock, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
+{
+ return sk_setsockopt(sock->sk, level, optname,
+ optval, optlen);
+}
EXPORT_SYMBOL(sock_setsockopt);
static const struct cred *sk_get_peer_cred(struct sock *sk)
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 01/15] net: Add sk_setsockopt() to take the sk ptr instead of the sock ptr Martin KaFai Lau
@ 2022-08-10 19:07 ` Martin KaFai Lau
2022-08-16 3:32 ` Andrii Nakryiko
2022-08-10 19:07 ` [PATCH v3 bpf-next 03/15] bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt() Martin KaFai Lau
` (13 subsequent siblings)
15 siblings, 1 reply; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:07 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
Most of the code in bpf_setsockopt(SOL_SOCKET) are duplicated from
the sk_setsockopt(). The number of supported optnames are
increasing ever and so as the duplicated code.
One issue in reusing sk_setsockopt() is that the bpf prog
has already acquired the sk lock. This patch adds a
has_current_bpf_ctx() to tell if the sk_setsockopt() is called from
a bpf prog. The bpf prog calling bpf_setsockopt() is either running
in_task() or in_serving_softirq(). Both cases have the current->bpf_ctx
initialized. Thus, the has_current_bpf_ctx() only needs to
test !!current->bpf_ctx.
This patch also adds sockopt_{lock,release}_sock() helpers
for sk_setsockopt() to use. These helpers will test
has_current_bpf_ctx() before acquiring/releasing the lock. They are
in EXPORT_SYMBOL for the ipv6 module to use in a latter patch.
Note on the change in sock_setbindtodevice(). sockopt_lock_sock()
is done in sock_setbindtodevice() instead of doing the lock_sock
in sock_bindtoindex(..., lock_sk = true).
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/linux/bpf.h | 14 ++++++++++++++
include/net/sock.h | 3 +++
net/core/sock.c | 30 +++++++++++++++++++++++++++---
3 files changed, 44 insertions(+), 3 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a627a02cf8ab..0a600b2013cc 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1966,6 +1966,16 @@ static inline bool unprivileged_ebpf_enabled(void)
return !sysctl_unprivileged_bpf_disabled;
}
+/* Not all bpf prog type has the bpf_ctx.
+ * Only trampoline and cgroup-bpf have it.
+ * For the bpf prog type that has initialized the bpf_ctx,
+ * this function can be used to decide if a kernel function
+ * is called by a bpf program.
+ */
+static inline bool has_current_bpf_ctx(void)
+{
+ return !!current->bpf_ctx;
+}
#else /* !CONFIG_BPF_SYSCALL */
static inline struct bpf_prog *bpf_prog_get(u32 ufd)
{
@@ -2175,6 +2185,10 @@ static inline bool unprivileged_ebpf_enabled(void)
return false;
}
+static inline bool has_current_bpf_ctx(void)
+{
+ return false;
+}
#endif /* CONFIG_BPF_SYSCALL */
void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
diff --git a/include/net/sock.h b/include/net/sock.h
index a7273b289188..b2ff230860c6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1721,6 +1721,9 @@ static inline void unlock_sock_fast(struct sock *sk, bool slow)
}
}
+void sockopt_lock_sock(struct sock *sk);
+void sockopt_release_sock(struct sock *sk);
+
/* Used by processes to "lock" a socket state, so that
* interrupts and bottom half handlers won't change it
* from under us. It essentially blocks any incoming
diff --git a/net/core/sock.c b/net/core/sock.c
index 20269c37ab3b..d3683228376f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -703,7 +703,9 @@ static int sock_setbindtodevice(struct sock *sk, sockptr_t optval, int optlen)
goto out;
}
- return sock_bindtoindex(sk, index, true);
+ sockopt_lock_sock(sk);
+ ret = sock_bindtoindex_locked(sk, index);
+ sockopt_release_sock(sk);
out:
#endif
@@ -1036,6 +1038,28 @@ static int sock_reserve_memory(struct sock *sk, int bytes)
return 0;
}
+void sockopt_lock_sock(struct sock *sk)
+{
+ /* When current->bpf_ctx is set, the setsockopt is called from
+ * a bpf prog. bpf has ensured the sk lock has been
+ * acquired before calling setsockopt().
+ */
+ if (has_current_bpf_ctx())
+ return;
+
+ lock_sock(sk);
+}
+EXPORT_SYMBOL(sockopt_lock_sock);
+
+void sockopt_release_sock(struct sock *sk)
+{
+ if (has_current_bpf_ctx())
+ return;
+
+ release_sock(sk);
+}
+EXPORT_SYMBOL(sockopt_release_sock);
+
/*
* This is meant for all protocols to use and covers goings on
* at the socket level. Everything here is generic.
@@ -1067,7 +1091,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
valbool = val ? 1 : 0;
- lock_sock(sk);
+ sockopt_lock_sock(sk);
switch (optname) {
case SO_DEBUG:
@@ -1496,7 +1520,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
ret = -ENOPROTOOPT;
break;
}
- release_sock(sk);
+ sockopt_release_sock(sk);
return ret;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 03/15] bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 01/15] net: Add sk_setsockopt() to take the sk ptr instead of the sock ptr Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf Martin KaFai Lau
@ 2022-08-10 19:07 ` Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 04/15] bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable() Martin KaFai Lau
` (12 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:07 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
When bpf program calling bpf_setsockopt(SOL_SOCKET),
it could be run in softirq and doesn't make sense to do the capable
check. There was a similar situation in bpf_setsockopt(TCP_CONGESTION).
In commit 8d650cdedaab ("tcp: fix tcp_set_congestion_control() use from bpf hook"),
tcp_set_congestion_control(..., cap_net_admin) was added to skip
the cap check for bpf prog.
This patch adds sockopt_ns_capable() and sockopt_capable() for
the sk_setsockopt() to use. They will consider the
has_current_bpf_ctx() before doing the ns_capable() and capable() test.
They are in EXPORT_SYMBOL for the ipv6 module to use in a latter patch.
Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/net/sock.h | 2 ++
net/core/sock.c | 38 +++++++++++++++++++++++++-------------
2 files changed, 27 insertions(+), 13 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index b2ff230860c6..72b78c2b6f83 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1723,6 +1723,8 @@ static inline void unlock_sock_fast(struct sock *sk, bool slow)
void sockopt_lock_sock(struct sock *sk);
void sockopt_release_sock(struct sock *sk);
+bool sockopt_ns_capable(struct user_namespace *ns, int cap);
+bool sockopt_capable(int cap);
/* Used by processes to "lock" a socket state, so that
* interrupts and bottom half handlers won't change it
diff --git a/net/core/sock.c b/net/core/sock.c
index d3683228376f..7ea46e4700fd 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1060,6 +1060,18 @@ void sockopt_release_sock(struct sock *sk)
}
EXPORT_SYMBOL(sockopt_release_sock);
+bool sockopt_ns_capable(struct user_namespace *ns, int cap)
+{
+ return has_current_bpf_ctx() || ns_capable(ns, cap);
+}
+EXPORT_SYMBOL(sockopt_ns_capable);
+
+bool sockopt_capable(int cap)
+{
+ return has_current_bpf_ctx() || capable(cap);
+}
+EXPORT_SYMBOL(sockopt_capable);
+
/*
* This is meant for all protocols to use and covers goings on
* at the socket level. Everything here is generic.
@@ -1095,7 +1107,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
switch (optname) {
case SO_DEBUG:
- if (val && !capable(CAP_NET_ADMIN))
+ if (val && !sockopt_capable(CAP_NET_ADMIN))
ret = -EACCES;
else
sock_valbool_flag(sk, SOCK_DBG, valbool);
@@ -1139,7 +1151,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
break;
case SO_SNDBUFFORCE:
- if (!capable(CAP_NET_ADMIN)) {
+ if (!sockopt_capable(CAP_NET_ADMIN)) {
ret = -EPERM;
break;
}
@@ -1161,7 +1173,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
break;
case SO_RCVBUFFORCE:
- if (!capable(CAP_NET_ADMIN)) {
+ if (!sockopt_capable(CAP_NET_ADMIN)) {
ret = -EPERM;
break;
}
@@ -1188,8 +1200,8 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
case SO_PRIORITY:
if ((val >= 0 && val <= 6) ||
- ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
- ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
+ sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
+ sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
sk->sk_priority = val;
else
ret = -EPERM;
@@ -1334,8 +1346,8 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
clear_bit(SOCK_PASSSEC, &sock->flags);
break;
case SO_MARK:
- if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
- !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
+ !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
ret = -EPERM;
break;
}
@@ -1343,8 +1355,8 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
__sock_set_mark(sk, val);
break;
case SO_RCVMARK:
- if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
- !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
+ !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
ret = -EPERM;
break;
}
@@ -1378,7 +1390,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
#ifdef CONFIG_NET_RX_BUSY_POLL
case SO_BUSY_POLL:
/* allow unprivileged users to decrease the value */
- if ((val > sk->sk_ll_usec) && !capable(CAP_NET_ADMIN))
+ if ((val > sk->sk_ll_usec) && !sockopt_capable(CAP_NET_ADMIN))
ret = -EPERM;
else {
if (val < 0)
@@ -1388,13 +1400,13 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
}
break;
case SO_PREFER_BUSY_POLL:
- if (valbool && !capable(CAP_NET_ADMIN))
+ if (valbool && !sockopt_capable(CAP_NET_ADMIN))
ret = -EPERM;
else
WRITE_ONCE(sk->sk_prefer_busy_poll, valbool);
break;
case SO_BUSY_POLL_BUDGET:
- if (val > READ_ONCE(sk->sk_busy_poll_budget) && !capable(CAP_NET_ADMIN)) {
+ if (val > READ_ONCE(sk->sk_busy_poll_budget) && !sockopt_capable(CAP_NET_ADMIN)) {
ret = -EPERM;
} else {
if (val < 0 || val > U16_MAX)
@@ -1465,7 +1477,7 @@ static int sk_setsockopt(struct sock *sk, int level, int optname,
* scheduler has enough safe guards.
*/
if (sk_txtime.clockid != CLOCK_MONOTONIC &&
- !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
ret = -EPERM;
break;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 04/15] bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (2 preceding siblings ...)
2022-08-10 19:07 ` [PATCH v3 bpf-next 03/15] bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt() Martin KaFai Lau
@ 2022-08-10 19:07 ` Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 05/15] bpf: net: Change do_ip_setsockopt() " Martin KaFai Lau
` (11 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:07 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
Similar to the earlier patch that avoids sk_setsockopt() from
taking sk lock and doing capable test when called by bpf. This patch
changes do_tcp_setsockopt() to use the sockopt_{lock,release}_sock()
and sockopt_[ns_]capable().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/ipv4/tcp.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 970e9a2cca4a..cfed84b1883f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3202,7 +3202,7 @@ EXPORT_SYMBOL(tcp_disconnect);
static inline bool tcp_can_repair_sock(const struct sock *sk)
{
- return ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN) &&
+ return sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN) &&
(sk->sk_state != TCP_LISTEN);
}
@@ -3502,11 +3502,11 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
return -EFAULT;
name[val] = 0;
- lock_sock(sk);
+ sockopt_lock_sock(sk);
err = tcp_set_congestion_control(sk, name, true,
- ns_capable(sock_net(sk)->user_ns,
- CAP_NET_ADMIN));
- release_sock(sk);
+ sockopt_ns_capable(sock_net(sk)->user_ns,
+ CAP_NET_ADMIN));
+ sockopt_release_sock(sk);
return err;
}
case TCP_ULP: {
@@ -3522,9 +3522,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
return -EFAULT;
name[val] = 0;
- lock_sock(sk);
+ sockopt_lock_sock(sk);
err = tcp_set_ulp(sk, name);
- release_sock(sk);
+ sockopt_release_sock(sk);
return err;
}
case TCP_FASTOPEN_KEY: {
@@ -3557,7 +3557,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
if (copy_from_sockptr(&val, optval, sizeof(val)))
return -EFAULT;
- lock_sock(sk);
+ sockopt_lock_sock(sk);
switch (optname) {
case TCP_MAXSEG:
@@ -3779,7 +3779,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
break;
}
- release_sock(sk);
+ sockopt_release_sock(sk);
return err;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 05/15] bpf: net: Change do_ip_setsockopt() to use the sockopt's lock_sock() and capable()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (3 preceding siblings ...)
2022-08-10 19:07 ` [PATCH v3 bpf-next 04/15] bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable() Martin KaFai Lau
@ 2022-08-10 19:07 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 06/15] bpf: net: Change do_ipv6_setsockopt() " Martin KaFai Lau
` (10 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:07 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
Similar to the earlier patch that avoids sk_setsockopt() from
taking sk lock and doing capable test when called by bpf. This patch
changes do_ip_setsockopt() to use the sockopt_{lock,release}_sock()
and sockopt_[ns_]capable().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/ipv4/ip_sockglue.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index a8a323ecbb54..a3c496580e6b 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -944,7 +944,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,
err = 0;
if (needs_rtnl)
rtnl_lock();
- lock_sock(sk);
+ sockopt_lock_sock(sk);
switch (optname) {
case IP_OPTIONS:
@@ -1333,14 +1333,14 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,
case IP_IPSEC_POLICY:
case IP_XFRM_POLICY:
err = -EPERM;
- if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
+ if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
break;
err = xfrm_user_policy(sk, optname, optval, optlen);
break;
case IP_TRANSPARENT:
- if (!!val && !ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
- !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ if (!!val && !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
+ !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
err = -EPERM;
break;
}
@@ -1368,13 +1368,13 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,
err = -ENOPROTOOPT;
break;
}
- release_sock(sk);
+ sockopt_release_sock(sk);
if (needs_rtnl)
rtnl_unlock();
return err;
e_inval:
- release_sock(sk);
+ sockopt_release_sock(sk);
if (needs_rtnl)
rtnl_unlock();
return -EINVAL;
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 06/15] bpf: net: Change do_ipv6_setsockopt() to use the sockopt's lock_sock() and capable()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (4 preceding siblings ...)
2022-08-10 19:07 ` [PATCH v3 bpf-next 05/15] bpf: net: Change do_ip_setsockopt() " Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 07/15] bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog() Martin KaFai Lau
` (9 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
Similar to the earlier patch that avoids sk_setsockopt() from
taking sk lock and doing capable test when called by bpf. This patch
changes do_ipv6_setsockopt() to use the sockopt_{lock,release}_sock()
and sockopt_[ns_]capable().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/ipv6/ipv6_sockglue.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 222f6bf220ba..d23de48ff612 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -327,7 +327,7 @@ static int ipv6_set_opt_hdr(struct sock *sk, int optname, sockptr_t optval,
int err;
/* hop-by-hop / destination options are privileged option */
- if (optname != IPV6_RTHDR && !ns_capable(net->user_ns, CAP_NET_RAW))
+ if (optname != IPV6_RTHDR && !sockopt_ns_capable(net->user_ns, CAP_NET_RAW))
return -EPERM;
/* remove any sticky options header with a zero option
@@ -417,7 +417,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
if (needs_rtnl)
rtnl_lock();
- lock_sock(sk);
+ sockopt_lock_sock(sk);
switch (optname) {
@@ -634,8 +634,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
break;
case IPV6_TRANSPARENT:
- if (valbool && !ns_capable(net->user_ns, CAP_NET_RAW) &&
- !ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ if (valbool && !sockopt_ns_capable(net->user_ns, CAP_NET_RAW) &&
+ !sockopt_ns_capable(net->user_ns, CAP_NET_ADMIN)) {
retv = -EPERM;
break;
}
@@ -946,7 +946,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
case IPV6_IPSEC_POLICY:
case IPV6_XFRM_POLICY:
retv = -EPERM;
- if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
+ if (!sockopt_ns_capable(net->user_ns, CAP_NET_ADMIN))
break;
retv = xfrm_user_policy(sk, optname, optval, optlen);
break;
@@ -994,14 +994,14 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
break;
}
- release_sock(sk);
+ sockopt_release_sock(sk);
if (needs_rtnl)
rtnl_unlock();
return retv;
e_inval:
- release_sock(sk);
+ sockopt_release_sock(sk);
if (needs_rtnl)
rtnl_unlock();
return -EINVAL;
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 07/15] bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (5 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 06/15] bpf: net: Change do_ipv6_setsockopt() " Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-16 3:33 ` Andrii Nakryiko
2022-08-10 19:08 ` [PATCH v3 bpf-next 08/15] bpf: Embed kernel CONFIG check into the if statement in bpf_setsockopt Martin KaFai Lau
` (8 subsequent siblings)
15 siblings, 1 reply; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
The bpf-iter-prog for tcp and unix sk can do bpf_setsockopt()
which needs has_current_bpf_ctx() to decide if it is called by a
bpf prog. This patch initializes the bpf_run_ctx in
bpf_iter_run_prog() for the has_current_bpf_ctx() to use.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/linux/bpf.h | 2 +-
kernel/bpf/bpf_iter.c | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0a600b2013cc..15ab980e9525 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1967,7 +1967,7 @@ static inline bool unprivileged_ebpf_enabled(void)
}
/* Not all bpf prog type has the bpf_ctx.
- * Only trampoline and cgroup-bpf have it.
+ * Only trampoline, cgroup-bpf, and iter have it.
* For the bpf prog type that has initialized the bpf_ctx,
* this function can be used to decide if a kernel function
* is called by a bpf program.
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 4b112aa8bba3..6476b2c03527 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -685,19 +685,24 @@ struct bpf_prog *bpf_iter_get_info(struct bpf_iter_meta *meta, bool in_stop)
int bpf_iter_run_prog(struct bpf_prog *prog, void *ctx)
{
+ struct bpf_run_ctx run_ctx, *old_run_ctx;
int ret;
if (prog->aux->sleepable) {
rcu_read_lock_trace();
migrate_disable();
might_fault();
+ old_run_ctx = bpf_set_run_ctx(&run_ctx);
ret = bpf_prog_run(prog, ctx);
+ bpf_reset_run_ctx(old_run_ctx);
migrate_enable();
rcu_read_unlock_trace();
} else {
rcu_read_lock();
migrate_disable();
+ old_run_ctx = bpf_set_run_ctx(&run_ctx);
ret = bpf_prog_run(prog, ctx);
+ bpf_reset_run_ctx(old_run_ctx);
migrate_enable();
rcu_read_unlock();
}
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 08/15] bpf: Embed kernel CONFIG check into the if statement in bpf_setsockopt
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (6 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 07/15] bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog() Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 09/15] bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt() Martin KaFai Lau
` (7 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
This patch moves the "#ifdef CONFIG_XXX" check into the "if/else"
statement itself. The change is done for the bpf_setsockopt()
function only. It will make the latter patches easier to follow
without the surrounding ifdef macro.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/core/filter.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 5669248aff25..01cb4a01b1c1 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5113,8 +5113,7 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
default:
ret = -EINVAL;
}
-#ifdef CONFIG_INET
- } else if (level == SOL_IP) {
+ } else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP) {
if (optlen != sizeof(int) || sk->sk_family != AF_INET)
return -EINVAL;
@@ -5135,8 +5134,7 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
default:
ret = -EINVAL;
}
-#if IS_ENABLED(CONFIG_IPV6)
- } else if (level == SOL_IPV6) {
+ } else if (IS_ENABLED(CONFIG_IPV6) && level == SOL_IPV6) {
if (optlen != sizeof(int) || sk->sk_family != AF_INET6)
return -EINVAL;
@@ -5157,8 +5155,7 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
default:
ret = -EINVAL;
}
-#endif
- } else if (level == SOL_TCP &&
+ } else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP &&
sk->sk_prot->setsockopt == tcp_setsockopt) {
if (optname == TCP_CONGESTION) {
char name[TCP_CA_NAME_MAX];
@@ -5250,7 +5247,6 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
ret = -EINVAL;
}
}
-#endif
} else {
ret = -EINVAL;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 09/15] bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (7 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 08/15] bpf: Embed kernel CONFIG check into the if statement in bpf_setsockopt Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 10/15] bpf: Refactor bpf specific tcp optnames to a new function Martin KaFai Lau
` (6 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
After the prep work in the previous patches,
this patch removes most of the dup code from bpf_setsockopt(SOL_SOCKET)
and reuses them from sk_setsockopt().
The only exception is SO_RCVLOWAT. The bpf_setsockopt() does
not always have the sock ptr (sk->sk_socket) and sock->ops->set_rcvlowat
is needed in sk_setsockopt. tcp_set_rcvlowat is the only
implementation for set_rcvlowat. bpf_setsockopt() needs
one special handling for SO_RCVLOWAT for tcp_sock, so leave
SO_RCVLOWAT in bpf_setsockopt() for now.
The existing optname white-list is refactored into a new
function sol_socket_setsockopt().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/net/sock.h | 2 +
net/core/filter.c | 130 +++++++++++++--------------------------------
net/core/sock.c | 4 +-
3 files changed, 42 insertions(+), 94 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 72b78c2b6f83..b7e159f9d7bf 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1800,6 +1800,8 @@ void sock_pfree(struct sk_buff *skb);
#define sock_edemux sock_efree
#endif
+int sk_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen);
int sock_setsockopt(struct socket *sock, int level, int op,
sockptr_t optval, unsigned int optlen);
diff --git a/net/core/filter.c b/net/core/filter.c
index 01cb4a01b1c1..d1f9f8360f60 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5013,106 +5013,52 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = {
.arg1_type = ARG_PTR_TO_CTX,
};
+static int sol_socket_setsockopt(struct sock *sk, int optname,
+ char *optval, int optlen)
+{
+ switch (optname) {
+ case SO_SNDBUF:
+ case SO_RCVBUF:
+ case SO_KEEPALIVE:
+ case SO_PRIORITY:
+ case SO_REUSEPORT:
+ case SO_RCVLOWAT:
+ case SO_MARK:
+ case SO_MAX_PACING_RATE:
+ case SO_BINDTOIFINDEX:
+ case SO_TXREHASH:
+ if (optlen != sizeof(int))
+ return -EINVAL;
+ break;
+ case SO_BINDTODEVICE:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (optname == SO_RCVLOWAT) {
+ int val = *(int *)optval;
+
+ if (val < 0)
+ val = INT_MAX;
+ WRITE_ONCE(sk->sk_rcvlowat, val ? : 1);
+ return 0;
+ }
+
+ return sk_setsockopt(sk, SOL_SOCKET, optname,
+ KERNEL_SOCKPTR(optval), optlen);
+}
+
static int __bpf_setsockopt(struct sock *sk, int level, int optname,
char *optval, int optlen)
{
- char devname[IFNAMSIZ];
- int val, valbool;
- struct net *net;
- int ifindex;
- int ret = 0;
+ int val, ret = 0;
if (!sk_fullsock(sk))
return -EINVAL;
if (level == SOL_SOCKET) {
- if (optlen != sizeof(int) && optname != SO_BINDTODEVICE)
- return -EINVAL;
- val = *((int *)optval);
- valbool = val ? 1 : 0;
-
- /* Only some socketops are supported */
- switch (optname) {
- case SO_RCVBUF:
- val = min_t(u32, val, sysctl_rmem_max);
- val = min_t(int, val, INT_MAX / 2);
- sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
- WRITE_ONCE(sk->sk_rcvbuf,
- max_t(int, val * 2, SOCK_MIN_RCVBUF));
- break;
- case SO_SNDBUF:
- val = min_t(u32, val, sysctl_wmem_max);
- val = min_t(int, val, INT_MAX / 2);
- sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
- WRITE_ONCE(sk->sk_sndbuf,
- max_t(int, val * 2, SOCK_MIN_SNDBUF));
- break;
- case SO_MAX_PACING_RATE: /* 32bit version */
- if (val != ~0U)
- cmpxchg(&sk->sk_pacing_status,
- SK_PACING_NONE,
- SK_PACING_NEEDED);
- sk->sk_max_pacing_rate = (val == ~0U) ?
- ~0UL : (unsigned int)val;
- sk->sk_pacing_rate = min(sk->sk_pacing_rate,
- sk->sk_max_pacing_rate);
- break;
- case SO_PRIORITY:
- sk->sk_priority = val;
- break;
- case SO_RCVLOWAT:
- if (val < 0)
- val = INT_MAX;
- WRITE_ONCE(sk->sk_rcvlowat, val ? : 1);
- break;
- case SO_MARK:
- if (sk->sk_mark != val) {
- sk->sk_mark = val;
- sk_dst_reset(sk);
- }
- break;
- case SO_BINDTODEVICE:
- optlen = min_t(long, optlen, IFNAMSIZ - 1);
- strncpy(devname, optval, optlen);
- devname[optlen] = 0;
-
- ifindex = 0;
- if (devname[0] != '\0') {
- struct net_device *dev;
-
- ret = -ENODEV;
-
- net = sock_net(sk);
- dev = dev_get_by_name(net, devname);
- if (!dev)
- break;
- ifindex = dev->ifindex;
- dev_put(dev);
- }
- fallthrough;
- case SO_BINDTOIFINDEX:
- if (optname == SO_BINDTOIFINDEX)
- ifindex = val;
- ret = sock_bindtoindex(sk, ifindex, false);
- break;
- case SO_KEEPALIVE:
- if (sk->sk_prot->keepalive)
- sk->sk_prot->keepalive(sk, valbool);
- sock_valbool_flag(sk, SOCK_KEEPOPEN, valbool);
- break;
- case SO_REUSEPORT:
- sk->sk_reuseport = valbool;
- break;
- case SO_TXREHASH:
- if (val < -1 || val > 1) {
- ret = -EINVAL;
- break;
- }
- sk->sk_txrehash = (u8)val;
- break;
- default:
- ret = -EINVAL;
- }
+ return sol_socket_setsockopt(sk, optname, optval, optlen);
} else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP) {
if (optlen != sizeof(int) || sk->sk_family != AF_INET)
return -EINVAL;
diff --git a/net/core/sock.c b/net/core/sock.c
index 7ea46e4700fd..29f161ab717f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1077,8 +1077,8 @@ EXPORT_SYMBOL(sockopt_capable);
* at the socket level. Everything here is generic.
*/
-static int sk_setsockopt(struct sock *sk, int level, int optname,
- sockptr_t optval, unsigned int optlen)
+int sk_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
{
struct so_timestamping timestamping;
struct socket *sock = sk->sk_socket;
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 10/15] bpf: Refactor bpf specific tcp optnames to a new function
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (8 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 09/15] bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt() Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 11/15] bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt() Martin KaFai Lau
` (5 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
The patch moves all bpf specific tcp optnames (TCP_BPF_XXX)
to a new function bpf_sol_tcp_setsockopt(). This will make
the next patch easier to follow.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/core/filter.c | 79 ++++++++++++++++++++++++++++++-----------------
1 file changed, 50 insertions(+), 29 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index d1f9f8360f60..200e79a1fbfd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5049,6 +5049,52 @@ static int sol_socket_setsockopt(struct sock *sk, int optname,
KERNEL_SOCKPTR(optval), optlen);
}
+static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
+ char *optval, int optlen)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ unsigned long timeout;
+ int val;
+
+ if (optlen != sizeof(int))
+ return -EINVAL;
+
+ val = *(int *)optval;
+
+ /* Only some options are supported */
+ switch (optname) {
+ case TCP_BPF_IW:
+ if (val <= 0 || tp->data_segs_out > tp->syn_data)
+ return -EINVAL;
+ tcp_snd_cwnd_set(tp, val);
+ break;
+ case TCP_BPF_SNDCWND_CLAMP:
+ if (val <= 0)
+ return -EINVAL;
+ tp->snd_cwnd_clamp = val;
+ tp->snd_ssthresh = val;
+ break;
+ case TCP_BPF_DELACK_MAX:
+ timeout = usecs_to_jiffies(val);
+ if (timeout > TCP_DELACK_MAX ||
+ timeout < TCP_TIMEOUT_MIN)
+ return -EINVAL;
+ inet_csk(sk)->icsk_delack_max = timeout;
+ break;
+ case TCP_BPF_RTO_MIN:
+ timeout = usecs_to_jiffies(val);
+ if (timeout > TCP_RTO_MIN ||
+ timeout < TCP_TIMEOUT_MIN)
+ return -EINVAL;
+ inet_csk(sk)->icsk_rto_min = timeout;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int __bpf_setsockopt(struct sock *sk, int level, int optname,
char *optval, int optlen)
{
@@ -5103,6 +5149,10 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
}
} else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP &&
sk->sk_prot->setsockopt == tcp_setsockopt) {
+ if (optname >= TCP_BPF_IW)
+ return bpf_sol_tcp_setsockopt(sk, optname,
+ optval, optlen);
+
if (optname == TCP_CONGESTION) {
char name[TCP_CA_NAME_MAX];
@@ -5113,7 +5163,6 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
} else {
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
- unsigned long timeout;
if (optlen != sizeof(int))
return -EINVAL;
@@ -5121,34 +5170,6 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
val = *((int *)optval);
/* Only some options are supported */
switch (optname) {
- case TCP_BPF_IW:
- if (val <= 0 || tp->data_segs_out > tp->syn_data)
- ret = -EINVAL;
- else
- tcp_snd_cwnd_set(tp, val);
- break;
- case TCP_BPF_SNDCWND_CLAMP:
- if (val <= 0) {
- ret = -EINVAL;
- } else {
- tp->snd_cwnd_clamp = val;
- tp->snd_ssthresh = val;
- }
- break;
- case TCP_BPF_DELACK_MAX:
- timeout = usecs_to_jiffies(val);
- if (timeout > TCP_DELACK_MAX ||
- timeout < TCP_TIMEOUT_MIN)
- return -EINVAL;
- inet_csk(sk)->icsk_delack_max = timeout;
- break;
- case TCP_BPF_RTO_MIN:
- timeout = usecs_to_jiffies(val);
- if (timeout > TCP_RTO_MIN ||
- timeout < TCP_TIMEOUT_MIN)
- return -EINVAL;
- inet_csk(sk)->icsk_rto_min = timeout;
- break;
case TCP_SAVE_SYN:
if (val < 0 || val > 1)
ret = -EINVAL;
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 11/15] bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (9 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 10/15] bpf: Refactor bpf specific tcp optnames to a new function Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 12/15] bpf: Change bpf_setsockopt(SOL_IP) to reuse do_ip_setsockopt() Martin KaFai Lau
` (4 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
After the prep work in the previous patches,
this patch removes all the dup code from bpf_setsockopt(SOL_TCP)
and reuses the do_tcp_setsockopt().
The existing optname white-list is refactored into a new
function sol_tcp_setsockopt(). The sol_tcp_setsockopt()
also calls the bpf_sol_tcp_setsockopt() to handle
the TCP_BPF_XXX specific optnames.
bpf_setsockopt(TCP_SAVE_SYN) now also allows a value 2 to
save the eth header also and it comes for free from
do_tcp_setsockopt().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/net/tcp.h | 2 +
net/core/filter.c | 97 +++++++++++++++--------------------------------
net/ipv4/tcp.c | 4 +-
3 files changed, 34 insertions(+), 69 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index d10962b9f0d0..c03a50c72f40 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -405,6 +405,8 @@ __poll_t tcp_poll(struct file *file, struct socket *sock,
int tcp_getsockopt(struct sock *sk, int level, int optname,
char __user *optval, int __user *optlen);
bool tcp_bpf_bypass_getsockopt(int level, int optname);
+int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen);
int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
unsigned int optlen);
void tcp_set_keepalive(struct sock *sk, int val);
diff --git a/net/core/filter.c b/net/core/filter.c
index 200e79a1fbfd..0c5361b8906d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5095,6 +5095,34 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname,
return 0;
}
+static int sol_tcp_setsockopt(struct sock *sk, int optname,
+ char *optval, int optlen)
+{
+ if (sk->sk_prot->setsockopt != tcp_setsockopt)
+ return -EINVAL;
+
+ switch (optname) {
+ case TCP_KEEPIDLE:
+ case TCP_KEEPINTVL:
+ case TCP_KEEPCNT:
+ case TCP_SYNCNT:
+ case TCP_WINDOW_CLAMP:
+ case TCP_USER_TIMEOUT:
+ case TCP_NOTSENT_LOWAT:
+ case TCP_SAVE_SYN:
+ if (optlen != sizeof(int))
+ return -EINVAL;
+ break;
+ case TCP_CONGESTION:
+ break;
+ default:
+ return bpf_sol_tcp_setsockopt(sk, optname, optval, optlen);
+ }
+
+ return do_tcp_setsockopt(sk, SOL_TCP, optname,
+ KERNEL_SOCKPTR(optval), optlen);
+}
+
static int __bpf_setsockopt(struct sock *sk, int level, int optname,
char *optval, int optlen)
{
@@ -5147,73 +5175,8 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
default:
ret = -EINVAL;
}
- } else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP &&
- sk->sk_prot->setsockopt == tcp_setsockopt) {
- if (optname >= TCP_BPF_IW)
- return bpf_sol_tcp_setsockopt(sk, optname,
- optval, optlen);
-
- if (optname == TCP_CONGESTION) {
- char name[TCP_CA_NAME_MAX];
-
- strncpy(name, optval, min_t(long, optlen,
- TCP_CA_NAME_MAX-1));
- name[TCP_CA_NAME_MAX-1] = 0;
- ret = tcp_set_congestion_control(sk, name, false, true);
- } else {
- struct inet_connection_sock *icsk = inet_csk(sk);
- struct tcp_sock *tp = tcp_sk(sk);
-
- if (optlen != sizeof(int))
- return -EINVAL;
-
- val = *((int *)optval);
- /* Only some options are supported */
- switch (optname) {
- case TCP_SAVE_SYN:
- if (val < 0 || val > 1)
- ret = -EINVAL;
- else
- tp->save_syn = val;
- break;
- case TCP_KEEPIDLE:
- ret = tcp_sock_set_keepidle_locked(sk, val);
- break;
- case TCP_KEEPINTVL:
- if (val < 1 || val > MAX_TCP_KEEPINTVL)
- ret = -EINVAL;
- else
- tp->keepalive_intvl = val * HZ;
- break;
- case TCP_KEEPCNT:
- if (val < 1 || val > MAX_TCP_KEEPCNT)
- ret = -EINVAL;
- else
- tp->keepalive_probes = val;
- break;
- case TCP_SYNCNT:
- if (val < 1 || val > MAX_TCP_SYNCNT)
- ret = -EINVAL;
- else
- icsk->icsk_syn_retries = val;
- break;
- case TCP_USER_TIMEOUT:
- if (val < 0)
- ret = -EINVAL;
- else
- icsk->icsk_user_timeout = val;
- break;
- case TCP_NOTSENT_LOWAT:
- tp->notsent_lowat = val;
- sk->sk_write_space(sk);
- break;
- case TCP_WINDOW_CLAMP:
- ret = tcp_set_window_clamp(sk, val);
- break;
- default:
- ret = -EINVAL;
- }
- }
+ } else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP) {
+ return sol_tcp_setsockopt(sk, optname, optval, optlen);
} else {
ret = -EINVAL;
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index cfed84b1883f..a6986f201f92 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3479,8 +3479,8 @@ int tcp_set_window_clamp(struct sock *sk, int val)
/*
* Socket option code for TCP.
*/
-static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
- sockptr_t optval, unsigned int optlen)
+int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
{
struct tcp_sock *tp = tcp_sk(sk);
struct inet_connection_sock *icsk = inet_csk(sk);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 12/15] bpf: Change bpf_setsockopt(SOL_IP) to reuse do_ip_setsockopt()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (10 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 11/15] bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt() Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 13/15] bpf: Change bpf_setsockopt(SOL_IPV6) to reuse do_ipv6_setsockopt() Martin KaFai Lau
` (3 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
After the prep work in the previous patches,
this patch removes the dup code from bpf_setsockopt(SOL_IP)
and reuses the implementation in do_ip_setsockopt().
The existing optname white-list is refactored into a new
function sol_ip_setsockopt().
NOTE,
the current bpf_setsockopt(IP_TOS) is quite different from the
the do_ip_setsockopt(IP_TOS). For example, it does not take
the INET_ECN_MASK into the account for tcp and also does not adjust
sk->sk_priority. It looks like the current bpf_setsockopt(IP_TOS)
was referencing the IPV6_TCLASS implementation instead of IP_TOS.
This patch tries to rectify that by using the do_ip_setsockopt(IP_TOS).
While this is a behavior change, the do_ip_setsockopt(IP_TOS) behavior
is arguably what the user is expecting. At least, the INET_ECN_MASK bits
should be masked out for tcp.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/net/ip.h | 2 ++
net/core/filter.c | 40 ++++++++++++++++++++--------------------
net/ipv4/ip_sockglue.c | 4 ++--
3 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h
index 1c979fd1904c..34fa5b0f0a0e 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -743,6 +743,8 @@ void ip_cmsg_recv_offset(struct msghdr *msg, struct sock *sk,
int ip_cmsg_send(struct sock *sk, struct msghdr *msg,
struct ipcm_cookie *ipc, bool allow_ipv6);
DECLARE_STATIC_KEY_FALSE(ip4_min_ttl);
+int do_ip_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
+ unsigned int optlen);
int ip_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
unsigned int optlen);
int ip_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
diff --git a/net/core/filter.c b/net/core/filter.c
index 0c5361b8906d..d236f71ff501 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5123,6 +5123,25 @@ static int sol_tcp_setsockopt(struct sock *sk, int optname,
KERNEL_SOCKPTR(optval), optlen);
}
+static int sol_ip_setsockopt(struct sock *sk, int optname,
+ char *optval, int optlen)
+{
+ if (sk->sk_family != AF_INET)
+ return -EINVAL;
+
+ switch (optname) {
+ case IP_TOS:
+ if (optlen != sizeof(int))
+ return -EINVAL;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return do_ip_setsockopt(sk, SOL_IP, optname,
+ KERNEL_SOCKPTR(optval), optlen);
+}
+
static int __bpf_setsockopt(struct sock *sk, int level, int optname,
char *optval, int optlen)
{
@@ -5134,26 +5153,7 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
if (level == SOL_SOCKET) {
return sol_socket_setsockopt(sk, optname, optval, optlen);
} else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP) {
- if (optlen != sizeof(int) || sk->sk_family != AF_INET)
- return -EINVAL;
-
- val = *((int *)optval);
- /* Only some options are supported */
- switch (optname) {
- case IP_TOS:
- if (val < -1 || val > 0xff) {
- ret = -EINVAL;
- } else {
- struct inet_sock *inet = inet_sk(sk);
-
- if (val == -1)
- val = 0;
- inet->tos = val;
- }
- break;
- default:
- ret = -EINVAL;
- }
+ return sol_ip_setsockopt(sk, optname, optval, optlen);
} else if (IS_ENABLED(CONFIG_IPV6) && level == SOL_IPV6) {
if (optlen != sizeof(int) || sk->sk_family != AF_INET6)
return -EINVAL;
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index a3c496580e6b..751fa69cb557 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -888,8 +888,8 @@ static int compat_ip_mcast_join_leave(struct sock *sk, int optname,
DEFINE_STATIC_KEY_FALSE(ip4_min_ttl);
-static int do_ip_setsockopt(struct sock *sk, int level, int optname,
- sockptr_t optval, unsigned int optlen)
+int do_ip_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
{
struct inet_sock *inet = inet_sk(sk);
struct net *net = sock_net(sk);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 13/15] bpf: Change bpf_setsockopt(SOL_IPV6) to reuse do_ipv6_setsockopt()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (11 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 12/15] bpf: Change bpf_setsockopt(SOL_IP) to reuse do_ip_setsockopt() Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 14/15] bpf: Add a few optnames to bpf_setsockopt Martin KaFai Lau
` (2 subsequent siblings)
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
After the prep work in the previous patches,
this patch removes the dup code from bpf_setsockopt(SOL_IPV6)
and reuses the implementation in do_ipv6_setsockopt().
ipv6 could be compiled as a module. Like how other code solved it
with stubs in ipv6_stubs.h, this patch adds the do_ipv6_setsockopt
to the ipv6_bpf_stub.
The current bpf_setsockopt(IPV6_TCLASS) does not take the
INET_ECN_MASK into the account for tcp. The
do_ipv6_setsockopt(IPV6_TCLASS) will handle it correctly.
The existing optname white-list is refactored into a new
function sol_ipv6_setsockopt().
After this last SOL_IPV6 dup code removal, the __bpf_setsockopt()
is simplified enough that the extra "{ }" around the if statement
can be removed.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
include/net/ipv6.h | 2 ++
include/net/ipv6_stubs.h | 2 ++
net/core/filter.c | 56 +++++++++++++++++++---------------------
net/ipv6/af_inet6.c | 1 +
net/ipv6/ipv6_sockglue.c | 4 +--
5 files changed, 33 insertions(+), 32 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index de9dcc5652c4..c110d9032083 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1156,6 +1156,8 @@ struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
*/
DECLARE_STATIC_KEY_FALSE(ip6_min_hopcount);
+int do_ipv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
+ unsigned int optlen);
int ipv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
unsigned int optlen);
int ipv6_getsockopt(struct sock *sk, int level, int optname,
diff --git a/include/net/ipv6_stubs.h b/include/net/ipv6_stubs.h
index 45e0339be6fa..8692698b01cf 100644
--- a/include/net/ipv6_stubs.h
+++ b/include/net/ipv6_stubs.h
@@ -81,6 +81,8 @@ struct ipv6_bpf_stub {
const struct in6_addr *daddr, __be16 dport,
int dif, int sdif, struct udp_table *tbl,
struct sk_buff *skb);
+ int (*ipv6_setsockopt)(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen);
};
extern const struct ipv6_bpf_stub *ipv6_bpf_stub __read_mostly;
diff --git a/net/core/filter.c b/net/core/filter.c
index d236f71ff501..de6f90a2ec0d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5142,45 +5142,41 @@ static int sol_ip_setsockopt(struct sock *sk, int optname,
KERNEL_SOCKPTR(optval), optlen);
}
+static int sol_ipv6_setsockopt(struct sock *sk, int optname,
+ char *optval, int optlen)
+{
+ if (sk->sk_family != AF_INET6)
+ return -EINVAL;
+
+ switch (optname) {
+ case IPV6_TCLASS:
+ if (optlen != sizeof(int))
+ return -EINVAL;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return ipv6_bpf_stub->ipv6_setsockopt(sk, SOL_IPV6, optname,
+ KERNEL_SOCKPTR(optval), optlen);
+}
+
static int __bpf_setsockopt(struct sock *sk, int level, int optname,
char *optval, int optlen)
{
- int val, ret = 0;
-
if (!sk_fullsock(sk))
return -EINVAL;
- if (level == SOL_SOCKET) {
+ if (level == SOL_SOCKET)
return sol_socket_setsockopt(sk, optname, optval, optlen);
- } else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP) {
+ else if (IS_ENABLED(CONFIG_INET) && level == SOL_IP)
return sol_ip_setsockopt(sk, optname, optval, optlen);
- } else if (IS_ENABLED(CONFIG_IPV6) && level == SOL_IPV6) {
- if (optlen != sizeof(int) || sk->sk_family != AF_INET6)
- return -EINVAL;
-
- val = *((int *)optval);
- /* Only some options are supported */
- switch (optname) {
- case IPV6_TCLASS:
- if (val < -1 || val > 0xff) {
- ret = -EINVAL;
- } else {
- struct ipv6_pinfo *np = inet6_sk(sk);
-
- if (val == -1)
- val = 0;
- np->tclass = val;
- }
- break;
- default:
- ret = -EINVAL;
- }
- } else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP) {
+ else if (IS_ENABLED(CONFIG_IPV6) && level == SOL_IPV6)
+ return sol_ipv6_setsockopt(sk, optname, optval, optlen);
+ else if (IS_ENABLED(CONFIG_INET) && level == SOL_TCP)
return sol_tcp_setsockopt(sk, optname, optval, optlen);
- } else {
- ret = -EINVAL;
- }
- return ret;
+
+ return -EINVAL;
}
static int _bpf_setsockopt(struct sock *sk, int level, int optname,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 2ce0c44d0081..cadc97852787 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -1057,6 +1057,7 @@ static const struct ipv6_stub ipv6_stub_impl = {
static const struct ipv6_bpf_stub ipv6_bpf_stub_impl = {
.inet6_bind = __inet6_bind,
.udp6_lib_lookup = __udp6_lib_lookup,
+ .ipv6_setsockopt = do_ipv6_setsockopt,
};
static int __init inet6_init(void)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index d23de48ff612..a4535bdbd310 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -391,8 +391,8 @@ static int ipv6_set_opt_hdr(struct sock *sk, int optname, sockptr_t optval,
return err;
}
-static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
- sockptr_t optval, unsigned int optlen)
+int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
{
struct ipv6_pinfo *np = inet6_sk(sk);
struct net *net = sock_net(sk);
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 14/15] bpf: Add a few optnames to bpf_setsockopt
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (12 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 13/15] bpf: Change bpf_setsockopt(SOL_IPV6) to reuse do_ipv6_setsockopt() Martin KaFai Lau
@ 2022-08-10 19:08 ` Martin KaFai Lau
2022-08-10 19:09 ` [PATCH v3 bpf-next 15/15] selftests/bpf: bpf_setsockopt tests Martin KaFai Lau
2022-08-11 17:04 ` [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() sdf
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:08 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
This patch adds a few optnames for bpf_setsockopt:
SO_REUSEADDR, IPV6_AUTOFLOWLABEL, TCP_MAXSEG, TCP_NODELAY,
and TCP_THIN_LINEAR_TIMEOUTS.
Thanks to the previous patches of this set, all additions can reuse
the sk_setsockopt(), do_ipv6_setsockopt(), and do_tcp_setsockopt().
The only change here is to allow them in bpf_setsockopt.
The bpf prog has been able to read all members of a sk by
using PTR_TO_BTF_ID of a sk. The optname additions here can also be
read by the same approach. Meaning there is a way to read
the values back.
These optnames can also be added to bpf_getsockopt() later with
another patch set that makes the bpf_getsockopt() to reuse
the sock_getsockopt(), tcp_getsockopt(), and ip[v6]_getsockopt().
Thus, this patch does not add more duplicated code to
bpf_getsockopt() now.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
net/core/filter.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index de6f90a2ec0d..52736c833a5f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5017,6 +5017,7 @@ static int sol_socket_setsockopt(struct sock *sk, int optname,
char *optval, int optlen)
{
switch (optname) {
+ case SO_REUSEADDR:
case SO_SNDBUF:
case SO_RCVBUF:
case SO_KEEPALIVE:
@@ -5102,11 +5103,14 @@ static int sol_tcp_setsockopt(struct sock *sk, int optname,
return -EINVAL;
switch (optname) {
+ case TCP_NODELAY:
+ case TCP_MAXSEG:
case TCP_KEEPIDLE:
case TCP_KEEPINTVL:
case TCP_KEEPCNT:
case TCP_SYNCNT:
case TCP_WINDOW_CLAMP:
+ case TCP_THIN_LINEAR_TIMEOUTS:
case TCP_USER_TIMEOUT:
case TCP_NOTSENT_LOWAT:
case TCP_SAVE_SYN:
@@ -5150,6 +5154,7 @@ static int sol_ipv6_setsockopt(struct sock *sk, int optname,
switch (optname) {
case IPV6_TCLASS:
+ case IPV6_AUTOFLOWLABEL:
if (optlen != sizeof(int))
return -EINVAL;
break;
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 bpf-next 15/15] selftests/bpf: bpf_setsockopt tests
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (13 preceding siblings ...)
2022-08-10 19:08 ` [PATCH v3 bpf-next 14/15] bpf: Add a few optnames to bpf_setsockopt Martin KaFai Lau
@ 2022-08-10 19:09 ` Martin KaFai Lau
2022-08-11 17:04 ` [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() sdf
15 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-10 19:09 UTC (permalink / raw)
To: bpf, netdev
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni, Stanislav Fomichev
This patch adds tests to exercise optnames that are allowed
in bpf_setsockopt().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
.../selftests/bpf/prog_tests/setget_sockopt.c | 125 +++++
.../selftests/bpf/progs/bpf_tracing_net.h | 31 +-
.../selftests/bpf/progs/setget_sockopt.c | 451 ++++++++++++++++++
3 files changed, 606 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
create mode 100644 tools/testing/selftests/bpf/progs/setget_sockopt.c
diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
new file mode 100644
index 000000000000..018611e6b248
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#define _GNU_SOURCE
+#include <sched.h>
+#include <linux/socket.h>
+#include <net/if.h>
+
+#include "test_progs.h"
+#include "cgroup_helpers.h"
+#include "network_helpers.h"
+
+#include "setget_sockopt.skel.h"
+
+#define CG_NAME "/setget-sockopt-test"
+
+static const char addr4_str[] = "127.0.0.1";
+static const char addr6_str[] = "::1";
+static struct setget_sockopt *skel;
+static int cg_fd;
+
+static int create_netns(void)
+{
+ if (!ASSERT_OK(unshare(CLONE_NEWNET), "create netns"))
+ return -1;
+
+ if (!ASSERT_OK(system("ip link set dev lo up"), "set lo up"))
+ return -1;
+
+ if (!ASSERT_OK(system("ip link add dev binddevtest1 type veth peer name binddevtest2"),
+ "add veth"))
+ return -1;
+
+ if (!ASSERT_OK(system("ip link set dev binddevtest1 up"),
+ "bring veth up"))
+ return -1;
+
+ return 0;
+}
+
+static void test_tcp(int family)
+{
+ struct setget_sockopt__bss *bss = skel->bss;
+ int sfd, cfd;
+
+ memset(bss, 0, sizeof(*bss));
+
+ sfd = start_server(family, SOCK_STREAM,
+ family == AF_INET6 ? addr6_str : addr4_str, 0, 0);
+ if (!ASSERT_GE(sfd, 0, "start_server"))
+ return;
+
+ cfd = connect_to_fd(sfd, 0);
+ if (!ASSERT_GE(cfd, 0, "connect_to_fd_server")) {
+ close(sfd);
+ return;
+ }
+ close(sfd);
+ close(cfd);
+
+ ASSERT_EQ(bss->nr_listen, 1, "nr_listen");
+ ASSERT_EQ(bss->nr_connect, 1, "nr_connect");
+ ASSERT_EQ(bss->nr_active, 1, "nr_active");
+ ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
+ ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
+ ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
+}
+
+static void test_udp(int family)
+{
+ struct setget_sockopt__bss *bss = skel->bss;
+ int sfd;
+
+ memset(bss, 0, sizeof(*bss));
+
+ sfd = start_server(family, SOCK_DGRAM,
+ family == AF_INET6 ? addr6_str : addr4_str, 0, 0);
+ if (!ASSERT_GE(sfd, 0, "start_server"))
+ return;
+ close(sfd);
+
+ ASSERT_GE(bss->nr_socket_post_create, 1, "nr_socket_post_create");
+ ASSERT_EQ(bss->nr_binddev, 1, "nr_bind");
+}
+
+void test_setget_sockopt(void)
+{
+ cg_fd = test__join_cgroup(CG_NAME);
+ if (cg_fd < 0)
+ return;
+
+ if (create_netns())
+ goto done;
+
+ skel = setget_sockopt__open();
+ if (!ASSERT_OK_PTR(skel, "open skel"))
+ goto done;
+
+ strcpy(skel->rodata->veth, "binddevtest1");
+ skel->rodata->veth_ifindex = if_nametoindex("binddevtest1");
+ if (!ASSERT_GT(skel->rodata->veth_ifindex, 0, "if_nametoindex"))
+ goto done;
+
+ if (!ASSERT_OK(setget_sockopt__load(skel), "load skel"))
+ goto done;
+
+ skel->links.skops_sockopt =
+ bpf_program__attach_cgroup(skel->progs.skops_sockopt, cg_fd);
+ if (!ASSERT_OK_PTR(skel->links.skops_sockopt, "attach cgroup"))
+ goto done;
+
+ skel->links.socket_post_create =
+ bpf_program__attach_cgroup(skel->progs.socket_post_create, cg_fd);
+ if (!ASSERT_OK_PTR(skel->links.socket_post_create, "attach_cgroup"))
+ goto done;
+
+ test_tcp(AF_INET6);
+ test_tcp(AF_INET);
+ test_udp(AF_INET6);
+ test_udp(AF_INET);
+
+done:
+ setget_sockopt__destroy(skel);
+ close(cg_fd);
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
index 98dd2c4815f0..5ebc6dabef84 100644
--- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
+++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
@@ -6,13 +6,40 @@
#define AF_INET6 10
#define SOL_SOCKET 1
+#define SO_REUSEADDR 2
#define SO_SNDBUF 7
-#define __SO_ACCEPTCON (1 << 16)
+#define SO_RCVBUF 8
+#define SO_KEEPALIVE 9
#define SO_PRIORITY 12
+#define SO_REUSEPORT 15
+#define SO_RCVLOWAT 18
+#define SO_BINDTODEVICE 25
+#define SO_MARK 36
+#define SO_MAX_PACING_RATE 47
+#define SO_BINDTOIFINDEX 62
+#define SO_TXREHASH 74
+#define __SO_ACCEPTCON (1 << 16)
+
+#define IP_TOS 1
+
+#define IPV6_TCLASS 67
+#define IPV6_AUTOFLOWLABEL 70
#define SOL_TCP 6
+#define TCP_NODELAY 1
+#define TCP_MAXSEG 2
+#define TCP_KEEPIDLE 4
+#define TCP_KEEPINTVL 5
+#define TCP_KEEPCNT 6
+#define TCP_SYNCNT 7
+#define TCP_WINDOW_CLAMP 10
#define TCP_CONGESTION 13
+#define TCP_THIN_LINEAR_TIMEOUTS 16
+#define TCP_USER_TIMEOUT 18
+#define TCP_NOTSENT_LOWAT 25
+#define TCP_SAVE_SYN 27
#define TCP_CA_NAME_MAX 16
+#define TCP_NAGLE_OFF 1
#define ICSK_TIME_RETRANS 1
#define ICSK_TIME_PROBE0 3
@@ -49,6 +76,8 @@
#define sk_state __sk_common.skc_state
#define sk_v6_daddr __sk_common.skc_v6_daddr
#define sk_v6_rcv_saddr __sk_common.skc_v6_rcv_saddr
+#define sk_flags __sk_common.skc_flags
+#define sk_reuse __sk_common.skc_reuse
#define s6_addr32 in6_u.u6_addr32
diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c
new file mode 100644
index 000000000000..08b2597ce4b8
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c
@@ -0,0 +1,451 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#include "vmlinux.h"
+#include "bpf_tracing_net.h"
+#include <bpf/bpf_core_read.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
+extern unsigned long CONFIG_HZ __kconfig;
+
+const volatile char veth[IFNAMSIZ];
+const volatile int veth_ifindex;
+
+int nr_listen;
+int nr_passive;
+int nr_active;
+int nr_connect;
+int nr_binddev;
+int nr_socket_post_create;
+
+struct sockopt_test {
+ int opt;
+ int new;
+ int restore;
+ int expected;
+ int tcp_expected;
+ int flip:1;
+};
+
+static const char cubic_cc[] = "cubic";
+static const char reno_cc[] = "reno";
+
+static const struct sockopt_test sol_socket_tests[] = {
+ { .opt = SO_REUSEADDR, .flip = 1, },
+ { .opt = SO_SNDBUF, .new = 8123, .expected = 8123 * 2, },
+ { .opt = SO_RCVBUF, .new = 8123, .expected = 8123 * 2, },
+ { .opt = SO_KEEPALIVE, .flip = 1, },
+ { .opt = SO_PRIORITY, .new = 0xeb9f, .expected = 0xeb9f, },
+ { .opt = SO_REUSEPORT, .flip = 1, },
+ { .opt = SO_RCVLOWAT, .new = 8123, .expected = 8123, },
+ { .opt = SO_MARK, .new = 0xeb9f, .expected = 0xeb9f, },
+ { .opt = SO_MAX_PACING_RATE, .new = 0xeb9f, .expected = 0xeb9f, },
+ { .opt = SO_TXREHASH, .flip = 1, },
+ { .opt = 0, },
+};
+
+static const struct sockopt_test sol_tcp_tests[] = {
+ { .opt = TCP_NODELAY, .flip = 1, },
+ { .opt = TCP_MAXSEG, .new = 1314, .expected = 1314, },
+ { .opt = TCP_KEEPIDLE, .new = 123, .expected = 123, .restore = 321, },
+ { .opt = TCP_KEEPINTVL, .new = 123, .expected = 123, .restore = 321, },
+ { .opt = TCP_KEEPCNT, .new = 123, .expected = 123, .restore = 124, },
+ { .opt = TCP_SYNCNT, .new = 123, .expected = 123, .restore = 124, },
+ { .opt = TCP_WINDOW_CLAMP, .new = 8123, .expected = 8123, .restore = 8124, },
+ { .opt = TCP_CONGESTION, },
+ { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
+ { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, },
+ { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, },
+ { .opt = TCP_SAVE_SYN, .new = 1, .expected = 1, },
+ { .opt = 0, },
+};
+
+static const struct sockopt_test sol_ip_tests[] = {
+ { .opt = IP_TOS, .new = 0xe1, .expected = 0xe1, .tcp_expected = 0xe0, },
+ { .opt = 0, },
+};
+
+static const struct sockopt_test sol_ipv6_tests[] = {
+ { .opt = IPV6_TCLASS, .new = 0xe1, .expected = 0xe1, .tcp_expected = 0xe0, },
+ { .opt = IPV6_AUTOFLOWLABEL, .flip = 1, },
+ { .opt = 0, },
+};
+
+struct loop_ctx {
+ void *ctx;
+ struct sock *sk;
+};
+
+static int __bpf_getsockopt(void *ctx, struct sock *sk,
+ int level, int opt, int *optval,
+ int optlen)
+{
+ if (level == SOL_SOCKET) {
+ switch (opt) {
+ case SO_REUSEADDR:
+ *optval = !!BPF_CORE_READ_BITFIELD(sk, sk_reuse);
+ break;
+ case SO_KEEPALIVE:
+ *optval = !!(sk->sk_flags & (1UL << 3));
+ break;
+ case SO_RCVLOWAT:
+ *optval = sk->sk_rcvlowat;
+ break;
+ case SO_MAX_PACING_RATE:
+ *optval = sk->sk_max_pacing_rate;
+ break;
+ default:
+ return bpf_getsockopt(ctx, level, opt, optval, optlen);
+ }
+ return 0;
+ }
+
+ if (level == IPPROTO_TCP) {
+ struct tcp_sock *tp = bpf_skc_to_tcp_sock(sk);
+
+ if (!tp)
+ return -1;
+
+ switch (opt) {
+ case TCP_NODELAY:
+ *optval = !!(BPF_CORE_READ_BITFIELD(tp, nonagle) & TCP_NAGLE_OFF);
+ break;
+ case TCP_MAXSEG:
+ *optval = tp->rx_opt.user_mss;
+ break;
+ case TCP_KEEPIDLE:
+ *optval = tp->keepalive_time / CONFIG_HZ;
+ break;
+ case TCP_SYNCNT:
+ *optval = tp->inet_conn.icsk_syn_retries;
+ break;
+ case TCP_KEEPINTVL:
+ *optval = tp->keepalive_intvl / CONFIG_HZ;
+ break;
+ case TCP_KEEPCNT:
+ *optval = tp->keepalive_probes;
+ break;
+ case TCP_WINDOW_CLAMP:
+ *optval = tp->window_clamp;
+ break;
+ case TCP_THIN_LINEAR_TIMEOUTS:
+ *optval = !!BPF_CORE_READ_BITFIELD(tp, thin_lto);
+ break;
+ case TCP_USER_TIMEOUT:
+ *optval = tp->inet_conn.icsk_user_timeout;
+ break;
+ case TCP_NOTSENT_LOWAT:
+ *optval = tp->notsent_lowat;
+ break;
+ case TCP_SAVE_SYN:
+ *optval = BPF_CORE_READ_BITFIELD(tp, save_syn);
+ break;
+ default:
+ return bpf_getsockopt(ctx, level, opt, optval, optlen);
+ }
+ return 0;
+ }
+
+ if (level == IPPROTO_IPV6) {
+ switch (opt) {
+ case IPV6_AUTOFLOWLABEL: {
+ __u16 proto = sk->sk_protocol;
+ struct inet_sock *inet_sk;
+
+ if (proto == IPPROTO_TCP)
+ inet_sk = (struct inet_sock *)bpf_skc_to_tcp_sock(sk);
+ else
+ inet_sk = (struct inet_sock *)bpf_skc_to_udp6_sock(sk);
+
+ if (!inet_sk)
+ return -1;
+
+ *optval = !!inet_sk->pinet6->autoflowlabel;
+ break;
+ }
+ default:
+ return bpf_getsockopt(ctx, level, opt, optval, optlen);
+ }
+ return 0;
+ }
+
+ return bpf_getsockopt(ctx, level, opt, optval, optlen);
+}
+
+static int bpf_test_sockopt_flip(void *ctx, struct sock *sk,
+ const struct sockopt_test *t,
+ int level)
+{
+ int old, tmp, new, opt = t->opt;
+
+ opt = t->opt;
+
+ if (__bpf_getsockopt(ctx, sk, level, opt, &old, sizeof(old)))
+ return 1;
+ /* kernel initialized txrehash to 255 */
+ if (level == SOL_SOCKET && opt == SO_TXREHASH && old != 0 && old != 1)
+ old = 1;
+
+ new = !old;
+ if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
+ return 1;
+ if (__bpf_getsockopt(ctx, sk, level, opt, &tmp, sizeof(tmp)) ||
+ tmp != new)
+ return 1;
+
+ if (bpf_setsockopt(ctx, level, opt, &old, sizeof(old)))
+ return 1;
+
+ return 0;
+}
+
+static int bpf_test_sockopt_int(void *ctx, struct sock *sk,
+ const struct sockopt_test *t,
+ int level)
+{
+ int old, tmp, new, expected, opt;
+
+ opt = t->opt;
+ new = t->new;
+ if (sk->sk_type == SOCK_STREAM && t->tcp_expected)
+ expected = t->tcp_expected;
+ else
+ expected = t->expected;
+
+ if (__bpf_getsockopt(ctx, sk, level, opt, &old, sizeof(old)) ||
+ old == new)
+ return 1;
+
+ if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new)))
+ return 1;
+ if (__bpf_getsockopt(ctx, sk, level, opt, &tmp, sizeof(tmp)) ||
+ tmp != expected)
+ return 1;
+
+ if (t->restore)
+ old = t->restore;
+ if (bpf_setsockopt(ctx, level, opt, &old, sizeof(old)))
+ return 1;
+
+ return 0;
+}
+
+static int bpf_test_socket_sockopt(__u32 i, struct loop_ctx *lc)
+{
+ const struct sockopt_test *t;
+
+ if (i >= ARRAY_SIZE(sol_socket_tests))
+ return 1;
+
+ t = &sol_socket_tests[i];
+ if (!t->opt)
+ return 1;
+
+ if (t->flip)
+ return bpf_test_sockopt_flip(lc->ctx, lc->sk, t, SOL_SOCKET);
+
+ return bpf_test_sockopt_int(lc->ctx, lc->sk, t, SOL_SOCKET);
+}
+
+static int bpf_test_ip_sockopt(__u32 i, struct loop_ctx *lc)
+{
+ const struct sockopt_test *t;
+
+ if (i >= ARRAY_SIZE(sol_ip_tests))
+ return 1;
+
+ t = &sol_ip_tests[i];
+ if (!t->opt)
+ return 1;
+
+ if (t->flip)
+ return bpf_test_sockopt_flip(lc->ctx, lc->sk, t, IPPROTO_IP);
+
+ return bpf_test_sockopt_int(lc->ctx, lc->sk, t, IPPROTO_IP);
+}
+
+static int bpf_test_ipv6_sockopt(__u32 i, struct loop_ctx *lc)
+{
+ const struct sockopt_test *t;
+
+ if (i >= ARRAY_SIZE(sol_ipv6_tests))
+ return 1;
+
+ t = &sol_ipv6_tests[i];
+ if (!t->opt)
+ return 1;
+
+ if (t->flip)
+ return bpf_test_sockopt_flip(lc->ctx, lc->sk, t, IPPROTO_IPV6);
+
+ return bpf_test_sockopt_int(lc->ctx, lc->sk, t, IPPROTO_IPV6);
+}
+
+static int bpf_test_tcp_sockopt(__u32 i, struct loop_ctx *lc)
+{
+ const struct sockopt_test *t;
+ struct sock *sk;
+ void *ctx;
+
+ if (i >= ARRAY_SIZE(sol_tcp_tests))
+ return 1;
+
+ t = &sol_tcp_tests[i];
+ if (!t->opt)
+ return 1;
+
+ ctx = lc->ctx;
+ sk = lc->sk;
+
+ if (t->opt == TCP_CONGESTION) {
+ char old_cc[16], tmp_cc[16];
+ const char *new_cc;
+
+ if (bpf_getsockopt(ctx, IPPROTO_TCP, TCP_CONGESTION, old_cc, sizeof(old_cc)))
+ return 1;
+ if (!bpf_strncmp(old_cc, sizeof(old_cc), cubic_cc))
+ new_cc = reno_cc;
+ else
+ new_cc = cubic_cc;
+ if (bpf_setsockopt(ctx, IPPROTO_TCP, TCP_CONGESTION, (void *)new_cc,
+ sizeof(new_cc)))
+ return 1;
+ if (bpf_getsockopt(ctx, IPPROTO_TCP, TCP_CONGESTION, tmp_cc, sizeof(tmp_cc)))
+ return 1;
+ if (bpf_strncmp(tmp_cc, sizeof(tmp_cc), new_cc))
+ return 1;
+ if (bpf_setsockopt(ctx, IPPROTO_TCP, TCP_CONGESTION, old_cc, sizeof(old_cc)))
+ return 1;
+ return 0;
+ }
+
+ if (t->flip)
+ return bpf_test_sockopt_flip(ctx, sk, t, IPPROTO_TCP);
+
+ return bpf_test_sockopt_int(ctx, sk, t, IPPROTO_TCP);
+}
+
+static int bpf_test_sockopt(void *ctx, struct sock *sk)
+{
+ struct loop_ctx lc = { .ctx = ctx, .sk = sk, };
+ __u16 family, proto;
+ int n;
+
+ family = sk->sk_family;
+ proto = sk->sk_protocol;
+
+ n = bpf_loop(ARRAY_SIZE(sol_socket_tests), bpf_test_socket_sockopt, &lc, 0);
+ if (n != ARRAY_SIZE(sol_socket_tests))
+ return -1;
+
+ if (proto == IPPROTO_TCP) {
+ n = bpf_loop(ARRAY_SIZE(sol_tcp_tests), bpf_test_tcp_sockopt, &lc, 0);
+ if (n != ARRAY_SIZE(sol_tcp_tests))
+ return -1;
+ }
+
+ if (family == AF_INET) {
+ n = bpf_loop(ARRAY_SIZE(sol_ip_tests), bpf_test_ip_sockopt, &lc, 0);
+ if (n != ARRAY_SIZE(sol_ip_tests))
+ return -1;
+ } else {
+ n = bpf_loop(ARRAY_SIZE(sol_ipv6_tests), bpf_test_ipv6_sockopt, &lc, 0);
+ if (n != ARRAY_SIZE(sol_ipv6_tests))
+ return -1;
+ }
+
+ return 0;
+}
+
+static int binddev_test(void *ctx)
+{
+ const char empty_ifname[] = "";
+ int ifindex, zero = 0;
+
+ if (bpf_setsockopt(ctx, SOL_SOCKET, SO_BINDTODEVICE,
+ (void *)veth, sizeof(veth)))
+ return -1;
+ if (bpf_getsockopt(ctx, SOL_SOCKET, SO_BINDTOIFINDEX,
+ &ifindex, sizeof(int)) ||
+ ifindex != veth_ifindex)
+ return -1;
+
+ if (bpf_setsockopt(ctx, SOL_SOCKET, SO_BINDTODEVICE,
+ (void *)empty_ifname, sizeof(empty_ifname)))
+ return -1;
+ if (bpf_getsockopt(ctx, SOL_SOCKET, SO_BINDTOIFINDEX,
+ &ifindex, sizeof(int)) ||
+ ifindex != 0)
+ return -1;
+
+ if (bpf_setsockopt(ctx, SOL_SOCKET, SO_BINDTOIFINDEX,
+ (void *)&veth_ifindex, sizeof(int)))
+ return -1;
+ if (bpf_getsockopt(ctx, SOL_SOCKET, SO_BINDTOIFINDEX,
+ &ifindex, sizeof(int)) ||
+ ifindex != veth_ifindex)
+ return -1;
+
+ if (bpf_setsockopt(ctx, SOL_SOCKET, SO_BINDTOIFINDEX,
+ &zero, sizeof(int)))
+ return -1;
+ if (bpf_getsockopt(ctx, SOL_SOCKET, SO_BINDTOIFINDEX,
+ &ifindex, sizeof(int)) ||
+ ifindex != 0)
+ return -1;
+
+ return 0;
+}
+
+SEC("lsm_cgroup/socket_post_create")
+int BPF_PROG(socket_post_create, struct socket *sock, int family,
+ int type, int protocol, int kern)
+{
+ struct sock *sk = sock->sk;
+
+ if (!sk)
+ return 1;
+
+ nr_socket_post_create += !bpf_test_sockopt(sk, sk);
+ nr_binddev += !binddev_test(sk);
+
+ return 1;
+}
+
+SEC("sockops")
+int skops_sockopt(struct bpf_sock_ops *skops)
+{
+ struct bpf_sock *bpf_sk = skops->sk;
+ struct sock *sk;
+
+ if (!bpf_sk)
+ return 1;
+
+ sk = (struct sock *)bpf_skc_to_tcp_sock(bpf_sk);
+ if (!sk)
+ return 1;
+
+ switch (skops->op) {
+ case BPF_SOCK_OPS_TCP_LISTEN_CB:
+ nr_listen += !bpf_test_sockopt(skops, sk);
+ break;
+ case BPF_SOCK_OPS_TCP_CONNECT_CB:
+ nr_connect += !bpf_test_sockopt(skops, sk);
+ break;
+ case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
+ nr_active += !bpf_test_sockopt(skops, sk);
+ break;
+ case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
+ nr_passive += !bpf_test_sockopt(skops, sk);
+ break;
+ }
+
+ return 1;
+}
+
+char _license[] SEC("license") = "GPL";
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt()
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
` (14 preceding siblings ...)
2022-08-10 19:09 ` [PATCH v3 bpf-next 15/15] selftests/bpf: bpf_setsockopt tests Martin KaFai Lau
@ 2022-08-11 17:04 ` sdf
2022-08-15 22:04 ` Daniel Borkmann
15 siblings, 1 reply; 21+ messages in thread
From: sdf @ 2022-08-11 17:04 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, netdev, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, David Miller, Eric Dumazet, Jakub Kicinski,
kernel-team, Paolo Abeni
On 08/10, Martin KaFai Lau wrote:
> The code in bpf_setsockopt() is mostly a copy-and-paste from
> the sock_setsockopt(), do_tcp_setsockopt(), do_ipv6_setsockopt(),
> and do_ip_setsockopt(). As the allowed optnames in bpf_setsockopt()
> grows, so are the duplicated code. The code between the copies
> also slowly drifted.
> This set is an effort to clean this up and reuse the existing
> {sock,do_tcp,do_ipv6,do_ip}_setsockopt() as much as possible.
> After the clean up, this set also adds a few allowed optnames
> that we need to the bpf_setsockopt().
> The initial attempt was to clean up both bpf_setsockopt() and
> bpf_getsockopt() together. However, the patch set was getting
> too long. It is beneficial to leave the bpf_getsockopt()
> out for another patch set. Thus, this set is focusing
> on the bpf_setsockopt().
> v3:
> - s/in_bpf/has_current_bpf_ctx/ (Andrii)
> - Add comments to has_current_bpf_ctx() and sockopt_lock_sock()
> (Stanislav)
> - Use vmlinux.h in selftest and add defines to bpf_tracing_net.h
> (Stanislav)
> - Use bpf_getsockopt(SO_MARK) in selftest (Stanislav)
> - Use BPF_CORE_READ_BITFIELD in selftest (Yonghong)
Reviewed-by: Stanislav Fomichev <sdf@google.com>
(I didn't go super deep on the selftest)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt()
2022-08-11 17:04 ` [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() sdf
@ 2022-08-15 22:04 ` Daniel Borkmann
2022-08-17 5:23 ` Martin KaFai Lau
0 siblings, 1 reply; 21+ messages in thread
From: Daniel Borkmann @ 2022-08-15 22:04 UTC (permalink / raw)
To: sdf, Martin KaFai Lau
Cc: bpf, netdev, Alexei Starovoitov, Andrii Nakryiko, David Miller,
Eric Dumazet, Jakub Kicinski, kernel-team, Paolo Abeni
On 8/11/22 7:04 PM, sdf@google.com wrote:
> On 08/10, Martin KaFai Lau wrote:
>> The code in bpf_setsockopt() is mostly a copy-and-paste from
>> the sock_setsockopt(), do_tcp_setsockopt(), do_ipv6_setsockopt(),
>> and do_ip_setsockopt(). As the allowed optnames in bpf_setsockopt()
>> grows, so are the duplicated code. The code between the copies
>> also slowly drifted.
>
>> This set is an effort to clean this up and reuse the existing
>> {sock,do_tcp,do_ipv6,do_ip}_setsockopt() as much as possible.
>
>> After the clean up, this set also adds a few allowed optnames
>> that we need to the bpf_setsockopt().
>
>> The initial attempt was to clean up both bpf_setsockopt() and
>> bpf_getsockopt() together. However, the patch set was getting
>> too long. It is beneficial to leave the bpf_getsockopt()
>> out for another patch set. Thus, this set is focusing
>> on the bpf_setsockopt().
>
>> v3:
>> - s/in_bpf/has_current_bpf_ctx/ (Andrii)
>> - Add comments to has_current_bpf_ctx() and sockopt_lock_sock()
>> (Stanislav)
>> - Use vmlinux.h in selftest and add defines to bpf_tracing_net.h
>> (Stanislav)
>> - Use bpf_getsockopt(SO_MARK) in selftest (Stanislav)
>> - Use BPF_CORE_READ_BITFIELD in selftest (Yonghong)
>
> Reviewed-by: Stanislav Fomichev <sdf@google.com>
>
> (I didn't go super deep on the selftest)
Looks like that one throws a build error, fwiw:
https://github.com/kernel-patches/bpf/runs/7844497492?check_suite_focus=true
[...]
CLNG-BPF [test_maps] kfunc_call_test_subprog.o
CLNG-BPF [test_maps] bpf_iter_test_kern6.o
progs/setget_sockopt.c:39:33: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = SO_REUSEADDR, .flip = 1, },
^
progs/setget_sockopt.c:42:33: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = SO_KEEPALIVE, .flip = 1, },
^
progs/setget_sockopt.c:44:33: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = SO_REUSEPORT, .flip = 1, },
^
CLNG-BPF [test_maps] btf__core_reloc_type_id.o
progs/setget_sockopt.c:48:32: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = SO_TXREHASH, .flip = 1, },
^
progs/setget_sockopt.c:53:32: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = TCP_NODELAY, .flip = 1, },
^
progs/setget_sockopt.c:61:45: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
^
progs/setget_sockopt.c:75:39: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
{ .opt = IPV6_AUTOFLOWLABEL, .flip = 1, },
^
7 errors generated.
make: *** [Makefile:521: /tmp/runner/work/bpf/bpf/tools/testing/selftests/bpf/setget_sockopt.o] Error 1
make: *** Waiting for unfinished jobs....
make: Leaving directory '/tmp/runner/work/bpf/bpf/tools/testing/selftests/bpf'
Error: Process completed with exit code 2.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf
2022-08-10 19:07 ` [PATCH v3 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf Martin KaFai Lau
@ 2022-08-16 3:32 ` Andrii Nakryiko
0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2022-08-16 3:32 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, netdev, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, David Miller, Eric Dumazet, Jakub Kicinski,
kernel-team, Paolo Abeni, Stanislav Fomichev
On Wed, Aug 10, 2022 at 12:10 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> Most of the code in bpf_setsockopt(SOL_SOCKET) are duplicated from
> the sk_setsockopt(). The number of supported optnames are
> increasing ever and so as the duplicated code.
>
> One issue in reusing sk_setsockopt() is that the bpf prog
> has already acquired the sk lock. This patch adds a
> has_current_bpf_ctx() to tell if the sk_setsockopt() is called from
> a bpf prog. The bpf prog calling bpf_setsockopt() is either running
> in_task() or in_serving_softirq(). Both cases have the current->bpf_ctx
> initialized. Thus, the has_current_bpf_ctx() only needs to
> test !!current->bpf_ctx.
>
> This patch also adds sockopt_{lock,release}_sock() helpers
> for sk_setsockopt() to use. These helpers will test
> has_current_bpf_ctx() before acquiring/releasing the lock. They are
> in EXPORT_SYMBOL for the ipv6 module to use in a latter patch.
>
> Note on the change in sock_setbindtodevice(). sockopt_lock_sock()
> is done in sock_setbindtodevice() instead of doing the lock_sock
> in sock_bindtoindex(..., lock_sk = true).
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---
> include/linux/bpf.h | 14 ++++++++++++++
> include/net/sock.h | 3 +++
> net/core/sock.c | 30 +++++++++++++++++++++++++++---
> 3 files changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a627a02cf8ab..0a600b2013cc 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1966,6 +1966,16 @@ static inline bool unprivileged_ebpf_enabled(void)
> return !sysctl_unprivileged_bpf_disabled;
> }
>
> +/* Not all bpf prog type has the bpf_ctx.
> + * Only trampoline and cgroup-bpf have it.
this is not true already (perf_event and kprobe/uprobe/tp progs have
bpf_ctx as well) and can easily get out of sync in the future, so I'd
drop the list of types that support bpf_ctx.
> + * For the bpf prog type that has initialized the bpf_ctx,
> + * this function can be used to decide if a kernel function
> + * is called by a bpf program.
> + */
> +static inline bool has_current_bpf_ctx(void)
> +{
> + return !!current->bpf_ctx;
> +}
> #else /* !CONFIG_BPF_SYSCALL */
[...]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 bpf-next 07/15] bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog()
2022-08-10 19:08 ` [PATCH v3 bpf-next 07/15] bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog() Martin KaFai Lau
@ 2022-08-16 3:33 ` Andrii Nakryiko
0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2022-08-16 3:33 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, netdev, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, David Miller, Eric Dumazet, Jakub Kicinski,
kernel-team, Paolo Abeni, Stanislav Fomichev
On Wed, Aug 10, 2022 at 12:11 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> The bpf-iter-prog for tcp and unix sk can do bpf_setsockopt()
> which needs has_current_bpf_ctx() to decide if it is called by a
> bpf prog. This patch initializes the bpf_run_ctx in
> bpf_iter_run_prog() for the has_current_bpf_ctx() to use.
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---
> include/linux/bpf.h | 2 +-
> kernel/bpf/bpf_iter.c | 5 +++++
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 0a600b2013cc..15ab980e9525 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1967,7 +1967,7 @@ static inline bool unprivileged_ebpf_enabled(void)
> }
>
> /* Not all bpf prog type has the bpf_ctx.
> - * Only trampoline and cgroup-bpf have it.
> + * Only trampoline, cgroup-bpf, and iter have it.
Apart from this part which I'd drop, lgtm:
Acked-by: Andrii Nakryiko <andrii@kernel.org>
> * For the bpf prog type that has initialized the bpf_ctx,
> * this function can be used to decide if a kernel function
> * is called by a bpf program.
> diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
> index 4b112aa8bba3..6476b2c03527 100644
> --- a/kernel/bpf/bpf_iter.c
> +++ b/kernel/bpf/bpf_iter.c
> @@ -685,19 +685,24 @@ struct bpf_prog *bpf_iter_get_info(struct bpf_iter_meta *meta, bool in_stop)
>
> int bpf_iter_run_prog(struct bpf_prog *prog, void *ctx)
> {
> + struct bpf_run_ctx run_ctx, *old_run_ctx;
> int ret;
>
> if (prog->aux->sleepable) {
> rcu_read_lock_trace();
> migrate_disable();
> might_fault();
> + old_run_ctx = bpf_set_run_ctx(&run_ctx);
> ret = bpf_prog_run(prog, ctx);
> + bpf_reset_run_ctx(old_run_ctx);
> migrate_enable();
> rcu_read_unlock_trace();
> } else {
> rcu_read_lock();
> migrate_disable();
> + old_run_ctx = bpf_set_run_ctx(&run_ctx);
> ret = bpf_prog_run(prog, ctx);
> + bpf_reset_run_ctx(old_run_ctx);
> migrate_enable();
> rcu_read_unlock();
> }
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt()
2022-08-15 22:04 ` Daniel Borkmann
@ 2022-08-17 5:23 ` Martin KaFai Lau
0 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2022-08-17 5:23 UTC (permalink / raw)
To: Daniel Borkmann
Cc: sdf, bpf, netdev, Alexei Starovoitov, Andrii Nakryiko,
David Miller, Eric Dumazet, Jakub Kicinski, kernel-team,
Paolo Abeni
On Tue, Aug 16, 2022 at 12:04:52AM +0200, Daniel Borkmann wrote:
> On 8/11/22 7:04 PM, sdf@google.com wrote:
> > On 08/10, Martin KaFai Lau wrote:
> > > The code in bpf_setsockopt() is mostly a copy-and-paste from
> > > the sock_setsockopt(), do_tcp_setsockopt(), do_ipv6_setsockopt(),
> > > and do_ip_setsockopt(). As the allowed optnames in bpf_setsockopt()
> > > grows, so are the duplicated code. The code between the copies
> > > also slowly drifted.
> >
> > > This set is an effort to clean this up and reuse the existing
> > > {sock,do_tcp,do_ipv6,do_ip}_setsockopt() as much as possible.
> >
> > > After the clean up, this set also adds a few allowed optnames
> > > that we need to the bpf_setsockopt().
> >
> > > The initial attempt was to clean up both bpf_setsockopt() and
> > > bpf_getsockopt() together. However, the patch set was getting
> > > too long. It is beneficial to leave the bpf_getsockopt()
> > > out for another patch set. Thus, this set is focusing
> > > on the bpf_setsockopt().
> >
> > > v3:
> > > - s/in_bpf/has_current_bpf_ctx/ (Andrii)
> > > - Add comments to has_current_bpf_ctx() and sockopt_lock_sock()
> > > (Stanislav)
> > > - Use vmlinux.h in selftest and add defines to bpf_tracing_net.h
> > > (Stanislav)
> > > - Use bpf_getsockopt(SO_MARK) in selftest (Stanislav)
> > > - Use BPF_CORE_READ_BITFIELD in selftest (Yonghong)
> >
> > Reviewed-by: Stanislav Fomichev <sdf@google.com>
> >
> > (I didn't go super deep on the selftest)
>
> Looks like that one throws a build error, fwiw:
>
> https://github.com/kernel-patches/bpf/runs/7844497492?check_suite_focus=true
>
> [...]
> CLNG-BPF [test_maps] kfunc_call_test_subprog.o
> CLNG-BPF [test_maps] bpf_iter_test_kern6.o
> progs/setget_sockopt.c:39:33: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = SO_REUSEADDR, .flip = 1, },
> ^
> progs/setget_sockopt.c:42:33: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = SO_KEEPALIVE, .flip = 1, },
> ^
> progs/setget_sockopt.c:44:33: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = SO_REUSEPORT, .flip = 1, },
> ^
> CLNG-BPF [test_maps] btf__core_reloc_type_id.o
> progs/setget_sockopt.c:48:32: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = SO_TXREHASH, .flip = 1, },
> ^
> progs/setget_sockopt.c:53:32: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = TCP_NODELAY, .flip = 1, },
> ^
> progs/setget_sockopt.c:61:45: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, },
> ^
> progs/setget_sockopt.c:75:39: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion]
> { .opt = IPV6_AUTOFLOWLABEL, .flip = 1, },
> ^
> 7 errors generated.
> make: *** [Makefile:521: /tmp/runner/work/bpf/bpf/tools/testing/selftests/bpf/setget_sockopt.o] Error 1
> make: *** Waiting for unfinished jobs....
> make: Leaving directory '/tmp/runner/work/bpf/bpf/tools/testing/selftests/bpf'
> Error: Process completed with exit code 2.
Thanks for the report. I also see it after moving from clang 15 to 16.
I will address it in v4.
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-08-17 5:24 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-10 19:07 [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 01/15] net: Add sk_setsockopt() to take the sk ptr instead of the sock ptr Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 02/15] bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf Martin KaFai Lau
2022-08-16 3:32 ` Andrii Nakryiko
2022-08-10 19:07 ` [PATCH v3 bpf-next 03/15] bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt() Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 04/15] bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable() Martin KaFai Lau
2022-08-10 19:07 ` [PATCH v3 bpf-next 05/15] bpf: net: Change do_ip_setsockopt() " Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 06/15] bpf: net: Change do_ipv6_setsockopt() " Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 07/15] bpf: Initialize the bpf_run_ctx in bpf_iter_run_prog() Martin KaFai Lau
2022-08-16 3:33 ` Andrii Nakryiko
2022-08-10 19:08 ` [PATCH v3 bpf-next 08/15] bpf: Embed kernel CONFIG check into the if statement in bpf_setsockopt Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 09/15] bpf: Change bpf_setsockopt(SOL_SOCKET) to reuse sk_setsockopt() Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 10/15] bpf: Refactor bpf specific tcp optnames to a new function Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 11/15] bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt() Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 12/15] bpf: Change bpf_setsockopt(SOL_IP) to reuse do_ip_setsockopt() Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 13/15] bpf: Change bpf_setsockopt(SOL_IPV6) to reuse do_ipv6_setsockopt() Martin KaFai Lau
2022-08-10 19:08 ` [PATCH v3 bpf-next 14/15] bpf: Add a few optnames to bpf_setsockopt Martin KaFai Lau
2022-08-10 19:09 ` [PATCH v3 bpf-next 15/15] selftests/bpf: bpf_setsockopt tests Martin KaFai Lau
2022-08-11 17:04 ` [PATCH v3 bpf-next 00/15] bpf: net: Remove duplicated code from bpf_setsockopt() sdf
2022-08-15 22:04 ` Daniel Borkmann
2022-08-17 5:23 ` Martin KaFai Lau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).