* [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs
@ 2022-06-14 10:44 Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-14 10:44 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, netdev, Jörn-Thorben Hinz
Addressed feedback by Martin KaFai Lau in this v3 of the series. See the
details below.
This series corrects some inconveniences for a BPF TCP CC that
implements and uses tcp_congestion_ops.cong_control(). Until now, such a
CC did not have all necessary write access to struct sock and
unnecessarily needed to implement cong_avoid().
v3:
- Add a selftest writing sk_pacing_*
- Add a selftest with incomplete tcp_congestion_ops
- Add a selftest with unsupported get_info()
- Remove an unused variable
- Reword a comment about reg() in bpf_struct_ops_map_update_elem()
v2:
- Drop redundant check for required functions and just rely on
tcp_register_congestion_control() (Martin KaFai Lau)
Jörn-Thorben Hinz (5):
bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status
bpf: Require only one of cong_avoid() and cong_control() from a TCP CC
selftests/bpf: Test a BPF CC writing sk_pacing_*
selftests/bpf: Test an incomplete BPF CC
selftests/bpf: Test a BPF CC implementing the unsupported get_info()
kernel/bpf/bpf_struct_ops.c | 7 +-
net/ipv4/bpf_tcp_ca.c | 39 ++---------
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 66 +++++++++++++++++++
.../bpf/progs/tcp_ca_incompl_cong_ops.c | 35 ++++++++++
.../bpf/progs/tcp_ca_unsupp_cong_op.c | 21 ++++++
.../bpf/progs/tcp_ca_write_sk_pacing.c | 60 +++++++++++++++++
6 files changed, 191 insertions(+), 37 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
--
2.30.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status
2022-06-14 10:44 [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
@ 2022-06-14 10:44 ` Jörn-Thorben Hinz
2022-06-17 20:21 ` Martin KaFai Lau
2022-06-14 10:44 ` [PATCH bpf-next v3 2/5] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC Jörn-Thorben Hinz
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-14 10:44 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, netdev, Jörn-Thorben Hinz
A CC that implements tcp_congestion_ops.cong_control() should be able to
control sk_pacing_rate and set sk_pacing_status, since
tcp_update_pacing_rate() is never called in this case. A built-in CC or
one from a kernel module is already able to write to both struct sock
members. For a BPF program, write access has not been allowed, yet.
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
---
net/ipv4/bpf_tcp_ca.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index f79ab942f03b..1f5c53ede4e5 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -111,6 +111,12 @@ static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log,
}
switch (off) {
+ case offsetof(struct sock, sk_pacing_rate):
+ end = offsetofend(struct sock, sk_pacing_rate);
+ break;
+ case offsetof(struct sock, sk_pacing_status):
+ end = offsetofend(struct sock, sk_pacing_status);
+ break;
case bpf_ctx_range(struct inet_connection_sock, icsk_ca_priv):
end = offsetofend(struct inet_connection_sock, icsk_ca_priv);
break;
--
2.30.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH bpf-next v3 2/5] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC
2022-06-14 10:44 [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
@ 2022-06-14 10:44 ` Jörn-Thorben Hinz
2022-06-17 20:26 ` Martin KaFai Lau
2022-06-14 10:44 ` [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_* Jörn-Thorben Hinz
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-14 10:44 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, netdev, Jörn-Thorben Hinz
Remove the check for required and optional functions in a struct
tcp_congestion_ops from bpf_tcp_ca.c. Rely on
tcp_register_congestion_control() to reject a BPF CC that does not
implement all required functions, as it will do for a non-BPF CC.
When a CC implements tcp_congestion_ops.cong_control(), the alternate
cong_avoid() is not in use in the TCP stack. Previously, a BPF CC was
still forced to implement cong_avoid() as a no-op since it was
non-optional in bpf_tcp_ca.c.
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
---
kernel/bpf/bpf_struct_ops.c | 7 +++----
net/ipv4/bpf_tcp_ca.c | 33 ---------------------------------
2 files changed, 3 insertions(+), 37 deletions(-)
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index d9a3c9207240..7e0068c3399c 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -503,10 +503,9 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
goto unlock;
}
- /* Error during st_ops->reg(). It is very unlikely since
- * the above init_member() should have caught it earlier
- * before reg(). The only possibility is if there was a race
- * in registering the struct_ops (under the same name) to
+ /* Error during st_ops->reg(). Can happen if this struct_ops needs to be
+ * verified as a whole, after all init_member() calls. Can also happen if
+ * there was a race in registering the struct_ops (under the same name) to
* a sub-system through different struct_ops's maps.
*/
set_memory_nx((long)st_map->image, 1);
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 1f5c53ede4e5..7a181631b995 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -14,18 +14,6 @@
/* "extern" is to avoid sparse warning. It is only used in bpf_struct_ops.c. */
extern struct bpf_struct_ops bpf_tcp_congestion_ops;
-static u32 optional_ops[] = {
- offsetof(struct tcp_congestion_ops, init),
- offsetof(struct tcp_congestion_ops, release),
- offsetof(struct tcp_congestion_ops, set_state),
- offsetof(struct tcp_congestion_ops, cwnd_event),
- offsetof(struct tcp_congestion_ops, in_ack_event),
- offsetof(struct tcp_congestion_ops, pkts_acked),
- offsetof(struct tcp_congestion_ops, min_tso_segs),
- offsetof(struct tcp_congestion_ops, sndbuf_expand),
- offsetof(struct tcp_congestion_ops, cong_control),
-};
-
static u32 unsupported_ops[] = {
offsetof(struct tcp_congestion_ops, get_info),
};
@@ -51,18 +39,6 @@ static int bpf_tcp_ca_init(struct btf *btf)
return 0;
}
-static bool is_optional(u32 member_offset)
-{
- unsigned int i;
-
- for (i = 0; i < ARRAY_SIZE(optional_ops); i++) {
- if (member_offset == optional_ops[i])
- return true;
- }
-
- return false;
-}
-
static bool is_unsupported(u32 member_offset)
{
unsigned int i;
@@ -246,7 +222,6 @@ static int bpf_tcp_ca_init_member(const struct btf_type *t,
{
const struct tcp_congestion_ops *utcp_ca;
struct tcp_congestion_ops *tcp_ca;
- int prog_fd;
u32 moff;
utcp_ca = (const struct tcp_congestion_ops *)udata;
@@ -268,14 +243,6 @@ static int bpf_tcp_ca_init_member(const struct btf_type *t,
return 1;
}
- if (!btf_type_resolve_func_ptr(btf_vmlinux, member->type, NULL))
- return 0;
-
- /* Ensure bpf_prog is provided for compulsory func ptr */
- prog_fd = (int)(*(unsigned long *)(udata + moff));
- if (!prog_fd && !is_optional(moff) && !is_unsupported(moff))
- return -EINVAL;
-
return 0;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_*
2022-06-14 10:44 [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 2/5] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC Jörn-Thorben Hinz
@ 2022-06-14 10:44 ` Jörn-Thorben Hinz
2022-06-17 21:04 ` Martin KaFai Lau
2022-06-14 10:44 ` [PATCH bpf-next v3 4/5] selftests/bpf: Test an incomplete BPF CC Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 5/5] selftests/bpf: Test a BPF CC implementing the unsupported get_info() Jörn-Thorben Hinz
4 siblings, 1 reply; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-14 10:44 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, netdev, Jörn-Thorben Hinz
Test whether a TCP CC implemented in BPF is allowed to write
sk_pacing_rate and sk_pacing_status in struct sock. This is needed when
cong_control() is implemented and used.
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
---
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 21 +++++++
.../bpf/progs/tcp_ca_write_sk_pacing.c | 60 +++++++++++++++++++
2 files changed, 81 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
index e9a9a31b2ffe..a797497e2864 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
@@ -9,6 +9,7 @@
#include "bpf_cubic.skel.h"
#include "bpf_tcp_nogpl.skel.h"
#include "bpf_dctcp_release.skel.h"
+#include "tcp_ca_write_sk_pacing.skel.h"
#ifndef ENOTSUPP
#define ENOTSUPP 524
@@ -322,6 +323,24 @@ static void test_rel_setsockopt(void)
bpf_dctcp_release__destroy(rel_skel);
}
+static void test_write_sk_pacing(void)
+{
+ struct tcp_ca_write_sk_pacing *skel;
+ struct bpf_link *link;
+
+ skel = tcp_ca_write_sk_pacing__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load")) {
+ return;
+ }
+
+ link = bpf_map__attach_struct_ops(skel->maps.write_sk_pacing);
+ if (ASSERT_OK_PTR(link, "attach_struct_ops")) {
+ bpf_link__destroy(link);
+ }
+
+ tcp_ca_write_sk_pacing__destroy(skel);
+}
+
void test_bpf_tcp_ca(void)
{
if (test__start_subtest("dctcp"))
@@ -334,4 +353,6 @@ void test_bpf_tcp_ca(void)
test_dctcp_fallback();
if (test__start_subtest("rel_setsockopt"))
test_rel_setsockopt();
+ if (test__start_subtest("write_sk_pacing"))
+ test_write_sk_pacing();
}
diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
new file mode 100644
index 000000000000..43447704cf0e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#define USEC_PER_SEC 1000000UL
+
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+static inline struct tcp_sock *tcp_sk(const struct sock *sk)
+{
+ return (struct tcp_sock *)sk;
+}
+
+SEC("struct_ops/write_sk_pacing_init")
+void BPF_PROG(write_sk_pacing_init, struct sock *sk)
+{
+#ifdef ENABLE_ATOMICS_TESTS
+ __sync_bool_compare_and_swap(&sk->sk_pacing_status, SK_PACING_NONE,
+ SK_PACING_NEEDED);
+#else
+ sk->sk_pacing_status = SK_PACING_NEEDED;
+#endif
+}
+
+SEC("struct_ops/write_sk_pacing_cong_control")
+void BPF_PROG(write_sk_pacing_cong_control, struct sock *sk,
+ const struct rate_sample *rs)
+{
+ const struct tcp_sock *tp = tcp_sk(sk);
+ unsigned long rate =
+ ((tp->snd_cwnd * tp->mss_cache * USEC_PER_SEC) << 3) /
+ (tp->srtt_us ?: 1U << 3);
+ sk->sk_pacing_rate = min(rate, sk->sk_max_pacing_rate);
+}
+
+SEC("struct_ops/write_sk_pacing_ssthresh")
+__u32 BPF_PROG(write_sk_pacing_ssthresh, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_ssthresh;
+}
+
+SEC("struct_ops/write_sk_pacing_undo_cwnd")
+__u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_cwnd;
+}
+
+SEC(".struct_ops")
+struct tcp_congestion_ops write_sk_pacing = {
+ .init = (void *)write_sk_pacing_init,
+ .cong_control = (void *)write_sk_pacing_cong_control,
+ .ssthresh = (void *)write_sk_pacing_ssthresh,
+ .undo_cwnd = (void *)write_sk_pacing_undo_cwnd,
+ .name = "bpf_w_sk_pacing",
+};
--
2.30.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH bpf-next v3 4/5] selftests/bpf: Test an incomplete BPF CC
2022-06-14 10:44 [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
` (2 preceding siblings ...)
2022-06-14 10:44 ` [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_* Jörn-Thorben Hinz
@ 2022-06-14 10:44 ` Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 5/5] selftests/bpf: Test a BPF CC implementing the unsupported get_info() Jörn-Thorben Hinz
4 siblings, 0 replies; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-14 10:44 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, netdev, Jörn-Thorben Hinz
Test whether a TCP CC implemented in BPF providing neither cong_avoid()
nor cong_control() is correctly rejected. This check solely depends on
tcp_register_congestion_control() now, which is invoked during
bpf_map__attach_struct_ops().
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
---
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 25 +++++++++++++
.../bpf/progs/tcp_ca_incompl_cong_ops.c | 35 +++++++++++++++++++
2 files changed, 60 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
index a797497e2864..cc2a8a12ca67 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
@@ -10,6 +10,7 @@
#include "bpf_tcp_nogpl.skel.h"
#include "bpf_dctcp_release.skel.h"
#include "tcp_ca_write_sk_pacing.skel.h"
+#include "tcp_ca_incompl_cong_ops.skel.h"
#ifndef ENOTSUPP
#define ENOTSUPP 524
@@ -341,6 +342,28 @@ static void test_write_sk_pacing(void)
tcp_ca_write_sk_pacing__destroy(skel);
}
+static void test_incompl_cong_ops(void)
+{
+ struct tcp_ca_incompl_cong_ops *skel;
+ struct bpf_link *link;
+
+ skel = tcp_ca_incompl_cong_ops__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load")) {
+ return;
+ }
+
+ /* That cong_avoid() and cong_control() are missing is only reported at
+ * this point:
+ */
+ link = bpf_map__attach_struct_ops(skel->maps.incompl_cong_ops);
+ if (!ASSERT_ERR_PTR(link, "attach_struct_ops")) {
+ /* Unexpected success?! */
+ bpf_link__destroy(link);
+ }
+
+ tcp_ca_incompl_cong_ops__destroy(skel);
+}
+
void test_bpf_tcp_ca(void)
{
if (test__start_subtest("dctcp"))
@@ -355,4 +378,6 @@ void test_bpf_tcp_ca(void)
test_rel_setsockopt();
if (test__start_subtest("write_sk_pacing"))
test_write_sk_pacing();
+ if (test__start_subtest("incompl_cong_ops"))
+ test_incompl_cong_ops();
}
diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c b/tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
new file mode 100644
index 000000000000..7bb872fb22dd
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+static inline struct tcp_sock *tcp_sk(const struct sock *sk)
+{
+ return (struct tcp_sock *)sk;
+}
+
+SEC("struct_ops/incompl_cong_ops_ssthresh")
+__u32 BPF_PROG(incompl_cong_ops_ssthresh, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_ssthresh;
+}
+
+SEC("struct_ops/incompl_cong_ops_undo_cwnd")
+__u32 BPF_PROG(incompl_cong_ops_undo_cwnd, struct sock *sk)
+{
+ return tcp_sk(sk)->snd_cwnd;
+}
+
+SEC(".struct_ops")
+struct tcp_congestion_ops incompl_cong_ops = {
+ /* Intentionally leaving out any of the required cong_avoid() and
+ * cong_control() here.
+ */
+ .ssthresh = (void *)incompl_cong_ops_ssthresh,
+ .undo_cwnd = (void *)incompl_cong_ops_undo_cwnd,
+ .name = "bpf_incompl_ops",
+};
--
2.30.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH bpf-next v3 5/5] selftests/bpf: Test a BPF CC implementing the unsupported get_info()
2022-06-14 10:44 [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
` (3 preceding siblings ...)
2022-06-14 10:44 ` [PATCH bpf-next v3 4/5] selftests/bpf: Test an incomplete BPF CC Jörn-Thorben Hinz
@ 2022-06-14 10:44 ` Jörn-Thorben Hinz
2022-06-17 21:22 ` Martin KaFai Lau
4 siblings, 1 reply; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-14 10:44 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, netdev, Jörn-Thorben Hinz
Test whether a TCP CC implemented in BPF providing get_info() is
rejected correctly. get_info() is unsupported in a BPF CC. The check for
required functions in a BPF CC has moved, this test ensures unsupported
functions are still rejected correctly.
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
---
.../selftests/bpf/prog_tests/bpf_tcp_ca.c | 20 ++++++++++++++++++
.../bpf/progs/tcp_ca_unsupp_cong_op.c | 21 +++++++++++++++++++
2 files changed, 41 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
index cc2a8a12ca67..b15a4b6ea4ae 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
@@ -11,6 +11,7 @@
#include "bpf_dctcp_release.skel.h"
#include "tcp_ca_write_sk_pacing.skel.h"
#include "tcp_ca_incompl_cong_ops.skel.h"
+#include "tcp_ca_unsupp_cong_op.skel.h"
#ifndef ENOTSUPP
#define ENOTSUPP 524
@@ -364,6 +365,23 @@ static void test_incompl_cong_ops(void)
tcp_ca_incompl_cong_ops__destroy(skel);
}
+static void test_unsupp_cong_op(void)
+{
+ libbpf_print_fn_t old_print_fn;
+ struct tcp_ca_unsupp_cong_op *skel;
+
+ err_str = "attach to unsupported member get_info";
+ found = false;
+ old_print_fn = libbpf_set_print(libbpf_debug_print);
+
+ skel = tcp_ca_unsupp_cong_op__open_and_load();
+ ASSERT_NULL(skel, "open_and_load");
+ ASSERT_EQ(found, true, "expected_err_msg");
+
+ tcp_ca_unsupp_cong_op__destroy(skel);
+ libbpf_set_print(old_print_fn);
+}
+
void test_bpf_tcp_ca(void)
{
if (test__start_subtest("dctcp"))
@@ -380,4 +398,6 @@ void test_bpf_tcp_ca(void)
test_write_sk_pacing();
if (test__start_subtest("incompl_cong_ops"))
test_incompl_cong_ops();
+ if (test__start_subtest("unsupp_cong_op"))
+ test_unsupp_cong_op();
}
diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c b/tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
new file mode 100644
index 000000000000..c06f4a41c21a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+SEC("struct_ops/unsupp_cong_op_get_info")
+size_t BPF_PROG(unsupp_cong_op_get_info, struct sock *sk, u32 ext, int *attr,
+ union tcp_cc_info *info)
+{
+ return 0;
+}
+
+SEC(".struct_ops")
+struct tcp_congestion_ops unsupp_cong_op = {
+ .get_info = (void *)unsupp_cong_op_get_info,
+ .name = "bpf_unsupp_op",
+};
--
2.30.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status
2022-06-14 10:44 ` [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
@ 2022-06-17 20:21 ` Martin KaFai Lau
0 siblings, 0 replies; 14+ messages in thread
From: Martin KaFai Lau @ 2022-06-17 20:21 UTC (permalink / raw)
To: Jörn-Thorben Hinz
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Tue, Jun 14, 2022 at 12:44:48PM +0200, Jörn-Thorben Hinz wrote:
> A CC that implements tcp_congestion_ops.cong_control() should be able to
> control sk_pacing_rate and set sk_pacing_status, since
> tcp_update_pacing_rate() is never called in this case. A built-in CC or
> one from a kernel module is already able to write to both struct sock
> members. For a BPF program, write access has not been allowed, yet.
>
> Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 2/5] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC
2022-06-14 10:44 ` [PATCH bpf-next v3 2/5] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC Jörn-Thorben Hinz
@ 2022-06-17 20:26 ` Martin KaFai Lau
0 siblings, 0 replies; 14+ messages in thread
From: Martin KaFai Lau @ 2022-06-17 20:26 UTC (permalink / raw)
To: Jörn-Thorben Hinz
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Tue, Jun 14, 2022 at 12:44:49PM +0200, Jörn-Thorben Hinz wrote:
> Remove the check for required and optional functions in a struct
> tcp_congestion_ops from bpf_tcp_ca.c. Rely on
> tcp_register_congestion_control() to reject a BPF CC that does not
> implement all required functions, as it will do for a non-BPF CC.
>
> When a CC implements tcp_congestion_ops.cong_control(), the alternate
> cong_avoid() is not in use in the TCP stack. Previously, a BPF CC was
> still forced to implement cong_avoid() as a no-op since it was
> non-optional in bpf_tcp_ca.c.
>
> Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_*
2022-06-14 10:44 ` [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_* Jörn-Thorben Hinz
@ 2022-06-17 21:04 ` Martin KaFai Lau
2022-06-18 16:43 ` Jörn-Thorben Hinz
0 siblings, 1 reply; 14+ messages in thread
From: Martin KaFai Lau @ 2022-06-17 21:04 UTC (permalink / raw)
To: Jörn-Thorben Hinz
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Tue, Jun 14, 2022 at 12:44:50PM +0200, Jörn-Thorben Hinz wrote:
> Test whether a TCP CC implemented in BPF is allowed to write
> sk_pacing_rate and sk_pacing_status in struct sock. This is needed when
> cong_control() is implemented and used.
>
> Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
> ---
> .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 21 +++++++
> .../bpf/progs/tcp_ca_write_sk_pacing.c | 60 +++++++++++++++++++
> 2 files changed, 81 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> index e9a9a31b2ffe..a797497e2864 100644
> --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> @@ -9,6 +9,7 @@
> #include "bpf_cubic.skel.h"
> #include "bpf_tcp_nogpl.skel.h"
> #include "bpf_dctcp_release.skel.h"
> +#include "tcp_ca_write_sk_pacing.skel.h"
>
> #ifndef ENOTSUPP
> #define ENOTSUPP 524
> @@ -322,6 +323,24 @@ static void test_rel_setsockopt(void)
> bpf_dctcp_release__destroy(rel_skel);
> }
>
> +static void test_write_sk_pacing(void)
> +{
> + struct tcp_ca_write_sk_pacing *skel;
> + struct bpf_link *link;
> +
> + skel = tcp_ca_write_sk_pacing__open_and_load();
> + if (!ASSERT_OK_PTR(skel, "open_and_load")) {
nit. Remove this single line '{'.
./scripts/checkpatch.pl has reported that also:
WARNING: braces {} are not necessary for single statement blocks
#43: FILE: tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c:332:
+ if (!ASSERT_OK_PTR(skel, "open_and_load")) {
+ return;
+ }
> + return;
> + }
> +
> + link = bpf_map__attach_struct_ops(skel->maps.write_sk_pacing);
> + if (ASSERT_OK_PTR(link, "attach_struct_ops")) {
Same here.
and no need to check the link before bpf_link__destroy.
bpf_link__destroy can handle error link. Something like:
ASSERT_OK_PTR(link, "attach_struct_ops");
bpf_link__destroy(link);
tcp_ca_write_sk_pacing__destroy(skel);
The earlier examples in test_cubic and test_dctcp were
written before bpf_link__destroy can handle error link.
> + bpf_link__destroy(link);
> + }
> +
> + tcp_ca_write_sk_pacing__destroy(skel);
> +}
> +
> void test_bpf_tcp_ca(void)
> {
> if (test__start_subtest("dctcp"))
> @@ -334,4 +353,6 @@ void test_bpf_tcp_ca(void)
> test_dctcp_fallback();
> if (test__start_subtest("rel_setsockopt"))
> test_rel_setsockopt();
> + if (test__start_subtest("write_sk_pacing"))
> + test_write_sk_pacing();
> }
> diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> new file mode 100644
> index 000000000000..43447704cf0e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> @@ -0,0 +1,60 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "vmlinux.h"
> +
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +#define USEC_PER_SEC 1000000UL
> +
> +#define min(a, b) ((a) < (b) ? (a) : (b))
> +
> +static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> +{
This helper is already available in bpf_tcp_helpers.h.
Is there a reason not to use that one and redefine
it in both patch 3 and 4? The mss_cache and srtt_us can be added
to bpf_tcp_helpers.h. It will need another effort to move
all selftest's bpf-cc to vmlinux.h.
> + return (struct tcp_sock *)sk;
> +}
> +
> +SEC("struct_ops/write_sk_pacing_init")
> +void BPF_PROG(write_sk_pacing_init, struct sock *sk)
> +{
> +#ifdef ENABLE_ATOMICS_TESTS
> + __sync_bool_compare_and_swap(&sk->sk_pacing_status, SK_PACING_NONE,
> + SK_PACING_NEEDED);
> +#else
> + sk->sk_pacing_status = SK_PACING_NEEDED;
> +#endif
> +}
> +
> +SEC("struct_ops/write_sk_pacing_cong_control")
> +void BPF_PROG(write_sk_pacing_cong_control, struct sock *sk,
> + const struct rate_sample *rs)
> +{
> + const struct tcp_sock *tp = tcp_sk(sk);
> + unsigned long rate =
> + ((tp->snd_cwnd * tp->mss_cache * USEC_PER_SEC) << 3) /
> + (tp->srtt_us ?: 1U << 3);
> + sk->sk_pacing_rate = min(rate, sk->sk_max_pacing_rate);
> +}
> +
> +SEC("struct_ops/write_sk_pacing_ssthresh")
> +__u32 BPF_PROG(write_sk_pacing_ssthresh, struct sock *sk)
> +{
> + return tcp_sk(sk)->snd_ssthresh;
> +}
> +
> +SEC("struct_ops/write_sk_pacing_undo_cwnd")
> +__u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk)
> +{
> + return tcp_sk(sk)->snd_cwnd;
> +}
> +
> +SEC(".struct_ops")
> +struct tcp_congestion_ops write_sk_pacing = {
> + .init = (void *)write_sk_pacing_init,
> + .cong_control = (void *)write_sk_pacing_cong_control,
> + .ssthresh = (void *)write_sk_pacing_ssthresh,
> + .undo_cwnd = (void *)write_sk_pacing_undo_cwnd,
> + .name = "bpf_w_sk_pacing",
> +};
> --
> 2.30.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 5/5] selftests/bpf: Test a BPF CC implementing the unsupported get_info()
2022-06-14 10:44 ` [PATCH bpf-next v3 5/5] selftests/bpf: Test a BPF CC implementing the unsupported get_info() Jörn-Thorben Hinz
@ 2022-06-17 21:22 ` Martin KaFai Lau
0 siblings, 0 replies; 14+ messages in thread
From: Martin KaFai Lau @ 2022-06-17 21:22 UTC (permalink / raw)
To: Jörn-Thorben Hinz
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Tue, Jun 14, 2022 at 12:44:52PM +0200, Jörn-Thorben Hinz wrote:
> Test whether a TCP CC implemented in BPF providing get_info() is
> rejected correctly. get_info() is unsupported in a BPF CC. The check for
> required functions in a BPF CC has moved, this test ensures unsupported
> functions are still rejected correctly.
>
> Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_*
2022-06-17 21:04 ` Martin KaFai Lau
@ 2022-06-18 16:43 ` Jörn-Thorben Hinz
2022-06-20 16:06 ` Yonghong Song
0 siblings, 1 reply; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-18 16:43 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Fri, 2022-06-17 at 14:04 -0700, Martin KaFai Lau wrote:
> On Tue, Jun 14, 2022 at 12:44:50PM +0200, Jörn-Thorben Hinz wrote:
> > Test whether a TCP CC implemented in BPF is allowed to write
> > sk_pacing_rate and sk_pacing_status in struct sock. This is needed
> > when
> > cong_control() is implemented and used.
> >
> > Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
> > ---
> > .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 21 +++++++
> > .../bpf/progs/tcp_ca_write_sk_pacing.c | 60
> > +++++++++++++++++++
> > 2 files changed, 81 insertions(+)
> > create mode 100644
> > tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > index e9a9a31b2ffe..a797497e2864 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > @@ -9,6 +9,7 @@
> > #include "bpf_cubic.skel.h"
> > #include "bpf_tcp_nogpl.skel.h"
> > #include "bpf_dctcp_release.skel.h"
> > +#include "tcp_ca_write_sk_pacing.skel.h"
> >
> > #ifndef ENOTSUPP
> > #define ENOTSUPP 524
> > @@ -322,6 +323,24 @@ static void test_rel_setsockopt(void)
> > bpf_dctcp_release__destroy(rel_skel);
> > }
> >
> > +static void test_write_sk_pacing(void)
> > +{
> > + struct tcp_ca_write_sk_pacing *skel;
> > + struct bpf_link *link;
> > +
> > + skel = tcp_ca_write_sk_pacing__open_and_load();
> > + if (!ASSERT_OK_PTR(skel, "open_and_load")) {
> nit. Remove this single line '{'.
>
> ./scripts/checkpatch.pl has reported that also:
> WARNING: braces {} are not necessary for single statement blocks
> #43: FILE: tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c:332:
> + if (!ASSERT_OK_PTR(skel, "open_and_load")) {
> + return;
> + }
Have to admit I knowingly disregarded that warning as more of a
recommendation. Out of habit and since I personally don’t see any
compelling reason to generally use single-line statements after ifs,
only multiple disadvantages.
But wrong place to argue here, of course. Will bow to the warning.
>
>
> > + return;
> > + }
> > +
> > + link = bpf_map__attach_struct_ops(skel-
> > >maps.write_sk_pacing);
> > + if (ASSERT_OK_PTR(link, "attach_struct_ops")) {
> Same here.
>
> and no need to check the link before bpf_link__destroy.
> bpf_link__destroy can handle error link. Something like:
>
> ASSERT_OK_PTR(link, "attach_struct_ops");
> bpf_link__destroy(link);
> tcp_ca_write_sk_pacing__destroy(skel);
>
> The earlier examples in test_cubic and test_dctcp were
> written before bpf_link__destroy can handle error link.
You are right, I followed the other two test_*() functions there. Good
to know that it behaves similar to (k)free() and others. Will remove
the ifs around bpf_link__destroy().
>
> > + bpf_link__destroy(link);
> > + }
> > +
> > + tcp_ca_write_sk_pacing__destroy(skel);
> > +}
> > +
> > void test_bpf_tcp_ca(void)
> > {
> > if (test__start_subtest("dctcp"))
> > @@ -334,4 +353,6 @@ void test_bpf_tcp_ca(void)
> > test_dctcp_fallback();
> > if (test__start_subtest("rel_setsockopt"))
> > test_rel_setsockopt();
> > + if (test__start_subtest("write_sk_pacing"))
> > + test_write_sk_pacing();
> > }
> > diff --git
> > a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > new file mode 100644
> > index 000000000000..43447704cf0e
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > @@ -0,0 +1,60 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include "vmlinux.h"
> > +
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +#define USEC_PER_SEC 1000000UL
> > +
> > +#define min(a, b) ((a) < (b) ? (a) : (b))
> > +
> > +static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > +{
> This helper is already available in bpf_tcp_helpers.h.
> Is there a reason not to use that one and redefine
> it in both patch 3 and 4? The mss_cache and srtt_us can be added
> to bpf_tcp_helpers.h. It will need another effort to move
> all selftest's bpf-cc to vmlinux.h.
I fully agree it’s not elegant to redefine tcp_sk() twice more.
It was between either using bpf_tcp_helpers.h and adding and
maintaining additional struct members there. Or using the (as I
understood it) more “modern” approach with vmlinux.h and redefining the
trivial tcp_sk(). I chose the later. Didn’t see a reason not to slowly
introduce vmlinux.h into the CA tests.
I had the same dilemma for the algorithm I’m implementing: Reuse
bpf_tcp_helpers.h from the kernel tree and extend it. Or use vmlinux.h
and copy only some of the (mostly trivial) helper functions. Also chose
the later here.
While doing the above, I also considered extracting the type
declarations from bpf_tcp_helpers.h into an, e.g.,
bpf_tcp_types_helper.h, keeping only the functions in
bpf_tcp_helpers.h. bpf_tcp_helpers.h could have been a base helper for
any BPF CA implementation then and used with either vmlinux.h or the
“old-school” includes. Similar to the way bpf_helpers.h is used. But at
that point, a bpf_tcp_types_helper.h could have probably just been
dropped for good and in favor of vmlinux.h. So I didn’t continue with
that.
Do you insist to use bpf_tcp_helpers.h instead of vmlinux.h? Or could
the described split into two headers make sense after all?
(Will wait for your reply here before sending a v4.)
>
> > + return (struct tcp_sock *)sk;
> > +}
> > +
> > +SEC("struct_ops/write_sk_pacing_init")
> > +void BPF_PROG(write_sk_pacing_init, struct sock *sk)
> > +{
> > +#ifdef ENABLE_ATOMICS_TESTS
> > + __sync_bool_compare_and_swap(&sk->sk_pacing_status,
> > SK_PACING_NONE,
> > + SK_PACING_NEEDED);
> > +#else
> > + sk->sk_pacing_status = SK_PACING_NEEDED;
> > +#endif
> > +}
> > +
> > +SEC("struct_ops/write_sk_pacing_cong_control")
> > +void BPF_PROG(write_sk_pacing_cong_control, struct sock *sk,
> > + const struct rate_sample *rs)
> > +{
> > + const struct tcp_sock *tp = tcp_sk(sk);
> > + unsigned long rate =
> > + ((tp->snd_cwnd * tp->mss_cache * USEC_PER_SEC) <<
> > 3) /
> > + (tp->srtt_us ?: 1U << 3);
> > + sk->sk_pacing_rate = min(rate, sk->sk_max_pacing_rate);
> > +}
> > +
> > +SEC("struct_ops/write_sk_pacing_ssthresh")
> > +__u32 BPF_PROG(write_sk_pacing_ssthresh, struct sock *sk)
> > +{
> > + return tcp_sk(sk)->snd_ssthresh;
> > +}
> > +
> > +SEC("struct_ops/write_sk_pacing_undo_cwnd")
> > +__u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk)
> > +{
> > + return tcp_sk(sk)->snd_cwnd;
> > +}
> > +
> > +SEC(".struct_ops")
> > +struct tcp_congestion_ops write_sk_pacing = {
> > + .init = (void *)write_sk_pacing_init,
> > + .cong_control = (void *)write_sk_pacing_cong_control,
> > + .ssthresh = (void *)write_sk_pacing_ssthresh,
> > + .undo_cwnd = (void *)write_sk_pacing_undo_cwnd,
> > + .name = "bpf_w_sk_pacing",
> > +};
> > --
> > 2.30.2
> >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_*
2022-06-18 16:43 ` Jörn-Thorben Hinz
@ 2022-06-20 16:06 ` Yonghong Song
2022-06-20 18:08 ` Martin KaFai Lau
0 siblings, 1 reply; 14+ messages in thread
From: Yonghong Song @ 2022-06-20 16:06 UTC (permalink / raw)
To: Jörn-Thorben Hinz, Martin KaFai Lau
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On 6/18/22 9:43 AM, Jörn-Thorben Hinz wrote:
> On Fri, 2022-06-17 at 14:04 -0700, Martin KaFai Lau wrote:
>> On Tue, Jun 14, 2022 at 12:44:50PM +0200, Jörn-Thorben Hinz wrote:
>>> Test whether a TCP CC implemented in BPF is allowed to write
>>> sk_pacing_rate and sk_pacing_status in struct sock. This is needed
>>> when
>>> cong_control() is implemented and used.
>>>
>>> Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
>>> ---
>>> .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 21 +++++++
>>> .../bpf/progs/tcp_ca_write_sk_pacing.c | 60
>>> +++++++++++++++++++
>>> 2 files changed, 81 insertions(+)
>>> create mode 100644
>>> tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
>>>
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> index e9a9a31b2ffe..a797497e2864 100644
>>> --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> @@ -9,6 +9,7 @@
>>> #include "bpf_cubic.skel.h"
>>> #include "bpf_tcp_nogpl.skel.h"
>>> #include "bpf_dctcp_release.skel.h"
>>> +#include "tcp_ca_write_sk_pacing.skel.h"
>>>
>>> #ifndef ENOTSUPP
>>> #define ENOTSUPP 524
>>> @@ -322,6 +323,24 @@ static void test_rel_setsockopt(void)
>>> bpf_dctcp_release__destroy(rel_skel);
>>> }
>>>
>>> +static void test_write_sk_pacing(void)
>>> +{
>>> + struct tcp_ca_write_sk_pacing *skel;
>>> + struct bpf_link *link;
>>> +
>>> + skel = tcp_ca_write_sk_pacing__open_and_load();
>>> + if (!ASSERT_OK_PTR(skel, "open_and_load")) {
>> nit. Remove this single line '{'.
>>
>> ./scripts/checkpatch.pl has reported that also:
>> WARNING: braces {} are not necessary for single statement blocks
>> #43: FILE: tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c:332:
>> + if (!ASSERT_OK_PTR(skel, "open_and_load")) {
>> + return;
>> + }
> Have to admit I knowingly disregarded that warning as more of a
> recommendation. Out of habit and since I personally don’t see any
> compelling reason to generally use single-line statements after ifs,
> only multiple disadvantages.
>
> But wrong place to argue here, of course. Will bow to the warning.
>
>>
>>
>>> + return;
>>> + }
>>> +
>>> + link = bpf_map__attach_struct_ops(skel-
>>>> maps.write_sk_pacing);
>>> + if (ASSERT_OK_PTR(link, "attach_struct_ops")) {
>> Same here.
>>
>> and no need to check the link before bpf_link__destroy.
>> bpf_link__destroy can handle error link. Something like:
>>
>> ASSERT_OK_PTR(link, "attach_struct_ops");
>> bpf_link__destroy(link);
>> tcp_ca_write_sk_pacing__destroy(skel);
>>
>> The earlier examples in test_cubic and test_dctcp were
>> written before bpf_link__destroy can handle error link.
> You are right, I followed the other two test_*() functions there. Good
> to know that it behaves similar to (k)free() and others. Will remove
> the ifs around bpf_link__destroy().
>
>>
>>> + bpf_link__destroy(link);
>>> + }
>>> +
>>> + tcp_ca_write_sk_pacing__destroy(skel);
>>> +}
>>> +
>>> void test_bpf_tcp_ca(void)
>>> {
>>> if (test__start_subtest("dctcp"))
>>> @@ -334,4 +353,6 @@ void test_bpf_tcp_ca(void)
>>> test_dctcp_fallback();
>>> if (test__start_subtest("rel_setsockopt"))
>>> test_rel_setsockopt();
>>> + if (test__start_subtest("write_sk_pacing"))
>>> + test_write_sk_pacing();
>>> }
>>> diff --git
>>> a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
>>> b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
>>> new file mode 100644
>>> index 000000000000..43447704cf0e
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
>>> @@ -0,0 +1,60 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +
>>> +#include "vmlinux.h"
>>> +
>>> +#include <bpf/bpf_helpers.h>
>>> +#include <bpf/bpf_tracing.h>
>>> +
>>> +char _license[] SEC("license") = "GPL";
>>> +
>>> +#define USEC_PER_SEC 1000000UL
>>> +
>>> +#define min(a, b) ((a) < (b) ? (a) : (b))
>>> +
>>> +static inline struct tcp_sock *tcp_sk(const struct sock *sk)
>>> +{
>> This helper is already available in bpf_tcp_helpers.h.
>> Is there a reason not to use that one and redefine
>> it in both patch 3 and 4? The mss_cache and srtt_us can be added
>> to bpf_tcp_helpers.h. It will need another effort to move
>> all selftest's bpf-cc to vmlinux.h.
> I fully agree it’s not elegant to redefine tcp_sk() twice more.
>
> It was between either using bpf_tcp_helpers.h and adding and
> maintaining additional struct members there. Or using the (as I
> understood it) more “modern” approach with vmlinux.h and redefining the
> trivial tcp_sk(). I chose the later. Didn’t see a reason not to slowly
> introduce vmlinux.h into the CA tests.
>
> I had the same dilemma for the algorithm I’m implementing: Reuse
> bpf_tcp_helpers.h from the kernel tree and extend it. Or use vmlinux.h
> and copy only some of the (mostly trivial) helper functions. Also chose
> the later here.
>
> While doing the above, I also considered extracting the type
> declarations from bpf_tcp_helpers.h into an, e.g.,
> bpf_tcp_types_helper.h, keeping only the functions in
> bpf_tcp_helpers.h. bpf_tcp_helpers.h could have been a base helper for
> any BPF CA implementation then and used with either vmlinux.h or the
> “old-school” includes. Similar to the way bpf_helpers.h is used. But at
> that point, a bpf_tcp_types_helper.h could have probably just been
> dropped for good and in favor of vmlinux.h. So I didn’t continue with
> that.
>
> Do you insist to use bpf_tcp_helpers.h instead of vmlinux.h? Or could
> the described split into two headers make sense after all?
I prefer to use vmlinux.h. Eventually we would like to use vmlinux.h
for progs which include bpf_tcp_healpers.h. Basically remove the struct
definitions in bpf_tcp_helpers.h and replacing "bpf.h, stddef.h, tcp.h
..." with vmlinux.h. We may not be there yet, but that is the goal.
>
> (Will wait for your reply here before sending a v4.)
>
>>
>>> + return (struct tcp_sock *)sk;
>>> +}
>>> +
>>> +SEC("struct_ops/write_sk_pacing_init")
>>> +void BPF_PROG(write_sk_pacing_init, struct sock *sk)
>>> +{
>>> +#ifdef ENABLE_ATOMICS_TESTS
>>> + __sync_bool_compare_and_swap(&sk->sk_pacing_status,
>>> SK_PACING_NONE,
>>> + SK_PACING_NEEDED);
>>> +#else
>>> + sk->sk_pacing_status = SK_PACING_NEEDED;
>>> +#endif
>>> +}
>>> +
>>> +SEC("struct_ops/write_sk_pacing_cong_control")
>>> +void BPF_PROG(write_sk_pacing_cong_control, struct sock *sk,
>>> + const struct rate_sample *rs)
>>> +{
>>> + const struct tcp_sock *tp = tcp_sk(sk);
>>> + unsigned long rate =
>>> + ((tp->snd_cwnd * tp->mss_cache * USEC_PER_SEC) <<
>>> 3) /
>>> + (tp->srtt_us ?: 1U << 3);
>>> + sk->sk_pacing_rate = min(rate, sk->sk_max_pacing_rate);
>>> +}
>>> +
>>> +SEC("struct_ops/write_sk_pacing_ssthresh")
>>> +__u32 BPF_PROG(write_sk_pacing_ssthresh, struct sock *sk)
>>> +{
>>> + return tcp_sk(sk)->snd_ssthresh;
>>> +}
>>> +
>>> +SEC("struct_ops/write_sk_pacing_undo_cwnd")
>>> +__u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk)
>>> +{
>>> + return tcp_sk(sk)->snd_cwnd;
>>> +}
>>> +
>>> +SEC(".struct_ops")
>>> +struct tcp_congestion_ops write_sk_pacing = {
>>> + .init = (void *)write_sk_pacing_init,
>>> + .cong_control = (void *)write_sk_pacing_cong_control,
>>> + .ssthresh = (void *)write_sk_pacing_ssthresh,
>>> + .undo_cwnd = (void *)write_sk_pacing_undo_cwnd,
>>> + .name = "bpf_w_sk_pacing",
>>> +};
>>> --
>>> 2.30.2
>>>
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_*
2022-06-20 16:06 ` Yonghong Song
@ 2022-06-20 18:08 ` Martin KaFai Lau
2022-06-22 18:48 ` Jörn-Thorben Hinz
0 siblings, 1 reply; 14+ messages in thread
From: Martin KaFai Lau @ 2022-06-20 18:08 UTC (permalink / raw)
To: Jörn-Thorben Hinz, Yonghong Song
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Mon, Jun 20, 2022 at 09:06:13AM -0700, Yonghong Song wrote:
[ ... ]
> > > > a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > > > b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > > > new file mode 100644
> > > > index 000000000000..43447704cf0e
> > > > --- /dev/null
> > > > +++ b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > > > @@ -0,0 +1,60 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +#include "vmlinux.h"
> > > > +
> > > > +#include <bpf/bpf_helpers.h>
> > > > +#include <bpf/bpf_tracing.h>
> > > > +
> > > > +char _license[] SEC("license") = "GPL";
> > > > +
> > > > +#define USEC_PER_SEC 1000000UL
> > > > +
> > > > +#define min(a, b) ((a) < (b) ? (a) : (b))
> > > > +
> > > > +static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > > > +{
> > > This helper is already available in bpf_tcp_helpers.h.
> > > Is there a reason not to use that one and redefine
> > > it in both patch 3 and 4? The mss_cache and srtt_us can be added
> > > to bpf_tcp_helpers.h. It will need another effort to move
> > > all selftest's bpf-cc to vmlinux.h.
> > I fully agree it’s not elegant to redefine tcp_sk() twice more.
> >
> > It was between either using bpf_tcp_helpers.h and adding and
> > maintaining additional struct members there. Or using the (as I
> > understood it) more “modern” approach with vmlinux.h and redefining the
> > trivial tcp_sk(). I chose the later. Didn’t see a reason not to slowly
> > introduce vmlinux.h into the CA tests.
> >
> > I had the same dilemma for the algorithm I’m implementing: Reuse
> > bpf_tcp_helpers.h from the kernel tree and extend it. Or use vmlinux.h
> > and copy only some of the (mostly trivial) helper functions. Also chose
> > the later here.
> >
> > While doing the above, I also considered extracting the type
> > declarations from bpf_tcp_helpers.h into an, e.g.,
> > bpf_tcp_types_helper.h, keeping only the functions in
> > bpf_tcp_helpers.h. bpf_tcp_helpers.h could have been a base helper for
> > any BPF CA implementation then and used with either vmlinux.h or the
> > “old-school” includes. Similar to the way bpf_helpers.h is used. But at
> > that point, a bpf_tcp_types_helper.h could have probably just been
> > dropped for good and in favor of vmlinux.h. So I didn’t continue with
> > that.
I think a trimmed down bpf_tcp_helpers.h + vmlinux.h is good. Basically
what Yonghong has suggested. Not sure what you meant by 'old-school' includes.
I don't think it needs a new bpf_tcp_types_helper.h also.
I think it makes sense to remove everything from bpf_tcp_helpers.h
that is already available from vmlinux.h. bpf_tcp_helpers.h
should only have some macros and helpers left. Then move
bpf_dctcp.c, bpf_cubic.c, and a few others under progs/ to
use vmlinux.h. I haven't tried but it should be doable
from a quick look at bpf_cubic.c and bpf_dctcp.c.
I agree it is better to directly use the struct tcp_sock, inet_connection_sock,
inet_sock... from the vmlinux.h. However, bpf_tcp_helpers.h does not
only have the struct and enum defined in the kernel. It has some
helpers and macros (e.g. TCP_CONG_NEEDS_ECN, TCP_ECN_*) that are missing
from vmlinux.h. These are actually used by the realistic bpf-tcp-cc like
bpf_cubic.c and bpf_dctcp.c.
The simple test in this patch is not a fully implemented bpf-tcp-cc and
it only needs to duplicate the tcp_sk() helper which looks ok at the beginning.
Later, this simple test will be copied-and-pasted to create another test.
These new tests may then need to duplicate more helpers and macros.
It was what I meant it needs separate patches to migrate all bpf-tcp-cc
tests to vmlinux.h. Otherwise, when they are migrated to vmlinux.h later,
we have another pattern of tests that can be cleared up to remove
these helpers/macros duplication.
I don't mind to keep a duplicate tcp_sk() in this set for now
if you can do a follow up to move all bpf-tcp-cc tests
to this path (vmlinux.h + a trimmed down bpf_tcp_helpers.h) and then
remove the tcp_sk() duplication here. This will be very useful.
> >
> > Do you insist to use bpf_tcp_helpers.h instead of vmlinux.h? Or could
> > the described split into two headers make sense after all?
>
> I prefer to use vmlinux.h. Eventually we would like to use vmlinux.h
> for progs which include bpf_tcp_healpers.h. Basically remove the struct
> definitions in bpf_tcp_helpers.h and replacing "bpf.h, stddef.h, tcp.h ..."
> with vmlinux.h. We may not be there yet, but that is the goal.
>
> >
> > (Will wait for your reply here before sending a v4.)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_*
2022-06-20 18:08 ` Martin KaFai Lau
@ 2022-06-22 18:48 ` Jörn-Thorben Hinz
0 siblings, 0 replies; 14+ messages in thread
From: Jörn-Thorben Hinz @ 2022-06-22 18:48 UTC (permalink / raw)
To: Martin KaFai Lau, Yonghong Song
Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev
On Mon, 2022-06-20 at 11:08 -0700, Martin KaFai Lau wrote:
> On Mon, Jun 20, 2022 at 09:06:13AM -0700, Yonghong Song wrote:
> [ ... ]
> > > > > a/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > > > > b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > > > > new file mode 100644
> > > > > index 000000000000..43447704cf0e
> > > > > --- /dev/null
> > > > > +++
> > > > > b/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
> > > > > @@ -0,0 +1,60 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > +
> > > > > +#include "vmlinux.h"
> > > > > +
> > > > > +#include <bpf/bpf_helpers.h>
> > > > > +#include <bpf/bpf_tracing.h>
> > > > > +
> > > > > +char _license[] SEC("license") = "GPL";
> > > > > +
> > > > > +#define USEC_PER_SEC 1000000UL
> > > > > +
> > > > > +#define min(a, b) ((a) < (b) ? (a) : (b))
> > > > > +
> > > > > +static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > > > > +{
> > > > This helper is already available in bpf_tcp_helpers.h.
> > > > Is there a reason not to use that one and redefine
> > > > it in both patch 3 and 4? The mss_cache and srtt_us can be
> > > > added
> > > > to bpf_tcp_helpers.h. It will need another effort to move
> > > > all selftest's bpf-cc to vmlinux.h.
> > > I fully agree it’s not elegant to redefine tcp_sk() twice more.
> > >
> > > It was between either using bpf_tcp_helpers.h and adding and
> > > maintaining additional struct members there. Or using the (as I
> > > understood it) more “modern” approach with vmlinux.h and
> > > redefining the
> > > trivial tcp_sk(). I chose the later. Didn’t see a reason not to
> > > slowly
> > > introduce vmlinux.h into the CA tests.
> > >
> > > I had the same dilemma for the algorithm I’m implementing: Reuse
> > > bpf_tcp_helpers.h from the kernel tree and extend it. Or use
> > > vmlinux.h
> > > and copy only some of the (mostly trivial) helper functions. Also
> > > chose
> > > the later here.
> > >
> > > While doing the above, I also considered extracting the type
> > > declarations from bpf_tcp_helpers.h into an, e.g.,
> > > bpf_tcp_types_helper.h, keeping only the functions in
> > > bpf_tcp_helpers.h. bpf_tcp_helpers.h could have been a base
> > > helper for
> > > any BPF CA implementation then and used with either vmlinux.h or
> > > the
> > > “old-school” includes. Similar to the way bpf_helpers.h is used.
> > > But at
> > > that point, a bpf_tcp_types_helper.h could have probably just
> > > been
> > > dropped for good and in favor of vmlinux.h. So I didn’t continue
> > > with
> > > that.
> I think a trimmed down bpf_tcp_helpers.h + vmlinux.h is good.
> Basically
> what Yonghong has suggested. Not sure what you meant by 'old-school'
> includes.
That was badly worded by me. I was referring to linux/types.h and co.
> I don't think it needs a new bpf_tcp_types_helper.h also.
I thought that might be helpful as a first step towards using
vmlinux.h, without fully migrating all the users of bpf_tcp_helpers.h.
But that’s probably unnecessary.
>
> I think it makes sense to remove everything from bpf_tcp_helpers.h
> that is already available from vmlinux.h. bpf_tcp_helpers.h
> should only have some macros and helpers left. Then move
> bpf_dctcp.c, bpf_cubic.c, and a few others under progs/ to
> use vmlinux.h. I haven't tried but it should be doable
> from a quick look at bpf_cubic.c and bpf_dctcp.c.
>
> I agree it is better to directly use the struct tcp_sock,
> inet_connection_sock,
> inet_sock... from the vmlinux.h. However, bpf_tcp_helpers.h does not
> only have the struct and enum defined in the kernel. It has some
> helpers and macros (e.g. TCP_CONG_NEEDS_ECN, TCP_ECN_*) that are
> missing
> from vmlinux.h.
It’s unfortunate that especially those many, tiny non-macro helpers
like tcp_sk() and e.g. tcp_stamp_us_delta() still have to be redefined.
Could it make sense to provide those, in the same way tcp_slow_start(),
tcp_cong_avoid_ai(), and tcp_reno_*() are “exported” as kfuncs by
bpf_tcp_ca.c (if I read that correctly)? Even though the list of these
functions could grow quickly.
> These are actually used by the realistic bpf-tcp-cc like
> bpf_cubic.c and bpf_dctcp.c.
>
> The simple test in this patch is not a fully implemented bpf-tcp-cc
> and
> it only needs to duplicate the tcp_sk() helper which looks ok at the
> beginning.
> Later, this simple test will be copied-and-pasted to create another
> test.
> These new tests may then need to duplicate more helpers and macros.
> It was what I meant it needs separate patches to migrate all bpf-tcp-
> cc
> tests to vmlinux.h. Otherwise, when they are migrated to vmlinux.h
> later,
> we have another pattern of tests that can be cleared up to remove
> these helpers/macros duplication.
I agree with you there and also don’t favor copying stuff around. Only
did it here, since something had to be copied: either a field into one
of the structs in bpf_tcp_helpers or tcp_sk.
>
> I don't mind to keep a duplicate tcp_sk() in this set for now
> if you can do a follow up to move all bpf-tcp-cc tests
> to this path (vmlinux.h + a trimmed down bpf_tcp_helpers.h) and then
> remove the tcp_sk() duplication here. This will be very useful.
Took a look at this over the last few days, see [1]. Happy about
feedback.
[1]
https://lore.kernel.org/bpf/20220622181015.892445-1-jthinz@mailbox.tu-berlin.de/
>
> > >
> > > Do you insist to use bpf_tcp_helpers.h instead of vmlinux.h? Or
> > > could
> > > the described split into two headers make sense after all?
> >
> > I prefer to use vmlinux.h. Eventually we would like to use
> > vmlinux.h
> > for progs which include bpf_tcp_healpers.h. Basically remove the
> > struct
> > definitions in bpf_tcp_helpers.h and replacing "bpf.h, stddef.h,
> > tcp.h ..."
> > with vmlinux.h. We may not be there yet, but that is the goal.
> >
> > >
> > > (Will wait for your reply here before sending a v4.)
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-06-22 18:48 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14 10:44 [PATCH bpf-next v3 0/5] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 1/5] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
2022-06-17 20:21 ` Martin KaFai Lau
2022-06-14 10:44 ` [PATCH bpf-next v3 2/5] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC Jörn-Thorben Hinz
2022-06-17 20:26 ` Martin KaFai Lau
2022-06-14 10:44 ` [PATCH bpf-next v3 3/5] selftests/bpf: Test a BPF CC writing sk_pacing_* Jörn-Thorben Hinz
2022-06-17 21:04 ` Martin KaFai Lau
2022-06-18 16:43 ` Jörn-Thorben Hinz
2022-06-20 16:06 ` Yonghong Song
2022-06-20 18:08 ` Martin KaFai Lau
2022-06-22 18:48 ` Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 4/5] selftests/bpf: Test an incomplete BPF CC Jörn-Thorben Hinz
2022-06-14 10:44 ` [PATCH bpf-next v3 5/5] selftests/bpf: Test a BPF CC implementing the unsupported get_info() Jörn-Thorben Hinz
2022-06-17 21:22 ` Martin KaFai Lau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).