* [PATCH bpf-next 0/2] Split bpf_sock dst_port field
@ 2022-01-27 17:24 Jakub Sitnicki
2022-01-27 17:24 ` [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide Jakub Sitnicki
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2022-01-27 17:24 UTC (permalink / raw)
To: bpf
Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Menglong Dong, Martin KaFai Lau
This is a follow-up to discussion around the idea of making dst_port in struct
bpf_sock a 16-bit field that happened in [1]. I have fleshed it out further:
v1:
- keep dst_field offset unchanged to prevent existing BPF program breakage
(Martin)
- allow 8-bit loads from dst_port[0] and [1]
- add test coverage for the verifier and the context access converter
[1] https://lore.kernel.org/bpf/87sftbobys.fsf@cloudflare.com/
Jakub Sitnicki (2):
bpf: Make dst_port field in struct bpf_sock 16-bit wide
selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads
include/uapi/linux/bpf.h | 3 +-
net/core/filter.c | 9 ++-
tools/include/uapi/linux/bpf.h | 3 +-
.../selftests/bpf/prog_tests/sock_fields.c | 58 +++++++++----
.../selftests/bpf/progs/test_sock_fields.c | 41 ++++++++++
tools/testing/selftests/bpf/verifier/sock.c | 81 ++++++++++++++++++-
6 files changed, 172 insertions(+), 23 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide
2022-01-27 17:24 [PATCH bpf-next 0/2] Split bpf_sock dst_port field Jakub Sitnicki
@ 2022-01-27 17:24 ` Jakub Sitnicki
2022-01-28 18:17 ` Alexei Starovoitov
2022-01-27 17:24 ` [PATCH bpf-next 2/2] selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads Jakub Sitnicki
2022-01-28 6:31 ` [PATCH bpf-next 0/2] Split bpf_sock dst_port field Martin KaFai Lau
2 siblings, 1 reply; 6+ messages in thread
From: Jakub Sitnicki @ 2022-01-27 17:24 UTC (permalink / raw)
To: bpf
Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Menglong Dong, Martin KaFai Lau
Menglong Dong reports that the documentation for the dst_port field in
struct bpf_sock is inaccurate and confusing. From the BPF program PoV, the
field is a zero-padded 16-bit integer in network byte order. The value
appears to the BPF user as if laid out in memory as so:
offsetof(struct bpf_sock, dst_port) + 0 <port MSB>
+ 8 <port LSB>
+16 0x00
+24 0x00
32-, 16-, and 8-bit wide loads from the field are all allowed, but only if
the offset into the field is 0.
32-bit wide loads from dst_port are especially confusing. The loaded value,
after converting to host byte order with bpf_ntohl(dst_port), contains the
port number in the upper 16-bits.
Remove the confusion by splitting the field into two 16-bit fields. For
backward compatibility, allow 32-bit wide loads from offsetof(struct
bpf_sock, dst_port).
While at it, allow loads 8-bit loads at offset [0] and [1] from dst_port.
Reported-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
include/uapi/linux/bpf.h | 3 ++-
net/core/filter.c | 9 ++++++++-
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4a2f7041ebae..027e84b18b51 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5574,7 +5574,8 @@ struct bpf_sock {
__u32 src_ip4;
__u32 src_ip6[4];
__u32 src_port; /* host byte order */
- __u32 dst_port; /* network byte order */
+ __be16 dst_port; /* network byte order */
+ __u16 zero_padding;
__u32 dst_ip4;
__u32 dst_ip6[4];
__u32 state;
diff --git a/net/core/filter.c b/net/core/filter.c
index a06931c27eeb..ef8473163696 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -8265,6 +8265,7 @@ bool bpf_sock_is_valid_access(int off, int size, enum bpf_access_type type,
struct bpf_insn_access_aux *info)
{
const int size_default = sizeof(__u32);
+ int field_size;
if (off < 0 || off >= sizeof(struct bpf_sock))
return false;
@@ -8276,7 +8277,6 @@ bool bpf_sock_is_valid_access(int off, int size, enum bpf_access_type type,
case offsetof(struct bpf_sock, family):
case offsetof(struct bpf_sock, type):
case offsetof(struct bpf_sock, protocol):
- case offsetof(struct bpf_sock, dst_port):
case offsetof(struct bpf_sock, src_port):
case offsetof(struct bpf_sock, rx_queue_mapping):
case bpf_ctx_range(struct bpf_sock, src_ip4):
@@ -8285,6 +8285,13 @@ bool bpf_sock_is_valid_access(int off, int size, enum bpf_access_type type,
case bpf_ctx_range_till(struct bpf_sock, dst_ip6[0], dst_ip6[3]):
bpf_ctx_record_field_size(info, size_default);
return bpf_ctx_narrow_access_ok(off, size, size_default);
+ case bpf_ctx_range(struct bpf_sock, dst_port):
+ field_size = size == size_default ?
+ size_default : sizeof_field(struct bpf_sock, dst_port);
+ bpf_ctx_record_field_size(info, field_size);
+ return bpf_ctx_narrow_access_ok(off, size, field_size);
+ case bpf_ctx_range(struct bpf_sock, zero_padding):
+ return false;
}
return size == size_default;
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH bpf-next 2/2] selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads
2022-01-27 17:24 [PATCH bpf-next 0/2] Split bpf_sock dst_port field Jakub Sitnicki
2022-01-27 17:24 ` [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide Jakub Sitnicki
@ 2022-01-27 17:24 ` Jakub Sitnicki
2022-01-28 6:31 ` [PATCH bpf-next 0/2] Split bpf_sock dst_port field Martin KaFai Lau
2 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2022-01-27 17:24 UTC (permalink / raw)
To: bpf
Cc: netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Menglong Dong, Martin KaFai Lau
Add coverage to the verifier tests and tests for reading bpf_sock fields to
ensure that 32-bit, 16-bit, and 8-bit loads from dst_port field are allowed
only at intended offsets and produce expected values.
While 16-bit and 8-bit access to dst_port field is straight-forward, 32-bit
wide loads need be allowed and produce a zero-padded 16-bit value for
backward compatibility.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
tools/include/uapi/linux/bpf.h | 3 +-
.../selftests/bpf/prog_tests/sock_fields.c | 58 +++++++++----
.../selftests/bpf/progs/test_sock_fields.c | 41 ++++++++++
tools/testing/selftests/bpf/verifier/sock.c | 81 ++++++++++++++++++-
4 files changed, 162 insertions(+), 21 deletions(-)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4a2f7041ebae..027e84b18b51 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5574,7 +5574,8 @@ struct bpf_sock {
__u32 src_ip4;
__u32 src_ip6[4];
__u32 src_port; /* host byte order */
- __u32 dst_port; /* network byte order */
+ __be16 dst_port; /* network byte order */
+ __u16 zero_padding;
__u32 dst_ip4;
__u32 dst_ip6[4];
__u32 state;
diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index 9fc040eaa482..9d211b5c22c4 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -1,9 +1,11 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2019 Facebook */
+#define _GNU_SOURCE
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
+#include <sched.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
@@ -20,6 +22,7 @@
enum bpf_linum_array_idx {
EGRESS_LINUM_IDX,
INGRESS_LINUM_IDX,
+ READ_SK_DST_PORT_LINUM_IDX,
__NR_BPF_LINUM_ARRAY_IDX,
};
@@ -42,8 +45,16 @@ static __u64 child_cg_id;
static int linum_map_fd;
static __u32 duration;
-static __u32 egress_linum_idx = EGRESS_LINUM_IDX;
-static __u32 ingress_linum_idx = INGRESS_LINUM_IDX;
+static bool create_netns(void)
+{
+ if (!ASSERT_OK(unshare(CLONE_NEWNET), "create netns"))
+ return false;
+
+ if (!ASSERT_OK(system("ip link set dev lo up"), "bring up lo"))
+ return false;
+
+ return true;
+}
static void print_sk(const struct bpf_sock *sk, const char *prefix)
{
@@ -91,19 +102,24 @@ static void check_result(void)
{
struct bpf_tcp_sock srv_tp, cli_tp, listen_tp;
struct bpf_sock srv_sk, cli_sk, listen_sk;
- __u32 ingress_linum, egress_linum;
+ __u32 idx, ingress_linum, egress_linum, linum;
int err;
- err = bpf_map_lookup_elem(linum_map_fd, &egress_linum_idx,
- &egress_linum);
+ idx = EGRESS_LINUM_IDX;
+ err = bpf_map_lookup_elem(linum_map_fd, &idx, &egress_linum);
CHECK(err < 0, "bpf_map_lookup_elem(linum_map_fd)",
"err:%d errno:%d\n", err, errno);
- err = bpf_map_lookup_elem(linum_map_fd, &ingress_linum_idx,
- &ingress_linum);
+ idx = INGRESS_LINUM_IDX;
+ err = bpf_map_lookup_elem(linum_map_fd, &idx, &ingress_linum);
CHECK(err < 0, "bpf_map_lookup_elem(linum_map_fd)",
"err:%d errno:%d\n", err, errno);
+ idx = READ_SK_DST_PORT_LINUM_IDX;
+ err = bpf_map_lookup_elem(linum_map_fd, &idx, &linum);
+ ASSERT_OK(err, "bpf_map_lookup_elem(linum_map_fd, READ_SK_DST_PORT_IDX)");
+ ASSERT_EQ(linum, 0, "failure in read_sk_dst_port on line");
+
memcpy(&srv_sk, &skel->bss->srv_sk, sizeof(srv_sk));
memcpy(&srv_tp, &skel->bss->srv_tp, sizeof(srv_tp));
memcpy(&cli_sk, &skel->bss->cli_sk, sizeof(cli_sk));
@@ -262,7 +278,7 @@ static void test(void)
char buf[DATA_LEN];
/* Prepare listen_fd */
- listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0);
+ listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0xcafe, 0);
/* start_server() has logged the error details */
if (CHECK_FAIL(listen_fd == -1))
goto done;
@@ -330,8 +346,12 @@ static void test(void)
void serial_test_sock_fields(void)
{
- struct bpf_link *egress_link = NULL, *ingress_link = NULL;
int parent_cg_fd = -1, child_cg_fd = -1;
+ struct bpf_link *link;
+
+ /* Use a dedicated netns to have a fixed listen port */
+ if (!create_netns())
+ return;
/* Create a cgroup, get fd, and join it */
parent_cg_fd = test__join_cgroup(PARENT_CGROUP);
@@ -352,15 +372,20 @@ void serial_test_sock_fields(void)
if (CHECK(!skel, "test_sock_fields__open_and_load", "failed\n"))
goto done;
- egress_link = bpf_program__attach_cgroup(skel->progs.egress_read_sock_fields,
- child_cg_fd);
- if (!ASSERT_OK_PTR(egress_link, "attach_cgroup(egress)"))
+ link = bpf_program__attach_cgroup(skel->progs.egress_read_sock_fields, child_cg_fd);
+ if (!ASSERT_OK_PTR(link, "attach_cgroup(egress_read_sock_fields)"))
+ goto done;
+ skel->links.egress_read_sock_fields = link;
+
+ link = bpf_program__attach_cgroup(skel->progs.ingress_read_sock_fields, child_cg_fd);
+ if (!ASSERT_OK_PTR(link, "attach_cgroup(ingress_read_sock_fields)"))
goto done;
+ skel->links.ingress_read_sock_fields = link;
- ingress_link = bpf_program__attach_cgroup(skel->progs.ingress_read_sock_fields,
- child_cg_fd);
- if (!ASSERT_OK_PTR(ingress_link, "attach_cgroup(ingress)"))
+ link = bpf_program__attach_cgroup(skel->progs.read_sk_dst_port, child_cg_fd);
+ if (!ASSERT_OK_PTR(link, "attach_cgroup(read_sk_dst_port"))
goto done;
+ skel->links.read_sk_dst_port = link;
linum_map_fd = bpf_map__fd(skel->maps.linum_map);
sk_pkt_out_cnt_fd = bpf_map__fd(skel->maps.sk_pkt_out_cnt);
@@ -369,8 +394,7 @@ void serial_test_sock_fields(void)
test();
done:
- bpf_link__destroy(egress_link);
- bpf_link__destroy(ingress_link);
+ test_sock_fields__detach(skel);
test_sock_fields__destroy(skel);
if (child_cg_fd >= 0)
close(child_cg_fd);
diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields.c b/tools/testing/selftests/bpf/progs/test_sock_fields.c
index 81b57b9aaaea..246f1f001813 100644
--- a/tools/testing/selftests/bpf/progs/test_sock_fields.c
+++ b/tools/testing/selftests/bpf/progs/test_sock_fields.c
@@ -12,6 +12,7 @@
enum bpf_linum_array_idx {
EGRESS_LINUM_IDX,
INGRESS_LINUM_IDX,
+ READ_SK_DST_PORT_LINUM_IDX,
__NR_BPF_LINUM_ARRAY_IDX,
};
@@ -250,4 +251,44 @@ int ingress_read_sock_fields(struct __sk_buff *skb)
return CG_OK;
}
+static __noinline bool sk_dst_port__load_word(struct bpf_sock *sk)
+{
+ __u32 *word = (__u32 *)&sk->dst_port;
+ return word[0] == bpf_htonl(0xcafe0000);
+}
+
+static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk)
+{
+ __u16 *half = (__u16 *)&sk->dst_port;
+ return half[0] == bpf_htons(0xcafe);
+}
+
+static __noinline bool sk_dst_port__load_byte(struct bpf_sock *sk)
+{
+ __u8 *byte = (__u8 *)&sk->dst_port;
+ return byte[0] == 0xca && byte[1] == 0xfe;
+}
+
+SEC("cgroup_skb/egress")
+int read_sk_dst_port(struct __sk_buff *skb)
+{
+ __u32 linum, linum_idx;
+ struct bpf_sock *sk;
+
+ linum_idx = READ_SK_DST_PORT_LINUM_IDX;
+
+ sk = skb->sk;
+ if (!sk)
+ RET_LOG();
+
+ if (!sk_dst_port__load_word(sk))
+ RET_LOG();
+ if (!sk_dst_port__load_half(sk))
+ RET_LOG();
+ if (!sk_dst_port__load_byte(sk))
+ RET_LOG();
+
+ return CG_OK;
+}
+
char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c
index ce13ece08d51..3d9d89c5fe65 100644
--- a/tools/testing/selftests/bpf/verifier/sock.c
+++ b/tools/testing/selftests/bpf/verifier/sock.c
@@ -121,7 +121,25 @@
.result = ACCEPT,
},
{
- "sk_fullsock(skb->sk): sk->dst_port [narrow load]",
+ "sk_fullsock(skb->sk): sk->dst_port [word load] (backward compatibility)",
+ .insns = {
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_EMIT_CALL(BPF_FUNC_sk_fullsock),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_sock, dst_port)),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_CGROUP_SKB,
+ .result = ACCEPT,
+},
+{
+ "sk_fullsock(skb->sk): sk->dst_port [half load]",
.insns = {
BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
@@ -139,7 +157,64 @@
.result = ACCEPT,
},
{
- "sk_fullsock(skb->sk): sk->dst_port [load 2nd byte]",
+ "sk_fullsock(skb->sk): sk->dst_port [half load] (invalid)",
+ .insns = {
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_EMIT_CALL(BPF_FUNC_sk_fullsock),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_sock, dst_port) + 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_CGROUP_SKB,
+ .result = REJECT,
+ .errstr = "invalid sock access",
+},
+{
+ "sk_fullsock(skb->sk): sk->dst_port [byte load]",
+ .insns = {
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_EMIT_CALL(BPF_FUNC_sk_fullsock),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_LDX_MEM(BPF_B, BPF_REG_2, BPF_REG_0, offsetof(struct bpf_sock, dst_port)),
+ BPF_LDX_MEM(BPF_B, BPF_REG_2, BPF_REG_0, offsetof(struct bpf_sock, dst_port) + 1),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_CGROUP_SKB,
+ .result = ACCEPT,
+},
+{
+ "sk_fullsock(skb->sk): sk->dst_port [byte load] (invalid)",
+ .insns = {
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_EMIT_CALL(BPF_FUNC_sk_fullsock),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_sock, dst_port) + 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_CGROUP_SKB,
+ .result = REJECT,
+ .errstr = "invalid sock access",
+},
+{
+ "sk_fullsock(skb->sk): sk->zero_padding [half load] (invalid)",
.insns = {
BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct __sk_buff, sk)),
BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
@@ -149,7 +224,7 @@
BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
- BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_sock, dst_port) + 1),
+ BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_sock, zero_padding)),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
--
2.31.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 0/2] Split bpf_sock dst_port field
2022-01-27 17:24 [PATCH bpf-next 0/2] Split bpf_sock dst_port field Jakub Sitnicki
2022-01-27 17:24 ` [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide Jakub Sitnicki
2022-01-27 17:24 ` [PATCH bpf-next 2/2] selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads Jakub Sitnicki
@ 2022-01-28 6:31 ` Martin KaFai Lau
2 siblings, 0 replies; 6+ messages in thread
From: Martin KaFai Lau @ 2022-01-28 6:31 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: bpf, netdev, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Menglong Dong
On Thu, Jan 27, 2022 at 06:24:46PM +0100, Jakub Sitnicki wrote:
> This is a follow-up to discussion around the idea of making dst_port in struct
> bpf_sock a 16-bit field that happened in [1]. I have fleshed it out further:
>
> v1:
> - keep dst_field offset unchanged to prevent existing BPF program breakage
> (Martin)
> - allow 8-bit loads from dst_port[0] and [1]
> - add test coverage for the verifier and the context access converter
lgtm. Thanks for the patches.
Acked-by: Martin KaFai Lau <kafai@fb.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide
2022-01-27 17:24 ` [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide Jakub Sitnicki
@ 2022-01-28 18:17 ` Alexei Starovoitov
2022-01-30 11:58 ` Jakub Sitnicki
0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2022-01-28 18:17 UTC (permalink / raw)
To: Jakub Sitnicki
Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Menglong Dong, Martin KaFai Lau
On Thu, Jan 27, 2022 at 9:24 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> Menglong Dong reports that the documentation for the dst_port field in
> struct bpf_sock is inaccurate and confusing. From the BPF program PoV, the
> field is a zero-padded 16-bit integer in network byte order. The value
> appears to the BPF user as if laid out in memory as so:
>
> offsetof(struct bpf_sock, dst_port) + 0 <port MSB>
> + 8 <port LSB>
> +16 0x00
> +24 0x00
>
> 32-, 16-, and 8-bit wide loads from the field are all allowed, but only if
> the offset into the field is 0.
>
> 32-bit wide loads from dst_port are especially confusing. The loaded value,
> after converting to host byte order with bpf_ntohl(dst_port), contains the
> port number in the upper 16-bits.
>
> Remove the confusion by splitting the field into two 16-bit fields. For
> backward compatibility, allow 32-bit wide loads from offsetof(struct
> bpf_sock, dst_port).
>
> While at it, allow loads 8-bit loads at offset [0] and [1] from dst_port.
>
> Reported-by: Menglong Dong <imagedong@tencent.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> include/uapi/linux/bpf.h | 3 ++-
> net/core/filter.c | 9 ++++++++-
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4a2f7041ebae..027e84b18b51 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5574,7 +5574,8 @@ struct bpf_sock {
> __u32 src_ip4;
> __u32 src_ip6[4];
> __u32 src_port; /* host byte order */
> - __u32 dst_port; /* network byte order */
> + __be16 dst_port; /* network byte order */
> + __u16 zero_padding;
I was wondering can we do '__u16 :16' here ?
Should we do the same for bpf_sk_lookup->remote_port as well
for consistency?
Thanks for the idea and the patches!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide
2022-01-28 18:17 ` Alexei Starovoitov
@ 2022-01-30 11:58 ` Jakub Sitnicki
0 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2022-01-30 11:58 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Menglong Dong, Martin KaFai Lau
On Fri, Jan 28, 2022 at 07:17 PM CET, Alexei Starovoitov wrote:
> On Thu, Jan 27, 2022 at 9:24 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> Menglong Dong reports that the documentation for the dst_port field in
>> struct bpf_sock is inaccurate and confusing. From the BPF program PoV, the
>> field is a zero-padded 16-bit integer in network byte order. The value
>> appears to the BPF user as if laid out in memory as so:
>>
>> offsetof(struct bpf_sock, dst_port) + 0 <port MSB>
>> + 8 <port LSB>
>> +16 0x00
>> +24 0x00
>>
>> 32-, 16-, and 8-bit wide loads from the field are all allowed, but only if
>> the offset into the field is 0.
>>
>> 32-bit wide loads from dst_port are especially confusing. The loaded value,
>> after converting to host byte order with bpf_ntohl(dst_port), contains the
>> port number in the upper 16-bits.
>>
>> Remove the confusion by splitting the field into two 16-bit fields. For
>> backward compatibility, allow 32-bit wide loads from offsetof(struct
>> bpf_sock, dst_port).
>>
>> While at it, allow loads 8-bit loads at offset [0] and [1] from dst_port.
>>
>> Reported-by: Menglong Dong <imagedong@tencent.com>
>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> ---
>> include/uapi/linux/bpf.h | 3 ++-
>> net/core/filter.c | 9 ++++++++-
>> 2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 4a2f7041ebae..027e84b18b51 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -5574,7 +5574,8 @@ struct bpf_sock {
>> __u32 src_ip4;
>> __u32 src_ip6[4];
>> __u32 src_port; /* host byte order */
>> - __u32 dst_port; /* network byte order */
>> + __be16 dst_port; /* network byte order */
>> + __u16 zero_padding;
>
> I was wondering can we do '__u16 :16' here ?
Great idea. Now done in v2:
https://lore.kernel.org/bpf/20220130115518.213259-1-jakub@cloudflare.com/
> Should we do the same for bpf_sk_lookup->remote_port as well
> for consistency?
I can tend to that this upcoming week.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-01-30 11:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-27 17:24 [PATCH bpf-next 0/2] Split bpf_sock dst_port field Jakub Sitnicki
2022-01-27 17:24 ` [PATCH bpf-next 1/2] bpf: Make dst_port field in struct bpf_sock 16-bit wide Jakub Sitnicki
2022-01-28 18:17 ` Alexei Starovoitov
2022-01-30 11:58 ` Jakub Sitnicki
2022-01-27 17:24 ` [PATCH bpf-next 2/2] selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads Jakub Sitnicki
2022-01-28 6:31 ` [PATCH bpf-next 0/2] Split bpf_sock dst_port field Martin KaFai Lau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).