* [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes @ 2022-02-22 18:25 Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Ilya Leoshkevich @ 2022-02-22 18:25 UTC (permalink / raw) To: Jakub Sitnicki, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich Hi, These are my current patches for the discussion from [1] and [2]. With these changes tests pass on both x86_64 and s390x, but there are quite a few things I'm not sure about, hence it's just an RFC: - Can moving size < target_size check lead to incorrect shifts in some corner cases that I missed? - Are different layouts of remote_port on little- and big-endian a bug or a feature? Do we want things to be this way? If not, are we bound by the ABI anyway? - Is there any way to make uapi changes look nicer? A wall of nested structs, unions and ifdefs in an otherwise clean struct definition isn't looking particularly good. - What is the Officially Approved way to access the remote_port field from C code? I'm leaning towards bpf_ntohs((__u16)remote_port), like in [3], and I adjusted the test accordingly. [1] https://lore.kernel.org/bpf/CAEf4BzaRNLw9_EnaMo5e46CdEkzbJiVU3j9oxnsemBKjNFf3wQ@mail.gmail.com/ [2] https://lore.kernel.org/bpf/20220221180358.169101-1-jakub@cloudflare.com/ [3] https://lore.kernel.org/bpf/20220113070245.791577-1-imagedong@tencent.com/ Ilya Leoshkevich (3): bpf: Fix certain narrow loads with offsets bpf: Fix bpf_sk_lookup.remote_port on big-endian selftests/bpf: Adapt bpf_sk_lookup.remote_port loads include/uapi/linux/bpf.h | 17 +++++++++++++++-- kernel/bpf/verifier.c | 14 +++++++++----- net/core/filter.c | 5 ++--- tools/include/uapi/linux/bpf.h | 17 +++++++++++++++-- .../selftests/bpf/progs/test_sk_lookup.c | 17 +++++++++++------ 5 files changed, 52 insertions(+), 18 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich @ 2022-02-22 18:25 ` Ilya Leoshkevich 2022-03-08 15:01 ` Jakub Sitnicki 2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads Ilya Leoshkevich 2 siblings, 1 reply; 18+ messages in thread From: Ilya Leoshkevich @ 2022-02-22 18:25 UTC (permalink / raw) To: Jakub Sitnicki, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich Verifier treats bpf_sk_lookup.remote_port as a 32-bit field for backward compatibility, regardless of what the uapi headers say. This field is mapped onto the 16-bit bpf_sk_lookup_kern.sport field. Therefore, accessing the most significant 16 bits of bpf_sk_lookup.remote_port must produce 0, which is currently not the case. The problem is that narrow loads with offset - commit 46f53a65d2de ("bpf: Allow narrow loads with offset > 0"), don't play nicely with the masking optimization - commit 239946314e57 ("bpf: possibly avoid extra masking for narrower load in verifier"). In particular, when we suppress extra masking, we suppress shifting as well, which is not correct. Fix by moving the masking suppression check to BPF_AND generation. Fixes: 46f53a65d2de ("bpf: Allow narrow loads with offset > 0") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- kernel/bpf/verifier.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d7473fee247c..195f2e9b5a47 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -12848,7 +12848,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) return -EINVAL; } - if (is_narrower_load && size < target_size) { + if (is_narrower_load) { u8 shift = bpf_ctx_narrow_access_offset( off, size, size_default) * 8; if (shift && cnt + 1 >= ARRAY_SIZE(insn_buf)) { @@ -12860,15 +12860,19 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH, insn->dst_reg, shift); - insn_buf[cnt++] = BPF_ALU32_IMM(BPF_AND, insn->dst_reg, - (1 << size * 8) - 1); + if (size < target_size) + insn_buf[cnt++] = BPF_ALU32_IMM( + BPF_AND, insn->dst_reg, + (1 << size * 8) - 1); } else { if (shift) insn_buf[cnt++] = BPF_ALU64_IMM(BPF_RSH, insn->dst_reg, shift); - insn_buf[cnt++] = BPF_ALU64_IMM(BPF_AND, insn->dst_reg, - (1ULL << size * 8) - 1); + if (size < target_size) + insn_buf[cnt++] = BPF_ALU64_IMM( + BPF_AND, insn->dst_reg, + (1ULL << size * 8) - 1); } } -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich @ 2022-03-08 15:01 ` Jakub Sitnicki 2022-03-08 23:58 ` Ilya Leoshkevich 0 siblings, 1 reply; 18+ messages in thread From: Jakub Sitnicki @ 2022-03-08 15:01 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Tue, Feb 22, 2022 at 07:25 PM +01, Ilya Leoshkevich wrote: > Verifier treats bpf_sk_lookup.remote_port as a 32-bit field for > backward compatibility, regardless of what the uapi headers say. > This field is mapped onto the 16-bit bpf_sk_lookup_kern.sport field. > Therefore, accessing the most significant 16 bits of > bpf_sk_lookup.remote_port must produce 0, which is currently not > the case. > > The problem is that narrow loads with offset - commit 46f53a65d2de > ("bpf: Allow narrow loads with offset > 0"), don't play nicely with > the masking optimization - commit 239946314e57 ("bpf: possibly avoid > extra masking for narrower load in verifier"). In particular, when we > suppress extra masking, we suppress shifting as well, which is not > correct. > > Fix by moving the masking suppression check to BPF_AND generation. > > Fixes: 46f53a65d2de ("bpf: Allow narrow loads with offset > 0") > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > --- > kernel/bpf/verifier.c | 14 +++++++++----- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d7473fee247c..195f2e9b5a47 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -12848,7 +12848,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) > return -EINVAL; > } > > - if (is_narrower_load && size < target_size) { > + if (is_narrower_load) { > u8 shift = bpf_ctx_narrow_access_offset( > off, size, size_default) * 8; > if (shift && cnt + 1 >= ARRAY_SIZE(insn_buf)) { > @@ -12860,15 +12860,19 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) > insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH, > insn->dst_reg, > shift); > - insn_buf[cnt++] = BPF_ALU32_IMM(BPF_AND, insn->dst_reg, > - (1 << size * 8) - 1); > + if (size < target_size) > + insn_buf[cnt++] = BPF_ALU32_IMM( > + BPF_AND, insn->dst_reg, > + (1 << size * 8) - 1); > } else { > if (shift) > insn_buf[cnt++] = BPF_ALU64_IMM(BPF_RSH, > insn->dst_reg, > shift); > - insn_buf[cnt++] = BPF_ALU64_IMM(BPF_AND, insn->dst_reg, > - (1ULL << size * 8) - 1); > + if (size < target_size) > + insn_buf[cnt++] = BPF_ALU64_IMM( > + BPF_AND, insn->dst_reg, > + (1ULL << size * 8) - 1); > } > } Thanks for patience. I'm coming back to this. This fix affects the 2-byte load from bpf_sk_lookup.remote_port. Dumping the xlated BPF code confirms it. On LE (x86-64) things look well. Before this patch: * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (b7) r0 = 0 2: (95) exit * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) 0: (69) r2 = *(u16 *)(r1 +4) 1: (b7) r0 = 0 2: (95) exit After this patch: * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (b7) r0 = 0 2: (95) exit * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) 0: (69) r2 = *(u16 *)(r1 +4) 1: (74) w2 >>= 16 2: (b7) r0 = 0 3: (95) exit Which works great because the JIT generates a zero-extended load movzwq: * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) bpf_prog_5e4fe3dbdcb18fd3: 0: nopl 0x0(%rax,%rax,1) 5: xchg %ax,%ax 7: push %rbp 8: mov %rsp,%rbp b: movzwq 0x4(%rdi),%rsi 10: xor %eax,%eax 12: leave 13: ret * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) bpf_prog_4a6336c64a340b96: 0: nopl 0x0(%rax,%rax,1) 5: xchg %ax,%ax 7: push %rbp 8: mov %rsp,%rbp b: movzwq 0x4(%rdi),%rsi 10: shr $0x10,%esi 13: xor %eax,%eax 15: leave 16: ret Runtime checks for bpf_sk_lookup.remote_port load and the 2-bytes of zero padding following it, like below, pass with flying colors: ok = ctx->remote_port == bpf_htons(8008); if (!ok) return SK_DROP; ok = *((__u16 *)&ctx->remote_port + 1) == 0; if (!ok) return SK_DROP; (The above checks compile to half-word (2-byte) loads.) On BE (s390x) things look different: Before the patch: * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (b7) r0 = 0 3: (95) exit * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (b7) r0 = 0 3: (95) exit After the patch: * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (74) w2 >>= 16 3: (bc) w2 = w2 4: (b7) r0 = 0 5: (95) exit * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (b7) r0 = 0 3: (95) exit These compile to: * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) bpf_prog_fdd58b8caca29f00: 0: j 0x0000000000000006 4: nopr 6: stmg %r11,%r15,112(%r15) c: la %r13,64(%r15) 10: aghi %r15,-96 14: llgh %r3,4(%r2,%r0) 1a: srl %r3,16 1e: llgfr %r3,%r3 22: lgfi %r14,0 28: lgr %r2,%r14 2c: lmg %r11,%r15,208(%r15) 32: br %r14 * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) bpf_prog_5e3d8e92223c6841: 0: j 0x0000000000000006 4: nopr 6: stmg %r11,%r15,112(%r15) c: la %r13,64(%r15) 10: aghi %r15,-96 14: llgh %r3,4(%r2,%r0) 1a: lgfi %r14,0 20: lgr %r2,%r14 24: lmg %r11,%r15,208(%r15) 2a: br %r14 Now, we right shift the value when loading *(u16 *)(r1 +36) which in C BPF is equivalent to *((__u16 *)&ctx->remote_port + 0) due to how the shift is calculated by bpf_ctx_narrow_access_offset(). This makes the expected typical use-case ctx->remote_port == bpf_htons(8008) fail on s390x because llgh (Load Logical Halfword (64<-16)) seems to lay out the data in the destination register so that it holds 0x0000_0000_0000_1f48. I don't know that was the intention here, as it makes the BPF C code non-portable. WDYT? BTW. Out of curiosity, how does a Logical Load Halfword (llgh) differ differ from a non-logical Load Halfword (lgh) on s390x? Compiler Explorer generates a non-logical load for similar C code. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-08 15:01 ` Jakub Sitnicki @ 2022-03-08 23:58 ` Ilya Leoshkevich 2022-03-09 8:36 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Ilya Leoshkevich @ 2022-03-08 23:58 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Tue, 2022-03-08 at 16:01 +0100, Jakub Sitnicki wrote: > On Tue, Feb 22, 2022 at 07:25 PM +01, Ilya Leoshkevich wrote: > > Verifier treats bpf_sk_lookup.remote_port as a 32-bit field for > > backward compatibility, regardless of what the uapi headers say. > > This field is mapped onto the 16-bit bpf_sk_lookup_kern.sport > > field. > > Therefore, accessing the most significant 16 bits of > > bpf_sk_lookup.remote_port must produce 0, which is currently not > > the case. > > > > The problem is that narrow loads with offset - commit 46f53a65d2de > > ("bpf: Allow narrow loads with offset > 0"), don't play nicely with > > the masking optimization - commit 239946314e57 ("bpf: possibly > > avoid > > extra masking for narrower load in verifier"). In particular, when > > we > > suppress extra masking, we suppress shifting as well, which is not > > correct. > > > > Fix by moving the masking suppression check to BPF_AND generation. > > > > Fixes: 46f53a65d2de ("bpf: Allow narrow loads with offset > 0") > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > > --- > > kernel/bpf/verifier.c | 14 +++++++++----- > > 1 file changed, 9 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index d7473fee247c..195f2e9b5a47 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -12848,7 +12848,7 @@ static int convert_ctx_accesses(struct > > bpf_verifier_env *env) > > return -EINVAL; > > } > > > > - if (is_narrower_load && size < target_size) { > > + if (is_narrower_load) { > > u8 shift = bpf_ctx_narrow_access_offset( > > off, size, size_default) * 8; > > if (shift && cnt + 1 >= > > ARRAY_SIZE(insn_buf)) { > > @@ -12860,15 +12860,19 @@ static int convert_ctx_accesses(struct > > bpf_verifier_env *env) > > insn_buf[cnt++] = > > BPF_ALU32_IMM(BPF_RSH, > > > > insn->dst_reg, > > > > shift); > > - insn_buf[cnt++] = > > BPF_ALU32_IMM(BPF_AND, insn->dst_reg, > > - (1 > > << size * 8) - 1); > > + if (size < target_size) > > + insn_buf[cnt++] = > > BPF_ALU32_IMM( > > + BPF_AND, insn- > > >dst_reg, > > + (1 << size * 8) - > > 1); > > } else { > > if (shift) > > insn_buf[cnt++] = > > BPF_ALU64_IMM(BPF_RSH, > > > > insn->dst_reg, > > > > shift); > > - insn_buf[cnt++] = > > BPF_ALU64_IMM(BPF_AND, insn->dst_reg, > > - > > (1ULL > > << size * 8) - 1); > > + if (size < target_size) > > + insn_buf[cnt++] = > > BPF_ALU64_IMM( > > + BPF_AND, insn- > > >dst_reg, > > + (1ULL << size * 8) > > - 1); > > } > > } > > Thanks for patience. I'm coming back to this. > > This fix affects the 2-byte load from bpf_sk_lookup.remote_port. > Dumping the xlated BPF code confirms it. > > On LE (x86-64) things look well. > > Before this patch: > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (b7) r0 = 0 > 2: (95) exit > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (b7) r0 = 0 > 2: (95) exit > > After this patch: > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (b7) r0 = 0 > 2: (95) exit > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (74) w2 >>= 16 > 2: (b7) r0 = 0 > 3: (95) exit > > Which works great because the JIT generates a zero-extended load > movzwq: > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > bpf_prog_5e4fe3dbdcb18fd3: > 0: nopl 0x0(%rax,%rax,1) > 5: xchg %ax,%ax > 7: push %rbp > 8: mov %rsp,%rbp > b: movzwq 0x4(%rdi),%rsi > 10: xor %eax,%eax > 12: leave > 13: ret > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > bpf_prog_4a6336c64a340b96: > 0: nopl 0x0(%rax,%rax,1) > 5: xchg %ax,%ax > 7: push %rbp > 8: mov %rsp,%rbp > b: movzwq 0x4(%rdi),%rsi > 10: shr $0x10,%esi > 13: xor %eax,%eax > 15: leave > 16: ret > > Runtime checks for bpf_sk_lookup.remote_port load and the 2-bytes of > zero padding following it, like below, pass with flying colors: > > ok = ctx->remote_port == bpf_htons(8008); > if (!ok) > return SK_DROP; > ok = *((__u16 *)&ctx->remote_port + 1) == 0; > if (!ok) > return SK_DROP; > > (The above checks compile to half-word (2-byte) loads.) > > > On BE (s390x) things look different: > > Before the patch: > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (b7) r0 = 0 > 3: (95) exit > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (b7) r0 = 0 > 3: (95) exit > > After the patch: > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (74) w2 >>= 16 > 3: (bc) w2 = w2 > 4: (b7) r0 = 0 > 5: (95) exit > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (b7) r0 = 0 > 3: (95) exit > > These compile to: > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > bpf_prog_fdd58b8caca29f00: > 0: j 0x0000000000000006 > 4: nopr > 6: stmg %r11,%r15,112(%r15) > c: la %r13,64(%r15) > 10: aghi %r15,-96 > 14: llgh %r3,4(%r2,%r0) > 1a: srl %r3,16 > 1e: llgfr %r3,%r3 > 22: lgfi %r14,0 > 28: lgr %r2,%r14 > 2c: lmg %r11,%r15,208(%r15) > 32: br %r14 > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > bpf_prog_5e3d8e92223c6841: > 0: j 0x0000000000000006 > 4: nopr > 6: stmg %r11,%r15,112(%r15) > c: la %r13,64(%r15) > 10: aghi %r15,-96 > 14: llgh %r3,4(%r2,%r0) > 1a: lgfi %r14,0 > 20: lgr %r2,%r14 > 24: lmg %r11,%r15,208(%r15) > 2a: br %r14 > > Now, we right shift the value when loading > > *(u16 *)(r1 +36) > > which in C BPF is equivalent to > > *((__u16 *)&ctx->remote_port + 0) > > due to how the shift is calculated by bpf_ctx_narrow_access_offset(). Right, that's exactly the intention here. The way I see the situation is: the ABI forces us to treat remote_port as a 32-bit field, even though the updated header now says otherwise. And this: unsigned int remote_port; unsigned short result = *(unsigned short *)remote_port; should be the same as: unsigned short result = remote_port >> 16; on big-endian. Note that this is inherently non-portable. > This makes the expected typical use-case > > ctx->remote_port == bpf_htons(8008) > > fail on s390x because llgh (Load Logical Halfword (64<-16)) seems to > lay > out the data in the destination register so that it holds > 0x0000_0000_0000_1f48. > > I don't know that was the intention here, as it makes the BPF C code > non-portable. > > WDYT? This depends on how we define the remote_port field. I would argue that the definition from patch 2 - even though ugly - is the correct one. It is consistent with both the little-endian (1f 48 00 00) and big-endian (00 00 1f 48) ABIs. I don't think the current definition is correct, because it expects 1f 48 00 00 on big-endian, and this is not the case. We can verify this by taking 9a69e2^ and applying --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c @@ -417,6 +417,8 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) return SK_DROP; if (LSW(ctx->remote_port, 0) != SRC_PORT) return SK_DROP; + if (ctx->remote_port != SRC_PORT) + return SK_DROP; /* Narrow loads from local_port field. Expect DST_PORT. */ if (LSB(ctx->local_port, 0) != ((DST_PORT >> 0) & 0xff) || Therefore that ctx->remote_port == bpf_htons(8008) fails without patch 2 is as expected. > BTW. Out of curiosity, how does a Logical Load Halfword (llgh) differ > differ from a non-logical Load Halfword (lgh) on s390x? Compiler > Explorer generates a non-logical load for similar C code. The logical one does zero extension, and the regular one does sign extension. The following unsigned long foo(unsigned short *bar) { return *bar; } is compiled to foo(unsigned short*): llgh %r2,0(%r2) br %r14 with -O3. Without -O3 zeroing out the upper bits is done using the sllg and srlg (left and right logical shifts respectively). ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-08 23:58 ` Ilya Leoshkevich @ 2022-03-09 8:36 ` Jakub Sitnicki 2022-03-09 12:34 ` Ilya Leoshkevich 0 siblings, 1 reply; 18+ messages in thread From: Jakub Sitnicki @ 2022-03-09 8:36 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Wed, Mar 09, 2022 at 12:58 AM +01, Ilya Leoshkevich wrote: > On Tue, 2022-03-08 at 16:01 +0100, Jakub Sitnicki wrote: >> On Tue, Feb 22, 2022 at 07:25 PM +01, Ilya Leoshkevich wrote: >> > Verifier treats bpf_sk_lookup.remote_port as a 32-bit field for >> > backward compatibility, regardless of what the uapi headers say. >> > This field is mapped onto the 16-bit bpf_sk_lookup_kern.sport >> > field. >> > Therefore, accessing the most significant 16 bits of >> > bpf_sk_lookup.remote_port must produce 0, which is currently not >> > the case. >> > >> > The problem is that narrow loads with offset - commit 46f53a65d2de >> > ("bpf: Allow narrow loads with offset > 0"), don't play nicely with >> > the masking optimization - commit 239946314e57 ("bpf: possibly >> > avoid >> > extra masking for narrower load in verifier"). In particular, when >> > we >> > suppress extra masking, we suppress shifting as well, which is not >> > correct. >> > >> > Fix by moving the masking suppression check to BPF_AND generation. >> > >> > Fixes: 46f53a65d2de ("bpf: Allow narrow loads with offset > 0") >> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> >> > --- >> > kernel/bpf/verifier.c | 14 +++++++++----- >> > 1 file changed, 9 insertions(+), 5 deletions(-) >> > >> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> > index d7473fee247c..195f2e9b5a47 100644 >> > --- a/kernel/bpf/verifier.c >> > +++ b/kernel/bpf/verifier.c >> > @@ -12848,7 +12848,7 @@ static int convert_ctx_accesses(struct >> > bpf_verifier_env *env) >> > return -EINVAL; >> > } >> > >> > - if (is_narrower_load && size < target_size) { >> > + if (is_narrower_load) { >> > u8 shift = bpf_ctx_narrow_access_offset( >> > off, size, size_default) * 8; >> > if (shift && cnt + 1 >= >> > ARRAY_SIZE(insn_buf)) { >> > @@ -12860,15 +12860,19 @@ static int convert_ctx_accesses(struct >> > bpf_verifier_env *env) >> > insn_buf[cnt++] = >> > BPF_ALU32_IMM(BPF_RSH, >> > >> > insn->dst_reg, >> > >> > shift); >> > - insn_buf[cnt++] = >> > BPF_ALU32_IMM(BPF_AND, insn->dst_reg, >> > - (1 >> > << size * 8) - 1); >> > + if (size < target_size) >> > + insn_buf[cnt++] = >> > BPF_ALU32_IMM( >> > + BPF_AND, insn- >> > >dst_reg, >> > + (1 << size * 8) - >> > 1); >> > } else { >> > if (shift) >> > insn_buf[cnt++] = >> > BPF_ALU64_IMM(BPF_RSH, >> > >> > insn->dst_reg, >> > >> > shift); >> > - insn_buf[cnt++] = >> > BPF_ALU64_IMM(BPF_AND, insn->dst_reg, >> > - >> > (1ULL >> > << size * 8) - 1); >> > + if (size < target_size) >> > + insn_buf[cnt++] = >> > BPF_ALU64_IMM( >> > + BPF_AND, insn- >> > >dst_reg, >> > + (1ULL << size * 8) >> > - 1); >> > } >> > } >> >> Thanks for patience. I'm coming back to this. >> >> This fix affects the 2-byte load from bpf_sk_lookup.remote_port. >> Dumping the xlated BPF code confirms it. >> >> On LE (x86-64) things look well. >> >> Before this patch: >> >> * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (b7) r0 = 0 >> 2: (95) exit >> >> * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (b7) r0 = 0 >> 2: (95) exit >> >> After this patch: >> >> * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (b7) r0 = 0 >> 2: (95) exit >> >> * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (74) w2 >>= 16 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> Which works great because the JIT generates a zero-extended load >> movzwq: >> >> * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) >> bpf_prog_5e4fe3dbdcb18fd3: >> 0: nopl 0x0(%rax,%rax,1) >> 5: xchg %ax,%ax >> 7: push %rbp >> 8: mov %rsp,%rbp >> b: movzwq 0x4(%rdi),%rsi >> 10: xor %eax,%eax >> 12: leave >> 13: ret >> >> >> * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) >> bpf_prog_4a6336c64a340b96: >> 0: nopl 0x0(%rax,%rax,1) >> 5: xchg %ax,%ax >> 7: push %rbp >> 8: mov %rsp,%rbp >> b: movzwq 0x4(%rdi),%rsi >> 10: shr $0x10,%esi >> 13: xor %eax,%eax >> 15: leave >> 16: ret >> >> Runtime checks for bpf_sk_lookup.remote_port load and the 2-bytes of >> zero padding following it, like below, pass with flying colors: >> >> ok = ctx->remote_port == bpf_htons(8008); >> if (!ok) >> return SK_DROP; >> ok = *((__u16 *)&ctx->remote_port + 1) == 0; >> if (!ok) >> return SK_DROP; >> >> (The above checks compile to half-word (2-byte) loads.) >> >> >> On BE (s390x) things look different: >> >> Before the patch: >> >> * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> After the patch: >> >> * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (74) w2 >>= 16 >> 3: (bc) w2 = w2 >> 4: (b7) r0 = 0 >> 5: (95) exit >> >> * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> These compile to: >> >> * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) >> bpf_prog_fdd58b8caca29f00: >> 0: j 0x0000000000000006 >> 4: nopr >> 6: stmg %r11,%r15,112(%r15) >> c: la %r13,64(%r15) >> 10: aghi %r15,-96 >> 14: llgh %r3,4(%r2,%r0) >> 1a: srl %r3,16 >> 1e: llgfr %r3,%r3 >> 22: lgfi %r14,0 >> 28: lgr %r2,%r14 >> 2c: lmg %r11,%r15,208(%r15) >> 32: br %r14 >> >> >> * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) >> bpf_prog_5e3d8e92223c6841: >> 0: j 0x0000000000000006 >> 4: nopr >> 6: stmg %r11,%r15,112(%r15) >> c: la %r13,64(%r15) >> 10: aghi %r15,-96 >> 14: llgh %r3,4(%r2,%r0) >> 1a: lgfi %r14,0 >> 20: lgr %r2,%r14 >> 24: lmg %r11,%r15,208(%r15) >> 2a: br %r14 >> >> Now, we right shift the value when loading >> >> *(u16 *)(r1 +36) >> >> which in C BPF is equivalent to >> >> *((__u16 *)&ctx->remote_port + 0) >> >> due to how the shift is calculated by bpf_ctx_narrow_access_offset(). > > Right, that's exactly the intention here. > The way I see the situation is: the ABI forces us to treat remote_port > as a 32-bit field, even though the updated header now says otherwise. > And this: > > unsigned int remote_port; > unsigned short result = *(unsigned short *)remote_port; > > should be the same as: > > unsigned short result = remote_port >> 16; > > on big-endian. Note that this is inherently non-portable. > >> This makes the expected typical use-case >> >> ctx->remote_port == bpf_htons(8008) >> >> fail on s390x because llgh (Load Logical Halfword (64<-16)) seems to >> lay >> out the data in the destination register so that it holds >> 0x0000_0000_0000_1f48. >> >> I don't know that was the intention here, as it makes the BPF C code >> non-portable. >> >> WDYT? > > This depends on how we define the remote_port field. I would argue that > the definition from patch 2 - even though ugly - is the correct one. > It is consistent with both the little-endian (1f 48 00 00) and > big-endian (00 00 1f 48) ABIs. > > I don't think the current definition is correct, because it expects > 1f 48 00 00 on big-endian, and this is not the case. We can verify this > by taking 9a69e2^ and applying > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > @@ -417,6 +417,8 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) > return SK_DROP; > if (LSW(ctx->remote_port, 0) != SRC_PORT) > return SK_DROP; > + if (ctx->remote_port != SRC_PORT) > + return SK_DROP; > > /* Narrow loads from local_port field. Expect DST_PORT. */ > if (LSB(ctx->local_port, 0) != ((DST_PORT >> 0) & 0xff) || > > Therefore that > > ctx->remote_port == bpf_htons(8008) > > fails without patch 2 is as expected. > Consider this - today the below is true on both LE and BE, right? *(u32 *)&ctx->remote_port == *(u16 *)&ctx->remote_port because the loads get converted to: *(u16 *)&ctx_kern->sport == *(u16 *)&ctx_kern->sport IOW, today, because of the bug that you are fixing here, the data layout changes from the PoV of the BPF program depending on the load size. With 2-byte loads, without this patch, the data layout appears as: struct bpf_sk_lookup { ... __be16 remote_port; __be16 remote_port; ... } While for 4-byte loads, it appears as in your 2nd patch: struct bpf_sk_lookup { ... #if little-endian __be16 remote_port; __u16 :16; /* zero padding */ #elif big-endian __u16 :16; /* zero padding */ __be16 remote_port; #endif ... } Because of that I don't see how we could keep complete ABI compatiblity, and have just one definition of struct bpf_sk_lookup that reflects it. These are conflicting requirements. I'd bite the bullet for 4-byte loads, for the sake of having an endian-agnostic struct bpf_sk_lookup and struct bpf_sock definition in the uAPI header. The sacrifice here is that the access converter will have to keep rewriting 4-byte access to bpf_sk_lookup.remote_port and bpf_sock.dst_port in this unexpected, quirky manner. The expectation is that with time users will recompile their BPF progs against the updated bpf.h, and switch to 2-byte loads. That will make the quirk in the access converter dead code in time. I don't have any better ideas. Sorry. [...] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-09 8:36 ` Jakub Sitnicki @ 2022-03-09 12:34 ` Ilya Leoshkevich 2022-03-10 22:57 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Ilya Leoshkevich @ 2022-03-09 12:34 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Wed, 2022-03-09 at 09:36 +0100, Jakub Sitnicki wrote: > On Wed, Mar 09, 2022 at 12:58 AM +01, Ilya Leoshkevich wrote: > > On Tue, 2022-03-08 at 16:01 +0100, Jakub Sitnicki wrote: > > > On Tue, Feb 22, 2022 at 07:25 PM +01, Ilya Leoshkevich wrote: > > > > Verifier treats bpf_sk_lookup.remote_port as a 32-bit field for > > > > backward compatibility, regardless of what the uapi headers > > > > say. > > > > This field is mapped onto the 16-bit bpf_sk_lookup_kern.sport > > > > field. > > > > Therefore, accessing the most significant 16 bits of > > > > bpf_sk_lookup.remote_port must produce 0, which is currently > > > > not > > > > the case. > > > > > > > > The problem is that narrow loads with offset - commit > > > > 46f53a65d2de > > > > ("bpf: Allow narrow loads with offset > 0"), don't play nicely > > > > with > > > > the masking optimization - commit 239946314e57 ("bpf: possibly > > > > avoid > > > > extra masking for narrower load in verifier"). In particular, > > > > when > > > > we > > > > suppress extra masking, we suppress shifting as well, which is > > > > not > > > > correct. > > > > > > > > Fix by moving the masking suppression check to BPF_AND > > > > generation. > > > > > > > > Fixes: 46f53a65d2de ("bpf: Allow narrow loads with offset > 0") > > > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > > > > --- > > > > kernel/bpf/verifier.c | 14 +++++++++----- > > > > 1 file changed, 9 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > index d7473fee247c..195f2e9b5a47 100644 > > > > --- a/kernel/bpf/verifier.c > > > > +++ b/kernel/bpf/verifier.c > > > > @@ -12848,7 +12848,7 @@ static int convert_ctx_accesses(struct > > > > bpf_verifier_env *env) > > > > return -EINVAL; > > > > } > > > > > > > > - if (is_narrower_load && size < target_size) { > > > > + if (is_narrower_load) { > > > > u8 shift = > > > > bpf_ctx_narrow_access_offset( > > > > off, size, size_default) * 8; > > > > if (shift && cnt + 1 >= > > > > ARRAY_SIZE(insn_buf)) { > > > > @@ -12860,15 +12860,19 @@ static int > > > > convert_ctx_accesses(struct > > > > bpf_verifier_env *env) > > > > insn_buf[cnt++] = > > > > BPF_ALU32_IMM(BPF_RSH, > > > > > > > > > > > > insn->dst_reg, > > > > > > > > > > > > shift); > > > > - insn_buf[cnt++] = > > > > BPF_ALU32_IMM(BPF_AND, insn->dst_reg, > > > > - > > > > ( > > > > 1 > > > > << size * 8) - 1); > > > > + if (size < target_size) > > > > + insn_buf[cnt++] = > > > > BPF_ALU32_IMM( > > > > + BPF_AND, insn- > > > > > dst_reg, > > > > + (1 << size * 8) > > > > - > > > > 1); > > > > } else { > > > > if (shift) > > > > insn_buf[cnt++] = > > > > BPF_ALU64_IMM(BPF_RSH, > > > > > > > > > > > > insn->dst_reg, > > > > > > > > > > > > shift); > > > > - insn_buf[cnt++] = > > > > BPF_ALU64_IMM(BPF_AND, insn->dst_reg, > > > > - > > > > > > > > (1ULL > > > > << size * 8) - 1); > > > > + if (size < target_size) > > > > + insn_buf[cnt++] = > > > > BPF_ALU64_IMM( > > > > + BPF_AND, insn- > > > > > dst_reg, > > > > + (1ULL << size * > > > > 8) > > > > - 1); > > > > } > > > > } > > > > > > Thanks for patience. I'm coming back to this. > > > > > > This fix affects the 2-byte load from bpf_sk_lookup.remote_port. > > > Dumping the xlated BPF code confirms it. > > > > > > On LE (x86-64) things look well. > > > > > > Before this patch: > > > > > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (b7) r0 = 0 > > > 2: (95) exit > > > > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (b7) r0 = 0 > > > 2: (95) exit > > > > > > After this patch: > > > > > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (b7) r0 = 0 > > > 2: (95) exit > > > > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (74) w2 >>= 16 > > > 2: (b7) r0 = 0 > > > 3: (95) exit > > > > > > Which works great because the JIT generates a zero-extended load > > > movzwq: > > > > > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > > > bpf_prog_5e4fe3dbdcb18fd3: > > > 0: nopl 0x0(%rax,%rax,1) > > > 5: xchg %ax,%ax > > > 7: push %rbp > > > 8: mov %rsp,%rbp > > > b: movzwq 0x4(%rdi),%rsi > > > 10: xor %eax,%eax > > > 12: leave > > > 13: ret > > > > > > > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > > > bpf_prog_4a6336c64a340b96: > > > 0: nopl 0x0(%rax,%rax,1) > > > 5: xchg %ax,%ax > > > 7: push %rbp > > > 8: mov %rsp,%rbp > > > b: movzwq 0x4(%rdi),%rsi > > > 10: shr $0x10,%esi > > > 13: xor %eax,%eax > > > 15: leave > > > 16: ret > > > > > > Runtime checks for bpf_sk_lookup.remote_port load and the 2-bytes > > > of > > > zero padding following it, like below, pass with flying colors: > > > > > > ok = ctx->remote_port == bpf_htons(8008); > > > if (!ok) > > > return SK_DROP; > > > ok = *((__u16 *)&ctx->remote_port + 1) == 0; > > > if (!ok) > > > return SK_DROP; > > > > > > (The above checks compile to half-word (2-byte) loads.) > > > > > > > > > On BE (s390x) things look different: > > > > > > Before the patch: > > > > > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (bc) w2 = w2 > > > 2: (b7) r0 = 0 > > > 3: (95) exit > > > > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (bc) w2 = w2 > > > 2: (b7) r0 = 0 > > > 3: (95) exit > > > > > > After the patch: > > > > > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (bc) w2 = w2 > > > 2: (74) w2 >>= 16 > > > 3: (bc) w2 = w2 > > > 4: (b7) r0 = 0 > > > 5: (95) exit > > > > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > > > 0: (69) r2 = *(u16 *)(r1 +4) > > > 1: (bc) w2 = w2 > > > 2: (b7) r0 = 0 > > > 3: (95) exit > > > > > > These compile to: > > > > > > * size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36) > > > bpf_prog_fdd58b8caca29f00: > > > 0: j 0x0000000000000006 > > > 4: nopr > > > 6: stmg %r11,%r15,112(%r15) > > > c: la %r13,64(%r15) > > > 10: aghi %r15,-96 > > > 14: llgh %r3,4(%r2,%r0) > > > 1a: srl %r3,16 > > > 1e: llgfr %r3,%r3 > > > 22: lgfi %r14,0 > > > 28: lgr %r2,%r14 > > > 2c: lmg %r11,%r15,208(%r15) > > > 32: br %r14 > > > > > > > > > * size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38) > > > bpf_prog_5e3d8e92223c6841: > > > 0: j 0x0000000000000006 > > > 4: nopr > > > 6: stmg %r11,%r15,112(%r15) > > > c: la %r13,64(%r15) > > > 10: aghi %r15,-96 > > > 14: llgh %r3,4(%r2,%r0) > > > 1a: lgfi %r14,0 > > > 20: lgr %r2,%r14 > > > 24: lmg %r11,%r15,208(%r15) > > > 2a: br %r14 > > > > > > Now, we right shift the value when loading > > > > > > *(u16 *)(r1 +36) > > > > > > which in C BPF is equivalent to > > > > > > *((__u16 *)&ctx->remote_port + 0) > > > > > > due to how the shift is calculated by > > > bpf_ctx_narrow_access_offset(). > > > > Right, that's exactly the intention here. > > The way I see the situation is: the ABI forces us to treat > > remote_port > > as a 32-bit field, even though the updated header now says > > otherwise. > > And this: > > > > unsigned int remote_port; > > unsigned short result = *(unsigned short *)remote_port; > > > > should be the same as: > > > > unsigned short result = remote_port >> 16; > > > > on big-endian. Note that this is inherently non-portable. > > > > > > > > > > This makes the expected typical use-case > > > > > > ctx->remote_port == bpf_htons(8008) > > > > > > fail on s390x because llgh (Load Logical Halfword (64<-16)) seems > > > to > > > lay > > > out the data in the destination register so that it holds > > > 0x0000_0000_0000_1f48. > > > > > > I don't know that was the intention here, as it makes the BPF C > > > code > > > non-portable. > > > > > > WDYT? > > > > This depends on how we define the remote_port field. I would argue > > that > > the definition from patch 2 - even though ugly - is the correct > > one. > > It is consistent with both the little-endian (1f 48 00 00) and > > big-endian (00 00 1f 48) ABIs. > > > > I don't think the current definition is correct, because it expects > > 1f 48 00 00 on big-endian, and this is not the case. We can verify > > this > > by taking 9a69e2^ and applying > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > @@ -417,6 +417,8 @@ int ctx_narrow_access(struct bpf_sk_lookup > > *ctx) > > return SK_DROP; > > if (LSW(ctx->remote_port, 0) != SRC_PORT) > > return SK_DROP; > > + if (ctx->remote_port != SRC_PORT) > > + return SK_DROP; > > > > /* Narrow loads from local_port field. Expect DST_PORT. */ > > if (LSB(ctx->local_port, 0) != ((DST_PORT >> 0) & 0xff) || > > > > Therefore that > > > > ctx->remote_port == bpf_htons(8008) > > > > fails without patch 2 is as expected. > > > > Consider this - today the below is true on both LE and BE, right? > > *(u32 *)&ctx->remote_port == *(u16 *)&ctx->remote_port > > because the loads get converted to: > > *(u16 *)&ctx_kern->sport == *(u16 *)&ctx_kern->sport > > IOW, today, because of the bug that you are fixing here, the data > layout > changes from the PoV of the BPF program depending on the load size. > > With 2-byte loads, without this patch, the data layout appears as: > > struct bpf_sk_lookup { > ... > __be16 remote_port; > __be16 remote_port; > ... > } I see, one can indeed argue that this is also a part of the ABI now. So we're stuck between a rock and a hard place. > While for 4-byte loads, it appears as in your 2nd patch: > > struct bpf_sk_lookup { > ... > #if little-endian > __be16 remote_port; > __u16 :16; /* zero padding */ > #elif big-endian > __u16 :16; /* zero padding */ > __be16 remote_port; > #endif > ... > } > > Because of that I don't see how we could keep complete ABI > compatiblity, > and have just one definition of struct bpf_sk_lookup that reflects > it. These are conflicting requirements. > > I'd bite the bullet for 4-byte loads, for the sake of having an > endian-agnostic struct bpf_sk_lookup and struct bpf_sock definition > in > the uAPI header. > > The sacrifice here is that the access converter will have to keep > rewriting 4-byte access to bpf_sk_lookup.remote_port and > bpf_sock.dst_port in this unexpected, quirky manner. > > The expectation is that with time users will recompile their BPF > progs > against the updated bpf.h, and switch to 2-byte loads. That will make > the quirk in the access converter dead code in time. > > I don't have any better ideas. Sorry. > > [...] I agree, let's go ahead with this solution. The only remaining problem that I see is: the bug is in the common code, and it will affect the fields that we add in the future. Can we either document this state of things in a comment, or fix the bug and emulate the old behavior for certain fields? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-09 12:34 ` Ilya Leoshkevich @ 2022-03-10 22:57 ` Jakub Sitnicki 2022-03-14 17:35 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Jakub Sitnicki @ 2022-03-10 22:57 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Wed, Mar 09, 2022 at 01:34 PM +01, Ilya Leoshkevich wrote: > On Wed, 2022-03-09 at 09:36 +0100, Jakub Sitnicki wrote: [...] >> >> Consider this - today the below is true on both LE and BE, right? >> >> *(u32 *)&ctx->remote_port == *(u16 *)&ctx->remote_port >> >> because the loads get converted to: >> >> *(u16 *)&ctx_kern->sport == *(u16 *)&ctx_kern->sport >> >> IOW, today, because of the bug that you are fixing here, the data >> layout >> changes from the PoV of the BPF program depending on the load size. >> >> With 2-byte loads, without this patch, the data layout appears as: >> >> struct bpf_sk_lookup { >> ... >> __be16 remote_port; >> __be16 remote_port; >> ... >> } > > I see, one can indeed argue that this is also a part of the ABI now. > So we're stuck between a rock and a hard place. > >> While for 4-byte loads, it appears as in your 2nd patch: >> >> struct bpf_sk_lookup { >> ... >> #if little-endian >> __be16 remote_port; >> __u16 :16; /* zero padding */ >> #elif big-endian >> __u16 :16; /* zero padding */ >> __be16 remote_port; >> #endif >> ... >> } >> >> Because of that I don't see how we could keep complete ABI >> compatiblity, >> and have just one definition of struct bpf_sk_lookup that reflects >> it. These are conflicting requirements. >> >> I'd bite the bullet for 4-byte loads, for the sake of having an >> endian-agnostic struct bpf_sk_lookup and struct bpf_sock definition >> in >> the uAPI header. >> >> The sacrifice here is that the access converter will have to keep >> rewriting 4-byte access to bpf_sk_lookup.remote_port and >> bpf_sock.dst_port in this unexpected, quirky manner. >> >> The expectation is that with time users will recompile their BPF >> progs >> against the updated bpf.h, and switch to 2-byte loads. That will make >> the quirk in the access converter dead code in time. >> >> I don't have any better ideas. Sorry. >> >> [...] > > I agree, let's go ahead with this solution. > > The only remaining problem that I see is: the bug is in the common > code, and it will affect the fields that we add in the future. > > Can we either document this state of things in a comment, or fix the > bug and emulate the old behavior for certain fields? I think we can fix the bug in the common code, and compensate for the quirky 4-byte access to bpf_sk_lookup.remote_port and bpf_sock.dst_port in the is_valid_access and convert_ctx_access. With the patch as below, access to remote_port gets rewritten to: * size=1, offset=0, r2 = *(u8 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (54) w2 &= 255 2: (b7) r0 = 0 3: (95) exit * size=1, offset=1, r2 = *(u8 *)(r1 +37) 0: (69) r2 = *(u16 *)(r1 +4) 1: (74) w2 >>= 8 2: (54) w2 &= 255 3: (b7) r0 = 0 4: (95) exit * size=1, offset=2, r2 = *(u8 *)(r1 +38) 0: (b4) w2 = 0 1: (54) w2 &= 255 2: (b7) r0 = 0 3: (95) exit * size=1, offset=3, r2 = *(u8 *)(r1 +39) 0: (b4) w2 = 0 1: (74) w2 >>= 8 2: (54) w2 &= 255 3: (b7) r0 = 0 4: (95) exit * size=2, offset=0, r2 = *(u16 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (b7) r0 = 0 2: (95) exit * size=2, offset=2, r2 = *(u16 *)(r1 +38) 0: (b4) w2 = 0 1: (b7) r0 = 0 2: (95) exit * size=4, offset=0, r2 = *(u32 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (b7) r0 = 0 2: (95) exit How does that look to you? Still need to give it a test on s390x. --8<-- diff --git a/net/core/filter.c b/net/core/filter.c index 65869fd510e8..2625a1d2dabc 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -10856,13 +10856,24 @@ static bool sk_lookup_is_valid_access(int off, int size, case bpf_ctx_range(struct bpf_sk_lookup, local_ip4): case bpf_ctx_range_till(struct bpf_sk_lookup, remote_ip6[0], remote_ip6[3]): case bpf_ctx_range_till(struct bpf_sk_lookup, local_ip6[0], local_ip6[3]): - case offsetof(struct bpf_sk_lookup, remote_port) ... - offsetof(struct bpf_sk_lookup, local_ip4) - 1: case bpf_ctx_range(struct bpf_sk_lookup, local_port): case bpf_ctx_range(struct bpf_sk_lookup, ingress_ifindex): bpf_ctx_record_field_size(info, sizeof(__u32)); return bpf_ctx_narrow_access_ok(off, size, sizeof(__u32)); + case bpf_ctx_range(struct bpf_sk_lookup, remote_port): + /* Allow 4-byte access to 2-byte field for backward compatibility */ + if (size == sizeof(__u32)) + return off == offsetof(struct bpf_sk_lookup, remote_port); + bpf_ctx_record_field_size(info, sizeof(__be16)); + return bpf_ctx_narrow_access_ok(off, size, sizeof(__be16)); + + case offsetofend(struct bpf_sk_lookup, remote_port) ... + offsetof(struct bpf_sk_lookup, local_ip4) - 1: + /* Allow access to zero padding for backward compatiblity */ + bpf_ctx_record_field_size(info, sizeof(__u16)); + return bpf_ctx_narrow_access_ok(off, size, sizeof(__u16)); + default: return false; } @@ -10944,6 +10955,11 @@ static u32 sk_lookup_convert_ctx_access(enum bpf_access_type type, sport, 2, target_size)); break; + case offsetofend(struct bpf_sk_lookup, remote_port): + *target_size = 2; + *insn++ = BPF_MOV32_IMM(si->dst_reg, 0); + break; + case offsetof(struct bpf_sk_lookup, local_port): *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, bpf_target_off(struct bpf_sk_lookup_kern, ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-10 22:57 ` Jakub Sitnicki @ 2022-03-14 17:35 ` Jakub Sitnicki 2022-03-14 18:25 ` Ilya Leoshkevich 0 siblings, 1 reply; 18+ messages in thread From: Jakub Sitnicki @ 2022-03-14 17:35 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Thu, Mar 10, 2022 at 11:57 PM +01, Jakub Sitnicki wrote: > On Wed, Mar 09, 2022 at 01:34 PM +01, Ilya Leoshkevich wrote: >> On Wed, 2022-03-09 at 09:36 +0100, Jakub Sitnicki wrote: > > [...] > >>> >>> Consider this - today the below is true on both LE and BE, right? >>> >>> *(u32 *)&ctx->remote_port == *(u16 *)&ctx->remote_port >>> >>> because the loads get converted to: >>> >>> *(u16 *)&ctx_kern->sport == *(u16 *)&ctx_kern->sport >>> >>> IOW, today, because of the bug that you are fixing here, the data >>> layout >>> changes from the PoV of the BPF program depending on the load size. >>> >>> With 2-byte loads, without this patch, the data layout appears as: >>> >>> struct bpf_sk_lookup { >>> ... >>> __be16 remote_port; >>> __be16 remote_port; >>> ... >>> } >> >> I see, one can indeed argue that this is also a part of the ABI now. >> So we're stuck between a rock and a hard place. >> >>> While for 4-byte loads, it appears as in your 2nd patch: >>> >>> struct bpf_sk_lookup { >>> ... >>> #if little-endian >>> __be16 remote_port; >>> __u16 :16; /* zero padding */ >>> #elif big-endian >>> __u16 :16; /* zero padding */ >>> __be16 remote_port; >>> #endif >>> ... >>> } >>> >>> Because of that I don't see how we could keep complete ABI >>> compatiblity, >>> and have just one definition of struct bpf_sk_lookup that reflects >>> it. These are conflicting requirements. >>> >>> I'd bite the bullet for 4-byte loads, for the sake of having an >>> endian-agnostic struct bpf_sk_lookup and struct bpf_sock definition >>> in >>> the uAPI header. >>> >>> The sacrifice here is that the access converter will have to keep >>> rewriting 4-byte access to bpf_sk_lookup.remote_port and >>> bpf_sock.dst_port in this unexpected, quirky manner. >>> >>> The expectation is that with time users will recompile their BPF >>> progs >>> against the updated bpf.h, and switch to 2-byte loads. That will make >>> the quirk in the access converter dead code in time. >>> >>> I don't have any better ideas. Sorry. >>> >>> [...] >> >> I agree, let's go ahead with this solution. >> >> The only remaining problem that I see is: the bug is in the common >> code, and it will affect the fields that we add in the future. >> >> Can we either document this state of things in a comment, or fix the >> bug and emulate the old behavior for certain fields? > > I think we can fix the bug in the common code, and compensate for the > quirky 4-byte access to bpf_sk_lookup.remote_port and bpf_sock.dst_port > in the is_valid_access and convert_ctx_access. > > With the patch as below, access to remote_port gets rewritten to: > > * size=1, offset=0, r2 = *(u8 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (54) w2 &= 255 > 2: (b7) r0 = 0 > 3: (95) exit > > * size=1, offset=1, r2 = *(u8 *)(r1 +37) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (74) w2 >>= 8 > 2: (54) w2 &= 255 > 3: (b7) r0 = 0 > 4: (95) exit > > * size=1, offset=2, r2 = *(u8 *)(r1 +38) > 0: (b4) w2 = 0 > 1: (54) w2 &= 255 > 2: (b7) r0 = 0 > 3: (95) exit > > * size=1, offset=3, r2 = *(u8 *)(r1 +39) > 0: (b4) w2 = 0 > 1: (74) w2 >>= 8 > 2: (54) w2 &= 255 > 3: (b7) r0 = 0 > 4: (95) exit > > * size=2, offset=0, r2 = *(u16 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (b7) r0 = 0 > 2: (95) exit > > * size=2, offset=2, r2 = *(u16 *)(r1 +38) > 0: (b4) w2 = 0 > 1: (b7) r0 = 0 > 2: (95) exit > > * size=4, offset=0, r2 = *(u32 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (b7) r0 = 0 > 2: (95) exit > > How does that look to you? > > Still need to give it a test on s390x. Context conversion with patch below applied looks correct to me on s390x as well: * size=1, offset=0, r2 = *(u8 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (74) w2 >>= 8 3: (bc) w2 = w2 4: (54) w2 &= 255 5: (bc) w2 = w2 6: (b7) r0 = 0 7: (95) exit * size=1, offset=1, r2 = *(u8 *)(r1 +37) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (54) w2 &= 255 3: (bc) w2 = w2 4: (b7) r0 = 0 5: (95) exit * size=1, offset=2, r2 = *(u8 *)(r1 +38) 0: (b4) w2 = 0 1: (bc) w2 = w2 2: (74) w2 >>= 8 3: (bc) w2 = w2 4: (54) w2 &= 255 5: (bc) w2 = w2 6: (b7) r0 = 0 7: (95) exit * size=1, offset=3, r2 = *(u8 *)(r1 +39) 0: (b4) w2 = 0 1: (bc) w2 = w2 2: (54) w2 &= 255 3: (bc) w2 = w2 4: (b7) r0 = 0 5: (95) exit * size=2, offset=0, r2 = *(u16 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (b7) r0 = 0 3: (95) exit * size=2, offset=2, r2 = *(u16 *)(r1 +38) 0: (b4) w2 = 0 1: (bc) w2 = w2 2: (b7) r0 = 0 3: (95) exit * size=4, offset=0, r2 = *(u32 *)(r1 +36) 0: (69) r2 = *(u16 *)(r1 +4) 1: (bc) w2 = w2 2: (b7) r0 = 0 3: (95) exit If we go this way, this should unbreak the bpf selftests on BE, independently of the patch 1 from this series. Will send it as a patch, so that we continue the review discussion. > > --8<-- > > diff --git a/net/core/filter.c b/net/core/filter.c > index 65869fd510e8..2625a1d2dabc 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -10856,13 +10856,24 @@ static bool sk_lookup_is_valid_access(int off, int size, > case bpf_ctx_range(struct bpf_sk_lookup, local_ip4): > case bpf_ctx_range_till(struct bpf_sk_lookup, remote_ip6[0], remote_ip6[3]): > case bpf_ctx_range_till(struct bpf_sk_lookup, local_ip6[0], local_ip6[3]): > - case offsetof(struct bpf_sk_lookup, remote_port) ... > - offsetof(struct bpf_sk_lookup, local_ip4) - 1: > case bpf_ctx_range(struct bpf_sk_lookup, local_port): > case bpf_ctx_range(struct bpf_sk_lookup, ingress_ifindex): > bpf_ctx_record_field_size(info, sizeof(__u32)); > return bpf_ctx_narrow_access_ok(off, size, sizeof(__u32)); > > + case bpf_ctx_range(struct bpf_sk_lookup, remote_port): > + /* Allow 4-byte access to 2-byte field for backward compatibility */ > + if (size == sizeof(__u32)) > + return off == offsetof(struct bpf_sk_lookup, remote_port); > + bpf_ctx_record_field_size(info, sizeof(__be16)); > + return bpf_ctx_narrow_access_ok(off, size, sizeof(__be16)); > + > + case offsetofend(struct bpf_sk_lookup, remote_port) ... > + offsetof(struct bpf_sk_lookup, local_ip4) - 1: > + /* Allow access to zero padding for backward compatiblity */ > + bpf_ctx_record_field_size(info, sizeof(__u16)); > + return bpf_ctx_narrow_access_ok(off, size, sizeof(__u16)); > + > default: > return false; > } > @@ -10944,6 +10955,11 @@ static u32 sk_lookup_convert_ctx_access(enum bpf_access_type type, > sport, 2, target_size)); > break; > > + case offsetofend(struct bpf_sk_lookup, remote_port): > + *target_size = 2; > + *insn++ = BPF_MOV32_IMM(si->dst_reg, 0); > + break; > + > case offsetof(struct bpf_sk_lookup, local_port): > *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, > bpf_target_off(struct bpf_sk_lookup_kern, ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-14 17:35 ` Jakub Sitnicki @ 2022-03-14 18:25 ` Ilya Leoshkevich 2022-03-14 20:57 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Ilya Leoshkevich @ 2022-03-14 18:25 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Mon, 2022-03-14 at 18:35 +0100, Jakub Sitnicki wrote: > On Thu, Mar 10, 2022 at 11:57 PM +01, Jakub Sitnicki wrote: > > On Wed, Mar 09, 2022 at 01:34 PM +01, Ilya Leoshkevich wrote: > > > On Wed, 2022-03-09 at 09:36 +0100, Jakub Sitnicki wrote: > > > > [...] > > > > > > > > > > Consider this - today the below is true on both LE and BE, > > > > right? > > > > > > > > *(u32 *)&ctx->remote_port == *(u16 *)&ctx->remote_port > > > > > > > > because the loads get converted to: > > > > > > > > *(u16 *)&ctx_kern->sport == *(u16 *)&ctx_kern->sport > > > > > > > > IOW, today, because of the bug that you are fixing here, the > > > > data > > > > layout > > > > changes from the PoV of the BPF program depending on the load > > > > size. > > > > > > > > With 2-byte loads, without this patch, the data layout appears > > > > as: > > > > > > > > struct bpf_sk_lookup { > > > > ... > > > > __be16 remote_port; > > > > __be16 remote_port; > > > > ... > > > > } > > > > > > I see, one can indeed argue that this is also a part of the ABI > > > now. > > > So we're stuck between a rock and a hard place. > > > > > > > While for 4-byte loads, it appears as in your 2nd patch: > > > > > > > > struct bpf_sk_lookup { > > > > ... > > > > #if little-endian > > > > __be16 remote_port; > > > > __u16 :16; /* zero padding */ > > > > #elif big-endian > > > > __u16 :16; /* zero padding */ > > > > __be16 remote_port; > > > > #endif > > > > ... > > > > } > > > > > > > > Because of that I don't see how we could keep complete ABI > > > > compatiblity, > > > > and have just one definition of struct bpf_sk_lookup that > > > > reflects > > > > it. These are conflicting requirements. > > > > > > > > I'd bite the bullet for 4-byte loads, for the sake of having an > > > > endian-agnostic struct bpf_sk_lookup and struct bpf_sock > > > > definition > > > > in > > > > the uAPI header. > > > > > > > > The sacrifice here is that the access converter will have to > > > > keep > > > > rewriting 4-byte access to bpf_sk_lookup.remote_port and > > > > bpf_sock.dst_port in this unexpected, quirky manner. > > > > > > > > The expectation is that with time users will recompile their > > > > BPF > > > > progs > > > > against the updated bpf.h, and switch to 2-byte loads. That > > > > will make > > > > the quirk in the access converter dead code in time. > > > > > > > > I don't have any better ideas. Sorry. > > > > > > > > [...] > > > > > > I agree, let's go ahead with this solution. > > > > > > The only remaining problem that I see is: the bug is in the > > > common > > > code, and it will affect the fields that we add in the future. > > > > > > Can we either document this state of things in a comment, or fix > > > the > > > bug and emulate the old behavior for certain fields? > > > > I think we can fix the bug in the common code, and compensate for > > the > > quirky 4-byte access to bpf_sk_lookup.remote_port and > > bpf_sock.dst_port > > in the is_valid_access and convert_ctx_access. > > > > With the patch as below, access to remote_port gets rewritten to: > > > > * size=1, offset=0, r2 = *(u8 *)(r1 +36) > > 0: (69) r2 = *(u16 *)(r1 +4) > > 1: (54) w2 &= 255 > > 2: (b7) r0 = 0 > > 3: (95) exit > > > > * size=1, offset=1, r2 = *(u8 *)(r1 +37) > > 0: (69) r2 = *(u16 *)(r1 +4) > > 1: (74) w2 >>= 8 > > 2: (54) w2 &= 255 > > 3: (b7) r0 = 0 > > 4: (95) exit > > > > * size=1, offset=2, r2 = *(u8 *)(r1 +38) > > 0: (b4) w2 = 0 > > 1: (54) w2 &= 255 > > 2: (b7) r0 = 0 > > 3: (95) exit > > > > * size=1, offset=3, r2 = *(u8 *)(r1 +39) > > 0: (b4) w2 = 0 > > 1: (74) w2 >>= 8 > > 2: (54) w2 &= 255 > > 3: (b7) r0 = 0 > > 4: (95) exit > > > > * size=2, offset=0, r2 = *(u16 *)(r1 +36) > > 0: (69) r2 = *(u16 *)(r1 +4) > > 1: (b7) r0 = 0 > > 2: (95) exit > > > > * size=2, offset=2, r2 = *(u16 *)(r1 +38) > > 0: (b4) w2 = 0 > > 1: (b7) r0 = 0 > > 2: (95) exit > > > > * size=4, offset=0, r2 = *(u32 *)(r1 +36) > > 0: (69) r2 = *(u16 *)(r1 +4) > > 1: (b7) r0 = 0 > > 2: (95) exit > > > > How does that look to you? > > > > Still need to give it a test on s390x. > > Context conversion with patch below applied looks correct to me on > s390x > as well: > > * size=1, offset=0, r2 = *(u8 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (74) w2 >>= 8 > 3: (bc) w2 = w2 > 4: (54) w2 &= 255 > 5: (bc) w2 = w2 > 6: (b7) r0 = 0 > 7: (95) exit > > * size=1, offset=1, r2 = *(u8 *)(r1 +37) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (54) w2 &= 255 > 3: (bc) w2 = w2 > 4: (b7) r0 = 0 > 5: (95) exit > > * size=1, offset=2, r2 = *(u8 *)(r1 +38) > 0: (b4) w2 = 0 > 1: (bc) w2 = w2 > 2: (74) w2 >>= 8 > 3: (bc) w2 = w2 > 4: (54) w2 &= 255 > 5: (bc) w2 = w2 > 6: (b7) r0 = 0 > 7: (95) exit > > * size=1, offset=3, r2 = *(u8 *)(r1 +39) > 0: (b4) w2 = 0 > 1: (bc) w2 = w2 > 2: (54) w2 &= 255 > 3: (bc) w2 = w2 > 4: (b7) r0 = 0 > 5: (95) exit > > * size=2, offset=0, r2 = *(u16 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (b7) r0 = 0 > 3: (95) exit > > * size=2, offset=2, r2 = *(u16 *)(r1 +38) > 0: (b4) w2 = 0 > 1: (bc) w2 = w2 > 2: (b7) r0 = 0 > 3: (95) exit > > * size=4, offset=0, r2 = *(u32 *)(r1 +36) > 0: (69) r2 = *(u16 *)(r1 +4) > 1: (bc) w2 = w2 > 2: (b7) r0 = 0 > 3: (95) exit > > If we go this way, this should unbreak the bpf selftests on BE, > independently of the patch 1 from this series. > > Will send it as a patch, so that we continue the review discussion. I applied this patch to bpf-next, and while it looks reasonable, the test still fails, e.g. here: /* Load from remote_port field with zero padding (backward compatibility) */ val_u32 = *(__u32 *)&ctx->remote_port; if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) return SK_DROP; > > > > > --8<-- > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 65869fd510e8..2625a1d2dabc 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -10856,13 +10856,24 @@ static bool sk_lookup_is_valid_access(int > > off, int size, > > case bpf_ctx_range(struct bpf_sk_lookup, local_ip4): > > case bpf_ctx_range_till(struct bpf_sk_lookup, > > remote_ip6[0], remote_ip6[3]): > > case bpf_ctx_range_till(struct bpf_sk_lookup, local_ip6[0], > > local_ip6[3]): > > - case offsetof(struct bpf_sk_lookup, remote_port) ... > > - offsetof(struct bpf_sk_lookup, local_ip4) - 1: > > case bpf_ctx_range(struct bpf_sk_lookup, local_port): > > case bpf_ctx_range(struct bpf_sk_lookup, ingress_ifindex): > > bpf_ctx_record_field_size(info, sizeof(__u32)); > > return bpf_ctx_narrow_access_ok(off, size, > > sizeof(__u32)); > > > > + case bpf_ctx_range(struct bpf_sk_lookup, remote_port): > > + /* Allow 4-byte access to 2-byte field for backward > > compatibility */ > > + if (size == sizeof(__u32)) > > + return off == offsetof(struct > > bpf_sk_lookup, remote_port); > > + bpf_ctx_record_field_size(info, sizeof(__be16)); > > + return bpf_ctx_narrow_access_ok(off, size, > > sizeof(__be16)); > > + > > + case offsetofend(struct bpf_sk_lookup, remote_port) ... > > + offsetof(struct bpf_sk_lookup, local_ip4) - 1: > > + /* Allow access to zero padding for backward > > compatiblity */ > > + bpf_ctx_record_field_size(info, sizeof(__u16)); > > + return bpf_ctx_narrow_access_ok(off, size, > > sizeof(__u16)); > > + > > default: > > return false; > > } > > @@ -10944,6 +10955,11 @@ static u32 > > sk_lookup_convert_ctx_access(enum bpf_access_type type, > > sport, 2, > > target_size)); > > break; > > > > + case offsetofend(struct bpf_sk_lookup, remote_port): > > + *target_size = 2; > > + *insn++ = BPF_MOV32_IMM(si->dst_reg, 0); > > + break; > > + > > case offsetof(struct bpf_sk_lookup, local_port): > > *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si- > > >src_reg, > > bpf_target_off(struct > > bpf_sk_lookup_kern, > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets 2022-03-14 18:25 ` Ilya Leoshkevich @ 2022-03-14 20:57 ` Jakub Sitnicki 0 siblings, 0 replies; 18+ messages in thread From: Jakub Sitnicki @ 2022-03-14 20:57 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Mon, Mar 14, 2022 at 07:25 PM +01, Ilya Leoshkevich wrote: > On Mon, 2022-03-14 at 18:35 +0100, Jakub Sitnicki wrote: >> On Thu, Mar 10, 2022 at 11:57 PM +01, Jakub Sitnicki wrote: >> > On Wed, Mar 09, 2022 at 01:34 PM +01, Ilya Leoshkevich wrote: >> > > On Wed, 2022-03-09 at 09:36 +0100, Jakub Sitnicki wrote: >> > >> > [...] >> > >> > > > >> > > > Consider this - today the below is true on both LE and BE, >> > > > right? >> > > > >> > > > *(u32 *)&ctx->remote_port == *(u16 *)&ctx->remote_port >> > > > >> > > > because the loads get converted to: >> > > > >> > > > *(u16 *)&ctx_kern->sport == *(u16 *)&ctx_kern->sport >> > > > >> > > > IOW, today, because of the bug that you are fixing here, the >> > > > data >> > > > layout >> > > > changes from the PoV of the BPF program depending on the load >> > > > size. >> > > > >> > > > With 2-byte loads, without this patch, the data layout appears >> > > > as: >> > > > >> > > > struct bpf_sk_lookup { >> > > > ... >> > > > __be16 remote_port; >> > > > __be16 remote_port; >> > > > ... >> > > > } >> > > >> > > I see, one can indeed argue that this is also a part of the ABI >> > > now. >> > > So we're stuck between a rock and a hard place. >> > > >> > > > While for 4-byte loads, it appears as in your 2nd patch: >> > > > >> > > > struct bpf_sk_lookup { >> > > > ... >> > > > #if little-endian >> > > > __be16 remote_port; >> > > > __u16 :16; /* zero padding */ >> > > > #elif big-endian >> > > > __u16 :16; /* zero padding */ >> > > > __be16 remote_port; >> > > > #endif >> > > > ... >> > > > } >> > > > >> > > > Because of that I don't see how we could keep complete ABI >> > > > compatiblity, >> > > > and have just one definition of struct bpf_sk_lookup that >> > > > reflects >> > > > it. These are conflicting requirements. >> > > > >> > > > I'd bite the bullet for 4-byte loads, for the sake of having an >> > > > endian-agnostic struct bpf_sk_lookup and struct bpf_sock >> > > > definition >> > > > in >> > > > the uAPI header. >> > > > >> > > > The sacrifice here is that the access converter will have to >> > > > keep >> > > > rewriting 4-byte access to bpf_sk_lookup.remote_port and >> > > > bpf_sock.dst_port in this unexpected, quirky manner. >> > > > >> > > > The expectation is that with time users will recompile their >> > > > BPF >> > > > progs >> > > > against the updated bpf.h, and switch to 2-byte loads. That >> > > > will make >> > > > the quirk in the access converter dead code in time. >> > > > >> > > > I don't have any better ideas. Sorry. >> > > > >> > > > [...] >> > > >> > > I agree, let's go ahead with this solution. >> > > >> > > The only remaining problem that I see is: the bug is in the >> > > common >> > > code, and it will affect the fields that we add in the future. >> > > >> > > Can we either document this state of things in a comment, or fix >> > > the >> > > bug and emulate the old behavior for certain fields? >> > >> > I think we can fix the bug in the common code, and compensate for >> > the >> > quirky 4-byte access to bpf_sk_lookup.remote_port and >> > bpf_sock.dst_port >> > in the is_valid_access and convert_ctx_access. >> > >> > With the patch as below, access to remote_port gets rewritten to: >> > >> > * size=1, offset=0, r2 = *(u8 *)(r1 +36) >> > 0: (69) r2 = *(u16 *)(r1 +4) >> > 1: (54) w2 &= 255 >> > 2: (b7) r0 = 0 >> > 3: (95) exit >> > >> > * size=1, offset=1, r2 = *(u8 *)(r1 +37) >> > 0: (69) r2 = *(u16 *)(r1 +4) >> > 1: (74) w2 >>= 8 >> > 2: (54) w2 &= 255 >> > 3: (b7) r0 = 0 >> > 4: (95) exit >> > >> > * size=1, offset=2, r2 = *(u8 *)(r1 +38) >> > 0: (b4) w2 = 0 >> > 1: (54) w2 &= 255 >> > 2: (b7) r0 = 0 >> > 3: (95) exit >> > >> > * size=1, offset=3, r2 = *(u8 *)(r1 +39) >> > 0: (b4) w2 = 0 >> > 1: (74) w2 >>= 8 >> > 2: (54) w2 &= 255 >> > 3: (b7) r0 = 0 >> > 4: (95) exit >> > >> > * size=2, offset=0, r2 = *(u16 *)(r1 +36) >> > 0: (69) r2 = *(u16 *)(r1 +4) >> > 1: (b7) r0 = 0 >> > 2: (95) exit >> > >> > * size=2, offset=2, r2 = *(u16 *)(r1 +38) >> > 0: (b4) w2 = 0 >> > 1: (b7) r0 = 0 >> > 2: (95) exit >> > >> > * size=4, offset=0, r2 = *(u32 *)(r1 +36) >> > 0: (69) r2 = *(u16 *)(r1 +4) >> > 1: (b7) r0 = 0 >> > 2: (95) exit >> > >> > How does that look to you? >> > >> > Still need to give it a test on s390x. >> >> Context conversion with patch below applied looks correct to me on >> s390x >> as well: >> >> * size=1, offset=0, r2 = *(u8 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (74) w2 >>= 8 >> 3: (bc) w2 = w2 >> 4: (54) w2 &= 255 >> 5: (bc) w2 = w2 >> 6: (b7) r0 = 0 >> 7: (95) exit >> >> * size=1, offset=1, r2 = *(u8 *)(r1 +37) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (54) w2 &= 255 >> 3: (bc) w2 = w2 >> 4: (b7) r0 = 0 >> 5: (95) exit >> >> * size=1, offset=2, r2 = *(u8 *)(r1 +38) >> 0: (b4) w2 = 0 >> 1: (bc) w2 = w2 >> 2: (74) w2 >>= 8 >> 3: (bc) w2 = w2 >> 4: (54) w2 &= 255 >> 5: (bc) w2 = w2 >> 6: (b7) r0 = 0 >> 7: (95) exit >> >> * size=1, offset=3, r2 = *(u8 *)(r1 +39) >> 0: (b4) w2 = 0 >> 1: (bc) w2 = w2 >> 2: (54) w2 &= 255 >> 3: (bc) w2 = w2 >> 4: (b7) r0 = 0 >> 5: (95) exit >> >> * size=2, offset=0, r2 = *(u16 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> * size=2, offset=2, r2 = *(u16 *)(r1 +38) >> 0: (b4) w2 = 0 >> 1: (bc) w2 = w2 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> * size=4, offset=0, r2 = *(u32 *)(r1 +36) >> 0: (69) r2 = *(u16 *)(r1 +4) >> 1: (bc) w2 = w2 >> 2: (b7) r0 = 0 >> 3: (95) exit >> >> If we go this way, this should unbreak the bpf selftests on BE, >> independently of the patch 1 from this series. >> >> Will send it as a patch, so that we continue the review discussion. > > I applied this patch to bpf-next, and while it looks reasonable, the > test still fails, e.g. here: > > /* Load from remote_port field with zero padding (backward > compatibility) */ > val_u32 = *(__u32 *)&ctx->remote_port; > if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) > return SK_DROP; > You are right. That's that the check I recently added that broke the bpf CI runs for s390x [1], and started our thread. I had a stale build of test_progs with a fix akin to patch [2] that I was testing and missed that. Thanks for giving it a run. If we go with Martin's suggestion [3] here and avoid bpf_htonl(), then we could make it work and save ourselves endianess checks. IOW, a patch like below would be also needed to unbreak the tests. First chunk is copied from your fixes to test_sk_lookup in patch 3 in this RFC series, naturally. [1] https://lore.kernel.org/bpf/CAEf4BzaRNLw9_EnaMo5e46CdEkzbJiVU3j9oxnsemBKjNFf3wQ@mail.gmail.com/ [2] https://lore.kernel.org/bpf/20220227202757.519015-4-jakub@cloudflare.com/ [3] https://lore.kernel.org/bpf/20220301062207.d3aqge5qg623asr6@kafai-mbp.dhcp.thefacebook.com/ ---8<--- diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c b/tools/testing/selftests/bpf/progs/test_sk_lookup.c index bf5b7caefdd0..2765a4bd500c 100644 --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c @@ -413,15 +413,14 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) /* Narrow loads from remote_port field. Expect SRC_PORT. */ if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || - LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || - LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3) != 0) + LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff)) return SK_DROP; if (LSW(ctx->remote_port, 0) != SRC_PORT) return SK_DROP; /* Load from remote_port field with zero padding (backward compatibility) */ val_u32 = *(__u32 *)&ctx->remote_port; - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) + if (val_u32 != SRC_PORT) return SK_DROP; /* Narrow loads from local_port field. Expect DST_PORT. */ ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich @ 2022-02-22 18:25 ` Ilya Leoshkevich 2022-02-27 2:44 ` Alexei Starovoitov 2022-02-22 18:25 ` [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads Ilya Leoshkevich 2 siblings, 1 reply; 18+ messages in thread From: Ilya Leoshkevich @ 2022-02-22 18:25 UTC (permalink / raw) To: Jakub Sitnicki, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich On big-endian, the port is available in the second __u16, not the first one. Therefore, provide a big-endian-specific definition that reflects that. Also, define remote_port_compat in order to have nicer architecture-agnostic code in the verifier and in tests. Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- include/uapi/linux/bpf.h | 17 +++++++++++++++-- net/core/filter.c | 5 ++--- tools/include/uapi/linux/bpf.h | 17 +++++++++++++++-- 3 files changed, 32 insertions(+), 7 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index afe3d0d7f5f2..7b0e5efa58e0 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -10,6 +10,7 @@ #include <linux/types.h> #include <linux/bpf_common.h> +#include <asm/byteorder.h> /* Extended instruction set based on top of classic BPF */ @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup { __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ __u32 remote_ip4; /* Network byte order */ __u32 remote_ip6[4]; /* Network byte order */ - __be16 remote_port; /* Network byte order */ - __u16 :16; /* Zero padding */ + union { + struct { +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN) + __be16 remote_port; /* Network byte order */ + __u16 :16; /* Zero padding */ +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN) + __u16 :16; /* Zero padding */ + __be16 remote_port; /* Network byte order */ +#else +#error unspecified endianness +#endif + }; + __u32 remote_port_compat; + }; __u32 local_ip4; /* Network byte order */ __u32 local_ip6[4]; /* Network byte order */ __u32 local_port; /* Host byte order */ diff --git a/net/core/filter.c b/net/core/filter.c index 65869fd510e8..4b247d5aebe8 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -10856,8 +10856,7 @@ static bool sk_lookup_is_valid_access(int off, int size, case bpf_ctx_range(struct bpf_sk_lookup, local_ip4): case bpf_ctx_range_till(struct bpf_sk_lookup, remote_ip6[0], remote_ip6[3]): case bpf_ctx_range_till(struct bpf_sk_lookup, local_ip6[0], local_ip6[3]): - case offsetof(struct bpf_sk_lookup, remote_port) ... - offsetof(struct bpf_sk_lookup, local_ip4) - 1: + case bpf_ctx_range(struct bpf_sk_lookup, remote_port_compat): case bpf_ctx_range(struct bpf_sk_lookup, local_port): case bpf_ctx_range(struct bpf_sk_lookup, ingress_ifindex): bpf_ctx_record_field_size(info, sizeof(__u32)); @@ -10938,7 +10937,7 @@ static u32 sk_lookup_convert_ctx_access(enum bpf_access_type type, #endif break; } - case offsetof(struct bpf_sk_lookup, remote_port): + case offsetof(struct bpf_sk_lookup, remote_port_compat): *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, bpf_target_off(struct bpf_sk_lookup_kern, sport, 2, target_size)); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index afe3d0d7f5f2..7b0e5efa58e0 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -10,6 +10,7 @@ #include <linux/types.h> #include <linux/bpf_common.h> +#include <asm/byteorder.h> /* Extended instruction set based on top of classic BPF */ @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup { __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ __u32 remote_ip4; /* Network byte order */ __u32 remote_ip6[4]; /* Network byte order */ - __be16 remote_port; /* Network byte order */ - __u16 :16; /* Zero padding */ + union { + struct { +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN) + __be16 remote_port; /* Network byte order */ + __u16 :16; /* Zero padding */ +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN) + __u16 :16; /* Zero padding */ + __be16 remote_port; /* Network byte order */ +#else +#error unspecified endianness +#endif + }; + __u32 remote_port_compat; + }; __u32 local_ip4; /* Network byte order */ __u32 local_ip6[4]; /* Network byte order */ __u32 local_port; /* Host byte order */ -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich @ 2022-02-27 2:44 ` Alexei Starovoitov 2022-02-27 20:30 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Alexei Starovoitov @ 2022-02-27 2:44 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Jakub Sitnicki, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Tue, Feb 22, 2022 at 07:25:58PM +0100, Ilya Leoshkevich wrote: > On big-endian, the port is available in the second __u16, not the first > one. Therefore, provide a big-endian-specific definition that reflects > that. Also, define remote_port_compat in order to have nicer > architecture-agnostic code in the verifier and in tests. > > Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > --- > include/uapi/linux/bpf.h | 17 +++++++++++++++-- > net/core/filter.c | 5 ++--- > tools/include/uapi/linux/bpf.h | 17 +++++++++++++++-- > 3 files changed, 32 insertions(+), 7 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index afe3d0d7f5f2..7b0e5efa58e0 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -10,6 +10,7 @@ > > #include <linux/types.h> > #include <linux/bpf_common.h> > +#include <asm/byteorder.h> > > /* Extended instruction set based on top of classic BPF */ > > @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup { > __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ > __u32 remote_ip4; /* Network byte order */ > __u32 remote_ip6[4]; /* Network byte order */ > - __be16 remote_port; /* Network byte order */ > - __u16 :16; /* Zero padding */ > + union { > + struct { > +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN) > + __be16 remote_port; /* Network byte order */ > + __u16 :16; /* Zero padding */ > +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN) > + __u16 :16; /* Zero padding */ > + __be16 remote_port; /* Network byte order */ > +#else > +#error unspecified endianness > +#endif > + }; > + __u32 remote_port_compat; Sorry this hack is not an option. Don't have any suggestions at this point. Pls come up with something else. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-27 2:44 ` Alexei Starovoitov @ 2022-02-27 20:30 ` Jakub Sitnicki 2022-02-28 10:19 ` Ilya Leoshkevich 0 siblings, 1 reply; 18+ messages in thread From: Jakub Sitnicki @ 2022-02-27 20:30 UTC (permalink / raw) To: Alexei Starovoitov Cc: Ilya Leoshkevich, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Sat, Feb 26, 2022 at 06:44 PM -08, Alexei Starovoitov wrote: > On Tue, Feb 22, 2022 at 07:25:58PM +0100, Ilya Leoshkevich wrote: >> On big-endian, the port is available in the second __u16, not the first >> one. Therefore, provide a big-endian-specific definition that reflects >> that. Also, define remote_port_compat in order to have nicer >> architecture-agnostic code in the verifier and in tests. >> >> Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide") >> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> >> --- >> include/uapi/linux/bpf.h | 17 +++++++++++++++-- >> net/core/filter.c | 5 ++--- >> tools/include/uapi/linux/bpf.h | 17 +++++++++++++++-- >> 3 files changed, 32 insertions(+), 7 deletions(-) >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index afe3d0d7f5f2..7b0e5efa58e0 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -10,6 +10,7 @@ >> >> #include <linux/types.h> >> #include <linux/bpf_common.h> >> +#include <asm/byteorder.h> >> >> /* Extended instruction set based on top of classic BPF */ >> >> @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup { >> __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ >> __u32 remote_ip4; /* Network byte order */ >> __u32 remote_ip6[4]; /* Network byte order */ >> - __be16 remote_port; /* Network byte order */ >> - __u16 :16; /* Zero padding */ >> + union { >> + struct { >> +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN) >> + __be16 remote_port; /* Network byte order */ >> + __u16 :16; /* Zero padding */ >> +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN) >> + __u16 :16; /* Zero padding */ >> + __be16 remote_port; /* Network byte order */ >> +#else >> +#error unspecified endianness >> +#endif >> + }; >> + __u32 remote_port_compat; > > Sorry this hack is not an option. > Don't have any suggestions at this point. Pls come up with something else. I think we can keep the bpf_sk_lookup definition as is, if we leave the 4-byte load from remote_port offset quirky behavior on little-endian. Please take a look at the test fix I've posted for 4-byte load from bpf_sock dst_port that works for me on x86_64 and s390. It is exactly the same case as we're dealing with here: https://lore.kernel.org/bpf/20220227202757.519015-4-jakub@cloudflare.com/T/#u ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-27 20:30 ` Jakub Sitnicki @ 2022-02-28 10:19 ` Ilya Leoshkevich 2022-02-28 13:26 ` Jakub Sitnicki 0 siblings, 1 reply; 18+ messages in thread From: Ilya Leoshkevich @ 2022-02-28 10:19 UTC (permalink / raw) To: Jakub Sitnicki, Alexei Starovoitov Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Sun, 2022-02-27 at 21:30 +0100, Jakub Sitnicki wrote: > > On Sat, Feb 26, 2022 at 06:44 PM -08, Alexei Starovoitov wrote: > > On Tue, Feb 22, 2022 at 07:25:58PM +0100, Ilya Leoshkevich wrote: > > > On big-endian, the port is available in the second __u16, not the > > > first > > > one. Therefore, provide a big-endian-specific definition that > > > reflects > > > that. Also, define remote_port_compat in order to have nicer > > > architecture-agnostic code in the verifier and in tests. > > > > > > Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct > > > bpf_sk_lookup 16-bit wide") > > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > > > --- > > > include/uapi/linux/bpf.h | 17 +++++++++++++++-- > > > net/core/filter.c | 5 ++--- > > > tools/include/uapi/linux/bpf.h | 17 +++++++++++++++-- > > > 3 files changed, 32 insertions(+), 7 deletions(-) > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > index afe3d0d7f5f2..7b0e5efa58e0 100644 > > > --- a/include/uapi/linux/bpf.h > > > +++ b/include/uapi/linux/bpf.h > > > @@ -10,6 +10,7 @@ > > > > > > #include <linux/types.h> > > > #include <linux/bpf_common.h> > > > +#include <asm/byteorder.h> > > > > > > /* Extended instruction set based on top of classic BPF */ > > > > > > @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup { > > > __u32 protocol; /* IP protocol (IPPROTO_TCP, > > > IPPROTO_UDP) */ > > > __u32 remote_ip4; /* Network byte order */ > > > __u32 remote_ip6[4]; /* Network byte order */ > > > - __be16 remote_port; /* Network byte order */ > > > - __u16 :16; /* Zero padding */ > > > + union { > > > + struct { > > > +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : > > > defined(__LITTLE_ENDIAN) > > > + __be16 remote_port; /* Network byte > > > order */ > > > + __u16 :16; /* Zero padding > > > */ > > > +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : > > > defined(__BIG_ENDIAN) > > > + __u16 :16; /* Zero padding > > > */ > > > + __be16 remote_port; /* Network byte > > > order */ > > > +#else > > > +#error unspecified endianness > > > +#endif > > > + }; > > > + __u32 remote_port_compat; > > > > Sorry this hack is not an option. > > Don't have any suggestions at this point. Pls come up with > > something else. > > I think we can keep the bpf_sk_lookup definition as is, if we leave > the > 4-byte load from remote_port offset quirky behavior on little-endian. > > Please take a look at the test fix I've posted for 4-byte load from > bpf_sock dst_port that works for me on x86_64 and s390. It is exactly > the same case as we're dealing with here: > > https://lore.kernel.org/bpf/20220227202757.519015-4-jakub@cloudflare.com/T/#u > What about 2-byte loads? static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk) { __u16 *half = (__u16 *)&sk->dst_port; return half[0] == bpf_htons(0xcafe); } requires "ca fe ?? ??" in memory on BE, while static __noinline bool sk_dst_port__load_word(struct bpf_sock *sk) { #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ const __u8 SHIFT = 16; #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ const __u8 SHIFT = 0; #else #error "Unrecognized __BYTE_ORDER__" #endif __u32 *word = (__u32 *)&sk->dst_port; return word[0] == bpf_htonl(0xcafe << SHIFT); } requires "00 00 ca fe". This is inconsistent. Furthermore, one cannot see it with bpf_sock thanks to case offsetofend(struct bpf_sock, dst_port) ... offsetof(struct bpf_sock, dst_ip4) - 1: return false; however, with sk_lookup this is the case: loading the most significant half of the port produces non-zero! So, it's not simply a quirkiness of the 4-byte load, it's a mutual inconsistency between LSW loads, MSW loads and 4-byte loads. One might argue that we can live with that, especially since all the user-relevant tests pass - here I can only say that an inconsistency on such a fundamental level makes me nervous. In order to resolve this inconsistency I've implemented patch 1 of this series. With that, "sk->dst_port == bpf_htons(0xcafe)" starts to fail, and that's where one needs something like this patch. Best regards, Ilya ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-28 10:19 ` Ilya Leoshkevich @ 2022-02-28 13:26 ` Jakub Sitnicki 2022-03-01 0:39 ` Ilya Leoshkevich 2022-03-01 0:40 ` Ilya Leoshkevich 0 siblings, 2 replies; 18+ messages in thread From: Jakub Sitnicki @ 2022-02-28 13:26 UTC (permalink / raw) To: Ilya Leoshkevich Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Mon, Feb 28, 2022 at 11:19 AM +01, Ilya Leoshkevich wrote: > In order to resolve this inconsistency I've implemented patch 1 of this > series. With that, "sk->dst_port == bpf_htons(0xcafe)" starts to fail, > and that's where one needs something like this patch. Truth be told I can't reproduce the said failure. I've extended the test with an additional check: 306 bool ok = sk->dst_port == bpf_htons(0xcafe); 307 if (!ok) 308 RET_LOG(); 309 if (!sk_dst_port__load_word(sk)) 310 RET_LOG(); ... but it translates to the same BPF assembly as sk_dst_port__load_half. That is: ; bool ok = sk->dst_port == bpf_htons(0xcafe); 9: (69) r1 = *(u16 *)(r6 +12) 10: (bc) w1 = w1 ; if (!ok) 11: (16) if w1 == 0xcafe goto pc+2 12: (b4) w1 = 308 13: (05) goto pc+14 During the test I had patch 1 from this series applied on top of [1]. The latter series should not matter, though, I didn't touch the access converter. Mind sharing what does the `bpftool prog objdump` output look like for the failing test on your side? [1] https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-28 13:26 ` Jakub Sitnicki @ 2022-03-01 0:39 ` Ilya Leoshkevich 2022-03-01 0:40 ` Ilya Leoshkevich 1 sibling, 0 replies; 18+ messages in thread From: Ilya Leoshkevich @ 2022-03-01 0:39 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Mon, 2022-02-28 at 14:26 +0100, Jakub Sitnicki wrote: > On Mon, Feb 28, 2022 at 11:19 AM +01, Ilya Leoshkevich wrote: > > In order to resolve this inconsistency I've implemented patch 1 of > > this > > series. With that, "sk->dst_port == bpf_htons(0xcafe)" starts to > > fail, > > and that's where one needs something like this patch. > > Truth be told I can't reproduce the said failure. > I've extended the test with an additional check: > > 306 bool ok = sk->dst_port == bpf_htons(0xcafe); > 307 if (!ok) > 308 RET_LOG(); > 309 if (!sk_dst_port__load_word(sk)) > 310 RET_LOG(); > > ... but it translates to the same BPF assembly as > sk_dst_port__load_half. That is: > > ; bool ok = sk->dst_port == bpf_htons(0xcafe); > 9: (69) r1 = *(u16 *)(r6 +12) > 10: (bc) w1 = w1 > ; if (!ok) > 11: (16) if w1 == 0xcafe goto pc+2 > 12: (b4) w1 = 308 > 13: (05) goto pc+14 > > During the test I had patch 1 from this series applied on top of [1]. > The latter series should not matter, though, I didn't touch the > access > converter. > > Mind sharing what does the `bpftool prog objdump` output look like > for > the failing test on your side? > > [1] > https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/ > Sure. As I mentioned, it's best demonstrated with sk_lookup, so that's what I'll be using. Apply this on top of bpf-next: --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c @@ -392,7 +392,7 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) { struct bpf_sock *sk; int err, family; - __u32 val_u32; + __u32 *ptr_u32; bool v4; v4 = (ctx->family == AF_INET); @@ -411,17 +411,12 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) if (LSW(ctx->protocol, 0) != IPPROTO_TCP) return SK_DROP; - /* Narrow loads from remote_port field. Expect SRC_PORT. */ - if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || - LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || - LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3) != 0) + ptr_u32 = (__u32 *)&ctx->remote_port; + if (LSW(*ptr_u32, 0) != SRC_PORT) return SK_DROP; - if (LSW(ctx->remote_port, 0) != SRC_PORT) + if (LSW(*ptr_u32, 1) != 0) return SK_DROP; - - /* Load from remote_port field with zero padding (backward compatibility) */ - val_u32 = *(__u32 *)&ctx->remote_port; - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) + if (*ptr_u32 != SRC_PORT) return SK_DROP; /* Narrow loads from local_port field. Expect DST_PORT. */ The checks look as follows: $ llvm-objdump -d -l tools/testing/selftests/bpf/test_sk_lookup.o ; if (LSW(*ptr_u32, 0) != SRC_PORT) 370: 69 26 00 26 00 00 00 00 r2 = *(u16 *)(r6 + 38) 371: 56 20 01 00 00 00 1f 48 if w2 != 8008 goto +256 <LBB15_135> ; if (LSW(*ptr_u32, 1) != 0) 372: 69 26 00 24 00 00 00 00 r2 = *(u16 *)(r6 + 36) 373: 56 20 00 fe 00 00 00 00 if w2 != 0 goto +254 <LBB15_135> ; if (*ptr_u32 != SRC_PORT) 374: 61 26 00 24 00 00 00 00 r2 = *(u32 *)(r6 + 36) 375: 56 20 00 fc 00 00 1f 48 if w2 != 8008 goto +252 <LBB15_135> After loading: # tools/testing/selftests/bpf/tools/sbin/bpftool prog dump xlated id 141 ; if (LSW(*ptr_u32, 0) != SRC_PORT) 49: (69) r2 = *(u16 *)(r6 +4) 50: (bc) w2 = w2 51: (56) if w2 != 0x1f48 goto pc+938 ; if (LSW(*ptr_u32, 1) != 0) 52: (69) r2 = *(u16 *)(r6 +4) 53: (bc) w2 = w2 54: (56) if w2 != 0x0 goto pc+935 ; if (*ptr_u32 != SRC_PORT) 55: (69) r2 = *(u16 *)(r6 +4) 56: (bc) w2 = w2 57: (56) if w2 != 0x1f48 goto pc+932 Note that both LSW(0) and LSW(1) loads result in the same code after rewriting. This is because the offset is ignored for this particular combination of sizes. Best regards, Ilya ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian 2022-02-28 13:26 ` Jakub Sitnicki 2022-03-01 0:39 ` Ilya Leoshkevich @ 2022-03-01 0:40 ` Ilya Leoshkevich 1 sibling, 0 replies; 18+ messages in thread From: Ilya Leoshkevich @ 2022-03-01 0:40 UTC (permalink / raw) To: Jakub Sitnicki Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, Heiko Carstens, Vasily Gorbik On Mon, 2022-02-28 at 14:26 +0100, Jakub Sitnicki wrote: > On Mon, Feb 28, 2022 at 11:19 AM +01, Ilya Leoshkevich wrote: > > In order to resolve this inconsistency I've implemented patch 1 of > > this > > series. With that, "sk->dst_port == bpf_htons(0xcafe)" starts to > > fail, > > and that's where one needs something like this patch. > > Truth be told I can't reproduce the said failure. > I've extended the test with an additional check: > > 306 bool ok = sk->dst_port == bpf_htons(0xcafe); > 307 if (!ok) > 308 RET_LOG(); > 309 if (!sk_dst_port__load_word(sk)) > 310 RET_LOG(); > > ... but it translates to the same BPF assembly as > sk_dst_port__load_half. That is: > > ; bool ok = sk->dst_port == bpf_htons(0xcafe); > 9: (69) r1 = *(u16 *)(r6 +12) > 10: (bc) w1 = w1 > ; if (!ok) > 11: (16) if w1 == 0xcafe goto pc+2 > 12: (b4) w1 = 308 > 13: (05) goto pc+14 > > During the test I had patch 1 from this series applied on top of [1]. > The latter series should not matter, though, I didn't touch the > access > converter. > > Mind sharing what does the `bpftool prog objdump` output look like > for > the failing test on your side? > > [1] > https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/ > Sure. As I mentioned, it's best demonstrated with sk_lookup, so that's what I'll be using. Apply this on top of bpf-next: --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c @@ -392,7 +392,7 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) { struct bpf_sock *sk; int err, family; - __u32 val_u32; + __u32 *ptr_u32; bool v4; v4 = (ctx->family == AF_INET); @@ -411,17 +411,12 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) if (LSW(ctx->protocol, 0) != IPPROTO_TCP) return SK_DROP; - /* Narrow loads from remote_port field. Expect SRC_PORT. */ - if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || - LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || - LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3) != 0) + ptr_u32 = (__u32 *)&ctx->remote_port; + if (LSW(*ptr_u32, 0) != SRC_PORT) return SK_DROP; - if (LSW(ctx->remote_port, 0) != SRC_PORT) + if (LSW(*ptr_u32, 1) != 0) return SK_DROP; - - /* Load from remote_port field with zero padding (backward compatibility) */ - val_u32 = *(__u32 *)&ctx->remote_port; - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) + if (*ptr_u32 != SRC_PORT) return SK_DROP; /* Narrow loads from local_port field. Expect DST_PORT. */ The checks look as follows: $ llvm-objdump -d -l tools/testing/selftests/bpf/test_sk_lookup.o ; if (LSW(*ptr_u32, 0) != SRC_PORT) 370: 69 26 00 26 00 00 00 00 r2 = *(u16 *)(r6 + 38) 371: 56 20 01 00 00 00 1f 48 if w2 != 8008 goto +256 <LBB15_135> ; if (LSW(*ptr_u32, 1) != 0) 372: 69 26 00 24 00 00 00 00 r2 = *(u16 *)(r6 + 36) 373: 56 20 00 fe 00 00 00 00 if w2 != 0 goto +254 <LBB15_135> ; if (*ptr_u32 != SRC_PORT) 374: 61 26 00 24 00 00 00 00 r2 = *(u32 *)(r6 + 36) 375: 56 20 00 fc 00 00 1f 48 if w2 != 8008 goto +252 <LBB15_135> After loading: # tools/testing/selftests/bpf/tools/sbin/bpftool prog dump xlated id 141 ; if (LSW(*ptr_u32, 0) != SRC_PORT) 49: (69) r2 = *(u16 *)(r6 +4) 50: (bc) w2 = w2 51: (56) if w2 != 0x1f48 goto pc+938 ; if (LSW(*ptr_u32, 1) != 0) 52: (69) r2 = *(u16 *)(r6 +4) 53: (bc) w2 = w2 54: (56) if w2 != 0x0 goto pc+935 ; if (*ptr_u32 != SRC_PORT) 55: (69) r2 = *(u16 *)(r6 +4) 56: (bc) w2 = w2 57: (56) if w2 != 0x1f48 goto pc+932 Note that both LSW(0) and LSW(1) loads result in the same code after rewriting. This is because the offset is ignored for this particular combination of sizes. Best regards, Ilya ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads 2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich @ 2022-02-22 18:25 ` Ilya Leoshkevich 2 siblings, 0 replies; 18+ messages in thread From: Ilya Leoshkevich @ 2022-02-22 18:25 UTC (permalink / raw) To: Jakub Sitnicki, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich - Remove some remote_port tests that are not applicable to u16. - Use remote_port_compat for backward compatibility tests. - Add LSB/LSW backward compatibility tests. - u32 load produces SRC_PORT on both little- and big-endian machines, so check just that. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- .../selftests/bpf/progs/test_sk_lookup.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c b/tools/testing/selftests/bpf/progs/test_sk_lookup.c index bf5b7caefdd0..7106fedfd2cc 100644 --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c @@ -392,7 +392,7 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) { struct bpf_sock *sk; int err, family; - __u32 val_u32; + __u32 *ptr_u32; bool v4; v4 = (ctx->family == AF_INET); @@ -413,15 +413,20 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) /* Narrow loads from remote_port field. Expect SRC_PORT. */ if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || - LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || - LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3) != 0) + LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff)) return SK_DROP; - if (LSW(ctx->remote_port, 0) != SRC_PORT) + if (ctx->remote_port != SRC_PORT) return SK_DROP; /* Load from remote_port field with zero padding (backward compatibility) */ - val_u32 = *(__u32 *)&ctx->remote_port; - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) + ptr_u32 = &ctx->remote_port_compat; + if (LSB(*ptr_u32, 0) != ((SRC_PORT >> 0) & 0xff) || + LSB(*ptr_u32, 1) != ((SRC_PORT >> 8) & 0xff) || + LSB(*ptr_u32, 2) != 0 || LSB(*ptr_u32, 3) != 0) + return SK_DROP; + if (LSW(*ptr_u32, 0) != SRC_PORT || LSW(*ptr_u32, 1) != 0) + return SK_DROP; + if (*ptr_u32 != SRC_PORT) return SK_DROP; /* Narrow loads from local_port field. Expect DST_PORT. */ -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-03-14 21:49 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich 2022-03-08 15:01 ` Jakub Sitnicki 2022-03-08 23:58 ` Ilya Leoshkevich 2022-03-09 8:36 ` Jakub Sitnicki 2022-03-09 12:34 ` Ilya Leoshkevich 2022-03-10 22:57 ` Jakub Sitnicki 2022-03-14 17:35 ` Jakub Sitnicki 2022-03-14 18:25 ` Ilya Leoshkevich 2022-03-14 20:57 ` Jakub Sitnicki 2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich 2022-02-27 2:44 ` Alexei Starovoitov 2022-02-27 20:30 ` Jakub Sitnicki 2022-02-28 10:19 ` Ilya Leoshkevich 2022-02-28 13:26 ` Jakub Sitnicki 2022-03-01 0:39 ` Ilya Leoshkevich 2022-03-01 0:40 ` Ilya Leoshkevich 2022-02-22 18:25 ` [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads Ilya Leoshkevich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.