All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	bpf@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets
Date: Tue, 08 Mar 2022 16:01:05 +0100	[thread overview]
Message-ID: <87bkygzbg5.fsf@cloudflare.com> (raw)
In-Reply-To: <20220222182559.2865596-2-iii@linux.ibm.com>

On Tue, Feb 22, 2022 at 07:25 PM +01, Ilya Leoshkevich wrote:
> Verifier treats bpf_sk_lookup.remote_port as a 32-bit field for
> backward compatibility, regardless of what the uapi headers say.
> This field is mapped onto the 16-bit bpf_sk_lookup_kern.sport field.
> Therefore, accessing the most significant 16 bits of
> bpf_sk_lookup.remote_port must produce 0, which is currently not
> the case.
>
> The problem is that narrow loads with offset - commit 46f53a65d2de
> ("bpf: Allow narrow loads with offset > 0"), don't play nicely with
> the masking optimization - commit 239946314e57 ("bpf: possibly avoid
> extra masking for narrower load in verifier"). In particular, when we
> suppress extra masking, we suppress shifting as well, which is not
> correct.
>
> Fix by moving the masking suppression check to BPF_AND generation.
>
> Fixes: 46f53a65d2de ("bpf: Allow narrow loads with offset > 0")
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  kernel/bpf/verifier.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d7473fee247c..195f2e9b5a47 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -12848,7 +12848,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>  			return -EINVAL;
>  		}
>  
> -		if (is_narrower_load && size < target_size) {
> +		if (is_narrower_load) {
>  			u8 shift = bpf_ctx_narrow_access_offset(
>  				off, size, size_default) * 8;
>  			if (shift && cnt + 1 >= ARRAY_SIZE(insn_buf)) {
> @@ -12860,15 +12860,19 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>  					insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH,
>  									insn->dst_reg,
>  									shift);
> -				insn_buf[cnt++] = BPF_ALU32_IMM(BPF_AND, insn->dst_reg,
> -								(1 << size * 8) - 1);
> +				if (size < target_size)
> +					insn_buf[cnt++] = BPF_ALU32_IMM(
> +						BPF_AND, insn->dst_reg,
> +						(1 << size * 8) - 1);
>  			} else {
>  				if (shift)
>  					insn_buf[cnt++] = BPF_ALU64_IMM(BPF_RSH,
>  									insn->dst_reg,
>  									shift);
> -				insn_buf[cnt++] = BPF_ALU64_IMM(BPF_AND, insn->dst_reg,
> -								(1ULL << size * 8) - 1);
> +				if (size < target_size)
> +					insn_buf[cnt++] = BPF_ALU64_IMM(
> +						BPF_AND, insn->dst_reg,
> +						(1ULL << size * 8) - 1);
>  			}
>  		}

Thanks for patience. I'm coming back to this.

This fix affects the 2-byte load from bpf_sk_lookup.remote_port.
Dumping the xlated BPF code confirms it.

On LE (x86-64) things look well.

Before this patch:

* size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (b7) r0 = 0
   2: (95) exit

* size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (b7) r0 = 0
   2: (95) exit

After this patch:

* size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (b7) r0 = 0
   2: (95) exit

* size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (74) w2 >>= 16
   2: (b7) r0 = 0
   3: (95) exit

Which works great because the JIT generates a zero-extended load movzwq:

* size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36)
bpf_prog_5e4fe3dbdcb18fd3:
   0:   nopl   0x0(%rax,%rax,1)
   5:   xchg   %ax,%ax
   7:   push   %rbp
   8:   mov    %rsp,%rbp
   b:   movzwq 0x4(%rdi),%rsi
  10:   xor    %eax,%eax
  12:   leave
  13:   ret


* size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38)
bpf_prog_4a6336c64a340b96:
   0:   nopl   0x0(%rax,%rax,1)
   5:   xchg   %ax,%ax
   7:   push   %rbp
   8:   mov    %rsp,%rbp
   b:   movzwq 0x4(%rdi),%rsi
  10:   shr    $0x10,%esi
  13:   xor    %eax,%eax
  15:   leave
  16:   ret

Runtime checks for bpf_sk_lookup.remote_port load and the 2-bytes of
zero padding following it, like below, pass with flying colors:

	ok = ctx->remote_port == bpf_htons(8008);
	if (!ok)
		return SK_DROP;
	ok = *((__u16 *)&ctx->remote_port + 1) == 0;
	if (!ok)
		return SK_DROP;

(The above checks compile to half-word (2-byte) loads.)


On BE (s390x) things look different:

Before the patch:

* size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (bc) w2 = w2
   2: (b7) r0 = 0
   3: (95) exit

* size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (bc) w2 = w2
   2: (b7) r0 = 0
   3: (95) exit

After the patch:

* size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (bc) w2 = w2
   2: (74) w2 >>= 16
   3: (bc) w2 = w2
   4: (b7) r0 = 0
   5: (95) exit

* size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38)
   0: (69) r2 = *(u16 *)(r1 +4)
   1: (bc) w2 = w2
   2: (b7) r0 = 0
   3: (95) exit

These compile to:

* size=2, offset=0, 0: (69) r2 = *(u16 *)(r1 +36)
bpf_prog_fdd58b8caca29f00:
   0:   j       0x0000000000000006
   4:   nopr
   6:   stmg    %r11,%r15,112(%r15)
   c:   la      %r13,64(%r15)
  10:   aghi    %r15,-96
  14:   llgh    %r3,4(%r2,%r0)
  1a:   srl     %r3,16
  1e:   llgfr   %r3,%r3
  22:   lgfi    %r14,0
  28:   lgr     %r2,%r14
  2c:   lmg     %r11,%r15,208(%r15)
  32:   br      %r14


* size=2, offset=2, 0: (69) r2 = *(u16 *)(r1 +38)
bpf_prog_5e3d8e92223c6841:
   0:   j       0x0000000000000006
   4:   nopr
   6:   stmg    %r11,%r15,112(%r15)
   c:   la      %r13,64(%r15)
  10:   aghi    %r15,-96
  14:   llgh    %r3,4(%r2,%r0)
  1a:   lgfi    %r14,0
  20:   lgr     %r2,%r14
  24:   lmg     %r11,%r15,208(%r15)
  2a:   br      %r14

Now, we right shift the value when loading

  *(u16 *)(r1 +36)

which in C BPF is equivalent to

  *((__u16 *)&ctx->remote_port + 0)

due to how the shift is calculated by bpf_ctx_narrow_access_offset().

This makes the expected typical use-case

  ctx->remote_port == bpf_htons(8008)

fail on s390x because llgh (Load Logical Halfword (64<-16)) seems to lay
out the data in the destination register so that it holds
0x0000_0000_0000_1f48.

I don't know that was the intention here, as it makes the BPF C code
non-portable.

WDYT?


BTW. Out of curiosity, how does a Logical Load Halfword (llgh) differ
differ from a non-logical Load Halfword (lgh) on s390x? Compiler
Explorer generates a non-logical load for similar C code.

  reply	other threads:[~2022-03-08 15:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich
2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich
2022-03-08 15:01   ` Jakub Sitnicki [this message]
2022-03-08 23:58     ` Ilya Leoshkevich
2022-03-09  8:36       ` Jakub Sitnicki
2022-03-09 12:34         ` Ilya Leoshkevich
2022-03-10 22:57           ` Jakub Sitnicki
2022-03-14 17:35             ` Jakub Sitnicki
2022-03-14 18:25               ` Ilya Leoshkevich
2022-03-14 20:57                 ` Jakub Sitnicki
2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich
2022-02-27  2:44   ` Alexei Starovoitov
2022-02-27 20:30     ` Jakub Sitnicki
2022-02-28 10:19       ` Ilya Leoshkevich
2022-02-28 13:26         ` Jakub Sitnicki
2022-03-01  0:39           ` Ilya Leoshkevich
2022-03-01  0:40           ` Ilya Leoshkevich
2022-02-22 18:25 ` [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads Ilya Leoshkevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkygzbg5.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=iii@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.