bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Jakub Sitnicki <jakub@cloudflare.com>, <bpf@vger.kernel.org>
Cc: <netdev@vger.kernel.org>, <kernel-team@cloudflare.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH bpf] bpf: Shift and mask loads narrower than context field size
Date: Tue, 14 Jul 2020 23:44:01 -0700	[thread overview]
Message-ID: <c98aaa5e-9347-c23f-cfa6-e267f2485c5b@fb.com> (raw)
In-Reply-To: <20200710173123.427983-1-jakub@cloudflare.com>



On 7/10/20 10:31 AM, Jakub Sitnicki wrote:
> When size of load from context is the same as target field size, but less
> than context field size, the verifier does not emit the shift and mask
> instructions for loads at non-zero offset.
> 
> This has the unexpected effect of loading the same data no matter what the
> offset was. While the expected behavior would be to load zeros for offsets
> that are greater than target field size.
> 
> For instance, u16 load from u32 context field backed by u16 target field at
> an offset of 2 bytes results in:
> 
>    SEC("sk_reuseport/narrow_half")
>    int reuseport_narrow_half(struct sk_reuseport_md *ctx)
>    {
>    	__u16 *half;
> 
>    	half = (__u16 *)&ctx->ip_protocol;
>    	if (half[0] == 0xaaaa)
>    		return SK_DROP;
>    	if (half[1] == 0xbbbb)
>    		return SK_DROP;
>    	return SK_PASS;
>    }

It would be good if you can include llvm asm output like below so people
can correlate source => asm => xlated codes:

        0:       w0 = 0
        1:       r2 = *(u16 *)(r1 + 24)
        2:       if w2 == 43690 goto +4 <LBB0_3>
        3:       r1 = *(u16 *)(r1 + 26)
        4:       w0 = 1
        5:       if w1 != 48059 goto +1 <LBB0_3>
        6:       w0 = 0

0000000000000038 <LBB0_3>:
        7:       exit

> 
>    int reuseport_narrow_half(struct sk_reuseport_md * ctx):
>    ; int reuseport_narrow_half(struct sk_reuseport_md *ctx)
>       0: (b4) w0 = 0
>    ; if (half[0] == 0xaaaa)
>       1: (79) r2 = *(u64 *)(r1 +8)
>       2: (69) r2 = *(u16 *)(r2 +924)
>    ; if (half[0] == 0xaaaa)
>       3: (16) if w2 == 0xaaaa goto pc+5
>    ; if (half[1] == 0xbbbb)
>       4: (79) r1 = *(u64 *)(r1 +8)
>       5: (69) r1 = *(u16 *)(r1 +924)
>       6: (b4) w0 = 1
>    ; if (half[1] == 0xbbbb)
>       7: (56) if w1 != 0xbbbb goto pc+1
>       8: (b4) w0 = 0
>    ; }
>       9: (95) exit

Indeed we have an issue here. The insn 5 is not correct.
The original assembly is correct.

Internally ip_protocol is backed by 2 bytes in sk_reuseport_kern.
The current verifier implementation makes an important assumption:
    all user load requests are within the size of kernel internal range
In this case, the verifier actually only correctly supports
    . one byte from offset 0
    . one byte from offset 1
    . two bytes from offset 0

The original assembly code tries to access 2 bytes from offset 2
and the verifier did incorrect transformation.

This actually makes sense since any other read is
misleading. For example, for ip_protocol, if people wants to
load 2 bytes from offset 2, what should we return? 0? In this case,
actually verifier can convert it to 0 with doing a load.


> In this case half[0] == half[1] == sk->sk_protocol that backs the
> ctx->ip_protocol field.
> 
> Fix it by shifting and masking any load from context that is narrower than
> context field size (is_narrower_load = size < ctx_field_size), in addition
> to loads that are narrower than target field size.

The fix can workaround the issue, but I think we should generate better 
codes for such cases.

> 
> The "size < target_size" check is left in place to cover the case when a
> context field is narrower than its target field, even if we might not have
> such case now. (It would have to be a u32 context field backed by a u64
> target field, with context fields all being 4-bytes or wider.)
> 
> Going back to the example, with the fix in place, the upper half load from
> ctx->ip_protocol yields zero:
> 
>    int reuseport_narrow_half(struct sk_reuseport_md * ctx):
>    ; int reuseport_narrow_half(struct sk_reuseport_md *ctx)
>       0: (b4) w0 = 0
>    ; if (half[0] == 0xaaaa)
>       1: (79) r2 = *(u64 *)(r1 +8)
>       2: (69) r2 = *(u16 *)(r2 +924)
>       3: (54) w2 &= 65535
>    ; if (half[0] == 0xaaaa)
>       4: (16) if w2 == 0xaaaa goto pc+7
>    ; if (half[1] == 0xbbbb)
>       5: (79) r1 = *(u64 *)(r1 +8)
>       6: (69) r1 = *(u16 *)(r1 +924)

The load is still from offset 0, 2 bytes with upper 48 bits as 0.

>       7: (74) w1 >>= 16

w1 will be 0 now. so this will work.

>       8: (54) w1 &= 65535

For the above insns 5-8, verifier, based on target information can
directly generate w1 = 0 since:
   . target kernel field size is 2, ctx field size is 4.
   . user tries to access offset 2 size 2.

Here, we need to decide whether we permits user to do partial read 
beyond of kernel narrow field or not (e.g., this example)? I would
say yes, but Daniel or Alexei can provide additional comments.

If we allow such accesses, I would like verifier to generate better
code as I illustrated in the above. This can be implemented in
verifier itself with target passing additional kernel field size
to the verifier. The target already passed the ctx field size back
to the verifier.

>       9: (b4) w0 = 1
>    ; if (half[1] == 0xbbbb)
>      10: (56) if w1 != 0xbbbb goto pc+1
>      11: (b4) w0 = 0
>    ; }
>      12: (95) exit
> 
> Fixes: f96da09473b5 ("bpf: simplify narrower ctx access")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>   kernel/bpf/verifier.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 94cead5a43e5..1c4d0e24a5a2 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -9760,7 +9760,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>   			return -EINVAL;
>   		}
>   
> -		if (is_narrower_load && size < target_size) {
> +		if (is_narrower_load || size < target_size) {
>   			u8 shift = bpf_ctx_narrow_access_offset(
>   				off, size, size_default) * 8;
>   			if (ctx_field_size <= 4) {
> 

  reply	other threads:[~2020-07-15  6:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10 17:31 [PATCH bpf] bpf: Shift and mask loads narrower than context field size Jakub Sitnicki
2020-07-15  6:44 ` Yonghong Song [this message]
2020-07-15 19:26   ` Jakub Sitnicki
2020-07-15 20:59     ` Yonghong Song
2020-07-16 11:48       ` Jakub Sitnicki
2020-07-16 17:38         ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c98aaa5e-9347-c23f-cfa6-e267f2485c5b@fb.com \
    --to=yhs@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).