All of lore.kernel.org
 help / color / mirror / Atom feed
* Funky verifier packet range error (> check works, != does not).
@ 2023-12-30  1:31 Maciej Żenczykowski
  2024-01-02 16:39 ` Eduard Zingerman
  2024-01-02 21:45 ` Andrii Nakryiko
  0 siblings, 2 replies; 13+ messages in thread
From: Maciej Żenczykowski @ 2023-12-30  1:31 UTC (permalink / raw)
  To: BPF Mailing List, Alexei Starovoitov

I have a relatively complex program that fails to load on 6.5.6 with a

if (data + 98 != data_end) return TC_ACT_SHOT;

check, that loads fine if I change the above != to (a you would think
weaker) > check.

It's not important, hit this while debugging, and I don't know if the
cause is the verifier treating != differently than > or the compiler
optimizing != somehow... but my gut feeling is on the former: some
verifier logic special cases > without doing something similar for the
stronger != comparison.

...
453: (85) call bpf_trace_printk#6     ; R0_w=scalar()
; if (data + 98 != data_end) return TC_ACT_SHOT;
454: (bf) r1 = r6                     ; R1_w=pkt(off=0,r=42,imm=0)
R6=pkt(off=0,r=42,imm=0)
455: (07) r1 += 98                    ; R1_w=pkt(off=98,r=42,imm=0)
; if (data + 98 != data_end) return TC_ACT_SHOT;
456: (5d) if r1 != r9 goto pc-23      ; R1_w=pkt(off=98,r=42,imm=0)
R9=pkt_end(off=0,imm=0)
*** IMHO here r=42 should be bumped to 98 ***
457: (bf) r3 = r6                     ; R3_w=pkt(off=0,r=42,imm=0)
R6=pkt(off=0,r=42,imm=0)
458: (07) r3 += 34                    ; R3_w=pkt(off=34,r=42,imm=0)
; uint64_t cs = bpf_csum_diff(NULL, 0, data + 14 + 20, 98 - 14 - 20, 0xFFFF);
459: (b7) r1 = 0                      ; R1_w=0
460: (b7) r2 = 0                      ; R2_w=0
461: (b7) r4 = 64                     ; R4_w=64
462: (b7) r5 = 65535                  ; R5_w=65535
463: (85) call bpf_csum_diff#28
invalid access to packet, off=34 size=64, R3(id=0,off=34,r=42)
R3 offset is outside of the packet

Side note: bpf_csum_diff() is super non user-friendly, but that's for
another thread...

Happy New Year,
Maciej

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2023-12-30  1:31 Funky verifier packet range error (> check works, != does not) Maciej Żenczykowski
@ 2024-01-02 16:39 ` Eduard Zingerman
  2024-01-02 18:30   ` Maciej Żenczykowski
  2024-01-02 21:45 ` Andrii Nakryiko
  1 sibling, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-01-02 16:39 UTC (permalink / raw)
  To: Maciej Żenczykowski, BPF Mailing List, Alexei Starovoitov

On Fri, 2023-12-29 at 17:31 -0800, Maciej Żenczykowski wrote:
> I have a relatively complex program that fails to load on 6.5.6 with a
> 
> if (data + 98 != data_end) return TC_ACT_SHOT;
> 
> check, that loads fine if I change the above != to (a you would think
> weaker) > check.
> 
> It's not important, hit this while debugging, and I don't know if the
> cause is the verifier treating != differently than > or the compiler
> optimizing != somehow... but my gut feeling is on the former: some
> verifier logic special cases > without doing something similar for the
> stronger != comparison.

Please note the following comment in verifier.c:find_good_pkt_pointers():

    /* Examples for register markings:
     *
     * pkt_data in dst register:
     *
     *   r2 = r3;
     *   r2 += 8;
     *   if (r2 > pkt_end) goto <handle exception>
     *   <access okay>
     *
     *   r2 = r3;
     *   r2 += 8;
     *   if (r2 < pkt_end) goto <access okay>
     *   <handle exception>
     *
     *   Where:
     *     r2 == dst_reg, pkt_end == src_reg
     *     r2=pkt(id=n,off=8,r=0)
     *     r3=pkt(id=n,off=0,r=0)
     *
       ... a few lines skipped ...
     *
     * Find register r3 and mark its range as r3=pkt(id=n,off=0,r=8)
     * or r3=pkt(id=n,off=0,r=8-1), so that range of bytes [r3, r3 + 8)
     * and [r3, r3 + 8-1) respectively is safe to access depending on
     * the check.
     */

In other words, from 'data + 98 > data_end' follows that 'data + 98 <= data_end',
which means that accessible range for 'data' pointer could be incremented by 97 bytes.
However, the 'data + 98 != data_end' is not sufficient to conclude that 98 more bytes
are available, as e.g. the following: 'data + 42 == data_end' could be true at the same time.
Does this makes sense?

Thanks,
Eduard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 16:39 ` Eduard Zingerman
@ 2024-01-02 18:30   ` Maciej Żenczykowski
  2024-01-02 19:23     ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Maciej Żenczykowski @ 2024-01-02 18:30 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: BPF Mailing List, Alexei Starovoitov

On Tue, Jan 2, 2024 at 8:40 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-12-29 at 17:31 -0800, Maciej Żenczykowski wrote:
> > I have a relatively complex program that fails to load on 6.5.6 with a
> >
> > if (data + 98 != data_end) return TC_ACT_SHOT;
> >
> > check, that loads fine if I change the above != to (a you would think
> > weaker) > check.
> >
> > It's not important, hit this while debugging, and I don't know if the
> > cause is the verifier treating != differently than > or the compiler
> > optimizing != somehow... but my gut feeling is on the former: some
> > verifier logic special cases > without doing something similar for the
> > stronger != comparison.
>
> Please note the following comment in verifier.c:find_good_pkt_pointers():
>
>     /* Examples for register markings:
>      *
>      * pkt_data in dst register:
>      *
>      *   r2 = r3;
>      *   r2 += 8;
>      *   if (r2 > pkt_end) goto <handle exception>
>      *   <access okay>
>      *
>      *   r2 = r3;
>      *   r2 += 8;
>      *   if (r2 < pkt_end) goto <access okay>
>      *   <handle exception>
>      *
>      *   Where:
>      *     r2 == dst_reg, pkt_end == src_reg
>      *     r2=pkt(id=n,off=8,r=0)
>      *     r3=pkt(id=n,off=0,r=0)
>      *
>        ... a few lines skipped ...
>      *
>      * Find register r3 and mark its range as r3=pkt(id=n,off=0,r=8)
>      * or r3=pkt(id=n,off=0,r=8-1), so that range of bytes [r3, r3 + 8)
>      * and [r3, r3 + 8-1) respectively is safe to access depending on
>      * the check.
>      */
>
> In other words, from 'data + 98 > data_end' follows that 'data + 98 <= data_end',
> which means that accessible range for 'data' pointer could be incremented by 97 bytes.
> However, the 'data + 98 != data_end' is not sufficient to conclude that 98 more bytes
> are available, as e.g. the following: 'data + 42 == data_end' could be true at the same time.
> Does this makes sense?

Nope, you got your logic reversed.

The check is:
  if (data + 98 != data_end) return;
so now (after this check) you *know* that 'data + 98 == data_end' and
thus you know there are *exactly* 98 valid bytes.

> Thanks,
> Eduard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 18:30   ` Maciej Żenczykowski
@ 2024-01-02 19:23     ` Eduard Zingerman
  2024-01-02 20:36       ` Maciej Żenczykowski
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-01-02 19:23 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: BPF Mailing List, Alexei Starovoitov

On Tue, 2024-01-02 at 10:30 -0800, Maciej Żenczykowski wrote:
[...]
> 
> The check is:
>   if (data + 98 != data_end) return;
> so now (after this check) you *know* that 'data + 98 == data_end' and
> thus you know there are *exactly* 98 valid bytes.

Apologies, you are correct.
So you want to have something along the following lines:

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a376eb609c41..6ddb34d5b9aa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14712,6 +14712,28 @@ static bool try_match_pkt_pointers(const struct bpf_insn *insn,
                        return false;
                }
                break;
+       case BPF_JEQ:
+       case BPF_JNE:
+               if ((dst_reg->type == PTR_TO_PACKET &&
+                    src_reg->type == PTR_TO_PACKET_END) ||
+                   (dst_reg->type == PTR_TO_PACKET_META &&
+                    reg_is_init_pkt_pointer(src_reg, PTR_TO_PACKET)) ||
+                   (dst_reg->type == PTR_TO_PACKET_END &&
+                    src_reg->type == PTR_TO_PACKET) ||
+                   (reg_is_init_pkt_pointer(dst_reg, PTR_TO_PACKET) &&
+                    src_reg->type == PTR_TO_PACKET_META)) {
+                       /* pkt_data' != pkt_end, pkt_meta' != pkt_data,
+                        * pkt_end != pkt_data', pkt_data != pkt_meta'
+                        */
+                       struct bpf_verifier_state *eq_branch;
+
+                       eq_branch = BPF_OP(insn->code) == BPF_JEQ ? other_branch : this_branch;
+                       find_good_pkt_pointers(eq_branch, dst_reg, dst_reg->type, true);
+                       mark_pkt_end(eq_branch, insn->dst_reg, false);
+               } else {
+                       return false;
+               }
+               break;
        case BPF_JGE:
                if ((dst_reg->type == PTR_TO_PACKET &&
                     src_reg->type == PTR_TO_PACKET_END) ||

Right?

This passes the following test case:

SEC("tc")
__success
__naked void data_plus_const_eq_pkt_end(void)
{
        asm volatile ("                                 \n\
        r9 = r1;                                        \n\
        r1 = *(u32*)(r9 + %[__sk_buff_data]);           \n\
        r2 = *(u32*)(r9 + %[__sk_buff_data_end]);       \n\
        r3 = r1;                                        \n\
        r3 += 8;                                        \n\
        if r3 != r2 goto 1f;                            \n\
        r1 = *(u64 *)(r1 + 0);                          \n\
1:                                                      \n\
        r0 = 0;                                         \n\
        exit;                                           \n\
"       :
        : __imm_const(__sk_buff_data, offsetof(struct __sk_buff, data)),
          __imm_const(__sk_buff_data_end, offsetof(struct __sk_buff, data_end))
        : __clobber_all);
}

And it's variations for EQ/NEQ, positive/negative.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 19:23     ` Eduard Zingerman
@ 2024-01-02 20:36       ` Maciej Żenczykowski
  2024-01-02 23:13         ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Maciej Żenczykowski @ 2024-01-02 20:36 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: BPF Mailing List, Alexei Starovoitov

On Tue, Jan 2, 2024 at 11:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-01-02 at 10:30 -0800, Maciej Żenczykowski wrote:
> [...]
> >
> > The check is:
> >   if (data + 98 != data_end) return;
> > so now (after this check) you *know* that 'data + 98 == data_end' and
> > thus you know there are *exactly* 98 valid bytes.
>
> Apologies, you are correct.
> So you want to have something along the following lines:
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index a376eb609c41..6ddb34d5b9aa 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -14712,6 +14712,28 @@ static bool try_match_pkt_pointers(const struct bpf_insn *insn,
>                         return false;
>                 }
>                 break;
> +       case BPF_JEQ:
> +       case BPF_JNE:
> +               if ((dst_reg->type == PTR_TO_PACKET &&
> +                    src_reg->type == PTR_TO_PACKET_END) ||
> +                   (dst_reg->type == PTR_TO_PACKET_META &&
> +                    reg_is_init_pkt_pointer(src_reg, PTR_TO_PACKET)) ||
> +                   (dst_reg->type == PTR_TO_PACKET_END &&
> +                    src_reg->type == PTR_TO_PACKET) ||
> +                   (reg_is_init_pkt_pointer(dst_reg, PTR_TO_PACKET) &&
> +                    src_reg->type == PTR_TO_PACKET_META)) {
> +                       /* pkt_data' != pkt_end, pkt_meta' != pkt_data,
> +                        * pkt_end != pkt_data', pkt_data != pkt_meta'
> +                        */
> +                       struct bpf_verifier_state *eq_branch;
> +
> +                       eq_branch = BPF_OP(insn->code) == BPF_JEQ ? other_branch : this_branch;
> +                       find_good_pkt_pointers(eq_branch, dst_reg, dst_reg->type, true);
> +                       mark_pkt_end(eq_branch, insn->dst_reg, false);
> +               } else {
> +                       return false;
> +               }
> +               break;
>         case BPF_JGE:
>                 if ((dst_reg->type == PTR_TO_PACKET &&
>                      src_reg->type == PTR_TO_PACKET_END) ||
>
> Right?

I think so? I don't fully follow the logic.

I wonder if JEQ/JNE couldn't simply be folded into the existing cases though...
Or perhaps some other refactoring to tri-state the jmps...

switch (opcode) {
case BPF_JEQ: eq_branch = this_branch; lt_branch = gt_branch =
other_branch; break;
case BPF_JNE: lt_branch = gt_branch = this_branch; eq_branch =
other_branch; break;
case BPF_LT: lt_branch = this_branch; eq_branch = gt_branch =
other_branch; break;
...
}
and then you can ignore opcode...

> This passes the following test case:
>
> SEC("tc")
> __success
> __naked void data_plus_const_eq_pkt_end(void)
> {
>         asm volatile ("                                 \n\
>         r9 = r1;                                        \n\
>         r1 = *(u32*)(r9 + %[__sk_buff_data]);           \n\
>         r2 = *(u32*)(r9 + %[__sk_buff_data_end]);       \n\
>         r3 = r1;                                        \n\
>         r3 += 8;                                        \n\
>         if r3 != r2 goto 1f;                            \n\
>         r1 = *(u64 *)(r1 + 0);                          \n\
> 1:                                                      \n\
>         r0 = 0;                                         \n\
>         exit;                                           \n\
> "       :
>         : __imm_const(__sk_buff_data, offsetof(struct __sk_buff, data)),
>           __imm_const(__sk_buff_data_end, offsetof(struct __sk_buff, data_end))
>         : __clobber_all);
> }
>
> And it's variations for EQ/NEQ, positive/negative.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2023-12-30  1:31 Funky verifier packet range error (> check works, != does not) Maciej Żenczykowski
  2024-01-02 16:39 ` Eduard Zingerman
@ 2024-01-02 21:45 ` Andrii Nakryiko
  2024-01-02 22:45   ` Maciej Żenczykowski
  1 sibling, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2024-01-02 21:45 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: BPF Mailing List, Alexei Starovoitov

On Fri, Dec 29, 2023 at 5:31 PM Maciej Żenczykowski
<zenczykowski@gmail.com> wrote:
>
> I have a relatively complex program that fails to load on 6.5.6 with a
>
> if (data + 98 != data_end) return TC_ACT_SHOT;
>

How realistic is such code in practice? Is there a situation in which
it's critical to ensure that the packet has exactly X bytes in [data,
data_end) range? Even in that case we can have data in frags, though,
right? So I'm just wondering if we are discussing some rather
theoretical situation?

> check, that loads fine if I change the above != to (a you would think
> weaker) > check.
>
> It's not important, hit this while debugging, and I don't know if the
> cause is the verifier treating != differently than > or the compiler
> optimizing != somehow... but my gut feeling is on the former: some
> verifier logic special cases > without doing something similar for the
> stronger != comparison.
>
> ...
> 453: (85) call bpf_trace_printk#6     ; R0_w=scalar()
> ; if (data + 98 != data_end) return TC_ACT_SHOT;
> 454: (bf) r1 = r6                     ; R1_w=pkt(off=0,r=42,imm=0)
> R6=pkt(off=0,r=42,imm=0)
> 455: (07) r1 += 98                    ; R1_w=pkt(off=98,r=42,imm=0)
> ; if (data + 98 != data_end) return TC_ACT_SHOT;
> 456: (5d) if r1 != r9 goto pc-23      ; R1_w=pkt(off=98,r=42,imm=0)
> R9=pkt_end(off=0,imm=0)
> *** IMHO here r=42 should be bumped to 98 ***
> 457: (bf) r3 = r6                     ; R3_w=pkt(off=0,r=42,imm=0)
> R6=pkt(off=0,r=42,imm=0)
> 458: (07) r3 += 34                    ; R3_w=pkt(off=34,r=42,imm=0)
> ; uint64_t cs = bpf_csum_diff(NULL, 0, data + 14 + 20, 98 - 14 - 20, 0xFFFF);
> 459: (b7) r1 = 0                      ; R1_w=0
> 460: (b7) r2 = 0                      ; R2_w=0
> 461: (b7) r4 = 64                     ; R4_w=64
> 462: (b7) r5 = 65535                  ; R5_w=65535
> 463: (85) call bpf_csum_diff#28
> invalid access to packet, off=34 size=64, R3(id=0,off=34,r=42)
> R3 offset is outside of the packet
>
> Side note: bpf_csum_diff() is super non user-friendly, but that's for
> another thread...
>
> Happy New Year,
> Maciej
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 21:45 ` Andrii Nakryiko
@ 2024-01-02 22:45   ` Maciej Żenczykowski
  2024-01-02 23:56     ` Yonghong Song
  2024-01-03  0:06     ` Andrii Nakryiko
  0 siblings, 2 replies; 13+ messages in thread
From: Maciej Żenczykowski @ 2024-01-02 22:45 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: BPF Mailing List, Alexei Starovoitov

On Tue, Jan 2, 2024 at 1:46 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Dec 29, 2023 at 5:31 PM Maciej Żenczykowski
> <zenczykowski@gmail.com> wrote:
> >
> > I have a relatively complex program that fails to load on 6.5.6 with a
> >
> > if (data + 98 != data_end) return TC_ACT_SHOT;
> >
>
> How realistic is such code in practice? Is there a situation in which
> it's critical to ensure that the packet has exactly X bytes in [data,
> data_end) range? Even in that case we can have data in frags, though,
> right? So I'm just wondering if we are discussing some rather
> theoretical situation?

So, as I mentioned I hit this while debugging some other complex code,
so the example 98 isn't a particularly good value
(when I actually hit this I think I was trying to match some ping packets).

However, elsewhere in the same program I need to match and reply to
IPv6 NS packets
(for an IPv6 address the kernel doesn't own, to workaround a pair of
kernel bugs / missing features in ipv6 neigh proxy).

In practice the NS I receive and need to handle are always:
  14 ethernet + 40 ipv6 + 8 icmp6 + 16 target + 8 byte link address
option = 86 bytes long
(and if they're not, then my current code can't parse them anyway)
so it's natural to do something like:

handle_ns_with_laddr(struct __sk_buff *skb) {
  if (skb->data + 86 != skb->data_end) return;
  struct ethernet_ipv6_ns *pkt = data;
  if (pkt->ether.h_proto != htons(ETH_P_IPV6)) return;
  if (pkt->ip6.nexthdr != IPPROTO_ICMPV6) return;
  ...etc...
}

Yeah, there's lots of caveats in the above (lack of pull, etc), but it
is enough to handle the case I need handled...

Obviously I could rewrite the above as:
  if (skb->data + 86 > skb->data_end) return;
  if (skb->data + 86 < skb->data_end) return;

though I guess a too smart compiler could optimize that back down to == ...

> > check, that loads fine if I change the above != to (a you would think
> > weaker) > check.
> >
> > It's not important, hit this while debugging, and I don't know if the
> > cause is the verifier treating != differently than > or the compiler
> > optimizing != somehow... but my gut feeling is on the former: some
> > verifier logic special cases > without doing something similar for the
> > stronger != comparison.
> >
> > ...
> > 453: (85) call bpf_trace_printk#6     ; R0_w=scalar()
> > ; if (data + 98 != data_end) return TC_ACT_SHOT;
> > 454: (bf) r1 = r6                     ; R1_w=pkt(off=0,r=42,imm=0)
> > R6=pkt(off=0,r=42,imm=0)
> > 455: (07) r1 += 98                    ; R1_w=pkt(off=98,r=42,imm=0)
> > ; if (data + 98 != data_end) return TC_ACT_SHOT;
> > 456: (5d) if r1 != r9 goto pc-23      ; R1_w=pkt(off=98,r=42,imm=0)
> > R9=pkt_end(off=0,imm=0)
> > *** IMHO here r=42 should be bumped to 98 ***
> > 457: (bf) r3 = r6                     ; R3_w=pkt(off=0,r=42,imm=0)
> > R6=pkt(off=0,r=42,imm=0)
> > 458: (07) r3 += 34                    ; R3_w=pkt(off=34,r=42,imm=0)
> > ; uint64_t cs = bpf_csum_diff(NULL, 0, data + 14 + 20, 98 - 14 - 20, 0xFFFF);
> > 459: (b7) r1 = 0                      ; R1_w=0
> > 460: (b7) r2 = 0                      ; R2_w=0
> > 461: (b7) r4 = 64                     ; R4_w=64
> > 462: (b7) r5 = 65535                  ; R5_w=65535
> > 463: (85) call bpf_csum_diff#28
> > invalid access to packet, off=34 size=64, R3(id=0,off=34,r=42)
> > R3 offset is outside of the packet
> >
> > Side note: bpf_csum_diff() is super non user-friendly, but that's for
> > another thread...
> >
> > Happy New Year,
> > Maciej
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 20:36       ` Maciej Żenczykowski
@ 2024-01-02 23:13         ` Eduard Zingerman
  2024-01-04 16:41           ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-01-02 23:13 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: BPF Mailing List, Alexei Starovoitov

On Tue, 2024-01-02 at 12:36 -0800, Maciej Żenczykowski wrote:
[...]
> 
> I wonder if JEQ/JNE couldn't simply be folded into the existing cases though...
> Or perhaps some other refactoring to tri-state the jmps...
> 
> switch (opcode) {
> case BPF_JEQ: eq_branch = this_branch; lt_branch = gt_branch =
> other_branch; break;
> case BPF_JNE: lt_branch = gt_branch = this_branch; eq_branch =
> other_branch; break;
> case BPF_LT: lt_branch = this_branch; eq_branch = gt_branch =
> other_branch; break;
> ...
> }
> and then you can ignore opcode...

The code could probably be simplified a bit (actually, I'm thinking
about pulling all dst/src type checks as bool variables at the beginning).

Suppose Andrii accepts this change, would you want to submit the patch?
(or I can wrap-up what I have).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 22:45   ` Maciej Żenczykowski
@ 2024-01-02 23:56     ` Yonghong Song
  2024-01-03  0:06     ` Andrii Nakryiko
  1 sibling, 0 replies; 13+ messages in thread
From: Yonghong Song @ 2024-01-02 23:56 UTC (permalink / raw)
  To: Maciej Żenczykowski, Andrii Nakryiko
  Cc: BPF Mailing List, Alexei Starovoitov


On 1/2/24 2:45 PM, Maciej Żenczykowski wrote:
> On Tue, Jan 2, 2024 at 1:46 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>> On Fri, Dec 29, 2023 at 5:31 PM Maciej Żenczykowski
>> <zenczykowski@gmail.com> wrote:
>>> I have a relatively complex program that fails to load on 6.5.6 with a
>>>
>>> if (data + 98 != data_end) return TC_ACT_SHOT;
>>>
>> How realistic is such code in practice? Is there a situation in which
>> it's critical to ensure that the packet has exactly X bytes in [data,
>> data_end) range? Even in that case we can have data in frags, though,
>> right? So I'm just wondering if we are discussing some rather
>> theoretical situation?
> So, as I mentioned I hit this while debugging some other complex code,
> so the example 98 isn't a particularly good value
> (when I actually hit this I think I was trying to match some ping packets).
>
> However, elsewhere in the same program I need to match and reply to
> IPv6 NS packets
> (for an IPv6 address the kernel doesn't own, to workaround a pair of
> kernel bugs / missing features in ipv6 neigh proxy).
>
> In practice the NS I receive and need to handle are always:
>    14 ethernet + 40 ipv6 + 8 icmp6 + 16 target + 8 byte link address
> option = 86 bytes long
> (and if they're not, then my current code can't parse them anyway)
> so it's natural to do something like:
>
> handle_ns_with_laddr(struct __sk_buff *skb) {
>    if (skb->data + 86 != skb->data_end) return;

So you want to test the precise packet length, right?
Does the following work?
    if (skb->data + 86 > skb->data_end) return;
    /* skb->data + 86 <= sbk->data_end, so you can access up to 86 bytes */

>    struct ethernet_ipv6_ns *pkt = data;
>    if (pkt->ether.h_proto != htons(ETH_P_IPV6)) return;
>    if (pkt->ip6.nexthdr != IPPROTO_ICMPV6) return;
>    ...etc...
> }
>
> Yeah, there's lots of caveats in the above (lack of pull, etc), but it
> is enough to handle the case I need handled...
>
> Obviously I could rewrite the above as:
>    if (skb->data + 86 > skb->data_end) return;
>    if (skb->data + 86 < skb->data_end) return;
>
> though I guess a too smart compiler could optimize that back down to == ...

I didn't try. but yes, it is possible.

>
>>> check, that loads fine if I change the above != to (a you would think
>>> weaker) > check.
>>>
>>> It's not important, hit this while debugging, and I don't know if the
>>> cause is the verifier treating != differently than > or the compiler
>>> optimizing != somehow... but my gut feeling is on the former: some
>>> verifier logic special cases > without doing something similar for the
>>> stronger != comparison.
>>>
>>> ...
>>> 453: (85) call bpf_trace_printk#6     ; R0_w=scalar()
>>> ; if (data + 98 != data_end) return TC_ACT_SHOT;
>>> 454: (bf) r1 = r6                     ; R1_w=pkt(off=0,r=42,imm=0)
>>> R6=pkt(off=0,r=42,imm=0)
>>> 455: (07) r1 += 98                    ; R1_w=pkt(off=98,r=42,imm=0)
>>> ; if (data + 98 != data_end) return TC_ACT_SHOT;
>>> 456: (5d) if r1 != r9 goto pc-23      ; R1_w=pkt(off=98,r=42,imm=0)
>>> R9=pkt_end(off=0,imm=0)
>>> *** IMHO here r=42 should be bumped to 98 ***
>>> 457: (bf) r3 = r6                     ; R3_w=pkt(off=0,r=42,imm=0)
>>> R6=pkt(off=0,r=42,imm=0)
>>> 458: (07) r3 += 34                    ; R3_w=pkt(off=34,r=42,imm=0)
>>> ; uint64_t cs = bpf_csum_diff(NULL, 0, data + 14 + 20, 98 - 14 - 20, 0xFFFF);
>>> 459: (b7) r1 = 0                      ; R1_w=0
>>> 460: (b7) r2 = 0                      ; R2_w=0
>>> 461: (b7) r4 = 64                     ; R4_w=64
>>> 462: (b7) r5 = 65535                  ; R5_w=65535
>>> 463: (85) call bpf_csum_diff#28
>>> invalid access to packet, off=34 size=64, R3(id=0,off=34,r=42)
>>> R3 offset is outside of the packet
>>>
>>> Side note: bpf_csum_diff() is super non user-friendly, but that's for
>>> another thread...
>>>
>>> Happy New Year,
>>> Maciej
>>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 22:45   ` Maciej Żenczykowski
  2024-01-02 23:56     ` Yonghong Song
@ 2024-01-03  0:06     ` Andrii Nakryiko
  2024-01-03  0:29       ` Eduard Zingerman
  1 sibling, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2024-01-03  0:06 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: BPF Mailing List, Alexei Starovoitov

On Tue, Jan 2, 2024 at 2:45 PM Maciej Żenczykowski
<zenczykowski@gmail.com> wrote:
>
> On Tue, Jan 2, 2024 at 1:46 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, Dec 29, 2023 at 5:31 PM Maciej Żenczykowski
> > <zenczykowski@gmail.com> wrote:
> > >
> > > I have a relatively complex program that fails to load on 6.5.6 with a
> > >
> > > if (data + 98 != data_end) return TC_ACT_SHOT;
> > >
> >
> > How realistic is such code in practice? Is there a situation in which
> > it's critical to ensure that the packet has exactly X bytes in [data,
> > data_end) range? Even in that case we can have data in frags, though,
> > right? So I'm just wondering if we are discussing some rather
> > theoretical situation?
>
> So, as I mentioned I hit this while debugging some other complex code,
> so the example 98 isn't a particularly good value
> (when I actually hit this I think I was trying to match some ping packets).
>
> However, elsewhere in the same program I need to match and reply to
> IPv6 NS packets
> (for an IPv6 address the kernel doesn't own, to workaround a pair of
> kernel bugs / missing features in ipv6 neigh proxy).
>
> In practice the NS I receive and need to handle are always:
>   14 ethernet + 40 ipv6 + 8 icmp6 + 16 target + 8 byte link address
> option = 86 bytes long
> (and if they're not, then my current code can't parse them anyway)
> so it's natural to do something like:
>
> handle_ns_with_laddr(struct __sk_buff *skb) {
>   if (skb->data + 86 != skb->data_end) return;
>   struct ethernet_ipv6_ns *pkt = data;
>   if (pkt->ether.h_proto != htons(ETH_P_IPV6)) return;
>   if (pkt->ip6.nexthdr != IPPROTO_ICMPV6) return;
>   ...etc...
> }
>
> Yeah, there's lots of caveats in the above (lack of pull, etc), but it
> is enough to handle the case I need handled...

Yeah, that's what I was getting at, just using data_end as a marker
for packet length seems incorrect. But I do see the point that it's
just an unnecessary complication for users to work around.

For the fix that Eduard proposed (and checking
try_match_pkt_pointers), should we do a similar simplification as we
do for scalar register comparisons? Make sure that data_end is always
on the right by swapping, if that's not the case. And also use
corresponding rev_opcode() and flip_opcode() operations to minimize
the amount of logic and duplicated code?

And I mean not just for new JEQ/JNE cases, but let's also refactor and
simplify existing logic as well?

>
> Obviously I could rewrite the above as:
>   if (skb->data + 86 > skb->data_end) return;
>   if (skb->data + 86 < skb->data_end) return;
>
> though I guess a too smart compiler could optimize that back down to == ...
>
> > > check, that loads fine if I change the above != to (a you would think
> > > weaker) > check.
> > >
> > > It's not important, hit this while debugging, and I don't know if the
> > > cause is the verifier treating != differently than > or the compiler
> > > optimizing != somehow... but my gut feeling is on the former: some
> > > verifier logic special cases > without doing something similar for the
> > > stronger != comparison.
> > >
> > > ...
> > > 453: (85) call bpf_trace_printk#6     ; R0_w=scalar()
> > > ; if (data + 98 != data_end) return TC_ACT_SHOT;
> > > 454: (bf) r1 = r6                     ; R1_w=pkt(off=0,r=42,imm=0)
> > > R6=pkt(off=0,r=42,imm=0)
> > > 455: (07) r1 += 98                    ; R1_w=pkt(off=98,r=42,imm=0)
> > > ; if (data + 98 != data_end) return TC_ACT_SHOT;
> > > 456: (5d) if r1 != r9 goto pc-23      ; R1_w=pkt(off=98,r=42,imm=0)
> > > R9=pkt_end(off=0,imm=0)
> > > *** IMHO here r=42 should be bumped to 98 ***
> > > 457: (bf) r3 = r6                     ; R3_w=pkt(off=0,r=42,imm=0)
> > > R6=pkt(off=0,r=42,imm=0)
> > > 458: (07) r3 += 34                    ; R3_w=pkt(off=34,r=42,imm=0)
> > > ; uint64_t cs = bpf_csum_diff(NULL, 0, data + 14 + 20, 98 - 14 - 20, 0xFFFF);
> > > 459: (b7) r1 = 0                      ; R1_w=0
> > > 460: (b7) r2 = 0                      ; R2_w=0
> > > 461: (b7) r4 = 64                     ; R4_w=64
> > > 462: (b7) r5 = 65535                  ; R5_w=65535
> > > 463: (85) call bpf_csum_diff#28
> > > invalid access to packet, off=34 size=64, R3(id=0,off=34,r=42)
> > > R3 offset is outside of the packet
> > >
> > > Side note: bpf_csum_diff() is super non user-friendly, but that's for
> > > another thread...
> > >
> > > Happy New Year,
> > > Maciej
> > >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-03  0:06     ` Andrii Nakryiko
@ 2024-01-03  0:29       ` Eduard Zingerman
  0 siblings, 0 replies; 13+ messages in thread
From: Eduard Zingerman @ 2024-01-03  0:29 UTC (permalink / raw)
  To: Andrii Nakryiko, Maciej Żenczykowski
  Cc: BPF Mailing List, Alexei Starovoitov

On Tue, 2024-01-02 at 16:06 -0800, Andrii Nakryiko wrote:
[...]
> For the fix that Eduard proposed (and checking
> try_match_pkt_pointers), should we do a similar simplification as we
> do for scalar register comparisons? Make sure that data_end is always
> on the right by swapping, if that's not the case. And also use
> corresponding rev_opcode() and flip_opcode() operations to minimize
> the amount of logic and duplicated code?
> 
> And I mean not just for new JEQ/JNE cases, but let's also refactor and
> simplify existing logic as well?

Yes, this should simplify the function significantly.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-02 23:13         ` Eduard Zingerman
@ 2024-01-04 16:41           ` Eduard Zingerman
  2024-01-05  0:33             ` Maciej Żenczykowski
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-01-04 16:41 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: BPF Mailing List, Alexei Starovoitov

On Wed, 2024-01-03 at 01:13 +0200, Eduard Zingerman wrote:
> On Tue, 2024-01-02 at 12:36 -0800, Maciej Żenczykowski wrote:
[...]
> Suppose Andrii accepts this change, would you want to submit the patch?
> (or I can wrap-up what I have).

Hi Maciej,

If you don't mind I'll wrap up and submit my local changes today.

Thanks,
Eduard


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Funky verifier packet range error (> check works, != does not).
  2024-01-04 16:41           ` Eduard Zingerman
@ 2024-01-05  0:33             ` Maciej Żenczykowski
  0 siblings, 0 replies; 13+ messages in thread
From: Maciej Żenczykowski @ 2024-01-05  0:33 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: BPF Mailing List, Alexei Starovoitov

On Thu, Jan 4, 2024 at 8:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2024-01-03 at 01:13 +0200, Eduard Zingerman wrote:
> > On Tue, 2024-01-02 at 12:36 -0800, Maciej Żenczykowski wrote:
> [...]
> > Suppose Andrii accepts this change, would you want to submit the patch?
> > (or I can wrap-up what I have).
>
> Hi Maciej,
>
> If you don't mind I'll wrap up and submit my local changes today.

Looking forward to seeing the patches.

>
> Thanks,
> Eduard
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-01-05  0:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-30  1:31 Funky verifier packet range error (> check works, != does not) Maciej Żenczykowski
2024-01-02 16:39 ` Eduard Zingerman
2024-01-02 18:30   ` Maciej Żenczykowski
2024-01-02 19:23     ` Eduard Zingerman
2024-01-02 20:36       ` Maciej Żenczykowski
2024-01-02 23:13         ` Eduard Zingerman
2024-01-04 16:41           ` Eduard Zingerman
2024-01-05  0:33             ` Maciej Żenczykowski
2024-01-02 21:45 ` Andrii Nakryiko
2024-01-02 22:45   ` Maciej Żenczykowski
2024-01-02 23:56     ` Yonghong Song
2024-01-03  0:06     ` Andrii Nakryiko
2024-01-03  0:29       ` Eduard Zingerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.