All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>
Subject: Re: [bpf PATCH v3] bpf: verifier, do_refine_retval_range may clamp umin to 0 incorrectly
Date: Thu, 30 Jan 2020 16:44:09 -0800	[thread overview]
Message-ID: <5e337859a4287_43212ac192b965bc51@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <740b2df9-af35-cc9e-d4f9-50978ef2dbc6@fb.com>

Yonghong Song wrote:
> 
> 
> On 1/30/20 3:34 PM, John Fastabend wrote:
> > Alexei Starovoitov wrote:
> >> On Thu, Jan 30, 2020 at 09:38:07AM -0800, John Fastabend wrote:
> >>> Alexei Starovoitov wrote:
> >>>> On Wed, Jan 29, 2020 at 02:52:10PM -0800, John Fastabend wrote:
> >>>>> Daniel Borkmann wrote:
> >>>>>> On 1/29/20 8:28 PM, Alexei Starovoitov wrote:
> >>>>>>> On Wed, Jan 29, 2020 at 8:25 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>>>>>>
> >>>>>>>>> Fixes: 849fa50662fbc ("bpf/verifier: refine retval R0 state for bpf_get_stack helper")
> >>>>>>>>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> >>>>>>>>
> >>>>>>>> Applied, thanks!
> >>>>>>>
> >>>>>>> Daniel,
> >>>>>>> did you run the selftests before applying?
> >>>>>>> This patch breaks two.
> >>>>>>> We have to find a different fix.
> >>>>>>>
> >>>>>>> ./test_progs -t get_stack
> >>>>>>> 68: (85) call bpf_get_stack#67
> >>>>>>>    R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0)
> >>>>>>> R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=4294967295,var_off=(0x0;
> >>>>>>> 0xffffffff)) R3_w=inv(id=0,umax_value=4294967295,var_off=(0x0;
> >>>>>>> 0xffffffff)) R4_w=inv0 R6=ctx(id=0,off=0,im?
> >>>>>>> R2 unbounded memory access, make sure to bounds check any array access
> >>>>>>> into a map
> >>>>>>
> >>>>>> Sigh, had it in my wip pre-rebase tree when running tests. I've revert it from the
> >>>>>> tree since this needs to be addressed. Sorry for the trouble.
> >>>>>
> >>>>> Thanks I'm looking into it now. Not sure how I missed it on
> >>>>> selftests either older branch or I missed the test somehow. I've
> >>>>> updated toolchain and kernel now so shouldn't happen again.
> >>>>
> >>>> Looks like smax_value was nuked by <<32 >>32 shifts.
> >>>> 53: (bf) r8 = r0   // R0=inv(id=0,smax_value=800)
> >>>> 54: (67) r8 <<= 32  // R8->smax_value = S64_MAX; in adjust_scalar_min_max_vals()
> >>>> 55: (c7) r8 s>>= 32
> >>>> ; if (usize < 0)
> >>>> 56: (c5) if r8 s< 0x0 goto pc+28
> >>>> // and here "less than zero check" doesn't help anymore.
> >>>>
> >>>> Not sure how to fix it yet, but the code pattern used in
> >>>> progs/test_get_stack_rawtp.c
> >>>> is real. Plenty of bpf progs rely on this.
> >>>
> >>> OK I see what happened I have some patches on my llvm tree and forgot to
> >>> pop them off before running selftests :/ These <<=32 s>>=32 pattern pops up
> >>> in a few places for us and causes verifier trouble whenever it is hit.
> >>>
> >>> I think I have a fix for this in llvm, if that is OK. And we can make
> >>> the BPF_RSH and BPF_LSH verifier bounds tighter if we also define the
> >>> architecture expectation on the jit side. For example x86 jit code here,
> >>>
> >>> 146:   shl    $0x20,%rdi
> >>> 14a:   shr    $0x20,%rdi
> >>>
> >>> the shr will clear the most significant bit so we can say something about
> >>> the min sign value. I'll generate a couple patches today and send them
> >>> out to discuss. Probably easier to explain with code and examples.
> >>
> >> How about we detect this pattern on the verifier side and replace with
> >> pseudo insn that will do 32-bit sign extend. Most archs have special
> >> cpu instruction to do this much more efficiently than two shifts.
> >> If JIT doesn't implement that pseudo yet the verifier can convert
> >> it back to two shifts.
> >> Then during verification it will process pseudo_sign_extend op easily.
> >> So the idea:
> >> 1. pattern match sequence of two shifts in a pass similar to
> >>     replace_map_fd_with_map_ptr() before main do_check()
> >> 2. pseudo_sign_extend gets process in do_check() doing the right thing
> >>     with bpf_reg_state.
> >> 3. JIT this pseudo insn or convert back
> >>
> >> Long term we can upgrade this pseudo insn into uapi and let llvm emit it.
> > 
> > I'm not sure pattern matching in the verifier is best. This paticular
> > case of lsh/rsh games is the result of BPF backend generating it from
> > a LLVM IR zext.
> > 
> > Here is the LLVM IR generated from test_get_stack_rawtp that produces
> > the zext.
> > 
> > 
> >   %26 = call i32 inttoptr (i64 67 to i32 (i8*, i8*, i32, i64)*)(i8* %0, i8* nonnull %23, i32 800, i64 256) #3, !dbg !166
> >    call void @llvm.dbg.value(metadata i32 %26, metadata !124, metadata !DIExpression()), !dbg !130
> >    %27 = icmp slt i32 %26, 0, !dbg !167
> >    br i1 %27, label %41, label %28, !dbg !169
> > 
> > 28:                                               ; preds = %25
> >    %29 = zext i32 %26 to i64, !dbg !170
> >    %30 = getelementptr i8, i8* %23, i64 %29, !dbg !170
> > 
> > 
> > Clang wants to do zext because we are promoting a i32 to i64. In the
> > BPF backend code we pattern match this as follows,
> > 
> >   def : Pat<(i64 (zext GPR32:$src)),
> >             (SRL_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;
> > 
> > Which generates the object code (again from test_get_stack_rawtp),
> > 
> >        56:       bc 81 00 00 00 00 00 00 w1 = w8
> >        57:       67 01 00 00 20 00 00 00 r1 <<= 32
> >        58:       77 01 00 00 20 00 00 00 r1 >>= 32
> > 
> > Unfortunately, this is a pretty poor form of zext from the verifiers point
> > of view it completely nukes bounds as you observed. So how about doing
> > something a bit simpler from the backend side. Noting that moving 32bit
> > into 32bit zero-extends on x86 and we also make that assumption elsewhere
> > so it should be safe to implement the zext from above object dump as just
> > the mov
> > 
> >    w1 = w8
> > 
> > Which we can implement in the backend with this patch,
> > 
> > diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
> > index 0f39294..a187103 100644
> > --- a/llvm/lib/Target/BPF/BPFInstrInfo.td
> > +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
> > @@ -733,7 +733,7 @@ def : Pat<(i64 (sext GPR32:$src)),
> >             (SRA_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;
> >   
> >   def : Pat<(i64 (zext GPR32:$src)),
> > -          (SRL_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;
> > +          (MOV_32_64 GPR32:$src)>;
> 
> Thanks. This should work.
> 
> Also, in BPF backend, we have a phase in BPFMIPeephole.cpp to
> remove such <<= and >>= in general. It may miss some cases.
> 
> But verifier seems handling <<= and >>= correctly, right?
> Even we have it, the verifier should reach the same conclusion
> compared to not having it, right?
> 

No, verifier marks it unknown and doesn't track it fully. We
can perhaps improve the verifier but the above is a nice
fix/improvement for the backend imo regardless.

	case BPF_LSH:
		if (umax_val >= insn_bitness) {
			/* Shifts greater than 31 or 63 are undefined.
			 * This includes shifts by a negative number.
			 */
			mark_reg_unknown(env, regs, insn->dst_reg);
			break;
		}
		/* We lose all sign bit information (except what we can pick
		 * up from var_off)
		 */

> >   
> > Now the new object code is simply,
> > 
> >        54:       c6 08 14 00 00 00 00 00 if w8 s< 0 goto +20 <LBB0_6>
> >        55:       1c 89 00 00 00 00 00 00 w9 -= w8
> >        56:       bc 81 00 00 00 00 00 00 w1 = w8
> >        57:       bf 72 00 00 00 00 00 00 r2 = r7
> >        58:       0f 12 00 00 00 00 00 00 r2 += r1
> >        59:       bf 61 00 00 00 00 00 00 r1 = r6
> >        60:       bc 93 00 00 00 00 00 00 w3 = w9
> >        61:       b7 04 00 00 00 00 00 00 r4 = 0
> >        62:       85 00 00 00 43 00 00 00 call 67
> > ;       if (ksize < 0)
> > 
> > That is the block from your originally trace. But one issue still
> > remains and just the above llvm backend update doesn't fix the verifier
> > problem created by my patch because in the false branch after line 54
> > above we don't have the right bounds.
> > 
> >   53: (bc) w8 = w0
> >   ; if (usize < 0)
> >   54: (c6) if w8 s< 0x0 goto pc+20
> >    R0=inv(id=0,smax_value=800) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9=inv800 R10=fp0 fp-8=mmmm????
> >   ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> >   55: (1c) w9 -= w8
> >   ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> >   56: (bc) w1 = w8
> >   57: (bf) r2 = r7
> >   58: (0f) r2 += r1
> >    R0_rw=invP(id=0,smax_value=800) R6=ctx(id=0,off=0,imm=0) R7_rw=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R9_rw=inv800 R10=fp0 fp-8=mmmm????
> >   parent didn't have regs=1 stack=0 marks
> >   last_idx 52 first_idx 40
> >   regs=1 stack=0 before 52: (85) call bpf_get_stack#67
> >   ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> >   59: (bf) r1 = r6
> >   60: (bc) w3 = w9
> >   61: (b7) r4 = 0
> >   62: (85) call bpf_get_stack#67
> > 
> > At line :54 R0 has bounds [SMIN, 800] which by 53: are the bounds in
> > w8 remembering a load will zero extend there.  So we should expect
> > the false branch to have bounds [0, 800] exactly as we want. But,
> > we instead got only a umax_value? Digging deeper we are failing here,
> > 
> >   /* Return true if VAL is compared with a s64 sign extended from s32, and they
> >    * are with the same signedness.
> >    */
> >   static bool cmp_val_with_extended_s64(s64 sval, struct bpf_reg_state *reg)
> >   {
> >           return ((s32)sval >= 0 &&
> >                   reg->smin_value >= 0 && reg->smax_value <= S32_MAX) ||
> >                  ((s32)sval < 0 &&
> >                   reg->smax_value <= 0 && reg->smin_value >= S32_MIN);
> >   }
> > 
> > This appears to conservative. I'll need to analyze that a bit but it
> > should be safe to relax to catch above <0 case. After that I expect
> > we should be passing again.
> > 
> > Sorry for the long thread but those are the details. What do you think,
> > in the meantime I'll generate the relaxed bounds on cmp_val_with_extended
> > and see what we can cook up with Daniel. It avoid pseudo instructions
> > and pattern matching which I think is a bit more general.
> > 
> > Thanks,
> > John
> > 



  reply	other threads:[~2020-01-31  0:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-27 19:29 [bpf PATCH v3] bpf: verifier, do_refine_retval_range may clamp umin to 0 incorrectly John Fastabend
2020-01-29 16:25 ` Daniel Borkmann
2020-01-29 19:28   ` Alexei Starovoitov
2020-01-29 22:20     ` Daniel Borkmann
2020-01-29 22:52       ` John Fastabend
2020-01-30  0:04         ` Alexei Starovoitov
2020-01-30 17:38           ` John Fastabend
2020-01-30 17:59             ` Alexei Starovoitov
2020-01-30 23:34               ` John Fastabend
2020-01-31  0:15                 ` Yonghong Song
2020-01-31  0:44                   ` John Fastabend [this message]
2020-01-31  0:52                     ` Yonghong Song
2020-01-31  2:50                     ` Alexei Starovoitov
2020-01-31  0:28                 ` Yonghong Song
2020-01-31  0:48                   ` John Fastabend
2020-01-31  2:46                 ` Alexei Starovoitov
2020-01-31  5:48                   ` John Fastabend
2020-01-31  6:18                     ` Alexei Starovoitov
2020-01-31 17:16                       ` John Fastabend
2020-01-31 21:36                         ` Alexei Starovoitov
2020-02-04 19:55                           ` John Fastabend
2020-02-05  1:21                             ` Yonghong Song
2020-02-05  3:05                               ` John Fastabend
2020-02-06  1:24                                 ` Yonghong Song
2020-02-07 20:47                                   ` John Fastabend
2020-02-08  6:23                                     ` Yonghong Song
2020-04-09 15:03 ` Lorenzo Fontana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5e337859a4287_43212ac192b965bc51@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.