From mboxrd@z Thu Jan 1 00:00:00 1970 From: Edward Cree Subject: Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications Date: Tue, 3 Apr 2018 14:39:11 +0100 Message-ID: <554dc916-2eed-dabc-1776-eca6d8bf019b@solarflare.com> References: <20180403010802.jkqffxw4m75oioj7@ast-mbp> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Cc: Daniel Borkmann , To: Alexei Starovoitov Return-path: Received: from dispatch1-us1.ppe-hosted.com ([148.163.129.52]:36340 "EHLO dispatch1-us1.ppe-hosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750722AbeDCNkE (ORCPT ); Tue, 3 Apr 2018 09:40:04 -0400 In-Reply-To: <20180403010802.jkqffxw4m75oioj7@ast-mbp> Content-Language: en-GB Sender: netdev-owner@vger.kernel.org List-ID: On 03/04/18 02:08, Alexei Starovoitov wrote: > I like patch 3 and going to play with it. > How did you test it ? Just test_verifier and test_progs (the latter has a failure  "test_bpf_obj_id:FAIL:get-prog-info(fd) err 0 errno 2 i 0 type 1(1) info_len 80(40) jit_enabled 0 jited_prog_len 0 xlated_prog_len 72"  but that was already there before my patch). > Do you have processed_insn numbers for > cilium or selftests programs before and after? > There should be no difference, right? That's a good check, I'll do that. > As far as patch 1 it was very difficult to review, since several logically > different things clamped together. So breaking it apart: > - converting two arrays of subprog_starts and subprog_stack_depth into > single array of struct bpf_subprog_info is a good thing > - tsort is interesting, but not sure it's correct. more below > - but main change of combining subprog logic with do_check is no good. > There will be no 'do_check() across whole program' walk. > Combining subprog pass with do_check is going into opposite direction > of this long term work. Divide and conquer. Combining more things into > do_check is the opposite of this programming principle. The main object of my change here was to change the data structures, to  store a subprogno in each insn aux rather than using bsearch() on the  subprog_starts.  I have now figured out the algorithm to do this in its  own pass (previously I thought this needed a recursive walk which is why  I wanted to roll it into do_check() - doing more than one whole-program  recursive walk seems like a bad idea.) > My short term plan is to split basic instruction correctness checks > out of do_check() loop into separate pass and run it early on. I agree with that short term plan, sounds like a good idea. I'm still not sure I understand the long-term plan, though; since most  insns' input registers will still need to be checked (I'm assuming  majority of most real ebpf programs consists of computing and  dereferencing pointers), the data flow analysis will have to end up  doing all the same register updates current do_check() does (though  potentially in a different order), e.g. if a function is called three  times it will have to analyse it with three sets of input registers. Unless you have some way of specifying function preconditions, I don't  see how it works.  In particular something like     char *f(char *p)     {         *p++ = 0;         return p;     }     int main(void)     {         char *p = "abc"; /* represents getting a ptr from ctx or map */         p = f(p);         p = f(p);         p = f(p);         return 0;     }  seems as though it would be difficult to analyse in any way more  scalable than the current full recursive walk.  Unless you somehow  propagate register state _backwards_ as constraints when analysing a  function...?  In any case it seems like there are likely to be things  which current verifier accepts which require 'whole-program' analysis  to determine the dataflow (e.g. what if there were some conditional  branches in f(), and the preconditions on p depended on the value of  some other arg in r2?) > As far as tsort approach for determining max stack depth... > It's an interesting idea, but implementation is suffering from the same > 'combine everything into one loop' coding issue. > I think updating total_stack_depth math should be separate from sorting, > since the same function can be part of different stacks with different max. > I don't see how updating global subprog_info[i].total_stack_depth can > be correct. It has to be different for every stack and the longest > stack is not necessarily the deepest. May be I'm missing something, > but combined sort and stack_depth math didn't make it easy to review. > I find existing check_max_stack_depth() algo much easier to understand. The sort _is_ the way to compute total stack depth.  The sort order isn't  being stored anywhere; it's being done just so that each subprog gets  looked at after all its callers have been considered.  So when it gets  selected as a 'frontier node', its maximum stack depth is known, and can  thus be used to update its callees (note that we do a max_t() with each  callee's existing total_stack_depth, thus getting the deepest stack of  all call chains to the function). It may help to imagine drawing the call graph and labelling each node with  a stack depth as it is visited; sadly that's difficult to show in an email  (or a code comment).  But I can try to explain it a bit better than  "/* Update callee total stack depth */". I will also try to split up patch #1 into more pieces.  I mistakenly thought  that existing check_max_stack_depth() depended on some invariants that I was  removing, but I guess that was only true while I had non-contiguous subprogs. -Ed