From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1755519AbdKCVKG (ORCPT <rfc822;w@1wt.eu>);
        Fri, 3 Nov 2017 17:10:06 -0400
Received: from mail-pf0-f195.google.com ([209.85.192.195]:45290 "EHLO
        mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752114AbdKCVKE (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 3 Nov 2017 17:10:04 -0400
X-Google-Smtp-Source: ABhQp+SLn1ad7Gk1OmnKNGhC5O22lTpIaZzyg+xh3TYsWogo+Pwwz8gdKzsBz/b1MjFJ11/bsWFCLA==
Date: Sat, 4 Nov 2017 06:07:56 +0900
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Josef Bacik <josef@toxicpanda.com>, rostedt@goodmis.org,
        mingo@redhat.com, davem@davemloft.net, netdev@vger.kernel.org,
        linux-kernel@vger.kernel.org, ast@kernel.org, kernel-team@fb.com,
        Josef Bacik <jbacik@fb.com>
Subject: Re: [PATCH 1/2] bpf: add a bpf_override_function helper
Message-ID: <20171103210753.odvacnyh56krj7zn@ast-mbp>
References: <1509633431-2184-1-git-send-email-josef@toxicpanda.com>
 <1509633431-2184-2-git-send-email-josef@toxicpanda.com>
 <59FBA64D.1050400@iogearbox.net>
 <20171103143135.bnlwu7hmtgmgjdri@destiny>
 <59FC9EC6.3060900@iogearbox.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <59FC9EC6.3060900@iogearbox.net>
User-Agent: NeoMutt/20170421 (1.8.2)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, Nov 03, 2017 at 05:52:22PM +0100, Daniel Borkmann wrote:
> On 11/03/2017 03:31 PM, Josef Bacik wrote:
> > On Fri, Nov 03, 2017 at 12:12:13AM +0100, Daniel Borkmann wrote:
> > > Hi Josef,
> > > 
> > > one more issue I just noticed, see comment below:
> > > 
> > > On 11/02/2017 03:37 PM, Josef Bacik wrote:
> > > [...]
> > > > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > > > index cdd78a7beaae..dfa44fd74bae 100644
> > > > --- a/include/linux/filter.h
> > > > +++ b/include/linux/filter.h
> > > > @@ -458,7 +458,8 @@ struct bpf_prog {
> > > >    				locked:1,	/* Program image locked? */
> > > >    				gpl_compatible:1, /* Is filter GPL compatible? */
> > > >    				cb_access:1,	/* Is control block accessed? */
> > > > -				dst_needed:1;	/* Do we need dst entry? */
> > > > +				dst_needed:1,	/* Do we need dst entry? */
> > > > +				kprobe_override:1; /* Do we override a kprobe? */
> > > >    	kmemcheck_bitfield_end(meta);
> > > >    	enum bpf_prog_type	type;		/* Type of BPF program */
> > > >    	u32			len;		/* Number of filter blocks */
> > > [...]
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index d906775e12c1..f8f7927a9152 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -4189,6 +4189,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
> > > >    			prog->dst_needed = 1;
> > > >    		if (insn->imm == BPF_FUNC_get_prandom_u32)
> > > >    			bpf_user_rnd_init_once();
> > > > +		if (insn->imm == BPF_FUNC_override_return)
> > > > +			prog->kprobe_override = 1;
> > > >    		if (insn->imm == BPF_FUNC_tail_call) {
> > > >    			/* If we tail call into other programs, we
> > > >    			 * cannot make any assumptions since they can
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 9660ee65fbef..0d7fce52391d 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -8169,6 +8169,13 @@ static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd)
> > > >    		return -EINVAL;
> > > >    	}
> > > > 
> > > > +	/* Kprobe override only works for kprobes, not uprobes. */
> > > > +	if (prog->kprobe_override &&
> > > > +	    !(event->tp_event->flags & TRACE_EVENT_FL_KPROBE)) {
> > > > +		bpf_prog_put(prog);
> > > > +		return -EINVAL;
> > > > +	}
> > > 
> > > Can we somehow avoid the prog->kprobe_override flag here completely
> > > and also same in the perf_event_attach_bpf_prog() handler?
> > > 
> > > Reason is that it's not reliable for bailing out this way: Think of
> > > the main program you're attaching doesn't use bpf_override_return()
> > > helper, but it tail-calls into other BPF progs that make use of it
> > > instead. So above check would be useless and will fail and we continue
> > > to attach the prog for probes where it's not intended to be used.
> > > 
> > > We've had similar issues in the past e.g. c2002f983767 ("bpf: fix
> > > checking xdp_adjust_head on tail calls") is just one of those. Thus,
> > > can we avoid the flag altogether and handle such error case differently?
> > 
> > So if I'm reading this right there's no way to know what we'll tail call at any
> > given point, so I need to go back to my previous iteration of this patch and
> > always save the state of the kprobe in the per-cpu variable to make sure we
> > don't use bpf_override_return in the wrong case?
> 
> Yeah.
> 
> > The tail call functions won't be in the BPF_PROG_ARRAY right?  It'll be just
> > some other arbitrary function?  If that's the case then we really need something
> > like this
> 
> With BPF_PROG_ARRAY you mean BPF_MAP_TYPE_PROG_ARRAY or the prog array
> for the tracing/multiprog attach point? The program you're calling into
> is inside the BPF_MAP_TYPE_PROG_ARRAY map, but can change at any time
> and can have nesting as well.
> 
> > https://patchwork.kernel.org/patch/10034815/
> > 
> > and I need to bring that back right?  Thanks,
> 
> I'm afraid so. The thing with skb cb_access which was brought up there is
> that once you have a tail call in the prog you cannot make any assumptions
> anymore, therefore the cb_access flag is set to 1 so we save/restore for
> those cases precautionary since it could be accessed or not later on. In
> your case I think this wouldn't work since legitimate bpf kprobes progs could
> use tail calls today, so setting prog->kprobe_override there would prevent
> attaching for non-kprobes due to subsequent flags & TRACE_EVENT_FL_KPROBE
> check.

how about preventing programs that use bpf_override_return to be
store into prog_array? imo that would be cleaner solution.
doing tail_call into them is kinda useless anyway.
Then we can keep the bit and fast path.