bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Ido Schimmel <idosch@idosch.org>
Cc: davem@davemloft.net, daniel@iogearbox.net, x86@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com,
	edumazet@google.com
Subject: Re: [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test
Date: Thu, 10 Oct 2019 12:07:20 -0700	[thread overview]
Message-ID: <20191010190634.lpxo5n5qpef3nwdj@ast-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20191010110729.GA21703@splinter>

On Thu, Oct 10, 2019 at 02:07:29PM +0300, Ido Schimmel wrote:
> On Wed, Oct 09, 2019 at 09:15:03PM -0700, Alexei Starovoitov wrote:
> > +SEC("raw_tracepoint/kfree_skb")
> > +int trace_kfree_skb(struct trace_kfree_skb *ctx)
> > +{
> > +	struct sk_buff *skb = ctx->skb;
> > +	struct net_device *dev;
> > +	int ifindex;
> > +	struct callback_head *ptr;
> > +	void *func;
> > +
> > +	__builtin_preserve_access_index(({
> > +		dev = skb->dev;
> > +		ifindex = dev->ifindex;
> 
> Hi Alexei,
> 
> The patchset looks very useful. One question: Is it always safe to
> access 'skb->dev->ifindex' here? I'm asking because 'dev' is inside a
> union with 'dev_scratch' which is 'unsigned long' and therefore might
> not always be a valid memory address. Consider for example the following
> code path:
> 
> ...
> __udp_queue_rcv_skb(sk, skb)
> 	__udp_enqueue_schedule_skb(sk, skb)
> 		udp_set_dev_scratch(skb)
> 		// returns error
> 	...
> 	kfree_skb(skb) // ebpf program is invoked
> 
> How is this handled by eBPF?

Excellent question. There are cases like this where the verifier cannot possibly
track semantics of the kernel code and union of pointer with scratch area
like this. That's why every access through btf pointer is a hidden probe_read.
Comparing to old school tracing all memory accesses were probe_read
and bpf prog was free to read anything and page fault everywhere.
Now bpf prog will almost always access correct data. Yet corner cases like
this are inevitable. I'm working on few ideas how to improve it further
with btf-tagged slab allocations and kasan-like memory shadowing.

Your question made me thinking whether we have a long standing issue
with dev_scratch, since even classic bpf has SKF_AD_IFINDEX hack
which is implemented as:
    *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, dev),
                          BPF_REG_TMP, BPF_REG_CTX,
                          offsetof(struct sk_buff, dev));
    /* if (tmp != 0) goto pc + 1 */
    *insn++ = BPF_JMP_IMM(BPF_JNE, BPF_REG_TMP, 0, 1);
    *insn++ = BPF_EXIT_INSN();
    if (fp->k == SKF_AD_OFF + SKF_AD_IFINDEX)
            *insn = BPF_LDX_MEM(BPF_W, BPF_REG_A, BPF_REG_TMP,
                                offsetof(struct net_device, ifindex));

That means for long time [c|e]BPF code was checking skb->dev for NULL only.
I've analyzed the code where socket filters can be called and I think it's good
there. dev_scratch is used after sk_filter has run.
But there are other hooks: lwt, various cgroups.
I've checked lwt and cgroup inet/egress. I think dev_scratch should not be
used in these paths. So should be good there as well.
But I think the whole idea of aliasing scratch into 'dev' pointer is dangerous.
There are plenty of tracepoints that do skb->dev->foo. Hard to track where
everything is called.
I think udp code need to move this dev_scratch into some other place in skb.


  reply	other threads:[~2019-10-10 19:07 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10  4:14 [PATCH v2 bpf-next 00/12] bpf: revolutionize bpf tracing Alexei Starovoitov
2019-10-10  4:14 ` [PATCH v2 bpf-next 01/12] bpf: add typecast to raw_tracepoints to help BTF generation Alexei Starovoitov
2019-10-10  4:14 ` [PATCH v2 bpf-next 02/12] bpf: add typecast to bpf helpers " Alexei Starovoitov
2019-10-10  4:14 ` [PATCH v2 bpf-next 03/12] bpf: process in-kernel BTF Alexei Starovoitov
2019-10-11 17:56   ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 04/12] bpf: add attach_btf_id attribute to program load Alexei Starovoitov
2019-10-11 17:58   ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 05/12] libbpf: auto-detect btf_id of raw_tracepoint Alexei Starovoitov
2019-10-11 18:02   ` Andrii Nakryiko
2019-10-11 18:07   ` Andrii Nakryiko
2019-10-12  0:40     ` Alexei Starovoitov
2019-10-12  1:29       ` Alexei Starovoitov
2019-10-12  4:38         ` Andrii Nakryiko
2019-10-12  4:53           ` Alexei Starovoitov
2019-10-12  4:39       ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 06/12] bpf: implement accurate raw_tp context access via BTF Alexei Starovoitov
2019-10-11 18:31   ` Andrii Nakryiko
2019-10-11 23:13     ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 07/12] bpf: attach raw_tp program with BTF via type name Alexei Starovoitov
2019-10-11 18:44   ` Andrii Nakryiko
2019-10-10  4:14 ` [PATCH v2 bpf-next 08/12] bpf: add support for BTF pointers to interpreter Alexei Starovoitov
2019-10-10  4:15 ` [PATCH v2 bpf-next 09/12] bpf: add support for BTF pointers to x86 JIT Alexei Starovoitov
2019-10-11 18:48   ` Andrii Nakryiko
2019-10-10  4:15 ` [PATCH v2 bpf-next 10/12] bpf: check types of arguments passed into helpers Alexei Starovoitov
2019-10-11 19:02   ` Andrii Nakryiko
2019-10-12  1:39     ` Alexei Starovoitov
2019-10-12  4:25       ` Andrii Nakryiko
2019-10-10  4:15 ` [PATCH v2 bpf-next 11/12] bpf: disallow bpf_probe_read[_str] helpers Alexei Starovoitov
2019-10-11 19:03   ` Andrii Nakryiko
2019-10-10  4:15 ` [PATCH v2 bpf-next 12/12] selftests/bpf: add kfree_skb raw_tp test Alexei Starovoitov
2019-10-10 11:07   ` Ido Schimmel
2019-10-10 19:07     ` Alexei Starovoitov [this message]
2019-10-11 19:05   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191010190634.lpxo5n5qpef3nwdj@ast-mbp.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=idosch@idosch.org \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).