All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@plumgrid.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Andi Kleen <andi@firstfloor.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	Tom Zanussi <tom.zanussi@linux.intel.com>,
	Jovi Zhangwei <jovi.zhangwei@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH tip 0/5] tracing filters with BPF
Date: Tue, 10 Dec 2013 18:32:47 -0800	[thread overview]
Message-ID: <CAMEtUuzaCMa6gSgoKvuzXPwo5d+2oFg9URiakKECh2nYSD8o9g@mail.gmail.com> (raw)
In-Reply-To: <20131210154748.GA1950@gmail.com>

On Tue, Dec 10, 2013 at 7:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Alexei Starovoitov <ast@plumgrid.com> wrote:
>
>> > I'm fine if it becomes a requirement to have a vmlinux built with
>> > DEBUG_INFO to use BPF and have a tool like perf to translate the
>> > filters. But it that must not replace what the current filters do
>> > now. That is, it can be an add on, but not a replacement.
>>
>> Of course. tracing filters via bpf is an additional tool for kernel
>> debugging. bpf by itself has use cases beyond tracing.
>
> Well, Steve has a point: forcing DEBUG_INFO is a big showstopper for
> most people.

there is a misunderstanding here.
I was saying 'of course' to 'not replace current filter infra'.

bpf does not depend on debug info.
That's the key difference between 'perf probe' approach and bpf filters.

Masami is right that what I was trying to achieve with bpf filters
is similar to 'perf probe': insert a dynamic probe anywhere
in the kernel, walk pointers, data structures, print interesting stuff.

'perf probe' does it via scanning vmlinux with debug info.
bpf filters don't need it.
tools/bpf/trace/*_orig.c examples only depend on linux headers
in /lib/modules/../build/include/
Today bpf compiler struct layout is the same as x86_64.

Tomorrow bpf compiler will have flags to adjust endianness, pointer size, etc
of the front-end. Similar to -m32/-m64 and -m*-endian flags.
Neat part is that I don't need to do any work, just enable it properly in
the bpf backend. From gcc/llvm point of view, bpf is yet another 'hw'
architecture that compiler is emitting code for.
So when C code of filter_ex1_orig.c does 'skb->dev', compiler determines
field offset by looking at /lib/modules/.../include/skbuff.h
whereas for 'perf probe' 'skb->dev' means walk debug info.

Something like: cc1 -mlayout_x86_64 filter.c will produce bpf code that
walks all data structures in the same way x86_64 does it.
Even if the user makes a mistake and uses -mlayout_aarch64, it won't crash.
Note that all -m* flags will be in one compiler. It won't grow any bigger
because of that. All of it already supported by C front-ends.
It may sound complex, but really very little code for the bpf backend.

I didn't look inside systemtap/ktap enough to say how much they're
relying on presence of debug info to make a comparison.

I see two main use cases for bpf tracing filters: debugging live kernel
and collecting stats. Same tricks that [sk]tap do with their maps.
Or may be some of the stats that 'perf record' collects in userspace
can be collected by bpf filter in kernel and stored into generic bpf table?

> Would it be possible to make BFP filters recognize exposed details
> like the current filters do, without depending on the vmlinux?

Well, if you say that presence of linux headers is also too much to ask,
I can hook bpf after probes stored all the args.

This way current simple filter syntax can move to userspace.
'arg1==x || arg2!=y' can be parsed by userspace, bpf code
generated and fed into kernel. It will be faster than walk_pred_tree(),
but if we cannot remove 2k lines from trace_events_filter.c
because of backward compatibility, extra performance becomes
the only reason to have two different implementations.

Another use case is to optimize fetch sequences of dynamic probes
as Masami suggested, but backward compatibility requirement
would preserve to ways of doing it as well.

imo the current hook of bpf into tracing is more compelling, but let me
think more about reusing data stored in the ring buffer.

Thanks
Alexei

  reply	other threads:[~2013-12-11  2:32 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03  4:28 [RFC PATCH tip 0/5] tracing filters with BPF Alexei Starovoitov
2013-12-03  4:28 ` [RFC PATCH tip 1/5] Extended BPF core framework Alexei Starovoitov
2013-12-03  4:28 ` [RFC PATCH tip 2/5] Extended BPF JIT for x86-64 Alexei Starovoitov
2013-12-03  4:28 ` [RFC PATCH tip 3/5] Extended BPF (64-bit BPF) design document Alexei Starovoitov
2013-12-03 17:01   ` H. Peter Anvin
2013-12-03 19:59     ` Alexei Starovoitov
2013-12-03 20:41       ` Frank Ch. Eigler
2013-12-03 21:31         ` Alexei Starovoitov
2013-12-04  9:24           ` Ingo Molnar
2013-12-03  4:28 ` [RFC PATCH tip 4/5] use BPF in tracing filters Alexei Starovoitov
2013-12-04  0:48   ` Masami Hiramatsu
2013-12-04  1:11     ` Steven Rostedt
2013-12-05  0:05       ` Masami Hiramatsu
2013-12-05  5:11         ` Alexei Starovoitov
2013-12-06  8:43           ` Masami Hiramatsu
2013-12-06 10:05             ` Jovi Zhangwei
2013-12-06 23:48               ` Masami Hiramatsu
2013-12-08 18:22                 ` Frank Ch. Eigler
2013-12-09 10:12                   ` Masami Hiramatsu
2013-12-03  4:28 ` [RFC PATCH tip 5/5] tracing filter examples in BPF Alexei Starovoitov
2013-12-04  0:35   ` Jonathan Corbet
2013-12-04  1:21     ` Alexei Starovoitov
2013-12-03  9:16 ` [RFC PATCH tip 0/5] tracing filters with BPF Ingo Molnar
2013-12-03 15:33   ` Steven Rostedt
2013-12-03 18:26     ` Alexei Starovoitov
2013-12-04  1:13       ` Masami Hiramatsu
2013-12-09  7:29         ` Namhyung Kim
2013-12-09  9:51           ` Masami Hiramatsu
2013-12-03 18:06   ` Alexei Starovoitov
2013-12-04  9:34     ` Ingo Molnar
2013-12-04 17:36       ` Alexei Starovoitov
2013-12-05 10:38         ` Ingo Molnar
2013-12-06  5:43           ` Alexei Starovoitov
2013-12-03 10:34 ` Masami Hiramatsu
2013-12-04  0:01 ` Andi Kleen
2013-12-04  3:09   ` Alexei Starovoitov
2013-12-05  4:40     ` Alexei Starovoitov
2013-12-05 10:41       ` Ingo Molnar
2013-12-05 13:46         ` Steven Rostedt
2013-12-05 22:36           ` Alexei Starovoitov
2013-12-05 23:37             ` Steven Rostedt
2013-12-06  4:49               ` Alexei Starovoitov
2013-12-10 15:47                 ` Ingo Molnar
2013-12-11  2:32                   ` Alexei Starovoitov [this message]
2013-12-11  3:35                     ` Masami Hiramatsu
2013-12-12  2:48                       ` Alexei Starovoitov
2013-12-05 16:11       ` Frank Ch. Eigler
2013-12-05 19:43         ` Alexei Starovoitov
2013-12-06  0:14       ` Andi Kleen
2013-12-06  1:10         ` H. Peter Anvin
2013-12-06  1:20           ` Andi Kleen
2013-12-06  1:28             ` H. Peter Anvin
2013-12-06 21:43               ` Frank Ch. Eigler
2013-12-06  5:16             ` Alexei Starovoitov
2013-12-06 23:54               ` Masami Hiramatsu
2013-12-07  1:01                 ` Alexei Starovoitov
2013-12-06  5:46             ` Jovi Zhangwei
2013-12-07  1:12             ` Alexei Starovoitov
2013-12-07 16:53               ` Jovi Zhangwei
2013-12-06  5:19       ` Jovi Zhangwei
2013-12-06 23:58         ` Masami Hiramatsu
2013-12-07 16:21           ` Jovi Zhangwei
2013-12-09  4:59             ` Masami Hiramatsu
2013-12-06  6:17       ` Jovi Zhangwei
2013-12-05 16:31   ` Frank Ch. Eigler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMEtUuzaCMa6gSgoKvuzXPwo5d+2oFg9URiakKECh2nYSD8o9g@mail.gmail.com \
    --to=ast@plumgrid.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=edumazet@google.com \
    --cc=hpa@zytor.com \
    --cc=jovi.zhangwei@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tom.zanussi@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.