From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754517AbbBNXDE (ORCPT ); Sat, 14 Feb 2015 18:03:04 -0500 Received: from mail-qg0-f41.google.com ([209.85.192.41]:47976 "EHLO mail-qg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754382AbbBNXDC (ORCPT ); Sat, 14 Feb 2015 18:03:02 -0500 MIME-Version: 1.0 From: Alexei Starovoitov Date: Sat, 14 Feb 2015 18:02:41 -0500 Message-ID: Subject: Re: [PATCH v3 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls To: Hekuang Cc: Steven Rostedt , Ingo Molnar , Namhyung Kim , Arnaldo Carvalho de Melo , Jiri Olsa , Masami Hiramatsu , Linux API , Network Development , LKML , Linus Torvalds , Peter Zijlstra , "Eric W. Biederman" , wangnan0@huawei.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 11, 2015 at 11:58 PM, Hekuang wrote: > >>> eBPF is very flexible, which means it is bound to have someone use it >>> in a way you never dreamed of, and that will be what bites you in the >>> end (pun intended). >> >> understood :) >> let's start slow then with bpf+syscall and bpf+kprobe only. > > > I think BPF + system calls/kprobes can meet our use case > (https://lkml.org/lkml/2015/2/6/44), but there're some issues to be > improved. > > I suggest that you can improve bpf+kprobes when attached to function > headers(or TRACE_MARKERS), make it converts pt-regs to bpf_ctx->arg1, > arg2.., then top models and architectures can be separated by bpf. > > BPF bytecode is cross-platform, but what we can get by using bpf+kprobes > is a 'regs->rdx' kind of information, such information is both > architecture and kernel version related. for kprobes in the middle of the function, kernel cannot convert pt_regs into argN. Placement was decided by compiler and can only be found in debug info. I think bpf+kprobe will be using it when it is available. When there is no debug info, kprobes will be limited to function entry and mapping of regs/stack into argN can be done by user space depending on architecture. So user tracing scripts in some higher level language can be kernel/arch independent when 'perf probe+bpf' is loading them on the fly on the given machine. > We hope to establish some models for describing kernel procedures such > as IO and network, which requires that it does not rely on architecture > and does not rely to a specific kernel version as much as possible. That's obviously a goal, but it requires a new approach to tracepoints. I think a lot of great ideas were discussed in this thread, so I'm hopeful that we'll come up with solution that will satisfy even strictest Peter's requirements :) From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexei Starovoitov Subject: Re: [PATCH v3 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls Date: Sat, 14 Feb 2015 18:02:41 -0500 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Steven Rostedt , Ingo Molnar , Namhyung Kim , Arnaldo Carvalho de Melo , Jiri Olsa , Masami Hiramatsu , Linux API , Network Development , LKML , Linus Torvalds , Peter Zijlstra , "Eric W. Biederman" , wangnan0-hv44wF8Li93QT0dZR+AlfA@public.gmane.org To: Hekuang Return-path: Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org On Wed, Feb 11, 2015 at 11:58 PM, Hekuang wrote: > >>> eBPF is very flexible, which means it is bound to have someone use it >>> in a way you never dreamed of, and that will be what bites you in the >>> end (pun intended). >> >> understood :) >> let's start slow then with bpf+syscall and bpf+kprobe only. > > > I think BPF + system calls/kprobes can meet our use case > (https://lkml.org/lkml/2015/2/6/44), but there're some issues to be > improved. > > I suggest that you can improve bpf+kprobes when attached to function > headers(or TRACE_MARKERS), make it converts pt-regs to bpf_ctx->arg1, > arg2.., then top models and architectures can be separated by bpf. > > BPF bytecode is cross-platform, but what we can get by using bpf+kprobes > is a 'regs->rdx' kind of information, such information is both > architecture and kernel version related. for kprobes in the middle of the function, kernel cannot convert pt_regs into argN. Placement was decided by compiler and can only be found in debug info. I think bpf+kprobe will be using it when it is available. When there is no debug info, kprobes will be limited to function entry and mapping of regs/stack into argN can be done by user space depending on architecture. So user tracing scripts in some higher level language can be kernel/arch independent when 'perf probe+bpf' is loading them on the fly on the given machine. > We hope to establish some models for describing kernel procedures such > as IO and network, which requires that it does not rely on architecture > and does not rely to a specific kernel version as much as possible. That's obviously a goal, but it requires a new approach to tracepoints. I think a lot of great ideas were discussed in this thread, so I'm hopeful that we'll come up with solution that will satisfy even strictest Peter's requirements :)