All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Zanussi <tom.zanussi@linux.intel.com>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Andi Kleen <andi@firstfloor.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Arnaldo Carvalho de Melo <acme@infradead.org>
Subject: Re: [PATCH v2 00/15] tracing: 'hist' triggers
Date: Tue, 03 Mar 2015 23:01:38 -0600	[thread overview]
Message-ID: <1425445298.20819.222.camel@picadillo> (raw)
In-Reply-To: <54F66C7C.90301@plumgrid.com>

On Tue, 2015-03-03 at 18:22 -0800, Alexei Starovoitov wrote:
> On 3/3/15 7:47 AM, Tom Zanussi wrote:
> > On Mon, 2015-03-02 at 18:31 -0800, Alexei Starovoitov wrote:
> >> On Mon, Mar 2, 2015 at 5:18 PM, Tom Zanussi <tom.zanussi@linux.intel.com> wrote:
> 
> >> I'm not proposing to use asm or C for this 'hist->bpf' tool.
> >> Keep proposed 'hist:keys=...:vals=...' syntax and generate
> >> bpf program on the fly based on this string.
> >> So this user tool will take string, generate program, load
> >> and attach it. Everything based on this single string input.
> >> With the examples you mentioned in docs, it's not hard.
> >> It will be more involved than patch 12, but way more generic
> >> and easily extensible when 'hist:keys=...' string would need
> >> to be extended.
> >>
> >
> > Hmm, this seems silly to me - doing all that work just to get back to
> > where we already are.
> 
> your 'hist->bpf' tool could have been first to solve 'bpf hard to use'
> problem and overtime it could have evolved into full dtrace alternative.
> Whereas by adding 'hist:keys=..' parsing to kernel you'll be stuck
> with it and somebody else's dtrace-like tool will supersede it.
> 

I think there's some misunderstanding there - it was never my intent to
create a full dtrace alternative, really the idea was (and has been,
even before there was any such thing as ebpf in the kernel) only to
provide access to some low-hanging fruit via the standard read/write
interfaces people are used to with ftrace.  The dtrace-like tools were
what I thought ebpf was all about, where users could go after exhausting
the possibilities of a simple table-driven first approach to a problem. 

> > I don't think you can start requiring all new tracing functionality to
> > embed a code generator/compiler, or as a result force any particular
> > interface on users.
> 
> force interface on users?!
> I think I've explained enough that bpf would not be a user visible
> interface. It's kernel to user land interface. In my proposal of
> 'hist->bpf' tool users won't even see that bpf is being generated
> and run underneath. Same with dtrace-like scripting. Users will
> see scripting language and won't care how it's compiled and
> executed inside kernel. Multiple different languages and interfaces
> are possible. Including hist-like text that's not a language.
> 

What about the kernel to user land interface that users have already
been asking for, pseudo-files in debugfs/tracefs?  If you can't do that,
then you're forcing interface on users.  

>  > If it's possible to factor out a common
> > infrastructure that can accommodate different user interfaces and
> > preferences, don't you think it makes sense to do that?
> >
> > In any case, outside of the boilerplate tracing infrastructure code, it
> > seems to me that a lot of the code in the main hist trigger patches is
> > code that you'd also need for a tracepoint/bpf interface.  I think we've
> > already agreed that it would be good to work towards being able to share
> > that, so for the next version, I'll see what I can come up with.
> 
> yes. let's see what pieces can be shared. Though I don't want to
> sacrifice long term goals for short term solution.
> In case of bpf_maps, the objective is to share data between
> kernel and user space by single abstraction with different
> implementations underneath. In some use cases kernel programs only
> do lookups into maps, while user space concurrently adding/deleting
> entries (so no allocations from programs)
> In your patch 4 you're adding kernel only access to maps but planning
> to use hash type only and also not exposing these maps to user space
> at all... that defeats bpf_map purpose and you only reusing ~200
> lines of hashtab.c code, but also need to hack it for pre-allocate
> and make config_bpf_syscall a strong dependency for 'hist'. why?
> 'hist' patch set would have been much shorter if you just used
> hashtable.h or rhashtable.h or even rbtree.h (so you don't need
> to do sorting in patch 13). The actual hash insert/walk code
> is tiny and trivial.

Well, I kind of hinted in the cover letter that ultimately I thought the
tracing_map common code would move out of bpf/syscall.c, and thus break
the dependency on bpf_syscall.  I didn't want to spend time actually
doing that before knowing how the general idea would fly, and am glad
now that I didn't.

But the reason I thought of using it in the first place was that I saw a
map implementation specially designed for tracing and thought it would
make sense to reuse it if possible for what I thought was a
complementary tracing tool.  I did find it useful, and using it did
simplify my overall code, but it seems it's more trouble than it's worth
and not particularly welcome, so yeah, I probably will just look at
something else.

Thanks for your input,

Tom

> I've looked through the patches again and they seem to have serious
> bugs when it comes to object life time:
> - patch 5 that adds clear_map is completely broken, calling
> delete_all_elements() by wrapping it in rcu_read_lock() ?!
> - patch 6 adds 'free_elem' callback that is called right from
> htab_map_delete_elem() while we're still under rcu.. ?!
> - patch 12 does
> struct hist_trigger_entry *entry;
> tracing_map_lookup_elem(hist_data->map, next_key, &entry);
> hist_trigger_entry_print(m, hist_data, next_key, entry);
> so when you're storing a pointer inside the map the rcu protection
> of map_lookup protects only the value of 'entry' pointer.
> The actual contents of *entry are not protected by anything,
> so even if you fix 5 and 6, you'll still have severe memory corruptions,
> since nothing guarantees that contents of *entry are valid when
> hist_trigger_entry_print() is trying to access it.
> 
> I think you'd be better off using vanilla hashtable.h
> 
> Thanks
> 



  reply	other threads:[~2015-03-04  5:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03  2:31 [PATCH v2 00/15] tracing: 'hist' triggers Alexei Starovoitov
2015-03-03 15:47 ` Tom Zanussi
2015-03-04  2:22   ` Alexei Starovoitov
2015-03-04  5:01     ` Tom Zanussi [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-03-03  0:01 Alexei Starovoitov
2015-03-03  1:18 ` Tom Zanussi
2015-03-02 19:36 Alexei Starovoitov
2015-03-02 19:45 ` Steven Rostedt
2015-03-02 19:49   ` Karim Yaghmour
2015-03-02 20:33     ` Alexei Starovoitov
2015-03-02 20:37       ` Karim Yaghmour
2015-03-02 20:39       ` Steven Rostedt
2015-03-02 20:43         ` Steven Rostedt
2015-03-02 20:48           ` Alexei Starovoitov
2015-03-02 20:52             ` Karim Yaghmour
2015-03-02 20:54             ` Karim Yaghmour
2015-03-02 19:14 Alexei Starovoitov
2015-03-02 19:31 ` Steven Rostedt
2015-03-02 19:55 ` Tom Zanussi
2015-03-09 11:38   ` He Kuang
2015-03-02 16:00 Tom Zanussi
2015-03-03  2:25 ` Masami Hiramatsu
2015-03-03 14:47   ` Tom Zanussi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1425445298.20819.222.camel@picadillo \
    --to=tom.zanussi@linux.intel.com \
    --cc=acme@infradead.org \
    --cc=andi@firstfloor.org \
    --cc=ast@plumgrid.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.