From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754602AbbCCCZW (ORCPT ); Mon, 2 Mar 2015 21:25:22 -0500 Received: from mail7.hitachi.co.jp ([133.145.228.42]:47595 "EHLO mail7.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753642AbbCCCZV (ORCPT ); Mon, 2 Mar 2015 21:25:21 -0500 Message-ID: <54F51B8A.9010904@hitachi.com> Date: Tue, 03 Mar 2015 11:25:14 +0900 From: Masami Hiramatsu Organization: Hitachi, Ltd., Japan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: Tom Zanussi Cc: rostedt@goodmis.org, namhyung@kernel.org, andi@firstfloor.org, ast@plumgrid.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 00/15] tracing: 'hist' triggers References: In-Reply-To: Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (2015/03/03 1:00), Tom Zanussi wrote: > This is v2 of my previously posted 'hashtriggers' patchset [1], but > renamed to 'hist triggers' following feedback from v1. This is what I need :) The trigger interface gives us better flexibility for environment. With this series I believe the 80% use of "scripting tracing" can be replaced with just "echo'ing tracing" via tracefs :) > > Since then, the kernel has gained a tracing map implementation in the > form of bpf_map, which this patchset makes a bit more generic, exports > and uses (as tracing_map_*, still in the bpf syscall file however). > > A large part of the initial hash triggers implementation was devoted > to a map implementation and general-purpose hashing functions, which > have now been subsumed by the bpf maps. I've completely redone the > trigger patches themselves to work on top of tracing_map. The result > is a much simpler and easier-to-review patchset that's able to focus > more directly on the problem at hand. > > The new version addresses all the comments from the previous review, > including changing the name from hash->hist, adding separate 'hist' > files for the output, and moving the examples into Documentation. > > This patchset also includes a couple other new and related triggers, > enable_hist and disable_hist, very similar to the existing > enable_event/disable_event triggers used to automatically enable and > disable events based on a triggering condition, but in this case > allowing hist triggers to be enabled and disabled in the same way. > > The only problem with using the bpf_map implementation for this is > that it uses kmalloc internally, which causes problems when trying to > trace kmalloc itself. I'm guessing the ebpf tracing code would also > share this problem e.g. when using bpf_maps from probes on kmalloc(). > This patchset attempts a solution to that problem (by adding a > gfp_flag and changing the kmem memory allocation tracepoints to > conditional variants) for checking for it in for but I'm not sure it's > the best way to address it. That is not a solution for kprobe-based events, nor the events on interrupt context. Can we reserve some amount of memory for bpf_map? and If it is exceeded the reserved memory we can choose (A) disable hist or (B) continue to do with kmalloc. > > There are a couple of important bits of functionality that were > present in v1 but dropped in v2 mainly because I'm still trying to > figure out the best way to accomplish those things using the bpf_map > implementation. > > The first is support for compound keys. Currently, maps can only be > keyed on a single event field, whereas in v1 they could be keyed on > multiple keys. With support for compound keys, you can create much > more interesting output, such as for example per-pid lists of > syscalls or read counts e.g.: > > # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \ > /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger > > # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist > > key: common_pid:bash[3112], id:sys_write vals: count:69 > key: common_pid:bash[3112], id:sys_rt_sigprocmask vals: count:218 > > key: common_pid:update-notifier[3164], id:sys_poll vals: count:37 > key: common_pid:update-notifier[3164], id:sys_recvfrom vals: count:118 > > key: common_pid:deja-dup-monito[3194], id:sys_sendto vals: count:1 > key: common_pid:deja-dup-monito[3194], id:sys_read vals: count:4 > key: common_pid:deja-dup-monito[3194], id:sys_poll vals: count:8 > key: common_pid:deja-dup-monito[3194], id:sys_recvmsg vals: count:8 > key: common_pid:deja-dup-monito[3194], id:sys_getegid vals: count:8 > > key: common_pid:emacs[3275], id:sys_fsync vals: count:1 > key: common_pid:emacs[3275], id:sys_open vals: count:1 > key: common_pid:emacs[3275], id:sys_symlink vals: count:2 > key: common_pid:emacs[3275], id:sys_poll vals: count:23 > key: common_pid:emacs[3275], id:sys_select vals: count:23 > key: common_pid:emacs[3275], id:unknown_syscall vals: count:34 > key: common_pid:emacs[3275], id:sys_ioctl vals: count:60 > key: common_pid:emacs[3275], id:sys_rt_sigprocmask vals: count:116 > > key: common_pid:cat[3323], id:sys_munmap vals: count:1 > key: common_pid:cat[3323], id:sys_fadvise64 vals: count:1 Very impressive! :) Thank you, > > Related to that is support for sorting on multiple fields. Currently, > you can sort using only a primary key. Being able to sort on multiple > or at least a secondary key is indispensible for seeing trends when > displaying multiple values. > > [1] http://thread.gmane.org/gmane.linux.kernel/1673551 > > Changes from v1: > - completely rewritten on top of tracing_map (renamed and exported bpf_map) > - added map clearing and client ops to tracing_map > - changed the name from 'hash' triggers to 'hist' triggers > - added new trigger 'pause' feature > - added new enable_hist and disable_hist triggers > - added usage for hist/enable_hist/disable hist to tracing/README > - moved examples into Documentation/trace/event.txt > - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints > > The following changes since commit 49058038a12cfd9044146a1bf4b286781268d5c9: > > ring-buffer: Do not wake up a splice waiter when page is not full (2015-02-24 14:00:41 -0600) > > are available in the git repository at: > > git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v2 > http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v2 > > Tom Zanussi (15): > tracing: Make ftrace_event_field checking functions available > tracing: Add event record param to trigger_ops.func() > tracing: Add get_syscall_name() > bpf: Export bpf map functionality as trace_map_* > bpf: Export a map-clearing function > bpf: Add tracing_map client ops > mm: Add ___GFP_NOTRACE > tracing: Make kmem memory allocation tracepoints conditional > tracing: Add kmalloc/kfree macros > bpf: Make tracing_map use kmalloc/kfree_notrace() > tracing: Add a per-event-trigger 'paused' field > tracing: Add 'hist' event trigger command > tracing: Add sorting to hist triggers > tracing: Add enable_hist/disable_hist triggers > tracing: Add 'hist' trigger Documentation > > Documentation/trace/events.txt | 870 +++++++++++++++++++++ > include/linux/bpf.h | 15 + > include/linux/ftrace_event.h | 9 +- > include/linux/gfp.h | 3 +- > include/linux/slab.h | 61 +- > include/trace/events/kmem.h | 28 +- > kernel/bpf/arraymap.c | 16 + > kernel/bpf/hashtab.c | 39 +- > kernel/bpf/syscall.c | 193 ++++- > kernel/trace/trace.c | 48 ++ > kernel/trace/trace.h | 25 +- > kernel/trace/trace_events.c | 3 + > kernel/trace/trace_events_filter.c | 15 +- > kernel/trace/trace_events_trigger.c | 1466 ++++++++++++++++++++++++++++++++++- > kernel/trace/trace_syscalls.c | 11 + > mm/slab.c | 45 +- > mm/slob.c | 45 +- > mm/slub.c | 47 +- > 18 files changed, 2795 insertions(+), 144 deletions(-) > -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com