From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754734AbbCBTPR (ORCPT <rfc822;w@1wt.eu>);
	Mon, 2 Mar 2015 14:15:17 -0500
Received: from mail-qg0-f48.google.com ([209.85.192.48]:33128 "EHLO
	mail-qg0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753675AbbCBTPP (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 2 Mar 2015 14:15:15 -0500
MIME-Version: 1.0
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Mon, 2 Mar 2015 11:14:54 -0800
Message-ID: <CAMEtUuyskC1bZKz=vtohd=KpDOjSqS-RJocjjWTNWZyDe+xSDA@mail.gmail.com>
Subject: Re: [PATCH v2 00/15] tracing: 'hist' triggers
To: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
        Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
        Namhyung Kim <namhyung@kernel.org>, Andi Kleen <andi@firstfloor.org>,
        LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>,
        Arnaldo Carvalho de Melo <acme@infradead.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Mar 2, 2015 at 8:00 AM, Tom Zanussi <tom.zanussi@linux.intel.com> wrote:
>
>   # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \
>         /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
>
>   # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
>
>   key: common_pid:bash[3112], id:sys_write                     vals: count:69
>   key: common_pid:bash[3112], id:sys_rt_sigprocmask            vals: count:218

Hi Tom,

I think we both want to see in-kernel aggregation.
This 'hist' stuff is trying to do counting and even map sorting
in the kernel, whereas with bpf programs I'm moving
all of these decisions to user space.
I understand your desire to avoid any user level scripts
and do everything via 'cat' and debugfs, but imo that's
very limiting. I think it's better to do slim user space
scripting language that can translate to bpf even in
embedded setups. Then users will be able to aggregate
whatever they like, whereas with 'hist' approach
they're limited to simple counters.
trace_events_trigger.c - 1466 lines - that's quite a bit
of code that will be rarely used. Kinda goes counter
to embedded argument. Why add this to kernel
when bpf programs can do the same on demand?
Also the arguments about stable ABI apply as well.
The format of 'hist' file would need to be stable, so will
be hard to extend it. With bpf programs doing aggregation
the kernel ABI exposure is much smaller.
So would you consider working together on adding
clean bpf+tracepoints infra and corresponding
user space bits?
We can have small user space parser/compiler for
'hist:keys=common_pid.execname,id.syscall:vals=hitcount'
strings that will convert it into bpf program and you'll
be able to use it in embedded setups ?

Thanks