All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] [ANNOUNCE] Dynamically created function based events
@ 2018-02-02 23:04 Steven Rostedt
  2018-02-02 23:04 ` [PATCH 01/18] tracing: Add " Steven Rostedt
                   ` (21 more replies)
  0 siblings, 22 replies; 87+ messages in thread
From: Steven Rostedt @ 2018-02-02 23:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Masami Hiramatsu, Tom Zanussi, linux-rt-users,
	linux-trace-users, Arnaldo Carvalho de Melo, Clark Williams,
	Jiri Olsa, Daniel Bristot de Oliveira, Juri Lelli,
	Jonathan Corbet, Mathieu Desnoyers, Namhyung Kim,
	Alexei Starovoitov


At Kernel Summit back in October, we tried to bring up trace markers, which
would be nops within the kernel proper, that would allow modules to hook
arbitrary trace events to them. The reaction to this proposal was less than
favorable. We were told that we were trying to make a work around for a
problem, and not solving it. The problem in our minds is the notion of a
"stable trace event".

There are maintainers that do not want trace events, or more trace events in
their subsystems. This is due to the fact that trace events post an
interface to user space, and this interface could become required by some
tool. This may cause the trace event to become stable where it must not
break the tool, and thus prevent the code from changing.

Or, the trace event may just have to add padding for fields that tools
may require. The "success" field of the sched_wakeup trace event is one such
instance. There is no more "success" variable, but tools may fail if it were
to go away, so a "1" is simply added to the trace event wasting ring buffer
real estate.

I talked with Linus about this, and he told me that we already have these
markers in the kernel. They are from the mcount/__fentry__ used by function
tracing. Have the trace events be created by these, and see if this will
satisfy most areas that want trace events.

I decided to implement this idea, and here's the patch set.

Introducing "function based events". These are created dynamically by a
tracefs file called "function_events". By writing a pseudo prototype into
this file, you create an event.

 # mount -t tracefs nodev /sys/kernel/tracing
 # cd /sys/kernel/tracing
 # echo 'do_IRQ(symbol ip[16] | x64[6] irq_stack[16])' > function_events
 # cat events/functions/do_IRQ/format
name: do_IRQ
ID: 1399
format:
	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	field:int common_pid;	offset:4;	size:4;	signed:1;

	field:unsigned long __parent_ip;	offset:8;	size:8;	signed:0;
	field:unsigned long __ip;	offset:16;	size:8;	signed:0;
	field:symbol ip;	offset:24;	size:8;	signed:0;
	field:x64 irq_stack[6];	offset:32;	size:48;	signed:0;

print fmt: "%pS->%pS(ip=%pS, irq_stack=%llx:%llx:%llx:%llx:%llx:%llx)", REC->__ip, REC->__parent_ip,
REC->ip, REC->irq_stack[0], REC->irq_stack[1], REC->irq_stack[2], REC->irq_stack[3], REC->irq_stack[4],
REC->irq_stack[5]

 # echo 1 > events/functions/do_IRQ/enable
 # cat trace
          <idle>-0     [003] d..3  3647.049344: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.049433: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.049672: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.325709: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.325929: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.325993: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.387571: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.387791: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
          <idle>-0     [003] d..3  3647.387874: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)

And this is much more powerful than just this. We can show strings, and
index off of structures into other structures.

  # echo '__vfs_read(symbol read+40[0]+16)' > function_events

  # echo 1 > events/functions/__vfs_read/enable
  # cat trace
         sshd-1343  [005] ...2   199.734752: vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
         bash-1344  [003] ...2   199.734822: vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
         sshd-1343  [005] ...2   199.734835: vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
 avahi-daemon-910   [003] ...2   200.136740: vfs_read->__vfs_read(read=          (null))
 avahi-daemon-910   [003] ...2   200.136750: vfs_read->__vfs_read(read=          (null))

And even read user space:

  # echo 'SyS_openat(int dfd, string path, x32 flags, x16 mode)' > function_events
  # echo 1 > events/functions/enable
  # grep task_fork /proc/kallsyms 
ffffffff810d5a60 t task_fork_fair
ffffffff810dfc30 t task_fork_dl
  # cat trace
            grep-1820  [000] ...2  3926.107603: entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, path=/proc/kallsyms, flags=100, mode=0)

These are fully functional events. That is, they work with ftrace,
trace-cmd, perf, histograms, triggers, and eBPF.

What's next? I need to rewrite the function graph tracer, and be able to add
dynamic events on function return.

I made this work with x86 with a simple function that only returns
6 function parameters for x86_64 and 3 for x86_32. But this could easily
be extended.

Cheers!

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
ftrace/dynamic-ftrace-events

Head SHA1: 30fbdffd5d38bd27b04fb911f7158f10a99be8c4


Steven Rostedt (VMware) (18):
      tracing: Add function based events
      tracing: Add documentation for function based events
      tracing: Add simple arguments to function based events
      tracing/x86: Add arch_get_func_args() function
      tracing: Add hex print for dynamic ftrace based events
      tracing: Add indirect offset to args of ftrace based events
      tracing: Add dereferencing multiple fields per arg
      tracing: Add "unsigned" to function based events
      tracing: Add indexing of arguments for function based events
      tracing: Make func_type enums for easier comparing of arg types
      tracing: Add symbol type to function based events
      tracing: Add accessing direct address from function based events
      tracing: Add array type to function based events
      tracing: Have char arrays be strings for function based events
      tracing: Add string type for dynamic strings in function based events
      tracing: Add NULL to skip args for function based events
      tracing: Add indirect to indirect access for function based events
      tracing/perf: Allow perf to use function based events

----
 Documentation/trace/function-based-events.rst |  426 ++++++++
 arch/x86/kernel/ftrace.c                      |   28 +
 include/linux/trace_events.h                  |    2 +
 kernel/trace/Kconfig                          |   12 +
 kernel/trace/Makefile                         |    1 +
 kernel/trace/trace.h                          |   11 +
 kernel/trace/trace_event_ftrace.c             | 1440 +++++++++++++++++++++++++
 kernel/trace/trace_probe.h                    |   11 -
 8 files changed, 1920 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/trace/function-based-events.rst
 create mode 100644 kernel/trace/trace_event_ftrace.c

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2018-02-13 15:28 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-02 23:04 [PATCH 00/18] [ANNOUNCE] Dynamically created function based events Steven Rostedt
2018-02-02 23:04 ` [PATCH 01/18] tracing: Add " Steven Rostedt
2018-02-05  8:24   ` Jiri Olsa
2018-02-05 15:00     ` Steven Rostedt
2018-02-07  3:09       ` Steven Rostedt
2018-02-07 12:06         ` Jiri Olsa
2018-02-02 23:05 ` [PATCH 02/18] tracing: Add documentation for " Steven Rostedt
2018-02-02 23:05 ` [PATCH 03/18] tracing: Add simple arguments to " Steven Rostedt
2018-02-08 10:18   ` Namhyung Kim
2018-02-08 15:37     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 04/18] tracing/x86: Add arch_get_func_args() function Steven Rostedt
2018-02-05 16:33   ` Masami Hiramatsu
2018-02-05 17:06     ` Steven Rostedt
2018-02-08  5:28   ` Namhyung Kim
2018-02-08 15:29     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 05/18] tracing: Add hex print for dynamic ftrace based events Steven Rostedt
2018-02-02 23:05 ` [PATCH 06/18] tracing: Add indirect offset to args of " Steven Rostedt
2018-02-02 23:05 ` [PATCH 07/18] tracing: Add dereferencing multiple fields per arg Steven Rostedt
2018-02-02 23:05 ` [PATCH 08/18] tracing: Add "unsigned" to function based events Steven Rostedt
2018-02-02 23:05 ` [PATCH 09/18] tracing: Add indexing of arguments for " Steven Rostedt
2018-02-08 10:59   ` Namhyung Kim
2018-02-08 15:43     ` Steven Rostedt
2018-02-08 23:56       ` Namhyung Kim
2018-02-09  0:19         ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 10/18] tracing: Make func_type enums for easier comparing of arg types Steven Rostedt
2018-02-02 23:05 ` [PATCH 11/18] tracing: Add symbol type to function based events Steven Rostedt
2018-02-08 11:03   ` Namhyung Kim
2018-02-08 15:48     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 12/18] tracing: Add accessing direct address from " Steven Rostedt
2018-02-09  0:34   ` Namhyung Kim
2018-02-09  1:10     ` Steven Rostedt
2018-02-09 22:07     ` Steven Rostedt
2018-02-12  2:06       ` Namhyung Kim
2018-02-12 15:47         ` Masami Hiramatsu
2018-02-12 15:47           ` Masami Hiramatsu
2018-02-12 16:47           ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 13/18] tracing: Add array type to " Steven Rostedt
2018-02-03 13:56   ` Masami Hiramatsu
2018-02-03 15:29     ` Steven Rostedt
2018-02-04  3:50       ` Masami Hiramatsu
2018-02-09  1:17   ` Namhyung Kim
2018-02-09  1:54     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 14/18] tracing: Have char arrays be strings for " Steven Rostedt
2018-02-02 23:05 ` [PATCH 15/18] tracing: Add string type for dynamic strings in " Steven Rostedt
2018-02-09  3:15   ` Namhyung Kim
2018-02-09  3:31     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 16/18] tracing: Add NULL to skip args for " Steven Rostedt
2018-02-02 23:05 ` [PATCH 17/18] tracing: Add indirect to indirect access " Steven Rostedt
2018-02-09  5:13   ` Namhyung Kim
2018-02-09 15:47     ` Steven Rostedt
2018-02-09 17:18       ` Steven Rostedt
2018-02-12  2:15       ` Namhyung Kim
2018-02-12 17:23         ` Steven Rostedt
2018-02-13  9:27           ` Namhyung Kim
2018-02-13 15:28             ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 18/18] tracing/perf: Allow perf to use " Steven Rostedt
2018-02-03 13:38 ` [PATCH 00/18] [ANNOUNCE] Dynamically created " Masami Hiramatsu
2018-02-03 15:27   ` Steven Rostedt
2018-02-04  3:57     ` Masami Hiramatsu
2018-02-04 17:21       ` Alexei Starovoitov
2018-02-05 14:39         ` Masami Hiramatsu
2018-02-03 17:04 ` Mathieu Desnoyers
2018-02-03 19:02   ` Steven Rostedt
2018-02-03 20:52     ` Alexei Starovoitov
2018-02-03 21:08       ` Steven Rostedt
2018-02-03 21:30         ` Alexei Starovoitov
2018-02-04  2:37           ` Namhyung Kim
2018-02-04 15:50         ` Mathieu Desnoyers
2018-02-03 21:17       ` Steven Rostedt
2018-02-03 21:38         ` Alexei Starovoitov
2018-02-04  2:25         ` Namhyung Kim
2018-02-05 15:02           ` Steven Rostedt
2018-02-05 13:53         ` Juri Lelli
2018-02-05 13:53           ` Juri Lelli
2018-02-05 15:07           ` Steven Rostedt
2018-02-05 15:07             ` Steven Rostedt
2018-02-03 21:43   ` Linus Torvalds
2018-02-04 15:30     ` Mathieu Desnoyers
2018-02-04 15:47       ` Steven Rostedt
2018-02-04 19:39       ` Linus Torvalds
2018-02-05 10:09         ` Peter Zijlstra
2018-02-05 15:10           ` Steven Rostedt
2018-02-05 15:14         ` Masami Hiramatsu
2018-02-03 18:52 ` Steven Rostedt
2018-02-05 10:23 ` Juri Lelli
2018-02-05 10:49   ` Daniel Bristot de Oliveira
2018-02-05 15:11     ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.