* Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) [not found] ` <AANLkTi=Dw-dxkUgYLZYBKb5sGYcdGcpS0OH_fEHW51Jr@mail.gmail.com> @ 2010-10-28 18:20 ` David Sharp 2010-10-28 19:27 ` Mathieu Desnoyers 2010-11-16 20:28 ` Mathieu Desnoyers 0 siblings, 2 replies; 5+ messages in thread From: David Sharp @ 2010-10-28 18:20 UTC (permalink / raw) To: linux-kernel, Mathieu Desnoyers, Steven Rostedt, Ingo Molnar, Andrew Morton, Michael Rubin Hi again, I now have some benchmark results for two more tracers, lttng and perf. Thanks to Mathieu for helping me with the lttng test. To restate our objectives: > Google uses kernel tracing aggressively in the its data centers. We > wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng > all have a better feature set than ktrace, so we are abandoning that > code. > > We see several implementations of tracing aimed at the mainline kernel > and wanted a fair comparison of each of them to make sure they will > not significantly impact performance. A tracing toolkit that is too > expensive is not usable in our environment. > > So we are sending out some benchmark results to compare the three > tracing subsystems we know of. In addition we included numbers for > ktrace, and a git tree for the source. > > The benchmark used is rather synthetic, and aims to measure the > overhead of entering the kernel with a tracepoint enabled. It measures > the time to call getuid() 100,000 times, and returns the average time > per call. To infer the overhead of tracing, the benchmark is run with > tracing off, and with tracing on. The tracer is given a > generously-sized ring buffer so that overflows do not occur (no reader > is present). The "off" run essentially measures the cost of a ring > switch, as sys_getuid() is practically a nop syscall. Subtracting the > "off" result from the "on" result indicates the average amount of time > it takes to add a small item to the tracing ring buffer. I hope this produces a fair comparison. I've detailed my steps below for all of the tracers in case there are improvements to be made. This is the code used for the getuid microbenchmark: http://autotest.kernel.org/browser/trunk/client/tests/tracing_microbenchmark/src/getuid_microbench.c The numbers below are the "on" minus "off" average results over at least 10 runs each. Results for amount of time to execute a tracepoint (includes previous results): ktrace: 200ns (old) ftrace: 224ns (old, w/ handcoded tracepoint, not syscall tracing) lttng: 449ns (new) perf: 1047ns (new) Also interesting: ftrace: 587ns (old, w/ syscall tracing) This just shows that syscall tracing is much slower than a normal tracepoint. The rest of this email is a lot of details about the benchmark runs for each tracer. I look forward to your feedback. Thanks, David Sharp Here are the details of the steps for the two new results (lttng, perf) first, and then some repeated details of the previous ftrace and ktrace results. LTTng ----- I use lttng in flight-recorder mode to avoid the overhead of a reader while tracing. The benchmark was run with these versions: - lttng kernel 0.233 based on v.2.6.36 + the sys_getuid tracepoint patch [0] - ltt-modules v0.19.1 + a probe for sys_getuid [1] - ltt-control v0.88 I ran the benchmark like this: #setup: modprobe ltt-trace-control modprobe syscall-trace /sys/kernel/debug/ltt/markers/kernel/sys_getuid/enable #run: for n in $(seq 10) ; do \ lttctl -C -w $PWD/$n \ -o channel.all.bufsize=$((8*1024*1024)) \ -o channel.all.overwrite=1 trace1 && \ getuid_microbench 100000 >> results.traced && \ lttctl -D -w $PWD/$n trace1 ; \ done for n in $(seq 10) ; do \ getuid_microbench 100000 >> results.notrace ; \ done Perf ---- This was run with a vanilla v2.6.36 kernel plus the sys_getuid tracepoint patch, and perf 0.0.2. I used a large number of pages and observed that perf reported it woke up only once to write data. I'm assuming that it woke up at the end, so it was out of the way during the tracing. for n in $(seq 10) ; do \ perf record -e timer:sys_getuid -a -m 4096 -- \ getuid_microbench 100000 >> results.perf ; \ done for n in $(seq 10) ; \ do getuid_microbench 100000 >> results.notrace ; \ done ftrace ------ ftrace and ktrace benchmarks were run using Autotest. The commands below approximate what the autotest does. The Autotest test itself is here: http://autotest.kernel.org/browser/trunk/client/tests/tracing_microbenchmark #setup: tracedir="/sys/kernel/debug/tracing" echo '$((8*1024)) > $tracedir/buffer_size_kb echo '1' > $tracedir/events/timer/sys_getuid/enable #run: for n in $(seq 10) ; do \ echo '1' > $tracing/tracing_enabled && \ getuid_microbench 100000 >> results.ftrace && \ echo '0' > $tracing/tracing_enabled && \ echo > $tracing/trace ; \ done for n in $(seq 10) ; do \ getuid_microbench 100000 >> results.notrace && \ done ktrace ------ > Some background on Ktrace: Ktrace uses the same ringbuffer as ftrace, > but has much fewer frontend features (filtering, formatted output > file). Trace events are always packed. ktrace has an mmap interface to > the per-cpu ring buffers instead of splice() (the merits of these two > options have yet to be evaluated). > > In order to test ktrace fairly, I forward ported it v2.6.36-r7. I also > made some minor changes to ftrace that don't significantly affect the > benchmark. > > The tree with ktrace and the changes mentioned above is available here: > > git://google3-2.osuosl.org/kernel/tracing/ktrace.git ktrace-forwardport I didn't push the ktrace tracer to the public kernel.org Autotest, but it is quite similar to the ftrace tracer. Once again, I'll approximate it with shell commands: #setup: tracedir="/sys/kernel/debug/ktrace" echo '$((8*1024*1024)) > $tracedir/bufsize echo '1' > $tracedir/events/kernel_getuid #run: for n in $(seq 10) ; do \ echo '1' > $tracing/enabled && \ getuid_microbench 100000 >> results.ftrace && \ echo '0' > $tracing/enabled && \ echo '1' > $tracing/reset ; \ done for n in $(seq 10) ; do \ getuid_microbench 100000 >> results.notrace && \ done Footnotes --------- [0] sys_getuid tracepoint: http://google3-2.osuosl.org/?p=kernel/tracing/ktrace.git;a=commitdiff;h=872088a6e482ce0288d8fe9f02726df82173de0e [1] sys_getuid ltt probe, diff against ltt-module v0.19.1 (forgive gmail's patch mangling): diff --git a/probes/syscall-trace.c b/probes/syscall-trace.c index 9ae419f..1409a74 100644 --- a/probes/syscall-trace.c +++ b/probes/syscall-trace.c @@ -49,6 +49,20 @@ notrace void probe_syscall_exit(void *_data, long ret) &ret, sizeof(ret), sizeof(ret)); } +void probe_getuid(void *_data, uid_t uid); + +DEFINE_MARKER_TP(kernel, sys_getuid, sys_getuid, + probe_getuid, "uid %d"); + +notrace void probe_getuid(void *_data, uid_t uid) +{ + struct marker *marker; + + marker = &GET_MARKER(kernel, sys_getuid); + ltt_specialized_trace(marker, marker->single.probe_private, + &uid, sizeof(uid), sizeof(uid)); +} + MODULE_LICENSE("GPL and additional rights"); MODULE_AUTHOR("Mathieu Desnoyers"); MODULE_DESCRIPTION("syscall Tracepoint Probes"); -- 1.7.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) 2010-10-28 18:20 ` Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) David Sharp @ 2010-10-28 19:27 ` Mathieu Desnoyers 2010-10-28 23:24 ` Michael Rubin 2010-11-16 20:28 ` Mathieu Desnoyers 1 sibling, 1 reply; 5+ messages in thread From: Mathieu Desnoyers @ 2010-10-28 19:27 UTC (permalink / raw) To: David Sharp Cc: linux-kernel, Steven Rostedt, Ingo Molnar, Andrew Morton, Michael Rubin * David Sharp (dhsharp@google.com) wrote: [...] > I hope this produces a fair comparison. I've detailed my steps below > for all of the tracers in case there are improvements to be made. > [...] > Results for amount of time to execute a tracepoint (includes previous results): > ktrace: 200ns (old) > ftrace: 224ns (old, w/ handcoded tracepoint, not syscall tracing) > lttng: 449ns (new) > perf: 1047ns (new) > > Also interesting: > ftrace: 587ns (old, w/ syscall tracing) > This just shows that syscall tracing is much slower than a normal tracepoint. Hrm, not quite fair actually. LTTng enables a thread flag that causes syscall_trace to be called on sycall entry/exit by default. Please try with the following patch to remove the extra syscall overhead. Thanks, Mathieu for google benchmarks: remove syscall tracing from LTTng modules Syscall tracing is enabled by default when lttng is tracing, which causes the full registers to be saved and a sycall_trace call upon syscall entry/exit. This patch is for testing only. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> --- ltt-tracer.c | 2 -- 1 file changed, 2 deletions(-) Index: lttng-modules/ltt-tracer.c =================================================================== --- lttng-modules.orig/ltt-tracer.c +++ lttng-modules/ltt-tracer.c @@ -839,7 +839,6 @@ int ltt_trace_alloc(const char *trace_na if (list_empty(<t_traces.head)) { mod_timer(<t_async_wakeup_timer, jiffies + LTT_PERCPU_TIMER_INTERVAL); - set_kernel_trace_flag_all_tasks(); } list_add_rcu(&trace->list, <t_traces.head); synchronize_trace(); @@ -917,7 +916,6 @@ static int _ltt_trace_destroy(struct ltt list_del_rcu(&trace->list); synchronize_trace(); if (list_empty(<t_traces.head)) { - clear_kernel_trace_flag_all_tasks(); /* * We stop the asynchronous delivery of reader wakeup, but * we must make one last check for reader wakeups pending -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) 2010-10-28 19:27 ` Mathieu Desnoyers @ 2010-10-28 23:24 ` Michael Rubin 2010-10-29 4:37 ` Mathieu Desnoyers 0 siblings, 1 reply; 5+ messages in thread From: Michael Rubin @ 2010-10-28 23:24 UTC (permalink / raw) To: Mathieu Desnoyers Cc: David Sharp, linux-kernel, Steven Rostedt, Ingo Molnar, Andrew Morton On Thu, Oct 28, 2010 at 12:27 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > * David Sharp (dhsharp@google.com) wrote: > Hrm, not quite fair actually. LTTng enables a thread flag that causes > syscall_trace to be called on sycall entry/exit by default. Please try with the > following patch to remove the extra syscall overhead. We started this thread is an attempt to get the ball rolling on keeping track of these metrics. They seem important for both Google and Linux as a whole. We published all our work and benchmarks with hopes that the owners could comment and also run the tests themselves. Matthieu it might make more sense for you to find the" fair" setup and publish that, than take the extra time to go back and forth with David. Once you have the config that best represents what users will see and you have numbers to go with it, we can then run it in our environment. I worry you will be forced to send us patch and patch and patch and we won't have the resources to do a good job in responding to them. Which is not fair to you. mrubin ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) 2010-10-28 23:24 ` Michael Rubin @ 2010-10-29 4:37 ` Mathieu Desnoyers 0 siblings, 0 replies; 5+ messages in thread From: Mathieu Desnoyers @ 2010-10-29 4:37 UTC (permalink / raw) To: Michael Rubin Cc: David Sharp, linux-kernel, Steven Rostedt, Ingo Molnar, Andrew Morton * Michael Rubin (mrubin@google.com) wrote: > On Thu, Oct 28, 2010 at 12:27 PM, Mathieu Desnoyers > <mathieu.desnoyers@efficios.com> wrote: > > * David Sharp (dhsharp@google.com) wrote: > > Hrm, not quite fair actually. LTTng enables a thread flag that causes > > syscall_trace to be called on sycall entry/exit by default. Please try with the > > following patch to remove the extra syscall overhead. > > We started this thread is an attempt to get the ball rolling on > keeping track of these metrics. They seem important for both Google > and Linux as a whole. We published all our work and benchmarks with > hopes that the owners could comment and also run the tests themselves. > > Matthieu it might make more sense for you to find the" fair" setup and > publish that, than take the extra time to go back and forth with > David. Once you have the config that best represents what users will > see and you have numbers to go with it, we can then run it in our > environment. I worry you will be forced to send us patch and patch and > patch and we won't have the resources to do a good job in responding > to them. Which is not fair to you. Nah, it's really a matter of fine-tuning really. I already performed these benchmarks, have published the results, and know what to expect, so I figured out quickly that something was wrong with your setup. Disabling the extra syscall tracing is the only detail I see that is missing for fair comparison of LTTng with your setup. For the "fair" setup to compare ring buffers, I would personally tend to say that going through a system call might not be the way I would benchmark the ring buffer per se, but I must admit that the benchmark you are doing is very useful for evaluation of system call entry/exit tracing. On this particular aspect, I have always admitted that lttng is not super-fast, but it actually takes the same overhead as the Linux kernel syscall tracing has today. But given it is always enabled when lttng tracing runs, you happened to add this overhead to the lttng ring buffer inappropriately. I'm willing to help David with the setup, and I'm sure it won't take long. By the way, I'm currently porting lttng to the new Generic Ring Buffer Library, which might be even slightly faster than the current lttng code (and faster than ftrace with the ring buffer benchmark Steven wrote). So when I'm done porting lttng to it, it might be worthwhile to benchmark the Generic Ring Buffer too. Thank you, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) 2010-10-28 18:20 ` Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) David Sharp 2010-10-28 19:27 ` Mathieu Desnoyers @ 2010-11-16 20:28 ` Mathieu Desnoyers 1 sibling, 0 replies; 5+ messages in thread From: Mathieu Desnoyers @ 2010-11-16 20:28 UTC (permalink / raw) To: David Sharp Cc: linux-kernel, Steven Rostedt, Ingo Molnar, Andrew Morton, Michael Rubin * David Sharp (dhsharp@google.com) wrote: [...] > Results for amount of time to execute a tracepoint (includes previous results): > ktrace: 200ns (old) > ftrace: 224ns (old, w/ handcoded tracepoint, not syscall tracing) > lttng: 449ns (new) > perf: 1047ns (new) > > Also interesting: > ftrace: 587ns (old, w/ syscall tracing) > This just shows that syscall tracing is much slower than a normal tracepoint. As I pointed out in my email a few weeks ago, the LTTng comparison is simply bogus because the "syscall tracing" thread-flag is active, which calls into syscall tracing, after saving all registers, from entry_*.S, both at syscall entry and exit. I did benchmarks using Steven's ring_buffer_benchmark kernel module, which calls tracing in a loop, for both Ftrace and the Generic Ring Buffer (which is derived from LTTng). The results are: Intel Xeon 2.0GHz ftrace: 103 ns/entry (no reader) lttng: 83 ns/entry (no reader) (with the generic ring buffer library) So, given that even after I pointed out that the results above were bogus, people took the numbers for granted, and given that David seems busy on other things, I thought I should set records straight. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-11-16 20:28 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <AANLkTim8c+FLo11XJbTmTjiuucn-9UJD0Vui=fvUgCd+@mail.gmail.com> [not found] ` <AANLkTikd6sawwJeBm76QLeowXj0GBAf=wpRVgG8x5MOp@mail.gmail.com> [not found] ` <AANLkTi=Dw-dxkUgYLZYBKb5sGYcdGcpS0OH_fEHW51Jr@mail.gmail.com> 2010-10-28 18:20 ` Benchmarks of kernel tracing options (ftrace, ktrace, lttng and perf) David Sharp 2010-10-28 19:27 ` Mathieu Desnoyers 2010-10-28 23:24 ` Michael Rubin 2010-10-29 4:37 ` Mathieu Desnoyers 2010-11-16 20:28 ` Mathieu Desnoyers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).