linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Benchmarks of kernel tracing options (ftrace and ktrace)
       [not found] ` <AANLkTikKwx6okpX4pxVzTvrVNm=KhUvLQGs0_ziwo6fX@mail.gmail.com>
@ 2010-10-13 23:19   ` David Sharp
  2010-10-13 23:50     ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: David Sharp @ 2010-10-13 23:19 UTC (permalink / raw)
  To: linux-kernel, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar,
	Andrew Morton, Michael Rubin

Google uses kernel tracing aggressively in the its data centers. We
wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng
all have a better feature set than ktrace, so we are abandoning that
code.

We see several implementations of tracing aimed at the mainline kernel
and wanted a fair comparison of each of them to make sure they will
not significantly impact performance. A tracing toolkit that is too
expensive is not usable in our environment.

So we are sending out some benchmark results to compare the three
tracing subsystems we know of. In addition we included numbers for
ktrace, and a git tree for the source.

Some background on Ktrace: Ktrace uses the same ringbuffer as ftrace,
but has much fewer frontend features (filtering, formatted output
file). Trace events are always packed. ktrace has an mmap interface to
the per-cpu ring buffers instead of splice()  (the merits of these two
options have yet to be evaluated).

In order to test ktrace fairly, I forward ported it v2.6.36.r7. I also
made some minor changes to ftrace that don't significantly affect the
benchmark.

The tree with ktrace and the changes mentioned above is available here:

       git://google3-2.osuosl.org/kernel/tracing/ktrace.git  ktrace-forwardport

The benchmark used is rather synthetic, and aims to measure the
overhead of entering the kernel with a tracepoint enabled. It measures
the time to call getuid() 100,000 times, and returns the average time
per call. To infer the overhead of tracing, the benchmark is run with
tracing off, and with tracing on. The tracer is given a
generously-sized ring buffer so that overflows do not occur (no reader
is present). The "off" run essentially measures the cost of a ring
switch, as sys_getuid() is practically a nop syscall. Subtracting the
"off" result from the "on" result indicates the average amount of time
it takes to add a small item to the tracing ring buffer.

I wrote the benchmark as an Autotest test, so you can see the actual
code for yourself:
http://autotest.kernel.org/browser/trunk/client/tests/tracing_microbenchmark


This first set of benchmark results compares ftrace to ktrace. The
numbers below are the "on" result minus the "off" result for each
configuration.

ktrace: 200ns  (tracepoint: kernel_getuid)
ftrace: 224ns   (tracepoint: timer:sys_getuid)
ftrace: 587ns   (tracepoint: syscalls:sys_enter_getuid)

The "off" result for all configurations was about 231ns. I believe
that to be a meaningless number wrt this benchmark, however.

The "kernel_getuid" (ktrace) and "timer:sys_getuid" (ftrace)
tracepoints are normal tracepoints added to the body of sys_getuid.
syscalls:sys_enter_getuid is the existing syscall tracing event.

The first two results show that ftrace is about 12% slower than ktrace
at adding events to the ring buffer.

The last result shows that the syscall tracing is about twice as
expensive as a normal tracepoint, which is interesting.


Similar benchmarks for lttng and perf_events are coming. Expect at
least one of those by next week.

Thanks,
David Sharp

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-10-13 23:19   ` Benchmarks of kernel tracing options (ftrace and ktrace) David Sharp
@ 2010-10-13 23:50     ` Steven Rostedt
  2010-10-14  0:00       ` Mathieu Desnoyers
  2010-11-16 23:27       ` David Sharp
  0 siblings, 2 replies; 9+ messages in thread
From: Steven Rostedt @ 2010-10-13 23:50 UTC (permalink / raw)
  To: David Sharp
  Cc: linux-kernel, Mathieu Desnoyers, Ingo Molnar, Andrew Morton,
	Michael Rubin, Frederic Weisbecker

On Wed, 2010-10-13 at 16:19 -0700, David Sharp wrote:
> Google uses kernel tracing aggressively in the its data centers. We

Thanks!

> wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng
> all have a better feature set than ktrace, so we are abandoning that
> code.

Cool!

> 
> We see several implementations of tracing aimed at the mainline kernel
> and wanted a fair comparison of each of them to make sure they will
> not significantly impact performance. A tracing toolkit that is too
> expensive is not usable in our environment.
> 

[ snip for now (I'm traveling) ]

> This first set of benchmark results compares ftrace to ktrace. The
> numbers below are the "on" result minus the "off" result for each
> configuration.
> 
> ktrace: 200ns  (tracepoint: kernel_getuid)
> ftrace: 224ns   (tracepoint: timer:sys_getuid)
> ftrace: 587ns   (tracepoint: syscalls:sys_enter_getuid)


> The last result shows that the syscall tracing is about twice as
> expensive as a normal tracepoint, which is interesting.

Argh, the syscall tracing has a lot of overhead. There is only one
tracepoint that is hooked into the ptrace code, and will save all
registers before calling the functions. It enables tracing on all
syscalls and there's a table that decides whether or not to trace the
syscall.

So I'm not surprised with the result that the syscall trace point is so
slow (note, perf uses the same infrastructure).

-- Steve



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-10-13 23:50     ` Steven Rostedt
@ 2010-10-14  0:00       ` Mathieu Desnoyers
  2010-10-14 11:27         ` Frederic Weisbecker
  2010-10-14 18:04         ` Jason Baron
  2010-11-16 23:27       ` David Sharp
  1 sibling, 2 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2010-10-14  0:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: David Sharp, linux-kernel, Ingo Molnar, Andrew Morton,
	Michael Rubin, Frederic Weisbecker

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Wed, 2010-10-13 at 16:19 -0700, David Sharp wrote:
> > Google uses kernel tracing aggressively in the its data centers. We
> 
> Thanks!
> 
> > wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng
> > all have a better feature set than ktrace, so we are abandoning that
> > code.
> 
> Cool!
> 
> > 
> > We see several implementations of tracing aimed at the mainline kernel
> > and wanted a fair comparison of each of them to make sure they will
> > not significantly impact performance. A tracing toolkit that is too
> > expensive is not usable in our environment.
> > 
> 
> [ snip for now (I'm traveling) ]
> 
> > This first set of benchmark results compares ftrace to ktrace. The
> > numbers below are the "on" result minus the "off" result for each
> > configuration.
> > 
> > ktrace: 200ns  (tracepoint: kernel_getuid)
> > ftrace: 224ns   (tracepoint: timer:sys_getuid)
> > ftrace: 587ns   (tracepoint: syscalls:sys_enter_getuid)
> 
> 
> > The last result shows that the syscall tracing is about twice as
> > expensive as a normal tracepoint, which is interesting.
> 
> Argh, the syscall tracing has a lot of overhead. There is only one
> tracepoint that is hooked into the ptrace code, and will save all
> registers before calling the functions. It enables tracing on all
> syscalls and there's a table that decides whether or not to trace the
> syscall.
> 
> So I'm not surprised with the result that the syscall trace point is so
> slow (note, perf uses the same infrastructure).

Yes, the interesting result in this first set of benchmarks is that syscall
tracing is quite slow. We could do better though. I think a different scheme
for syscall tracing that would not rely of saving all registers is needed. We
could do this automatically by adding tracepoints in the actual syscall
functions by modifying the DEFINE_SYSCALL*() macros. I would leave the current
syscall tracing mode as the default though, especially until gcc 4.5 and asm
gotos are more broadly adopted.

So the modified DEFINE_SYSCALL*() macros would generate code that looks like:
(approximately)

static int _syscall_name(type1, name1);

int syscall_name(type1 name1)
{
        int ret;

        trace_syscall_entry_name(name1);
        ret = _syscall_name(name1);
        trace_syscall_exit_name(name1);
        return ret;
}

static int _syscall_name(typê1, name1)


So when we expand:

DEFINE_SYSCALL1(name, type1, name1)
{
  .. actual body ...
}

We have the tracepoints automatically added.

Mathieu


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-10-14  0:00       ` Mathieu Desnoyers
@ 2010-10-14 11:27         ` Frederic Weisbecker
  2010-10-14 18:04         ` Jason Baron
  1 sibling, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2010-10-14 11:27 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, David Sharp, linux-kernel, Ingo Molnar,
	Andrew Morton, Michael Rubin

On Wed, Oct 13, 2010 at 08:00:27PM -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > On Wed, 2010-10-13 at 16:19 -0700, David Sharp wrote:
> > > Google uses kernel tracing aggressively in the its data centers. We
> > 
> > Thanks!
> > 
> > > wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng
> > > all have a better feature set than ktrace, so we are abandoning that
> > > code.
> > 
> > Cool!
> > 
> > > 
> > > We see several implementations of tracing aimed at the mainline kernel
> > > and wanted a fair comparison of each of them to make sure they will
> > > not significantly impact performance. A tracing toolkit that is too
> > > expensive is not usable in our environment.
> > > 
> > 
> > [ snip for now (I'm traveling) ]
> > 
> > > This first set of benchmark results compares ftrace to ktrace. The
> > > numbers below are the "on" result minus the "off" result for each
> > > configuration.
> > > 
> > > ktrace: 200ns  (tracepoint: kernel_getuid)
> > > ftrace: 224ns   (tracepoint: timer:sys_getuid)
> > > ftrace: 587ns   (tracepoint: syscalls:sys_enter_getuid)
> > 
> > 
> > > The last result shows that the syscall tracing is about twice as
> > > expensive as a normal tracepoint, which is interesting.
> > 
> > Argh, the syscall tracing has a lot of overhead. There is only one
> > tracepoint that is hooked into the ptrace code, and will save all
> > registers before calling the functions. It enables tracing on all
> > syscalls and there's a table that decides whether or not to trace the
> > syscall.
> > 
> > So I'm not surprised with the result that the syscall trace point is so
> > slow (note, perf uses the same infrastructure).
> 
> Yes, the interesting result in this first set of benchmarks is that syscall
> tracing is quite slow. We could do better though. I think a different scheme
> for syscall tracing that would not rely of saving all registers is needed. We
> could do this automatically by adding tracepoints in the actual syscall
> functions by modifying the DEFINE_SYSCALL*() macros. I would leave the current
> syscall tracing mode as the default though, especially until gcc 4.5 and asm
> gotos are more broadly adopted.
> 
> So the modified DEFINE_SYSCALL*() macros would generate code that looks like:
> (approximately)
> 
> static int _syscall_name(type1, name1);
> 
> int syscall_name(type1 name1)
> {
>         int ret;
> 
>         trace_syscall_entry_name(name1);
>         ret = _syscall_name(name1);
>         trace_syscall_exit_name(name1);
>         return ret;
> }
> 
> static int _syscall_name(typê1, name1)
> 
> 
> So when we expand:
> 
> DEFINE_SYSCALL1(name, type1, name1)
> {
>   .. actual body ...
> }
> 
> We have the tracepoints automatically added.
> 
> Mathieu



Looks like that would be a good improvement, it would also
simplify some tricky code parts I think.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-10-14  0:00       ` Mathieu Desnoyers
  2010-10-14 11:27         ` Frederic Weisbecker
@ 2010-10-14 18:04         ` Jason Baron
  1 sibling, 0 replies; 9+ messages in thread
From: Jason Baron @ 2010-10-14 18:04 UTC (permalink / raw)
  To: Mathieu Desnoyers, dhsharp
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Andrew Morton,
	Michael Rubin, Frederic Weisbecker

On Wed, Oct 13, 2010 at 08:00:27PM -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > On Wed, 2010-10-13 at 16:19 -0700, David Sharp wrote:
> > > Google uses kernel tracing aggressively in the its data centers. We
> > 
> > Thanks!
> > 
> > > wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng
> > > all have a better feature set than ktrace, so we are abandoning that
> > > code.
> > 
> > Cool!
> > 
> > > 
> > > We see several implementations of tracing aimed at the mainline kernel
> > > and wanted a fair comparison of each of them to make sure they will
> > > not significantly impact performance. A tracing toolkit that is too
> > > expensive is not usable in our environment.
> > > 
> > 
> > [ snip for now (I'm traveling) ]
> > 
> > > This first set of benchmark results compares ftrace to ktrace. The
> > > numbers below are the "on" result minus the "off" result for each
> > > configuration.
> > > 
> > > ktrace: 200ns  (tracepoint: kernel_getuid)
> > > ftrace: 224ns   (tracepoint: timer:sys_getuid)
> > > ftrace: 587ns   (tracepoint: syscalls:sys_enter_getuid)
> > 
> > 
> > > The last result shows that the syscall tracing is about twice as
> > > expensive as a normal tracepoint, which is interesting.
> > 
> > Argh, the syscall tracing has a lot of overhead. There is only one
> > tracepoint that is hooked into the ptrace code, and will save all
> > registers before calling the functions. It enables tracing on all
> > syscalls and there's a table that decides whether or not to trace the
> > syscall.
> > 
> > So I'm not surprised with the result that the syscall trace point is so
> > slow (note, perf uses the same infrastructure).
> 
> Yes, the interesting result in this first set of benchmarks is that syscall
> tracing is quite slow. We could do better though. I think a different scheme
> for syscall tracing that would not rely of saving all registers is needed. We
> could do this automatically by adding tracepoints in the actual syscall
> functions by modifying the DEFINE_SYSCALL*() macros. I would leave the current
> syscall tracing mode as the default though, especially until gcc 4.5 and asm
> gotos are more broadly adopted.
> 
> So the modified DEFINE_SYSCALL*() macros would generate code that looks like:
> (approximately)
> 
> static int _syscall_name(type1, name1);
> 
> int syscall_name(type1 name1)
> {
>         int ret;
> 
>         trace_syscall_entry_name(name1);
>         ret = _syscall_name(name1);
>         trace_syscall_exit_name(name1);
>         return ret;
> }
> 
> static int _syscall_name(typê1, name1)
> 
> 
> So when we expand:
> 
> DEFINE_SYSCALL1(name, type1, name1)
> {
>   .. actual body ...
> }
> 
> We have the tracepoints automatically added.
> 
> Mathieu
> 

right, the current syscall tracepoint code, is optimized for the 'off'
case, not the 'on' case. That is, when off, we add no additional
overhead to the syscall path. If we add 2 tracepoints per syscall we
are adding some cost even when disabled - to every syscall.

However, with the addition of the asm goto code, those concerns will
hopefully go away as Mathieu points out.

thanks,

-Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-10-13 23:50     ` Steven Rostedt
  2010-10-14  0:00       ` Mathieu Desnoyers
@ 2010-11-16 23:27       ` David Sharp
  2010-11-16 23:32         ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: David Sharp @ 2010-11-16 23:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Mathieu Desnoyers, Ingo Molnar, Andrew Morton,
	Michael Rubin, Frederic Weisbecker

On Wed, Oct 13, 2010 at 4:50 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 2010-10-13 at 16:19 -0700, David Sharp wrote:
>> Google uses kernel tracing aggressively in the its data centers. We
>
> Thanks!
>
>> wrote our own kernel tracer, ktrace. However ftrace, perf and LTTng
>> all have a better feature set than ktrace, so we are abandoning that
>> code.
>
> Cool!
>
>>
>> We see several implementations of tracing aimed at the mainline kernel
>> and wanted a fair comparison of each of them to make sure they will
>> not significantly impact performance. A tracing toolkit that is too
>> expensive is not usable in our environment.
>>
>
> [ snip for now (I'm traveling) ]
>
>> This first set of benchmark results compares ftrace to ktrace. The
>> numbers below are the "on" result minus the "off" result for each
>> configuration.
>>
>> ktrace: 200ns  (tracepoint: kernel_getuid)
>> ftrace: 224ns   (tracepoint: timer:sys_getuid)
>> ftrace: 587ns   (tracepoint: syscalls:sys_enter_getuid)
>
>
>> The last result shows that the syscall tracing is about twice as
>> expensive as a normal tracepoint, which is interesting.
>
> Argh, the syscall tracing has a lot of overhead. There is only one
> tracepoint that is hooked into the ptrace code, and will save all
> registers before calling the functions. It enables tracing on all
> syscalls and there's a table that decides whether or not to trace the
> syscall.
>
> So I'm not surprised with the result that the syscall trace point is so
> slow (note, perf uses the same infrastructure).
>
> -- Steve

Steve, Mike told me that you may have some patches to improve overhead
of syscall tracing. I'd be interested in testing them. Can you send
them my way?

Thanks,
d#

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-11-16 23:27       ` David Sharp
@ 2010-11-16 23:32         ` Steven Rostedt
  2010-11-19  1:11           ` Michael Rubin
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2010-11-16 23:32 UTC (permalink / raw)
  To: David Sharp
  Cc: linux-kernel, Mathieu Desnoyers, Ingo Molnar, Andrew Morton,
	Michael Rubin, Frederic Weisbecker

On Tue, 2010-11-16 at 15:27 -0800, David Sharp wrote:

> Steve, Mike told me that you may have some patches to improve overhead
> of syscall tracing. I'd be interested in testing them. Can you send
> them my way?

Mike may have been confused ;-)

I did tell him that some ring buffer improvements did make it into the
kernel. You might want to try the latest mainline again.

I even have other patches that might help. I'll have to go and dig them
up.

-- Steve



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-11-16 23:32         ` Steven Rostedt
@ 2010-11-19  1:11           ` Michael Rubin
  2010-11-19  1:41             ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Rubin @ 2010-11-19  1:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: David Sharp, linux-kernel, Mathieu Desnoyers, Ingo Molnar,
	Andrew Morton, Frederic Weisbecker

On Tue, Nov 16, 2010 at 3:32 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> Mike may have been confused ;-)

I am always confused.

> I did tell him that some ring buffer improvements did make it into the
> kernel. You might want to try the latest mainline again.

You told me that you had patches (in 2.6.37) I think that would allow
us to specifically improve the syscall tracing performance.

Can you point us at em?

mrubin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Benchmarks of kernel tracing options (ftrace and ktrace)
  2010-11-19  1:11           ` Michael Rubin
@ 2010-11-19  1:41             ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2010-11-19  1:41 UTC (permalink / raw)
  To: Michael Rubin
  Cc: David Sharp, linux-kernel, Mathieu Desnoyers, Ingo Molnar,
	Andrew Morton, Frederic Weisbecker

On Thu, 2010-11-18 at 17:11 -0800, Michael Rubin wrote:

> > I did tell him that some ring buffer improvements did make it into the
> > kernel. You might want to try the latest mainline again.
> 
> You told me that you had patches (in 2.6.37) I think that would allow
> us to specifically improve the syscall tracing performance.
> 
> Can you point us at em?

Look at commits:

afcc5c6872f0215d515a637041bb51f8691a8ea7
69d1b839f7eee347e357b3f6cce7f630cc6ff93d
140ff89127c74b1b1c1b0152a36ea3720ccf6bc3
d9abde2138e0a00a0d7e44676928efa0ef629d48
b8b2663bd7c9da04ac804659b9f617c199d0252c

-- Steve



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-11-19  1:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <AANLkTikfYy-kYb1=KbsYwHd0_vcf20d2nPSfFynugz8z@mail.gmail.com>
     [not found] ` <AANLkTikKwx6okpX4pxVzTvrVNm=KhUvLQGs0_ziwo6fX@mail.gmail.com>
2010-10-13 23:19   ` Benchmarks of kernel tracing options (ftrace and ktrace) David Sharp
2010-10-13 23:50     ` Steven Rostedt
2010-10-14  0:00       ` Mathieu Desnoyers
2010-10-14 11:27         ` Frederic Weisbecker
2010-10-14 18:04         ` Jason Baron
2010-11-16 23:27       ` David Sharp
2010-11-16 23:32         ` Steven Rostedt
2010-11-19  1:11           ` Michael Rubin
2010-11-19  1:41             ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).