linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ANNOUNCE] New tools: lttngtrace and lttngreport
@ 2010-11-15 13:42 Mathieu Desnoyers
  2010-11-15 16:07 ` Mathieu Desnoyers
  0 siblings, 1 reply; 10+ messages in thread
From: Mathieu Desnoyers @ 2010-11-15 13:42 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Arjan van de Ven,
	Andrew Morton, Steven Rostedt
  Cc: ltt-dev, linux-kernel

Hi everyone,

Recently, Linus came up with a request: a "super-strace" based on kernel
tracing. So I just came up with two bash scripts that perform exactly this within
LTTng: lttngtrace starts tracing, runs a program, stops tracing. lttngreport
generates a wakeup dependency report. I'm appending the script below (depends on
LTTng kernel tree, LTTng modules package, ltt-control and lttv).

It will report which interrupt/softirq and wakeup dependency chain caused each
wakeup, along with the duration within each step of the wakeup chain. This
includes I/O activity, traps, and syscalls.

Usage, e.g. to trace a git clone (must be run as root):

# lttngtrace git clone git://git.lttng.org/lttv.git
[...]

% lttngreport
[...]
Woken up in context of 29856 [/usr/lib/git-core/git] in high-level state RUNNING
Blocked in RUNNING, SYSCALL 7 [sys_waitpid+0x0/0x30], (times: 139642.854813709-139642.855068608, dur: 0.000255)
Woken up in context of 29856 [/usr/lib/git-core/git] in high-level state RUNNING
Blocked in RUNNING, SYSCALL 118 [sys_fsync+0x0/0x10], (times: 139642.855714820-139642.855795631, dur: 0.000081)
Woken up by a SoftIRQ: SoftIRQ 4 [blk_done_softirq+0x0/0x70]
Blocked in RUNNING, SYSCALL 118 [sys_fsync+0x0/0x10], (times: 139642.855870453-139642.858538485, dur: 0.002668)
         Blocked in UNKNOWN, (times: 139642.855869565-139642.858434668, dur: 0.002565)
         Woken up by a SoftIRQ: SoftIRQ 4 [blk_done_softirq+0x0/0x70]
         Blocked in UNKNOWN, (times: 139642.858483534-139642.858528693, dur: 0.000045)
         Woken up by a SoftIRQ: SoftIRQ 4 [blk_done_softirq+0x0/0x70]
Woken up in context of 1140 [kjournald]UNKNOWN,  in high-level state UNKNOWN
Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], (times: 139642.859478216-139642.860636322, dur: 0.001158)
Woken up by a SoftIRQ: SoftIRQ 3 [net_rx_action+0x0/0x1e0]
Blocked in RUNNING, SYSCALL 38 [sys_rename+0x0/0x30], (times: 139642.861159211-139642.861240877, dur: 0.000082)
Woken up by a SoftIRQ: SoftIRQ 4 [blk_done_softirq+0x0/0x70]

I'll be happy to tweak the report as I receive feedback.

(bash scripts below, depends on current lttv git tree)

Thanks,

Mathieu

-------
lttngtrace:

#!/bin/sh

rm -fr /tmp/autotrace1
ltt-armall

lttctl -C -w /tmp/autotrace1 autotrace1

${*} &
CHILDPID=$!
wait

lttctl -D autotrace1
ltt-disarmall

echo ${CHILDPID} > /tmp/autotrace1-pid

------
lttngreport:

#!/bin/sh

CHILDPID=$(cat /tmp/autotrace1-pid)

lttv -m depanalysis -t /tmp/autotrace1 --dep-pid ${CHILDPID}


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-15 13:42 [ANNOUNCE] New tools: lttngtrace and lttngreport Mathieu Desnoyers
@ 2010-11-15 16:07 ` Mathieu Desnoyers
  2010-11-17 11:37   ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Mathieu Desnoyers @ 2010-11-15 16:07 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Arjan van de Ven,
	Andrew Morton, Steven Rostedt
  Cc: ltt-dev, linux-kernel

* Mathieu Desnoyers (mathieu.desnoyers@efficios.com) wrote:
> Hi everyone,
> 
> Recently, Linus came up with a request: a "super-strace" based on kernel
> tracing. So I just came up with two bash scripts that perform exactly this within
> LTTng: lttngtrace starts tracing, runs a program, stops tracing. lttngreport
> generates a wakeup dependency report. I'm appending the script below (depends on
> LTTng kernel tree, LTTng modules package, ltt-control and lttv).
> 
> It will report which interrupt/softirq and wakeup dependency chain caused each
> wakeup, along with the duration within each step of the wakeup chain. This
> includes I/O activity, traps, and syscalls.

Hi again,

I just enhanced the wakeup dependency chain output to make it more
understandable by humans. It's astonishing what we can do with a bit of ASCII
art. For instance, here I am showing piece of firefox start trace:

        --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
        |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
        |    --- Woken up by an IRQ: IRQ 0 [timer]
        --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING

As you see, firefox blocked for 29.5 microseconds on select() waiting for a file
descriptor, the wake up came from "gnome-power-manager" after the poll() timed
out (1 second timeout). (This specific scenario cannot be reproduced at every
firefox run, but it is still a good example of how these spurious delays can be
identified with the tool.)

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-15 16:07 ` Mathieu Desnoyers
@ 2010-11-17 11:37   ` Andi Kleen
  2010-11-17 11:47     ` [ltt-dev] " Benjamin Poirier
  2010-11-17 17:31     ` Mathieu Desnoyers
  0 siblings, 2 replies; 10+ messages in thread
From: Andi Kleen @ 2010-11-17 11:37 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Arjan van de Ven,
	Andre, ltt-dev, linux-kernel

Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>
>         --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
>         |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
>         |    --- Woken up by an IRQ: IRQ 0 [timer]
>         --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING

Very nice! Now how can we get that with an unpatched kernel tree?

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ltt-dev] [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 11:37   ` Andi Kleen
@ 2010-11-17 11:47     ` Benjamin Poirier
  2010-11-17 17:31     ` Mathieu Desnoyers
  1 sibling, 0 replies; 10+ messages in thread
From: Benjamin Poirier @ 2010-11-17 11:47 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Mathieu Desnoyers, Linus Torvalds, linux-kernel, Andre, ltt-dev,
	Ingo Molnar, Arjan van de Ven, Thomas Gleixner

On 17/11/10 06:37 AM, Andi Kleen wrote:
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>>
>>         --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
>>         |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
>>         |    --- Woken up by an IRQ: IRQ 0 [timer]
>>         --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING
> 
> Very nice! Now how can we get that with an unpatched kernel tree?

Oh, that's easy: integrate LTTng!

> 
> -Andi
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 11:37   ` Andi Kleen
  2010-11-17 11:47     ` [ltt-dev] " Benjamin Poirier
@ 2010-11-17 17:31     ` Mathieu Desnoyers
  2010-11-17 21:44       ` Thomas Gleixner
  2010-11-17 22:01       ` Ingo Molnar
  1 sibling, 2 replies; 10+ messages in thread
From: Mathieu Desnoyers @ 2010-11-17 17:31 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Arjan van de Ven,
	Andre, ltt-dev, linux-kernel, pierre-marc.fournier

* Andi Kleen (andi@firstfloor.org) wrote:
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> >
> >         --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
> >         |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
> >         |    --- Woken up by an IRQ: IRQ 0 [timer]
> >         --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING
> 
> Very nice! Now how can we get that with an unpatched kernel tree?

Well, I'm afraid the collection approach "trace" is currently taking won't allow
this kind of dependency wakeup chain tracking, because they focus on tracing
operations happening on a thread and its children, but the reality is that the
wakeup chains often spread outside of this scope.

This is why lttngtrace gathers a system-wide trace even though we're mostly
intested in the wait/wakeups of a specific PID.

Wakeup dependency analysis depends on a few key events to track these chains.
It's all described in Pierre-Marc Fournier's master thesis and implemented as
a LTTV plugin. Don't worry about the French cover page, the whole chapter 2
(p. 40-74) is in English:

http://publications.polymtl.ca/117/1/2009_Pierre-MarcFournier.pdf

The instrumentation needed for it, excerpt from the thesis:

"We instrumented scheduling changes (and whether they are caused by preemption
or blocking), process wakeups, IRQ entry/exit, softIRQ entry/exit, trap
entry/exit and process forking. Additionally, special state dump events are
generated at the beginning of the trace that indicate in what control flow state
(see section 2.2.3.1) each process is initially. All this instrumentation is
kernel-based, no application instrumentation is necessary.

In order to display detailed information in reports, the Delay Analyzer also
consumes events that give the following information.
– Mapping between IRQ number and name of hardware using it
– Mapping between softIRQ number and the function handling it
– Mapping between a process ID and its name
– Arguments of the open() system call, in order to save the mapping between file
  descriptor and file name
– Arguments of the read() system call, in order to know what file caused the
  system call to block
– What file descriptors caused a poll() system call to wake up
Work is under way to instrument yet more specific system calls."

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 17:31     ` Mathieu Desnoyers
@ 2010-11-17 21:44       ` Thomas Gleixner
  2010-11-17 22:33         ` Steven Rostedt
  2010-11-18 12:57         ` Mathieu Desnoyers
  2010-11-17 22:01       ` Ingo Molnar
  1 sibling, 2 replies; 10+ messages in thread
From: Thomas Gleixner @ 2010-11-17 21:44 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Andi Kleen, Linus Torvalds, Ingo Molnar, Arjan van de Ven, Andre,
	ltt-dev, linux-kernel, pierre-marc.fournier

On Wed, 17 Nov 2010, Mathieu Desnoyers wrote:

> * Andi Kleen (andi@firstfloor.org) wrote:
> > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> > >
> > >         --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
> > >         |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
> > >         |    --- Woken up by an IRQ: IRQ 0 [timer]
> > >         --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING
> > 
> > Very nice! Now how can we get that with an unpatched kernel tree?
> 
> Well, I'm afraid the collection approach "trace" is currently taking won't allow
> this kind of dependency wakeup chain tracking, because they focus on tracing
> operations happening on a thread and its children, but the reality is that the
> wakeup chains often spread outside of this scope.

You are completely missing the point. There is no need to point out
that 'trace' does not do that. It is tracking a process (group)
context (though it can do system wide with the right permissions as
well).

And it's completely irrelevant for a user space programmer where the
kernel spends its time. What's not irrelevant is the information what
caused the kernel to spend time, i.e. which access to which data
resulted in a page fault or IO and how long did it take to come back.

It's completely irrelevant for him whether the kernel ran in circles
or not. If he sees a repeating pattern that a PF recovery or a
read/write to some file takes ages, he'll poke the sysadmin or the
kernel dude, which then will drill down into the gory details.

http://lwn.net/Articles/415760/

 "... Indeed I've been wishing for a tool which would easily tell me
 what pages I'm faulting in (and in what order) out of a 5GB mmaped
 file, in order to help debug performance issues with disk seeks when
 the file is totally paged out."

That's what relevant to user space developers, not the gory details
why the kernel took time X to retrieve that data from disk. Simply
because you can improve performance when you have such information by
rearranging your code and access patterns. The same applies for other
things like cache misses, which can be easily integrated into 'trace'.

> This is why lttngtrace gathers a system-wide trace even though we're mostly
> intested in the wait/wakeups of a specific PID.

Which results in a permission problem which you are completely
ignoring. Not a surprise though - I'm used to the academic way of
defining preliminaries and ignoring side effects just to get the paper
written.

> Wakeup dependency analysis depends on a few key events to track these chains.
> It's all described in Pierre-Marc Fournier's master thesis and implemented as

You seem to believe that kernel developers need to read a thesis to
understand wakeup chains and what it takes to trace them?

Dammit, we do that on a daily base and we did it even before we had
the ftrace/perf infrastructure in place without reading a thesis.

Stop this bullshit once and forever. You can do that in the lecture
room of your university and in the seminar for big corporate engineers
who are made to believe that this is important to improve their
productivity.

Your "DrTracing knows it better attitude" starts to be really annoying.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 17:31     ` Mathieu Desnoyers
  2010-11-17 21:44       ` Thomas Gleixner
@ 2010-11-17 22:01       ` Ingo Molnar
  2010-11-18  2:38         ` Mathieu Desnoyers
  1 sibling, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2010-11-17 22:01 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Andi Kleen, Linus Torvalds, Thomas Gleixner, Arjan van de Ven,
	Andre, ltt-dev, linux-kernel, pierre-marc.fournier


* Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> Well, I'm afraid the collection approach "trace" is currently taking won't allow 
> this kind of dependency wakeup chain tracking, because they focus on tracing 
> operations happening on a thread and its children, but the reality is that the 
> wakeup chains often spread outside of this scope.
>
> This is why lttngtrace gathers a system-wide trace even though we're mostly 
> intested in the wait/wakeups of a specific PID.

Mathieu, what you say is factually wrong - of course perf events allows system-wide 
tracing, it always did.

This is available in the 'trace' tool as well, try:

  $ trace record --all sleep 1          # do system-wide tracing
  
  # trace recorded [1.928 MB] - try 'trace summary' to get an overview

  $ trace summary

   .-----------------------------------.
 __)    Summary of 'sleep 1' events    (__

              [ task - pid ]     [ events ] [ ratio ]  [ runtime ]
 _____________________________________________________________________

             swapper - 0     :         64   [  0.4% ]      0.000 ms
         ksoftirqd/0 - 3     :          6   [  0.0% ]      0.063 ms
         migration/1 - 7     :          2   [  0.0% ]      0.000 ms
         kworker/0:1 - 10    :          6   [  0.0% ]      0.139 ms
         kworker/6:1 - 53    :          3   [  0.0% ]      0.014 ms
         kworker/5:1 - 54    :          3   [  0.0% ]      0.014 ms
         kworker/4:1 - 55    :          3   [  0.0% ]      0.014 ms
         kworker/3:1 - 56    :          3   [  0.0% ]      0.015 ms
         kworker/2:1 - 57    :          3   [  0.0% ]      0.015 ms
         kworker/1:1 - 58    :          3   [  0.0% ]      0.015 ms
        kworker/15:1 - 318   :          6   [  0.0% ]      0.045 ms
        kworker/14:1 - 319   :          6   [  0.0% ]      0.033 ms
        kworker/13:1 - 320   :          6   [  0.0% ]      0.036 ms
        kworker/12:1 - 321   :          6   [  0.0% ]      0.035 ms
        kworker/11:1 - 322   :          6   [  0.0% ]      0.041 ms
        kworker/10:1 - 323   :          6   [  0.0% ]      0.037 ms
         kworker/9:1 - 324   :          6   [  0.0% ]      0.042 ms
         kworker/7:1 - 326   :          6   [  0.0% ]      0.032 ms
         kworker/8:2 - 1047  :          6   [  0.0% ]      0.039 ms
            sendmail - 2327  :         18   [  0.1% ]      0.111 ms
               trace - 32281 :      15728   [ 96.2% ]     86.478 ms
               sleep - 32282 :        460   [  2.8% ]      1.305 ms
 _____________________________________________________________________

As you can see all those kworker events - it's a system-wide trace.

This is in addition to per process and per workflow (child hierarchy) tracing - 
which we expect to be more commonly used.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 21:44       ` Thomas Gleixner
@ 2010-11-17 22:33         ` Steven Rostedt
  2010-11-18 12:57         ` Mathieu Desnoyers
  1 sibling, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2010-11-17 22:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mathieu Desnoyers, Andi Kleen, Linus Torvalds, Ingo Molnar,
	Arjan van de Ven, Andre, ltt-dev, linux-kernel,
	pierre-marc.fournier

Hmm, I was somehow cut from the Cc list?

On Wed, Nov 17, 2010 at 10:44:23PM +0100, Thomas Gleixner wrote:
> On Wed, 17 Nov 2010, Mathieu Desnoyers wrote:
> 
> > * Andi Kleen (andi@firstfloor.org) wrote:
> > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> > > >
> > > >         --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
> > > >         |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
> > > >         |    --- Woken up by an IRQ: IRQ 0 [timer]
> > > >         --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING
> > > 
> > > Very nice! Now how can we get that with an unpatched kernel tree?
> > 
> > Well, I'm afraid the collection approach "trace" is currently taking won't allow
> > this kind of dependency wakeup chain tracking, because they focus on tracing
> > operations happening on a thread and its children, but the reality is that the
> > wakeup chains often spread outside of this scope.
> 
> You are completely missing the point. There is no need to point out
> that 'trace' does not do that. It is tracking a process (group)
> context (though it can do system wide with the right permissions as
> well).
> 
> And it's completely irrelevant for a user space programmer where the
> kernel spends its time. What's not irrelevant is the information what
> caused the kernel to spend time, i.e. which access to which data
> resulted in a page fault or IO and how long did it take to come back.
> 
> It's completely irrelevant for him whether the kernel ran in circles
> or not. If he sees a repeating pattern that a PF recovery or a
> read/write to some file takes ages, he'll poke the sysadmin or the
> kernel dude, which then will drill down into the gory details.
> 
> http://lwn.net/Articles/415760/
> 
>  "... Indeed I've been wishing for a tool which would easily tell me
>  what pages I'm faulting in (and in what order) out of a 5GB mmaped
>  file, in order to help debug performance issues with disk seeks when
>  the file is totally paged out."
> 
> That's what relevant to user space developers, not the gory details
> why the kernel took time X to retrieve that data from disk. Simply
> because you can improve performance when you have such information by
> rearranging your code and access patterns. The same applies for other
> things like cache misses, which can be easily integrated into 'trace'.

Yes, I think the point that is missing here is the focus of the user.

As Linus grinded into my head, he does not care about what the kernel is
doing when he is working on git. He wants to know if he can make his app
work better. Perhaps, localize data or such, which can help. Cache
missies would also be good feedback.

If the user sees that a syscall or pagefault is totally wacked, they can
then (as Thomas said) poke a kernel developer or admin to fix it. They
wont understand the internals of the kernel to get anything useful out
of it.


> 
> > This is why lttngtrace gathers a system-wide trace even though we're mostly
> > intested in the wait/wakeups of a specific PID.
> 
> Which results in a permission problem which you are completely
> ignoring. Not a surprise though - I'm used to the academic way of
> defining preliminaries and ignoring side effects just to get the paper
> written.

Yes, we do not want average Joe to be able to look at anything that they
can't already see (to a point). Sure, let them see when their pages are
being mapped in, or when they take a cache hit, or they block on IO, and
for how long. But I would not want them to be able to see how other
tasks that they do not own are affecting the system. Allowing that would
be a rootkit developer's dream. If they know of some vulnerablility that
takes some timing to trigger, they can then monitor that sshd or
whatever, and see inject an attack at just the right moment to steal a
password, or gain access to something else that they should not have.

But for admins, this is fine, since they have that ability anyway.

Thus, LTTng is a great tool for admins who want to analyze how their
system is running. But it is not something for the average Joe on a
general system. They wont know what all this information means anyway.

-- Steve


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 22:01       ` Ingo Molnar
@ 2010-11-18  2:38         ` Mathieu Desnoyers
  0 siblings, 0 replies; 10+ messages in thread
From: Mathieu Desnoyers @ 2010-11-18  2:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, Linus Torvalds, Thomas Gleixner, Arjan van de Ven,
	ltt-dev, linux-kernel, pierre-marc.fournier, Steven Rostedt

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
> > Well, I'm afraid the collection approach "trace" is currently taking won't allow 
> > this kind of dependency wakeup chain tracking, because they focus on tracing 
> > operations happening on a thread and its children, but the reality is that the 
> > wakeup chains often spread outside of this scope.
> >
> > This is why lttngtrace gathers a system-wide trace even though we're mostly 
> > intested in the wait/wakeups of a specific PID.
> 
> Mathieu, what you say is factually wrong - of course perf events allows
> system-wide tracing, it always did.

Ah, good to know. Reading through your announcement, I thought that the "trace"
tool was only following processes and their children (I know that perf allows
system-wide tracing). However, there seems to be some incompatibilities between
the will to make the "trace" tool usable from non-root users and the will to
gather a system-wide trace.

I'm actually struggling with the same issue with lttngtrace. So far, I've taken
the approach to only allow root to perform this kind of operation, which is much
more limiting than what strace currently does.

I was wondering if you had thought about this problem.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ANNOUNCE] New tools: lttngtrace and lttngreport
  2010-11-17 21:44       ` Thomas Gleixner
  2010-11-17 22:33         ` Steven Rostedt
@ 2010-11-18 12:57         ` Mathieu Desnoyers
  1 sibling, 0 replies; 10+ messages in thread
From: Mathieu Desnoyers @ 2010-11-18 12:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andi Kleen, Linus Torvalds, Ingo Molnar, Arjan van de Ven, Andre,
	ltt-dev, linux-kernel, pierre-marc.fournier

* Thomas Gleixner (tglx@linutronix.de) wrote:
> On Wed, 17 Nov 2010, Mathieu Desnoyers wrote:
> 
> > * Andi Kleen (andi@firstfloor.org) wrote:
> > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> > > >
> > > >         --> Blocked in RUNNING, SYSCALL 142 [sys_select+0x0/0xc0], ([...], dur: 0.029567)
> > > >         |    --> Blocked in RUNNING, SYSCALL 168 [sys_poll+0x0/0xc0], ([...], dur: 1.187935)
> > > >         |    --- Woken up by an IRQ: IRQ 0 [timer]
> > > >         --- Woken up in context of 7401 [gnome-power-man] in high-level state RUNNING
> > > 
> > > Very nice! Now how can we get that with an unpatched kernel tree?
> > 
> > Well, I'm afraid the collection approach "trace" is currently taking won't allow
> > this kind of dependency wakeup chain tracking, because they focus on tracing
> > operations happening on a thread and its children, but the reality is that the
> > wakeup chains often spread outside of this scope.
> 
> You are completely missing the point. There is no need to point out
> that 'trace' does not do that. It is tracking a process (group)
> context (though it can do system wide with the right permissions as
> well).

As I pointed out to Ingo, running-as-non-root vs system-wide tracing can be
a bit of a puzzle. I've now realized that the "--all" option does gather
system-wide traces, that's good.

> And it's completely irrelevant for a user space programmer where the
> kernel spends its time. What's not irrelevant is the information what
> caused the kernel to spend time, i.e. which access to which data
> resulted in a page fault or IO and how long did it take to come back.

Following wakeup chains across processes (e.g. firefox to Xorg to whatnot) is
also relevant to user space programmers, but not provided without system-wide
tracing. And if I/O has been slowed down because of an interaction with
something completely unrelated (e.g. a concurrent process doing fsync), the user
might want to know it, no ?

> It's completely irrelevant for him whether the kernel ran in circles
> or not. If he sees a repeating pattern that a PF recovery or a
> read/write to some file takes ages, he'll poke the sysadmin or the
> kernel dude, which then will drill down into the gory details.
> 
> http://lwn.net/Articles/415760/
> 
>  "... Indeed I've been wishing for a tool which would easily tell me
>  what pages I'm faulting in (and in what order) out of a 5GB mmaped
>  file, in order to help debug performance issues with disk seeks when
>  the file is totally paged out."
> 
> That's what relevant to user space developers, not the gory details
> why the kernel took time X to retrieve that data from disk. Simply
> because you can improve performance when you have such information by
> rearranging your code and access patterns. The same applies for other
> things like cache misses, which can be easily integrated into 'trace'.

If the said user is facing throughput issues, yes, indeed. When the user is
facing latency issues, then I have to disagree with you. So what you are saying
above is true, but only for one class of users. Agreed, this might be a very
large group. So possibly that we just target different users here ?


> > This is why lttngtrace gathers a system-wide trace even though we're mostly
> > intested in the wait/wakeups of a specific PID.
> 
> Which results in a permission problem which you are completely
> ignoring. Not a surprise though - I'm used to the academic way of
> defining preliminaries and ignoring side effects just to get the paper
> written.

I'm well aware of this problem, as I pointed out in this response and my reply
to Ingo.

> > Wakeup dependency analysis depends on a few key events to track these chains.
> > It's all described in Pierre-Marc Fournier's master thesis and implemented as
> You seem to believe that kernel developers need to read a thesis to
> understand wakeup chains and what it takes to trace them?
> 
> Dammit, we do that on a daily base and we did it even before we had
> the ftrace/perf infrastructure in place without reading a thesis.
> 
> Stop this bullshit once and forever. You can do that in the lecture
> room of your university and in the seminar for big corporate engineers
> who are made to believe that this is important to improve their
> productivity.
> 
> Your "DrTracing knows it better attitude" starts to be really annoying.

I might have miscommunicated my intend, sorry about that. It's just that I
worked with Pierre-Marc on this, and we identified part of the instrumentation
needed to perform this analysis. I thought it would be appropriate to share it
with you guys, but my tone my have been slightly more assertive than it should.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-11-18 12:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-15 13:42 [ANNOUNCE] New tools: lttngtrace and lttngreport Mathieu Desnoyers
2010-11-15 16:07 ` Mathieu Desnoyers
2010-11-17 11:37   ` Andi Kleen
2010-11-17 11:47     ` [ltt-dev] " Benjamin Poirier
2010-11-17 17:31     ` Mathieu Desnoyers
2010-11-17 21:44       ` Thomas Gleixner
2010-11-17 22:33         ` Steven Rostedt
2010-11-18 12:57         ` Mathieu Desnoyers
2010-11-17 22:01       ` Ingo Molnar
2010-11-18  2:38         ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).