All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
@ 2013-06-03  9:41 Alexander Gordeev
  2013-06-03 12:22 ` Ingo Molnar
  2013-06-03 13:36 ` Alexander Gordeev
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-06-03  9:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Frederic Weisbecker

This patchset is against perf/core branch.

As Linux is able to measure task-bound and CPU-bound performance
events there are no convenient means to monitor performance of
an execution context which requires control and tuning probably
most - interrupt service routines.

This series is an attempt to introduce IRQ-bound performance
events - ones that only count in a context of a hardware interrupt
handler.

The implementation is pretty straightforward: an IRQ-bound event
is registered with the IRQ descriptor and gets enabled/disabled
using new PMU callbacks: pmu_enable_irq() and pmu_disable_irq().

The series has not been tested thoroughly and is a concept proof
rather than a decent implementation: no group events could be be
loaded, inappropriate (i.e. software) events are not rejected,
only Intel and AMD PMUs were tried for 'perf stat', only Intel
PMU works with precise events. Perf tool changes are just a hack.

Yet, I would like first to ensure if the approach taken is not
screwed and I did not miss anything vital. Not to mention if the
change is wanted at all.

Below is a sample session on a machine with x2apic in cluster mode.
IRQ number is passed using new argument -I <irq> (please nevermind
'...process id '8'...' in the output for now):

# cat /proc/interrupts | grep ' 8:'
   8:         23          0          0          0         21          0          0          0         23          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0         27          0          0          0         23          0          0          0         17          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC-edge      rtc0
# ./tools/perf/perf stat -a -e L1-dcache-load-misses:k sleep 1
 Performance counter stats for 'sleep 1':

           124,849 L1-dcache-load-misses                                       

       1.001359403 seconds time elapsed

# ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k sleep 1

 Performance counter stats for process id '8':

                 0 L1-dcache-load-misses                                       

       1.001235781 seconds time elapsed

# ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k hwclock --test
Mon 03 Jun 2013 04:42:59 AM EDT  -0.891274 seconds

 Performance counter stats for process id '8':


Alexander Gordeev (6):
  perf/core: IRQ-bound performance events
  perf/x86: IRQ-bound performance events
  perf/x86/AMD PMU: IRQ-bound performance events
  perf/x86/Core PMU: IRQ-bound performance events
  perf/x86/Intel PMU: IRQ-bound performance events
  perf/tool: Hack 'pid' as 'irq' for sys_perf_event_open()

 arch/x86/kernel/cpu/perf_event.c          |   71 ++++++++++++++++++---
 arch/x86/kernel/cpu/perf_event.h          |   19 ++++++
 arch/x86/kernel/cpu/perf_event_amd.c      |    2 +
 arch/x86/kernel/cpu/perf_event_intel.c    |   93 +++++++++++++++++++++++++--
 arch/x86/kernel/cpu/perf_event_intel_ds.c |    5 +-
 arch/x86/kernel/cpu/perf_event_knc.c      |    2 +
 arch/x86/kernel/cpu/perf_event_p4.c       |    2 +
 arch/x86/kernel/cpu/perf_event_p6.c       |    2 +
 include/linux/irq.h                       |    8 ++
 include/linux/irqdesc.h                   |    3 +
 include/linux/perf_event.h                |   16 +++++
 include/uapi/linux/perf_event.h           |    1 +
 kernel/events/core.c                      |   69 +++++++++++++++----
 kernel/irq/Makefile                       |    1 +
 kernel/irq/handle.c                       |    4 +
 kernel/irq/irqdesc.c                      |   14 ++++
 kernel/irq/perf_event.c                   |  100 +++++++++++++++++++++++++++++
 tools/perf/builtin-record.c               |    8 ++
 tools/perf/builtin-stat.c                 |    8 ++
 tools/perf/util/evlist.c                  |    4 +-
 tools/perf/util/evsel.c                   |    3 +
 tools/perf/util/evsel.h                   |    1 +
 tools/perf/util/target.c                  |    4 +
 tools/perf/util/thread_map.c              |   16 +++++
 24 files changed, 422 insertions(+), 34 deletions(-)
 create mode 100644 kernel/irq/perf_event.c

-- 
1.7.7.6


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
  2013-06-03  9:41 [PATCH RFC -tip 0/6] perf: IRQ-bound performance events Alexander Gordeev
@ 2013-06-03 12:22 ` Ingo Molnar
  2013-06-03 19:41   ` Frederic Weisbecker
  2013-06-03 13:36 ` Alexander Gordeev
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2013-06-03 12:22 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, Thomas Gleixner, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Frederic Weisbecker


* Alexander Gordeev <agordeev@redhat.com> wrote:

> This patchset is against perf/core branch.
> 
> As Linux is able to measure task-bound and CPU-bound performance
> events there are no convenient means to monitor performance of
> an execution context which requires control and tuning probably
> most - interrupt service routines.
> 
> This series is an attempt to introduce IRQ-bound performance
> events - ones that only count in a context of a hardware interrupt
> handler.
> 
> The implementation is pretty straightforward: an IRQ-bound event
> is registered with the IRQ descriptor and gets enabled/disabled
> using new PMU callbacks: pmu_enable_irq() and pmu_disable_irq().
> 
> The series has not been tested thoroughly and is a concept proof
> rather than a decent implementation: no group events could be be
> loaded, inappropriate (i.e. software) events are not rejected,
> only Intel and AMD PMUs were tried for 'perf stat', only Intel
> PMU works with precise events. Perf tool changes are just a hack.
> 
> Yet, I would like first to ensure if the approach taken is not
> screwed and I did not miss anything vital. Not to mention if the
> change is wanted at all.
> 
> Below is a sample session on a machine with x2apic in cluster mode.
> IRQ number is passed using new argument -I <irq> (please nevermind
> '...process id '8'...' in the output for now):

Looks useful.

I think the main challenges are:

 - Creating a proper ABI for all this:

   - IRQ numbers alone are probably not specific enough: we'd also want to 
     be more specific to match on handler names - or handler numbers if
     the handler name is not unique.

   - another useful variant would be where IRQ numbers are too specific:
     something like 'perf top irq' would be a natural thing to do, to see 
     only overhead in hardirq execution - without limiting it to a
     specific handler. An 'all irq contexts' wildcard concept?

 - Covering softirqs as well. If we handle both hardirqs and softirqs,
   then we are pretty much feature complete: all major context types that 
   the Linux kernel cares about are covered in instrumentation. For things
   like networking the softirq overhead is obviously very important, and 
   for example on routers it will do most of the execution.

 - Covering threaded IRQs as well, in a similar model. So if someone types
   'perf top irq', and some IRQ handlers are running threaded, those
   should probaby be included as well.

 - Making the tooling friendlier: 'perf top irq' would be useful, and
   accepting handler names would be useful as well.

The runtime overhead of your patches seems to be pretty low: when no IRQ 
contexts are instrumented then it's a single 'is the list empty' check at 
context scheduling time. That looks acceptable.

Regarding the ABI and IRQ/softirq context enumeration you are breaking 
lots of new ground here, because unlike tasks, cgroups and CPUs the IRQ 
execution contexts do not have a good programmatically accessible 
namespace (yet). So it has to be thought out pretty well I think, but once 
we have it, it will be a lovely feature IMO.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
  2013-06-03  9:41 [PATCH RFC -tip 0/6] perf: IRQ-bound performance events Alexander Gordeev
  2013-06-03 12:22 ` Ingo Molnar
@ 2013-06-03 13:36 ` Alexander Gordeev
  2013-06-04  9:38   ` Peter Zijlstra
  1 sibling, 1 reply; 7+ messages in thread
From: Alexander Gordeev @ 2013-06-03 13:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Frederic Weisbecker

On Mon, Jun 03, 2013 at 11:41:32AM +0200, Alexander Gordeev wrote:
> # ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k sleep 1
> 
>  Performance counter stats for process id '8':
> 
>                  0 L1-dcache-load-misses                                       
> 
>        1.001235781 seconds time elapsed
> 
> # ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k hwclock --test
> Mon 03 Jun 2013 04:42:59 AM EDT  -0.891274 seconds
> 
>  Performance counter stats for process id '8':

Oops, the most interesting part did not make it in. Very sorry :) Here:

# ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k hwclock --test
Mon 03 Jun 2013 09:32:49 AM EDT  -0.719514 seconds

 Performance counter stats for process id '8':

               447 L1-dcache-load-misses                                       

       0.720874208 seconds time elapsed


-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
  2013-06-03 12:22 ` Ingo Molnar
@ 2013-06-03 19:41   ` Frederic Weisbecker
  2013-06-04  8:51     ` Jiri Olsa
  0 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2013-06-03 19:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alexander Gordeev, linux-kernel, x86, Thomas Gleixner,
	Peter Zijlstra, Arnaldo Carvalho de Melo, Jiri Olsa

On Mon, Jun 03, 2013 at 02:22:23PM +0200, Ingo Molnar wrote:
> 
> * Alexander Gordeev <agordeev@redhat.com> wrote:
> 
> > This patchset is against perf/core branch.
> > 
> > As Linux is able to measure task-bound and CPU-bound performance
> > events there are no convenient means to monitor performance of
> > an execution context which requires control and tuning probably
> > most - interrupt service routines.
> > 
> > This series is an attempt to introduce IRQ-bound performance
> > events - ones that only count in a context of a hardware interrupt
> > handler.
> > 
> > The implementation is pretty straightforward: an IRQ-bound event
> > is registered with the IRQ descriptor and gets enabled/disabled
> > using new PMU callbacks: pmu_enable_irq() and pmu_disable_irq().
> > 
> > The series has not been tested thoroughly and is a concept proof
> > rather than a decent implementation: no group events could be be
> > loaded, inappropriate (i.e. software) events are not rejected,
> > only Intel and AMD PMUs were tried for 'perf stat', only Intel
> > PMU works with precise events. Perf tool changes are just a hack.
> > 
> > Yet, I would like first to ensure if the approach taken is not
> > screwed and I did not miss anything vital. Not to mention if the
> > change is wanted at all.
> > 
> > Below is a sample session on a machine with x2apic in cluster mode.
> > IRQ number is passed using new argument -I <irq> (please nevermind
> > '...process id '8'...' in the output for now):
> 
> Looks useful.
> 
> I think the main challenges are:
> 
>  - Creating a proper ABI for all this:
> 
>    - IRQ numbers alone are probably not specific enough: we'd also want to 
>      be more specific to match on handler names - or handler numbers if
>      the handler name is not unique.
> 
>    - another useful variant would be where IRQ numbers are too specific:
>      something like 'perf top irq' would be a natural thing to do, to see 
>      only overhead in hardirq execution - without limiting it to a
>      specific handler. An 'all irq contexts' wildcard concept?
> 
>  - Covering softirqs as well. If we handle both hardirqs and softirqs,
>    then we are pretty much feature complete: all major context types that 
>    the Linux kernel cares about are covered in instrumentation. For things
>    like networking the softirq overhead is obviously very important, and 
>    for example on routers it will do most of the execution.
> 
>  - Covering threaded IRQs as well, in a similar model. So if someone types
>    'perf top irq', and some IRQ handlers are running threaded, those
>    should probaby be included as well.
> 
>  - Making the tooling friendlier: 'perf top irq' would be useful, and
>    accepting handler names would be useful as well.

How about we define finegrained context on top of perf events themselves?
Like we could tell perf to count a task's instructions only after
tracepoint:irq_entry is hit and stop counting when tracepoint:irq_exit.

This way we can define any kind of fine grained context, not just irqs. We
are not short on tracepoints, software events, breakpoints, kprobes, uprobes
to play Legos there.

I had a branch with a working draft of that:

  git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
  	perf/custom-ctx-v2-pre

Frederic Weisbecker (5):
      perf: Starter and stopper events
      perf: New enable_on_starter attribute
      perf: Support for starter and stopper in tools
      perf: New --enable-on-starter option
      perf: Add TODOs for event defined context

It needs quite some improvements, (some are listed in the TODO on the last commit)
especially in both the kernel and user interfaces.

Jiri had some nice ideas about it.

Also Peter was unhappy about how starter/stopper events were inherited in children
events because it complicated the inheritance code, which I totally agree with, although
I couldn't think of a better way by that time. Then I got distracted with other things so
this was the last iteration.

But it can be an interesting starting point. I'm convinced this can be a great feature!
Think about all the user contexts we can define with uprobes for example.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
  2013-06-03 19:41   ` Frederic Weisbecker
@ 2013-06-04  8:51     ` Jiri Olsa
  0 siblings, 0 replies; 7+ messages in thread
From: Jiri Olsa @ 2013-06-04  8:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Alexander Gordeev, linux-kernel, x86,
	Thomas Gleixner, Peter Zijlstra, Arnaldo Carvalho de Melo

On Mon, Jun 03, 2013 at 09:41:21PM +0200, Frederic Weisbecker wrote:
> On Mon, Jun 03, 2013 at 02:22:23PM +0200, Ingo Molnar wrote:
> > 
> > * Alexander Gordeev <agordeev@redhat.com> wrote:

SNIP

> How about we define finegrained context on top of perf events themselves?
> Like we could tell perf to count a task's instructions only after
> tracepoint:irq_entry is hit and stop counting when tracepoint:irq_exit.
> 
> This way we can define any kind of fine grained context, not just irqs. We
> are not short on tracepoints, software events, breakpoints, kprobes, uprobes
> to play Legos there.

agreed, we could do the same as Alex did plus we'd have
the generic interface to meassure any place

> 
> I had a branch with a working draft of that:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
>   	perf/custom-ctx-v2-pre
> 
> Frederic Weisbecker (5):
>       perf: Starter and stopper events
>       perf: New enable_on_starter attribute
>       perf: Support for starter and stopper in tools
>       perf: New --enable-on-starter option
>       perf: Add TODOs for event defined context
> 
> It needs quite some improvements, (some are listed in the TODO on the last commit)
> especially in both the kernel and user interfaces.
> 
> Jiri had some nice ideas about it.

yep, one of them is to to get back to this soon ;-)

jirka

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
  2013-06-03 13:36 ` Alexander Gordeev
@ 2013-06-04  9:38   ` Peter Zijlstra
  2013-06-04 10:14     ` Alexander Gordeev
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2013-06-04  9:38 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Frederic Weisbecker

On Mon, Jun 03, 2013 at 03:36:19PM +0200, Alexander Gordeev wrote:
> On Mon, Jun 03, 2013 at 11:41:32AM +0200, Alexander Gordeev wrote:
> > # ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k sleep 1
> > 
> >  Performance counter stats for process id '8':
> > 
> >                  0 L1-dcache-load-misses                                       
> > 
> >        1.001235781 seconds time elapsed
> > 
> > # ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k hwclock --test
> > Mon 03 Jun 2013 04:42:59 AM EDT  -0.891274 seconds
> > 
> >  Performance counter stats for process id '8':
> 
> Oops, the most interesting part did not make it in. Very sorry :) Here:
> 
> # ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k hwclock --test
> Mon 03 Jun 2013 09:32:49 AM EDT  -0.719514 seconds
> 
>  Performance counter stats for process id '8':
> 
>                447 L1-dcache-load-misses                                       

I think that is very much expected; except in the case where you spend
_all_ your time in IRQ handlers, they'll pretty much always miss l1
cache.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC -tip 0/6] perf: IRQ-bound performance events
  2013-06-04  9:38   ` Peter Zijlstra
@ 2013-06-04 10:14     ` Alexander Gordeev
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-06-04 10:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Frederic Weisbecker

On Tue, Jun 04, 2013 at 11:38:02AM +0200, Peter Zijlstra wrote:
> > Oops, the most interesting part did not make it in. Very sorry :) Here:
> > 
> > # ./tools/perf/perf stat -I 8 -a -e L1-dcache-load-misses:k hwclock --test
> > Mon 03 Jun 2013 09:32:49 AM EDT  -0.719514 seconds
> > 
> >  Performance counter stats for process id '8':
> > 
> >                447 L1-dcache-load-misses                                       
> 
> I think that is very much expected; except in the case where you spend
> _all_ your time in IRQ handlers, they'll pretty much always miss l1
> cache.

The emphasis was on the fact it indeed could be measured for a particular ISR ;)

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-06-04 10:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-03  9:41 [PATCH RFC -tip 0/6] perf: IRQ-bound performance events Alexander Gordeev
2013-06-03 12:22 ` Ingo Molnar
2013-06-03 19:41   ` Frederic Weisbecker
2013-06-04  8:51     ` Jiri Olsa
2013-06-03 13:36 ` Alexander Gordeev
2013-06-04  9:38   ` Peter Zijlstra
2013-06-04 10:14     ` Alexander Gordeev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.