linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephane Eranian <eranian@google.com>
To: linux-kernel@vger.kernel.org
Cc: peterz@infradead.org, mingo@elte.hu, ak@linux.intel.com,
	zheng.z.yan@intel.com, robert.richter@amd.com
Subject: [PATCH v2 0/3] perf: use hrtimer for event multiplexing
Date: Wed, 12 Sep 2012 16:13:12 +0200	[thread overview]
Message-ID: <1347459195-5491-1-git-send-email-eranian@google.com> (raw)

The current scheme of using the timer tick was fine
for per-thread events. However, it was causing
bias issues in system-wide mode (including for
uncore PMUs). Event groups would not get their
fair share of runtime on the PMU. With tickless
kernels, if a core is idle there is no timer tick,
and thus no event rotation (multiplexing). However,
there are events (especially uncore events) which do
count even though cores are asleep.

This patch changes the timer source for multiplexing.
It introduces a per-cpu hrtimer. The advantage is that
even when the core goes idle, it will come back to
service the hrtimer, thus multiplexing on system-wide
events works much better.

In order to minimize the impact of the hrtimer, it
is turned on and off on demand. When the PMU on
a CPU is overcommited, the hrtimer is activated.
It is stopped when the PMU is not overcommitted.

In order for this to work properly with HOTPLUG_CPU,
we had to change the order of initialization in
start_kernel() such that hrtimer_init() is run
before perf_event_init().

The second patch provide a sysctl control to
adjust the multiplexing interval. Unit is
milliseconds.

Here is a simple before/after example with
two event groups which do require multiplexing.
This is done in system-wide mode on an idle
system. What matters here is the scaling factor
in [] in not the total counts.

Before:

# perf stat -a -e ref-cycles,ref-cycles sleep 10
 Performance counter stats for 'sleep 10':
 34,319,545 ref-cycles  [56.51%]
 31,917,229 ref-cycles  [43.50%]

 10.000827569 seconds time elapsed

After:
# perf stat -a -e ref-cycles,ref-cycles sleep 10
 Performance counter stats for 'sleep 10':
 11,144,822,193 ref-cycles [50.00%]
 11,103,760,513 ref-cycles [50.00%]

 10.000672946 seconds time elapsed

In this second version of the patchset, we now
have the hrtimer_interval per PMU instance. The
tunable is in /sys/devices/XXX/mux_interval_ms,
where XXX is the name of the PMU instance. Due
to initialization changes of each hrtimer, we
had to introduce hrtimer_init_cpu() to initialize
a hrtimer from another CPU.

Signed-off-by: Stephane Eranian <eranian@google.com>
---

Stephane Eranian (3):
  hrtimer: add hrtimer_init_cpu()
  perf: use hrtimer for event multiplexing
  perf: add sysfs entry to adjust multiplexing interval per PMU

 include/linux/hrtimer.h    |    2 +
 include/linux/perf_event.h |    5 +-
 init/main.c                |    2 +-
 kernel/events/core.c       |  166 +++++++++++++++++++++++++++++++++++++++++---
 kernel/hrtimer.c           |   17 +++--
 5 files changed, 176 insertions(+), 16 deletions(-)

-- 
1.7.5.4


             reply	other threads:[~2012-09-12 14:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-12 14:13 Stephane Eranian [this message]
2012-09-12 14:13 ` [PATCH v2 1/3] hrtimer: add hrtimer_init_cpu() Stephane Eranian
2012-09-12 14:22   ` Eric Dumazet
2012-09-12 14:26   ` Peter Zijlstra
2012-09-12 14:33     ` Stephane Eranian
2012-09-12 14:38       ` Peter Zijlstra
2012-09-12 14:46         ` Stephane Eranian
2012-09-12 14:49           ` Peter Zijlstra
2012-09-12 14:51             ` Stephane Eranian
2012-09-12 14:13 ` [PATCH v2 2/3] perf: use hrtimer for event multiplexing Stephane Eranian
2012-09-12 14:22   ` Peter Zijlstra
2012-09-12 14:43     ` Stephane Eranian
2012-09-12 14:44       ` Peter Zijlstra
2012-09-12 14:48         ` Stephane Eranian
2012-09-12 14:50           ` Peter Zijlstra
2012-09-12 15:37       ` Stephane Eranian
2012-09-12 15:49         ` Stephane Eranian
2012-09-13 12:16         ` Peter Zijlstra
2012-09-13 12:20           ` Stephane Eranian
2012-09-13 12:26             ` Peter Zijlstra
2012-09-13 12:27               ` Stephane Eranian
2012-09-13 12:29                 ` Peter Zijlstra
2012-09-12 14:13 ` [PATCH v2 3/3] perf: add sysfs entry to adjust multiplexing interval per PMU Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1347459195-5491-1-git-send-email-eranian@google.com \
    --to=eranian@google.com \
    --cc=ak@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=zheng.z.yan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).