linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@infradead.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Stephane Eranian <eranian@google.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 22/47] perf stat: Add support for --initial-delay option
Date: Wed,  7 Aug 2013 18:10:49 -0300	[thread overview]
Message-ID: <1375909874-22073-23-git-send-email-acme@infradead.org> (raw)
In-Reply-To: <1375909874-22073-1-git-send-email-acme@infradead.org>

From: Andi Kleen <ak@linux.intel.com>

When measuring workloads the startup phase -- doing page faults, dynamic
linking, opening files -- is often very different from the rest of the
workload.  Especially with smaller kernels and using counter
multiplexing this can give significant measurement errors.

Multiplexing assumes that the workload is mostly the same over longer
periods. But at startup there is typically some spike of activity which
is relatively short.  If many groups are multiplexing the one group
seeing the spike, and which is then scaled up over the time to run all
groups, may see a significant error.

Also in general it's often not useful to measure the startup, because it
is so different from the rest.

One way around this is to use interval mode and discard the first
sample, but this can be awkward because interval mode doesn't support
intervals of less than 100ms, and also a useful interval is not
necessarily the same as a useful startup delay.

This patch adds a new --initial-delay / -D option to skip measuring for
the startup phase. The time can be specified in ms

Here's a simple example:

perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
             3,721 page-faults
...

If we just wait 20 ms the number of page faults is 1/3 less:

perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
             2,823 page-faults
...

So we filtered out most of the startup noise from bash.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-stat.txt |  5 +++++
 tools/perf/builtin-stat.c              | 22 +++++++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 2fe87fb..73c9759 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -132,6 +132,11 @@ is a useful mode to detect imbalance between physical cores.  To enable this mod
 use --per-core in addition to -a. (system-wide).  The output includes the
 core number and the number of online logical processors on that physical processor.
 
+-D msecs::
+--initial-delay msecs::
+After starting the program, wait msecs before measuring. This is useful to
+filter out the startup phase of the program, which is often very different.
+
 EXAMPLES
 --------
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 352fbd7..2e637e4 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -100,6 +100,7 @@ static const char		*pre_cmd			= NULL;
 static const char		*post_cmd			= NULL;
 static bool			sync_run			= false;
 static unsigned int		interval			= 0;
+static unsigned int		initial_delay			= 0;
 static bool			forever				= false;
 static struct timespec		ref_time;
 static struct cpu_map		*aggr_map;
@@ -254,7 +255,8 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
 	if (!perf_target__has_task(&target) &&
 	    perf_evsel__is_group_leader(evsel)) {
 		attr->disabled = 1;
-		attr->enable_on_exec = 1;
+		if (!initial_delay)
+			attr->enable_on_exec = 1;
 	}
 
 	return perf_evsel__open_per_thread(evsel, evsel_list->threads);
@@ -416,6 +418,20 @@ static void print_interval(void)
 	}
 }
 
+static void handle_initial_delay(void)
+{
+	struct perf_evsel *counter;
+
+	if (initial_delay) {
+		const int ncpus = cpu_map__nr(evsel_list->cpus),
+			nthreads = thread_map__nr(evsel_list->threads);
+
+		usleep(initial_delay * 1000);
+		list_for_each_entry(counter, &evsel_list->entries, node)
+			perf_evsel__enable(counter, ncpus, nthreads);
+	}
+}
+
 static int __run_perf_stat(int argc, const char **argv)
 {
 	char msg[512];
@@ -486,6 +502,7 @@ static int __run_perf_stat(int argc, const char **argv)
 
 	if (forks) {
 		perf_evlist__start_workload(evsel_list);
+		handle_initial_delay();
 
 		if (interval) {
 			while (!waitpid(child_pid, &status, WNOHANG)) {
@@ -497,6 +514,7 @@ static int __run_perf_stat(int argc, const char **argv)
 		if (WIFSIGNALED(status))
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
+		handle_initial_delay();
 		while (!done) {
 			nanosleep(&ts, NULL);
 			if (interval)
@@ -1419,6 +1437,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "aggregate counts per processor socket", AGGR_SOCKET),
 	OPT_SET_UINT(0, "per-core", &aggr_mode,
 		     "aggregate counts per physical processor core", AGGR_CORE),
+	OPT_UINTEGER('D', "delay", &initial_delay,
+		     "ms to wait before starting measurement after program start"),
 	OPT_END()
 	};
 	const char * const stat_usage[] = {
-- 
1.8.1.4


  parent reply	other threads:[~2013-08-07 21:19 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-07 21:10 [GIT PULL 00/47] perf/core improvements and fixes Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 01/47] perf: Add PERF_EVENT_IOC_ID ioctl to return event ID Arnaldo Carvalho de Melo
2013-08-09 14:50   ` Vince Weaver
2013-08-10 12:11     ` Jiri Olsa
2013-08-07 21:10 ` [PATCH 02/47] perf: Do not get values from disabled counters in group format read Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 03/47] perf evlist: Use PERF_EVENT_IOC_ID perf ioctl to read event id Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 04/47] perf tools: Add support for parsing PERF_SAMPLE_READ sample type Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 05/47] perf evlist: Fix event ID retrieval for group format read case Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 06/47] perf evlist: Add perf_evlist__id2sid method to get event ID related data Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 07/47] perf evsel: Add PERF_SAMPLE_READ sample related processing Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 08/47] perf tools: Add 'S' event/group modifier to read sample value Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 09/47] perf tests: Add attr record group sampling test Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 10/47] perf tests: Add parse events tests for leader sampling Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 11/47] perf evsel: Actually show symbol offset in stack trace when requested Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 12/47] perf tools: Fix compile of util/tsc.c Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 13/47] perf trace: Beautify 'connect' result Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 14/47] perf python: Remove duplicate TID bit from mask Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 15/47] perf util: Add parse_nsec_time() function Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 16/47] perf top: move CONSOLE_CLEAR to header file Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 17/47] perf stats: Add max and min stats Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 18/47] perf session: Export a few functions for event processing Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 19/47] perf kvm: Split out tracepoints from record args Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 20/47] perf evlist: Remove obsolete dummy execve Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 21/47] perf evsel: Add support for enabling counters Arnaldo Carvalho de Melo
2013-08-07 21:10 ` Arnaldo Carvalho de Melo [this message]
2013-08-07 21:10 ` [PATCH 23/47] perf stat: Flush output after each line in interval mode Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 24/47] perf symbols: avoid SyS kernel syscall aliases Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 25/47] perf tests: Add test for reading object code Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 26/47] perf symbols: Load kernel maps before using Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 27/47] perf tools: Make it possible to read object code from vmlinux Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 28/47] perf tests: Adjust the vmlinux symtab matches kallsyms test Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 29/47] perf tools: Make it possible to read object code from kernel modules Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 30/47] perf symbols: Add support for reading from /proc/kcore Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 31/47] perf tests: Adjust the vmlinux symtab matches kallsyms test again Arnaldo Carvalho de Melo
2013-08-07 21:10 ` [PATCH 32/47] perf tests: Add kcore to the object code reading test Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 33/47] perf annotate: Allow disassembly using /proc/kcore Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 34/47] perf annotate: Put dso name in symbol annotation title Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 35/47] perf annotate: Remove nop at end of annotation Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 36/47] perf annotate: Add call target name if it is missing Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 37/47] perf annotate browser: Improve description of '?' hotkey Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 38/47] perf annotate browser: Fix typo Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 39/47] perf session: Export queue_event function Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 40/47] perf kvm: Add live mode Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 41/47] perf kvm: Add min and max stats to display Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 42/47] perf kvm stat report: Add option to analyze specific VM Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 43/47] perf ui/gtk: Fix segmentation fault on perf_hpp__for_each_format loop Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 44/47] perf tools: Add support for pinned modifier Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 45/47] perf tests: Add tests of new " Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 46/47] perf machine: Do not require /lib/modules/* on a guest Arnaldo Carvalho de Melo
2013-08-07 21:11 ` [PATCH 47/47] Revert "tools lib lk: Fix for cross build" Arnaldo Carvalho de Melo
2013-08-12  8:17 ` [GIT PULL 00/47] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1375909874-22073-23-git-send-email-acme@infradead.org \
    --to=acme@infradead.org \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).