From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>,
linux-perf-users@vger.kernel.org, namhyung@kernel.org,
Ingo Molnar <mingo@redhat.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: measuring system wide CPU usage ignoring idle process
Date: Thu, 23 Nov 2017 11:42:20 -0300 [thread overview]
Message-ID: <20171123144220.GB8789@kernel.org> (raw)
In-Reply-To: <20171123142100.GA7066@krava>
Em Thu, Nov 23, 2017 at 03:21:00PM +0100, Jiri Olsa escreveu:
> On Thu, Nov 23, 2017 at 03:09:31PM +0100, Jiri Olsa wrote:
> > On Thu, Nov 23, 2017 at 02:40:36PM +0100, Milian Wolff wrote:
> > > On Tuesday, November 21, 2017 12:44:38 AM CET Jiri Olsa wrote:
> > > > On Mon, Nov 20, 2017 at 09:24:42PM +0100, Milian Wolff wrote:
> > > > > On Montag, 20. November 2017 15:29:08 CET Jiri Olsa wrote:
> > > > > > On Mon, Nov 20, 2017 at 03:00:46PM +0100, Milian Wolff wrote:
> > > > > > > Hey all,
> > > > > > >
> > > > > > > colleagues of mine just brought this inconvenient perf stat behavior
> > > > > > > to my
> > > > > > > attention:
> > > > > > >
> > > > > > > $ perf stat -a -e cpu-clock,task-clock,cycles,instructions sleep 1
> > > > > > >
> > > > > > > Performance counter stats for 'system wide':
> > > > > > > 4004.501439 cpu-clock (msec) # 4.000 CPUs
> > > > > > > utilized
> > > > > > > 4004.526474 task-clock (msec) # 4.000 CPUs
> > > > > > > utilized
> > > > > > > 945,906,029 cycles # 0.236 GHz
> > > > > > > 461,861,241 instructions # 0.49 insn per
> > > > > > > cycle
> > > > > > >
> > > > > > > 1.001247082 seconds time elapsed
> > > > > > >
> > > > > > > This shows that cpu-clock and task-clock are incremented also for the
> > > > > > > idle
> > > > > > > processes. Is there some trick to exclude that time, such that the CPU
> > > > > > > utilization drops below 100% when doing `perf stat -a`?
> > > > > >
> > > > > > I dont think it's the idle process you see, I think it's the managing
> > > > > > overhead before the 'sleep 1' task goes actualy to sleep
> > > > > >
> > > > > > there's some user space code before it gets into the sleep syscall,
> > > > > > and there's some possible kernel scheduling/syscall/irq code with
> > > > > > events already enabled and counting
> > > > >
> > > > > Sorry for being unclear: I was talking about the task-clock and cpu-clock
> > > > > values which you omitted from your measurements below. My example also
> > > > > shows that the counts for cycles and instructions are fine. But the
> > > > > cpu-clock and task-clock are useless as they always sum up to essentially
> > > > > `$nproc*$runtime`. What I'm hoping for are fractional values for the "N
> > > > > CPUs utilized".
> > > > ugh my bad.. anyway by using -a you create cpu counters
> > > > which never unschedule, so those times will be same
> > > > as the 'sleep 1' run length
Humm, what role perf_event_attr.exclude_idle has here?
> > > >
> > > > but not sure now how to get the real utilization.. will check
> > >
> > > did you have a chance to check the above? I'd be really interested in knowing
> > > whether there is an existing workaround. If not, would it be feasible to patch
> > > perf to get the desired behavior? I'd be willing to look into this. This would
> > > probably require changes on the kernel side though, or how could this be
> > > fixed?
> >
> > hi,
> > I haven't found any good way yet.. I ended up with following
> > patch to allow attach counters to idle process, which got
> > me the count/behaviour you need (with few tools changes in
> > my perf/idle branch)
> >
> > but I'm not sure it's the best idea ;-) there might
> > be better way.. CC-ing Ingo, Peter and Alexander
>
> also I was thinking we might add 'idle' line into perf top ;-)
> shouldn't be that hard once we have the counter
Humm...
What is wrong with perf_event_attr.exclude_idle? :-)
From include/uapi/linux/perf_event.h:
exclude_idle : 1, /* don't count when idle */
But it is not being set:
[root@jouet ~]# perf stat -vv -a -e cpu-clock,task-clock,cycles,instructions sleep 1
Using CPUID GenuineIntel-6-3D
intel_pt default config: tsc,pt,branch
------------------------------------------------------------
perf_event_attr:
type 1
size 112
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
------------------------------------------------------------
perf_event_attr:
type 1
size 112
config 0x1
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
size 112
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 19
cpu-clock: 0: 1001547771 1001547617 1001547617
cpu-clock: 1: 1001552938 1001552742 1001552742
cpu-clock: 2: 1001555120 1001554407 1001554407
cpu-clock: 3: 1001563889 1001563570 1001563570
cpu-clock: 4006219718 4006218336 4006218336
task-clock: 0: 1001603894 1001603894 1001603894
task-clock: 1: 1001616140 1001616140 1001616140
task-clock: 2: 1001617338 1001617338 1001617338
task-clock: 3: 1001621998 1001621998 1001621998
task-clock: 4006459370 4006459370 4006459370
cycles: 0: 71757776 1001642926 1001642926
cycles: 1: 23188411 1001651335 1001651335
cycles: 2: 24665622 1001654878 1001654878
cycles: 3: 79907293 1001659590 1001659590
cycles: 199519102 4006608729 4006608729
instructions: 0: 40314068 1001677791 1001677791
instructions: 1: 13525409 1001682314 1001682314
instructions: 2: 14247277 1001682655 1001682655
instructions: 3: 23286057 1001685112 1001685112
instructions: 91372811 4006727872 4006727872
Performance counter stats for 'system wide':
4006.219718 cpu-clock (msec) # 3.999 CPUs utilized
4006.459370 task-clock (msec) # 3.999 CPUs utilized
199,519,102 cycles # 0.050 GHz
91,372,811 instructions # 0.46 insn per cycle
1.001749823 seconds time elapsed
[root@jouet ~]#
So the I tried the patch at the end of this messagem, but it doesn't
seem to affect software counters such as cpu-clock and task-clock:
[root@jouet ~]# perf stat --no-idle -a -e cpu-clock,task-clock,cycles,instructions sleep 1m
Performance counter stats for 'system wide':
240005.027025 cpu-clock (msec) # 4.000 CPUs utilized
240005.150119 task-clock (msec) # 4.000 CPUs utilized
2,658,680,286 cycles # 0.011 GHz
1,109,111,339 instructions # 0.42 insn per cycle
60.001361214 seconds time elapsed
[root@jouet ~]# perf stat --idle -a -e cpu-clock,task-clock,cycles,instructions sleep 1m
Performance counter stats for 'system wide':
240006.825047 cpu-clock (msec) # 4.000 CPUs utilized
240006.964995 task-clock (msec) # 4.000 CPUs utilized
2,784,702,480 cycles # 0.012 GHz
1,210,285,863 instructions # 0.43 insn per cycle
60.001806963 seconds time elapsed
[root@jouet ~]#
[root@jouet ~]# perf stat -vv --no-idle -a -e cpu-clock,task-clock,cycles,instructions sleep 1 |& grep exclude_idle
exclude_idle 1
exclude_idle 1
exclude_idle 1
exclude_idle 1
[root@jouet ~]# perf stat -vv -a -e cpu-clock,task-clock,cycles,instructions sleep 1 |& grep exclude_idle
[root@jouet ~]# perf stat --idle -vv -a -e cpu-clock,task-clock,cycles,instructions sleep 1 |& grep exclude_idle
[root@jouet ~]#
Time to look at the kernel...
- Arnaldo
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 59af5a8419e2..32860537e114 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -144,6 +144,7 @@ typedef int (*aggr_get_id_t)(struct cpu_map *m, int cpu);
static int run_count = 1;
static bool no_inherit = false;
+static bool idle = true;
static volatile pid_t child_pid = -1;
static bool null_run = false;
static int detailed_run = 0;
@@ -237,6 +238,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
attr->inherit = !no_inherit;
+ attr->exclude_idle = !idle;
/*
* Some events get initialized with sample_(period/type) set,
@@ -1890,6 +1892,7 @@ static const struct option stat_options[] = {
OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
"monitor specified metrics or metric groups (separated by ,)",
parse_metric_groups),
+ OPT_BOOLEAN(0, "idle", &idle, "Measure when idle"),
OPT_END()
};
next prev parent reply other threads:[~2017-11-23 14:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 14:00 measuring system wide CPU usage ignoring idle process Milian Wolff
2017-11-20 14:29 ` Jiri Olsa
2017-11-20 20:24 ` Milian Wolff
2017-11-20 23:44 ` Jiri Olsa
2017-11-23 13:40 ` Milian Wolff
2017-11-23 14:09 ` Jiri Olsa
2017-11-23 14:21 ` Jiri Olsa
2017-11-23 14:42 ` Arnaldo Carvalho de Melo [this message]
2017-11-23 15:12 ` Jiri Olsa
2017-11-23 18:59 ` Arnaldo Carvalho de Melo
2017-11-24 8:14 ` Jiri Olsa
2017-11-23 15:15 ` Peter Zijlstra
2018-04-17 13:41 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171123144220.GB8789@kernel.org \
--to=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=jolsa@redhat.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=milian.wolff@kdab.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.