All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/3] perf: Add cputime events/metrics
@ 2018-11-11 21:04 Jiri Olsa
  2018-11-11 21:04 ` [PATCH 1/3] perf/cputime: Add cputime pmu Jiri Olsa
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jiri Olsa @ 2018-11-11 21:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Stephane Eranian, Milian Wolff, Andi Kleen, Frederic Weisbecker

hi,
so.. I failed to make work reliably the exclude_idle bit for
cpu-clock event using the idle's process sum_exec_runtime as
Peter outlined in his patch [1]. The time jumped up and down
and I couldn't make it stable.

But I noticed we actually have IDLE stats (and many more)
available for each CPU (enum cpu_usage_stat), we just can't
reach them by perf yet.

This patchset adds 'cputime' perf software PMU, that provides
CPUTIME_* stats via events that mirrors their names:

  # perf list | grep cputime
    cputime/guest/                                     [Kernel PMU event]
    cputime/guest_nice/                                [Kernel PMU event]
    cputime/idle/                                      [Kernel PMU event]
    cputime/iowait/                                    [Kernel PMU event]
    cputime/irq/                                       [Kernel PMU event]
    cputime/nice/                                      [Kernel PMU event]
    cputime/softirq/                                   [Kernel PMU event]
    cputime/steal/                                     [Kernel PMU event]
    cputime/system/                                    [Kernel PMU event]
    cputime/user/                                      [Kernel PMU event]


v2 changes:
  - all of the support patches are already in
  - new way of 'fixing' of idle counts when tick is disabled (patch 2)


Examples:
  # perf stat --top -I 1000
  #           time       Idle     System       User        Irq    Softirq    IO wait
       1.001692690     100.0%       0.0%       0.0%       0.7%       0.2%       0.0%
       2.002994039      98.9%       0.0%       0.0%       0.9%       0.2%       0.0%
       3.004164038      98.5%       0.2%       0.2%       0.9%       0.2%       0.0%
       4.005312773      98.9%       0.0%       0.0%       0.9%       0.2%       0.0%


  # perf stat --top-full -I 1000
  #           time       Idle     System       User        Irq    Softirq    IO wait      Guest Guest nice       Nice      Steal
       1.001750803     100.0%       0.0%       0.0%       0.7%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%
       2.003159490      99.0%       0.0%       0.0%       0.9%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%
       3.004358366      99.0%       0.0%       0.0%       0.9%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%
       4.005592436      98.9%       0.0%       0.0%       0.9%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%


  # perf stat -e cpu-clock,cputime/system/,cputime/user/,cputime/idle/ -a sleep 10

   Performance counter stats for 'system wide':

       240070.828221      cpu-clock (msec)          #   23.999 CPUs utilized          
     208,910,979,120 ns   cputime/system/           #     87.0% System                
      20,589,603,359 ns   cputime/user/             #      8.6% User                  
       8,813,416,821 ns   cputime/idle/             #      3.7% Idle                  

        10.003261054 seconds time elapsed


  # perf stat -e cpu-clock,cputime/system/,cputime/user/ yes > /dev/null
  ^Cyes: Interrupt

   Performance counter stats for 'yes':

         3483.824364      cpu-clock (msec)          #    1.000 CPUs utilized          
       2,460,117,205 ns   cputime/system/           #     70.6% System                
       1,018,360,669 ns   cputime/user/             #     29.2% User                  

         3.484554149 seconds time elapsed

         1.018525000 seconds user
         2.460515000 seconds sys

  # perf stat --top -I 1000 --interval-clear
  # perf stat --top -I 1000 --interval-clear --per-core
  # perf stat --top -I 1000 --interval-clear --per-socket
  # perf stat --top -I 1000 --interval-clear -A

It's also available in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/fixes

thanks,
jirka


[1] https://marc.info/?l=linux-kernel&m=152397251027433&w=2
---
Jiri Olsa (3):
      perf/cputime: Add cputime pmu
      perf/cputime: Fix idle time on NO_HZ config
      perf stat: Add cputime metric support

 include/linux/perf_event.h             |   2 ++
 include/linux/tick.h                   |   1 +
 kernel/events/Makefile                 |   2 +-
 kernel/events/core.c                   |   1 +
 kernel/events/cputime.c                | 221 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/time/tick-sched.c               |  11 ++++++
 tools/perf/Documentation/perf-stat.txt |  65 +++++++++++++++++++++++++++++++++
 tools/perf/builtin-stat.c              |  47 ++++++++++++++++++++++++
 tools/perf/util/stat-shadow.c          |  72 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/stat.c                 |  10 ++++++
 tools/perf/util/stat.h                 |  10 ++++++
 11 files changed, 441 insertions(+), 1 deletion(-)
 create mode 100644 kernel/events/cputime.c

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] perf/cputime: Add cputime pmu
  2018-11-11 21:04 [PATCHv2 0/3] perf: Add cputime events/metrics Jiri Olsa
@ 2018-11-11 21:04 ` Jiri Olsa
  2018-11-11 21:04 ` [PATCH 2/3] perf/cputime: Fix idle time on NO_HZ config Jiri Olsa
  2018-11-11 21:04 ` [PATCH 3/3] perf stat: Add cputime metric support Jiri Olsa
  2 siblings, 0 replies; 4+ messages in thread
From: Jiri Olsa @ 2018-11-11 21:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Stephane Eranian, Milian Wolff, Andi Kleen, Frederic Weisbecker

The CPUTIME_* counters account the time for CPU runtimes.
Adding 'cputime' PMU, that provides perf interface to those
counters.

The 'cputime' interface is standard software PMU, that provides
following events, meassuring their CPUTIME counterparts:

  PERF_CPUTIME_USER       - CPUTIME_USER
  PERF_CPUTIME_NICE       - CPUTIME_NICE
  PERF_CPUTIME_SYSTEM     - CPUTIME_SYSTEM
  PERF_CPUTIME_SOFTIRQ    - CPUTIME_SOFTIRQ
  PERF_CPUTIME_IRQ        - CPUTIME_IRQ
  PERF_CPUTIME_IDLE       - CPUTIME_IDLE
  PERF_CPUTIME_IOWAIT     - CPUTIME_IOWAIT
  PERF_CPUTIME_STEAL      - CPUTIME_STEAL
  PERF_CPUTIME_GUEST      - CPUTIME_GUEST
  PERF_CPUTIME_GUEST_NICE - CPUTIME_GUEST_NICE

The 'cputime' PMU adds 'events' and 'format' directory,
with above events specifics.

It can be used via perf tool like:

  # perf stat -e cputime/system/,cputime/user/ yes > /dev/null
  ^Cyes: Interrupt

   Performance counter stats for 'yes':

       2,177,550,368 ns   cputime/system/
         567,029,895 ns   cputime/user/

         2.749451438 seconds time elapsed

         0.567127000 seconds user
         2.177924000 seconds sys

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/perf_event.h |   2 +
 kernel/events/Makefile     |   2 +-
 kernel/events/core.c       |   1 +
 kernel/events/cputime.c    | 198 +++++++++++++++++++++++++++++++++++++
 4 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 kernel/events/cputime.c

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f0ca79..47a31d01df5a 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1162,6 +1162,8 @@ static inline int perf_callchain_store(struct perf_callchain_entry_ctx *ctx, u64
 	}
 }
 
+extern int __init perf_cputime_register(void);
+
 extern int sysctl_perf_event_paranoid;
 extern int sysctl_perf_event_mlock;
 extern int sysctl_perf_event_sample_rate;
diff --git a/kernel/events/Makefile b/kernel/events/Makefile
index 3c022e33c109..02271b8433a7 100644
--- a/kernel/events/Makefile
+++ b/kernel/events/Makefile
@@ -3,7 +3,7 @@ ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_core.o = $(CC_FLAGS_FTRACE)
 endif
 
-obj-y := core.o ring_buffer.o callchain.o
+obj-y := core.o ring_buffer.o callchain.o cputime.o
 
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_UPROBES) += uprobes.o
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 84530ab358c3..7403a27363f8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11723,6 +11723,7 @@ void __init perf_event_init(void)
 	perf_pmu_register(&perf_cpu_clock, NULL, -1);
 	perf_pmu_register(&perf_task_clock, NULL, -1);
 	perf_tp_register();
+	perf_cputime_register();
 	perf_event_init_cpu(smp_processor_id());
 	register_reboot_notifier(&perf_reboot_notifier);
 
diff --git a/kernel/events/cputime.c b/kernel/events/cputime.c
new file mode 100644
index 000000000000..efad24543f13
--- /dev/null
+++ b/kernel/events/cputime.c
@@ -0,0 +1,198 @@
+#include <linux/kernel_stat.h>
+#include <linux/sched.h>
+#include <linux/perf_event.h>
+
+enum perf_cputime_id {
+	PERF_CPUTIME_USER,
+	PERF_CPUTIME_NICE,
+	PERF_CPUTIME_SYSTEM,
+	PERF_CPUTIME_SOFTIRQ,
+	PERF_CPUTIME_IRQ,
+	PERF_CPUTIME_IDLE,
+	PERF_CPUTIME_IOWAIT,
+	PERF_CPUTIME_STEAL,
+	PERF_CPUTIME_GUEST,
+	PERF_CPUTIME_GUEST_NICE,
+	PERF_CPUTIME_MAX,
+};
+
+static enum cpu_usage_stat map[PERF_CPUTIME_MAX] = {
+	[PERF_CPUTIME_USER]		= CPUTIME_USER,
+	[PERF_CPUTIME_NICE]		= CPUTIME_NICE,
+	[PERF_CPUTIME_SYSTEM]		= CPUTIME_SYSTEM,
+	[PERF_CPUTIME_SOFTIRQ]		= CPUTIME_SOFTIRQ,
+	[PERF_CPUTIME_IRQ]		= CPUTIME_IRQ,
+	[PERF_CPUTIME_IDLE]		= CPUTIME_IDLE,
+	[PERF_CPUTIME_IOWAIT]		= CPUTIME_IOWAIT,
+	[PERF_CPUTIME_STEAL]		= CPUTIME_STEAL,
+	[PERF_CPUTIME_GUEST]		= CPUTIME_GUEST,
+	[PERF_CPUTIME_GUEST_NICE]	= CPUTIME_GUEST_NICE,
+};
+
+PMU_FORMAT_ATTR(event, "config:0-63");
+
+static struct attribute *cputime_format_attrs[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group cputime_format_attr_group = {
+	.name = "format",
+	.attrs = cputime_format_attrs,
+};
+
+static ssize_t
+cputime_event_attr_show(struct device *dev, struct device_attribute *attr,
+		    char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr =
+		container_of(attr, struct perf_pmu_events_attr, attr);
+
+	return sprintf(page, "event=%llu\n", pmu_attr->id);
+}
+
+#define __A(__n, __e)					\
+	PMU_EVENT_ATTR(__n, cputime_attr_##__n,		\
+		       __e, cputime_event_attr_show);	\
+	PMU_EVENT_ATTR_STRING(__n.unit,			\
+			cputime_attr_##__n##_unit, "ns");
+
+__A(user,	PERF_CPUTIME_USER)
+__A(nice,	PERF_CPUTIME_NICE)
+__A(system,	PERF_CPUTIME_SYSTEM)
+__A(softirq,	PERF_CPUTIME_SOFTIRQ)
+__A(irq,	PERF_CPUTIME_IRQ)
+__A(idle,	PERF_CPUTIME_IDLE)
+__A(iowait,	PERF_CPUTIME_IOWAIT)
+__A(steal,	PERF_CPUTIME_STEAL)
+__A(guest,	PERF_CPUTIME_GUEST)
+__A(guest_nice,	PERF_CPUTIME_GUEST_NICE)
+
+#undef __A
+
+static struct attribute *cputime_events_attrs[] = {
+#define __A(__n)				\
+	&cputime_attr_##__n.attr.attr,		\
+	&cputime_attr_##__n##_unit.attr.attr,
+
+	__A(user)
+	__A(nice)
+	__A(system)
+	__A(softirq)
+	__A(irq)
+	__A(idle)
+	__A(iowait)
+	__A(steal)
+	__A(guest)
+	__A(guest_nice)
+
+	NULL,
+
+#undef __A
+};
+
+static struct attribute_group cputime_events_attr_group = {
+	.name = "events",
+	.attrs = cputime_events_attrs,
+};
+
+static const struct attribute_group *cputime_attr_groups[] = {
+	&cputime_format_attr_group,
+	&cputime_events_attr_group,
+	NULL,
+};
+
+static u64 cputime_read_counter(struct perf_event *event)
+{
+	int cpu = event->oncpu;
+
+	return kcpustat_cpu(cpu).cpustat[event->hw.config];
+}
+
+static void perf_cputime_update(struct perf_event *event)
+{
+	u64 prev, now;
+	s64 delta;
+
+	/* Careful, an NMI might modify the previous event value: */
+again:
+	prev = local64_read(&event->hw.prev_count);
+	now = cputime_read_counter(event);
+
+	if (local64_cmpxchg(&event->hw.prev_count, prev, now) != prev)
+		goto again;
+
+	delta = now - prev;
+	local64_add(delta, &event->count);
+}
+
+static void cputime_event_start(struct perf_event *event, int flags)
+{
+	u64 now = cputime_read_counter(event);
+
+	local64_set(&event->hw.prev_count, now);
+}
+
+static void cputime_event_stop(struct perf_event *event, int flags)
+{
+	perf_cputime_update(event);
+}
+
+static int cputime_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		cputime_event_start(event, flags);
+
+	return 0;
+}
+
+static void cputime_event_del(struct perf_event *event, int flags)
+{
+	cputime_event_stop(event, PERF_EF_UPDATE);
+}
+
+static void perf_cputime_read(struct perf_event *event)
+{
+	perf_cputime_update(event);
+}
+
+static int cputime_event_init(struct perf_event *event)
+{
+	u64 cfg = event->attr.config;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* unsupported modes and filters */
+	if (event->attr.exclude_user   ||
+	    event->attr.exclude_kernel ||
+	    event->attr.exclude_hv     ||
+	    event->attr.exclude_idle   ||
+	    event->attr.exclude_host   ||
+	    event->attr.exclude_guest  ||
+	    event->attr.sample_period) /* no sampling */
+		return -EINVAL;
+
+	if (cfg >= PERF_CPUTIME_MAX)
+		return -EINVAL;
+
+	event->hw.config = map[cfg];
+	return 0;
+}
+
+static struct pmu perf_cputime = {
+	.task_ctx_nr	= perf_sw_context,
+	.attr_groups	= cputime_attr_groups,
+	.capabilities	= PERF_PMU_CAP_NO_INTERRUPT,
+	.event_init	= cputime_event_init,
+	.add		= cputime_event_add,
+	.del		= cputime_event_del,
+	.start		= cputime_event_start,
+	.stop		= cputime_event_stop,
+	.read		= perf_cputime_read,
+};
+
+int __init perf_cputime_register(void)
+{
+	return perf_pmu_register(&perf_cputime, "cputime", -1);
+}
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] perf/cputime: Fix idle time on NO_HZ config
  2018-11-11 21:04 [PATCHv2 0/3] perf: Add cputime events/metrics Jiri Olsa
  2018-11-11 21:04 ` [PATCH 1/3] perf/cputime: Add cputime pmu Jiri Olsa
@ 2018-11-11 21:04 ` Jiri Olsa
  2018-11-11 21:04 ` [PATCH 3/3] perf stat: Add cputime metric support Jiri Olsa
  2 siblings, 0 replies; 4+ messages in thread
From: Jiri Olsa @ 2018-11-11 21:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Stephane Eranian, Milian Wolff, Andi Kleen, Frederic Weisbecker

In case there's NO_HZ enabled we won't get proper numbers
for idle counter if the tick is disabled on the cpu.

Making up for it by counting the idle counter value if
the tick is stopped, which will keep the counter number
updated at the time it is read.

Link: http://lkml.kernel.org/n/tip-rw8kylf86mkfv60blwu5iyqr@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/tick.h     |  1 +
 kernel/events/cputime.c  | 25 ++++++++++++++++++++++++-
 kernel/time/tick-sched.c | 11 +++++++++++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 55388ab45fd4..17aaaae18a3c 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -125,6 +125,7 @@ extern bool tick_nohz_idle_got_tick(void);
 extern ktime_t tick_nohz_get_sleep_length(ktime_t *delta_next);
 extern unsigned long tick_nohz_get_idle_calls(void);
 extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
+extern unsigned long tick_nohz_get_idle_jiffies_cpu(int cpu);
 extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
 extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
 
diff --git a/kernel/events/cputime.c b/kernel/events/cputime.c
index efad24543f13..c49e73713263 100644
--- a/kernel/events/cputime.c
+++ b/kernel/events/cputime.c
@@ -1,6 +1,7 @@
 #include <linux/kernel_stat.h>
 #include <linux/sched.h>
 #include <linux/perf_event.h>
+#include <linux/tick.h>
 
 enum perf_cputime_id {
 	PERF_CPUTIME_USER,
@@ -102,11 +103,33 @@ static const struct attribute_group *cputime_attr_groups[] = {
 	NULL,
 };
 
+#ifdef CONFIG_NO_HZ_COMMON
+static u64 idle_fix(int cpu)
+{
+	u64 ticks;
+
+	if (!tick_nohz_tick_stopped_cpu(cpu))
+		return 0;
+
+	ticks = jiffies - tick_nohz_get_idle_jiffies_cpu(cpu);
+	return ticks * TICK_NSEC;
+}
+#else
+static u64 idle_fix(int cpu)
+{
+	return 0;
+}
+#endif
+
 static u64 cputime_read_counter(struct perf_event *event)
 {
 	int cpu = event->oncpu;
+	u64 val = kcpustat_cpu(cpu).cpustat[event->hw.config];
+
+	if (event->hw.config == PERF_CPUTIME_IDLE)
+		val += idle_fix(cpu);
 
-	return kcpustat_cpu(cpu).cpustat[event->hw.config];
+	return val;
 }
 
 static void perf_cputime_update(struct perf_event *event)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 69e673b88474..c5df46e3c9f5 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1077,6 +1077,17 @@ unsigned long tick_nohz_get_idle_calls_cpu(int cpu)
 	return ts->idle_calls;
 }
 
+/**
+ * tick_nohz_get_idle_jiffies_cpu - return the current idle jiffies counter value
+ * for a particular CPU.
+ */
+unsigned long tick_nohz_get_idle_jiffies_cpu(int cpu)
+{
+	struct tick_sched *ts = tick_get_tick_sched(cpu);
+
+	return ts->idle_jiffies;
+}
+
 /**
  * tick_nohz_get_idle_calls - return the current idle calls counter value
  *
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] perf stat: Add cputime metric support
  2018-11-11 21:04 [PATCHv2 0/3] perf: Add cputime events/metrics Jiri Olsa
  2018-11-11 21:04 ` [PATCH 1/3] perf/cputime: Add cputime pmu Jiri Olsa
  2018-11-11 21:04 ` [PATCH 2/3] perf/cputime: Fix idle time on NO_HZ config Jiri Olsa
@ 2018-11-11 21:04 ` Jiri Olsa
  2 siblings, 0 replies; 4+ messages in thread
From: Jiri Olsa @ 2018-11-11 21:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Alexander Shishkin,
	Stephane Eranian, Milian Wolff, Andi Kleen, Frederic Weisbecker

Adding --top/--top-full options to provide metrics based
on the cputime PMU events. Simply all the metrics are simple
ratios of events to STAT_NSECS time to get their % value.

The --top option provides basic subset of cputime metrics:

  # perf stat --top -I 1000
  #           time       Idle     System       User        Irq    Softirq    IO wait
       1.001692690     100.0%       0.0%       0.0%       0.7%       0.2%       0.0%
       2.002994039      98.9%       0.0%       0.0%       0.9%       0.2%       0.0%
       3.004164038      98.5%       0.2%       0.2%       0.9%       0.2%       0.0%
       4.005312773      98.9%       0.0%       0.0%       0.9%       0.2%       0.0%

The --top-full option provides all cputime metrics:

  # perf stat --top-full -I 1000
  #           time       Idle     System       User        Irq    Softirq    IO wait      Guest Guest nice       Nice      Steal
       1.001750803     100.0%       0.0%       0.0%       0.7%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%
       2.003159490      99.0%       0.0%       0.0%       0.9%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%
       3.004358366      99.0%       0.0%       0.0%       0.9%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%
       4.005592436      98.9%       0.0%       0.0%       0.9%       0.2%       0.0%       0.0%       0.0%       0.0%       0.0%

Link: http://lkml.kernel.org/n/tip-zue4s78pxc1cybb954t52ks4@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-stat.txt | 65 +++++++++++++++++++++++
 tools/perf/builtin-stat.c              | 47 +++++++++++++++++
 tools/perf/util/stat-shadow.c          | 72 ++++++++++++++++++++++++++
 tools/perf/util/stat.c                 | 10 ++++
 tools/perf/util/stat.h                 | 10 ++++
 5 files changed, 204 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index b10a90b6a718..9330765b7225 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -310,6 +310,71 @@ The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
 
 Users who wants to get the actual value can apply --no-metric-only.
 
+--top::
+--top-full:
+Measure cputime PMU events and display percentage of CPU utilization rates.
+
+The --top option displays rates for following events:
+  idle system user irq softirq iowait
+
+The --top-full option displays additional rates:
+  guest guest_nice nice steal
+
+Examples:
+  # perf stat --top
+  ^C
+   Performance counter stats for 'system wide':
+
+        Idle     System       User        Irq    Softirq    IO wait
+        1.3%      89.5%       7.4%       1.8%       0.1%       0.0%
+
+         7.282332605 seconds time elapsed
+
+  # perf stat --top-full
+  ^C
+   Performance counter stats for 'system wide':
+
+        Idle     System       User        Irq    Softirq    IO wait      Guest Guest nice       Nice      Steal
+        5.4%      85.4%       8.6%       0.5%       0.1%       0.0%       0.0%       0.0%       0.0%       0.0%
+
+         7.618359683 seconds time elapsed
+
+  # perf stat --top -I 1000
+  #           time       Idle     System       User        Irq    Softirq    IO wait
+       1.000525839       5.4%      85.3%       8.8%       0.4%       0.1%       0.0%
+       2.001032632       5.1%      85.7%       8.7%       0.4%       0.1%       0.0%
+       3.001388414       5.2%      85.7%       8.6%       0.4%       0.1%       0.0%
+       4.001758697       5.7%      85.2%       8.6%       0.5%       0.1%       0.0%
+
+  # perf stat --top -I 1000 -A
+  #           time CPU           Idle     System       User        Irq    Softirq    IO wait
+       1.000485174 CPU0          6.9%      84.0%       8.6%       0.5%       0.1%       0.0%
+       1.000485174 CPU1          5.5%      84.8%       9.1%       0.5%       0.1%       0.0%
+       1.000485174 CPU2          5.5%      86.6%       7.4%       0.5%       0.1%       0.0%
+       ...
+
+  # perf stat --top -I 1000 --per-core
+  #           time core         cpus       Idle     System       User        Irq    Softirq    IO wait
+       1.000450719 S0-C0           2       4.6%      87.0%       7.9%       0.4%       0.1%       0.0%
+       1.000450719 S0-C1           2       4.8%      86.3%       8.3%       0.4%       0.1%       0.0%
+       1.000450719 S0-C2           2       5.3%      86.3%       7.8%       0.4%       0.1%       0.0%
+       1.000450719 S0-C3           2       5.2%      85.5%       8.7%       0.4%       0.1%       0.0%
+       1.000450719 S0-C4           2       4.5%      86.7%       8.3%       0.4%       0.1%       0.0%
+
+  # perf stat --top ./perf bench sched messaging -l 10000
+  ...
+     Total time: 7.089 [sec]
+
+   Performance counter stats for './perf bench sched messaging -l 10000':
+
+        Idle     System       User        Irq    Softirq    IO wait
+        0.0%      90.1%       8.9%       0.5%       0.1%       0.0%
+
+         7.186366800 seconds time elapsed
+
+        14.527066000 seconds user
+       146.254278000 seconds sys
+
 EXAMPLES
 --------
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a635abfa77b6..23e4e1b76ebb 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -133,6 +133,34 @@ static const char *smi_cost_attrs = {
 	"}"
 };
 
+static const char *top_attrs = {
+	"{"
+	"cpu-clock,"
+	"cputime/idle/,"
+	"cputime/system/,"
+	"cputime/user/,"
+	"cputime/irq/,"
+	"cputime/softirq/,"
+	"cputime/iowait/"
+	"}"
+};
+
+static const char *top_full_attrs = {
+	"{"
+	"cpu-clock,"
+	"cputime/idle/,"
+	"cputime/system/,"
+	"cputime/user/,"
+	"cputime/irq/,"
+	"cputime/softirq/,"
+	"cputime/iowait/,"
+	"cputime/guest/,"
+	"cputime/guest_nice/,"
+	"cputime/nice/,"
+	"cputime/steal/"
+	"}"
+};
+
 static struct perf_evlist	*evsel_list;
 
 static struct target target = {
@@ -145,6 +173,8 @@ static volatile pid_t		child_pid			= -1;
 static int			detailed_run			=  0;
 static bool			transaction_run;
 static bool			topdown_run			= false;
+static bool			top_run				= false;
+static bool			top_run_full			= false;
 static bool			smi_cost			= false;
 static bool			smi_reset			= false;
 static int			big_num_opt			=  -1;
@@ -786,6 +816,8 @@ static const struct option stat_options[] = {
 	OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
 		     "monitor specified metrics or metric groups (separated by ,)",
 		     parse_metric_groups),
+	OPT_BOOLEAN(0, "top", &top_run, "show CPU utilization"),
+	OPT_BOOLEAN(0, "top-full", &top_run_full, "show extended CPU utilization"),
 	OPT_END()
 };
 
@@ -1186,6 +1218,21 @@ static int add_default_attributes(void)
 		return 0;
 	}
 
+	if (top_run || top_run_full) {
+		const char *attrs = top_run ? top_attrs : top_full_attrs;
+
+		err = parse_events(evsel_list, attrs, &errinfo);
+		if (err) {
+			fprintf(stderr, "Cannot set up cputime events\n");
+			parse_events_print_error(&errinfo, attrs);
+			return -1;
+		}
+		if (!force_metric_only)
+			stat_config.metric_only = true;
+		stat_config.metric_only_len = 10;
+		return 0;
+	}
+
 	if (smi_cost) {
 		int smi;
 
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 8ad32763cfff..7e24b042d0d2 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -760,6 +760,46 @@ static void generic_metric(struct perf_stat_config *config,
 		print_metric(config, ctxp, NULL, NULL, "", 0);
 }
 
+static void cputime_color_name(struct perf_evsel *evsel,
+			       const char **color, const char **name,
+			       double ratio)
+{
+	if (perf_stat_evsel__is(evsel, CPUTIME_IDLE)) {
+		if (ratio < 0.8)
+			*color = PERF_COLOR_GREEN;
+		if (ratio < 0.5)
+			*color = PERF_COLOR_RED;
+		*name = "Idle";
+		return;
+	}
+
+	if (ratio > (MIN_GREEN / 100))
+		*color = PERF_COLOR_GREEN;
+	if (ratio > (MIN_RED / 100))
+		*color = PERF_COLOR_RED;
+
+	if (perf_stat_evsel__is(evsel, CPUTIME_GUEST))
+		*name = "Guest";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_GUEST_NICE))
+		*name = "Guest nice";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_IOWAIT))
+		*name = "IO wait";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_IRQ))
+		*name = "Irq";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_NICE))
+		*name = "Nice";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_SOFTIRQ))
+		*name = "Softirq";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_STEAL))
+		*name = "Steal";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_SYSTEM))
+		*name = "System";
+	else if (perf_stat_evsel__is(evsel, CPUTIME_USER))
+		*name = "User";
+	else
+		*name = "unknown";
+}
+
 void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				   struct perf_evsel *evsel,
 				   double avg, int cpu,
@@ -970,6 +1010,38 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					be_bound * 100.);
 		else
 			print_metric(config, ctxp, NULL, NULL, name, 0);
+	} else if (perf_stat_evsel__is(evsel, CPUTIME_GUEST)      ||
+		   perf_stat_evsel__is(evsel, CPUTIME_GUEST_NICE) ||
+		   perf_stat_evsel__is(evsel, CPUTIME_IDLE)       ||
+		   perf_stat_evsel__is(evsel, CPUTIME_IOWAIT)     ||
+		   perf_stat_evsel__is(evsel, CPUTIME_IRQ)        ||
+		   perf_stat_evsel__is(evsel, CPUTIME_NICE)       ||
+		   perf_stat_evsel__is(evsel, CPUTIME_SOFTIRQ)    ||
+		   perf_stat_evsel__is(evsel, CPUTIME_STEAL)      ||
+		   perf_stat_evsel__is(evsel, CPUTIME_SYSTEM)     ||
+		   perf_stat_evsel__is(evsel, CPUTIME_USER)) {
+
+		const char *name = NULL;
+
+		total = runtime_stat_avg(st, STAT_NSECS, ctx, cpu);
+		/* STAT_NSECS is usec, cputime in nsec, converting */
+		total *= 1e6;
+
+		if (total)
+			ratio = avg / total;
+
+		cputime_color_name(evsel, &color, &name, ratio);
+
+		/*
+		 * The cputime meassures are tricky, we can easily get some noise
+		 * over 100% ... so let's be proactive and don't confuse users ;-)
+		 */
+		ratio = min(1., ratio);
+
+		if (total)
+			print_metric(config, ctxp, color, "%8.1f%%", name, ratio * 100.);
+		else
+			print_metric(config, ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
 		generic_metric(config, evsel->metric_expr, evsel->metric_events, evsel->name,
 				evsel->metric_name, avg, cpu, out, st);
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 4d40515307b8..c07d97083333 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -89,6 +89,16 @@ static const char *id_str[PERF_STAT_EVSEL_ID__MAX] = {
 	ID(TOPDOWN_RECOVERY_BUBBLES, topdown-recovery-bubbles),
 	ID(SMI_NUM, msr/smi/),
 	ID(APERF, msr/aperf/),
+	ID(CPUTIME_GUEST,	cputime/guest/),
+	ID(CPUTIME_GUEST_NICE,	cputime/guest_nice/),
+	ID(CPUTIME_IDLE,	cputime/idle/),
+	ID(CPUTIME_IOWAIT,	cputime/iowait/),
+	ID(CPUTIME_IRQ,		cputime/irq/),
+	ID(CPUTIME_NICE,	cputime/nice/),
+	ID(CPUTIME_SOFTIRQ,	cputime/softirq/),
+	ID(CPUTIME_STEAL,	cputime/steal/),
+	ID(CPUTIME_SYSTEM,	cputime/system/),
+	ID(CPUTIME_USER,	cputime/user/),
 };
 #undef ID
 
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 2f9c9159a364..f2582d89ef35 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -31,6 +31,16 @@ enum perf_stat_evsel_id {
 	PERF_STAT_EVSEL_ID__TOPDOWN_RECOVERY_BUBBLES,
 	PERF_STAT_EVSEL_ID__SMI_NUM,
 	PERF_STAT_EVSEL_ID__APERF,
+	PERF_STAT_EVSEL_ID__CPUTIME_GUEST,
+	PERF_STAT_EVSEL_ID__CPUTIME_GUEST_NICE,
+	PERF_STAT_EVSEL_ID__CPUTIME_IDLE,
+	PERF_STAT_EVSEL_ID__CPUTIME_IOWAIT,
+	PERF_STAT_EVSEL_ID__CPUTIME_IRQ,
+	PERF_STAT_EVSEL_ID__CPUTIME_NICE,
+	PERF_STAT_EVSEL_ID__CPUTIME_SOFTIRQ,
+	PERF_STAT_EVSEL_ID__CPUTIME_STEAL,
+	PERF_STAT_EVSEL_ID__CPUTIME_SYSTEM,
+	PERF_STAT_EVSEL_ID__CPUTIME_USER,
 	PERF_STAT_EVSEL_ID__MAX,
 };
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-11-11 21:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-11 21:04 [PATCHv2 0/3] perf: Add cputime events/metrics Jiri Olsa
2018-11-11 21:04 ` [PATCH 1/3] perf/cputime: Add cputime pmu Jiri Olsa
2018-11-11 21:04 ` [PATCH 2/3] perf/cputime: Fix idle time on NO_HZ config Jiri Olsa
2018-11-11 21:04 ` [PATCH 3/3] perf stat: Add cputime metric support Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.