linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] perf stat: avoid 10ms limit for printing event counts
@ 2018-03-27  8:09 Alexey Budankov
  2018-03-27  9:06 ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Alexey Budankov @ 2018-03-27  8:09 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
  Cc: Alexander Shishkin, Jiri Olsa, Namhyung Kim, Andi Kleen, linux-kernel


Currently print count interval for performance counters values is 
limited by 10ms so reading the values at frequencies higher than 100Hz 
is restricted by the tool.

This change avoids that limitation and makes perf stat -I possible 
on frequencies up to 1KHz and, to some extent, makes perf stat -I 
to be on-par with perf record sampling profiling.

When running perf stat -I for monitoring e.g. PCIe uncore counters and 
at the same time profiling some I/O workload by perf record e.g. for 
cpu-cycles and context switches, it is then possible to build and 
observe good-enough consolidated CPU/OS/IO(Uncore) performance picture 
for that workload.

The warning on possible runtime overhead is still preserved, however 
it is only visible when specifying -v option.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-stat.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f5c454855908..316607edd238 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1943,7 +1943,7 @@ static const struct option stat_options[] = {
 	OPT_STRING(0, "post", &post_cmd, "command",
 			"command to run after to the measured command"),
 	OPT_UINTEGER('I', "interval-print", &stat_config.interval,
-		    "print counts at regular interval in ms (>= 10)"),
+		    "print counts at regular interval in ms."),
 	OPT_INTEGER(0, "interval-count", &stat_config.times,
 		    "print counts for fixed number of times"),
 	OPT_UINTEGER(0, "timeout", &stat_config.timeout,
@@ -2924,14 +2924,9 @@ int cmd_stat(int argc, const char **argv)
 	}
 
 	if (interval && interval < 100) {
-		if (interval < 10) {
-			pr_err("print interval must be >= 10ms\n");
-			parse_options_usage(stat_usage, stat_options, "I", 1);
-			goto out;
-		} else
-			pr_warning("print interval < 100ms. "
-				   "The overhead percentage could be high in some cases. "
-				   "Please proceed with caution.\n");
+		pr_warning("print interval < 100ms. "
+			   "The overhead percentage could be high in some cases. "
+			   "Please proceed with caution.\n");
 	}
 
 	if (stat_config.times && interval)

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] perf stat: avoid 10ms limit for printing event counts
  2018-03-27  8:09 [PATCH v1] perf stat: avoid 10ms limit for printing event counts Alexey Budankov
@ 2018-03-27  9:06 ` Andi Kleen
  2018-03-27 10:55   ` Alexey Budankov
  2018-03-27 11:40   ` Alexey Budankov
  0 siblings, 2 replies; 6+ messages in thread
From: Andi Kleen @ 2018-03-27  9:06 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-kernel

> When running perf stat -I for monitoring e.g. PCIe uncore counters and 
> at the same time profiling some I/O workload by perf record e.g. for 
> cpu-cycles and context switches, it is then possible to build and 
> observe good-enough consolidated CPU/OS/IO(Uncore) performance picture 
> for that workload.

At some point I still hope we can make uncore measurements in 
perf record work. Kan tried at some point to allow multiple
PMUs in a group, but was not successfull. But perhaps we
can sample them from a software event instead.

> 
> The warning on possible runtime overhead is still preserved, however 
> it is only visible when specifying -v option.

I would print it unconditionally. Very few people use -v.

BTW better of course would be to occasionally measure the perf stat 
cpu time and only print the warning if it's above some percentage
of a CPU. But that would be much more work.

Rest looks ok.


-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] perf stat: avoid 10ms limit for printing event counts
  2018-03-27  9:06 ` Andi Kleen
@ 2018-03-27 10:55   ` Alexey Budankov
  2018-03-27 11:40   ` Alexey Budankov
  1 sibling, 0 replies; 6+ messages in thread
From: Alexey Budankov @ 2018-03-27 10:55 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-kernel

On 27.03.2018 12:06, Andi Kleen wrote:
>> When running perf stat -I for monitoring e.g. PCIe uncore counters and 
>> at the same time profiling some I/O workload by perf record e.g. for 
>> cpu-cycles and context switches, it is then possible to build and 
>> observe good-enough consolidated CPU/OS/IO(Uncore) performance picture 
>> for that workload.
> 
> At some point I still hope we can make uncore measurements in 
> perf record work. Kan tried at some point to allow multiple
> PMUs in a group, but was not successfull. But perhaps we
> can sample them from a software event instead.
> 
>>
>> The warning on possible runtime overhead is still preserved, however 
>> it is only visible when specifying -v option.
> 
> I would print it unconditionally. Very few people use -v.

If there is no objections I will resend the updated version.

Thanks,
Alexey

> 
> BTW better of course would be to occasionally measure the perf stat 
> cpu time and only print the warning if it's above some percentage
> of a CPU. But that would be much more work.
> 
> Rest looks ok.
> 
> 
> -Andi
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] perf stat: avoid 10ms limit for printing event counts
  2018-03-27  9:06 ` Andi Kleen
  2018-03-27 10:55   ` Alexey Budankov
@ 2018-03-27 11:40   ` Alexey Budankov
  2018-03-27 11:59     ` Andi Kleen
  1 sibling, 1 reply; 6+ messages in thread
From: Alexey Budankov @ 2018-03-27 11:40 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-kernel

On 27.03.2018 12:06, Andi Kleen wrote:
>> When running perf stat -I for monitoring e.g. PCIe uncore counters and 
>> at the same time profiling some I/O workload by perf record e.g. for 
>> cpu-cycles and context switches, it is then possible to build and 
>> observe good-enough consolidated CPU/OS/IO(Uncore) performance picture 
>> for that workload.
> 
> At some point I still hope we can make uncore measurements in 
> perf record work. Kan tried at some point to allow multiple
> PMUs in a group, but was not successfull. But perhaps we
> can sample them from a software event instead.
> 
>>
>> The warning on possible runtime overhead is still preserved, however 
>> it is only visible when specifying -v option.
> 
> I would print it unconditionally. Very few people use -v.
> 
> BTW better of course would be to occasionally measure the perf stat 
> cpu time and only print the warning if it's above some percentage
> of a CPU. But that would be much more work.

Would you please elaborate more on that?

Thanks,
Alexey

> 
> Rest looks ok.
> 
> 
> -Andi
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] perf stat: avoid 10ms limit for printing event counts
  2018-03-27 11:40   ` Alexey Budankov
@ 2018-03-27 11:59     ` Andi Kleen
  2018-03-27 16:27       ` Alexey Budankov
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2018-03-27 11:59 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-kernel

On Tue, Mar 27, 2018 at 02:40:29PM +0300, Alexey Budankov wrote:
> On 27.03.2018 12:06, Andi Kleen wrote:
> >> When running perf stat -I for monitoring e.g. PCIe uncore counters and 
> >> at the same time profiling some I/O workload by perf record e.g. for 
> >> cpu-cycles and context switches, it is then possible to build and 
> >> observe good-enough consolidated CPU/OS/IO(Uncore) performance picture 
> >> for that workload.
> > 
> > At some point I still hope we can make uncore measurements in 
> > perf record work. Kan tried at some point to allow multiple
> > PMUs in a group, but was not successfull. But perhaps we
> > can sample them from a software event instead.
> > 
> >>
> >> The warning on possible runtime overhead is still preserved, however 
> >> it is only visible when specifying -v option.
> > 
> > I would print it unconditionally. Very few people use -v.
> > 
> > BTW better of course would be to occasionally measure the perf stat 
> > cpu time and only print the warning if it's above some percentage
> > of a CPU. But that would be much more work.
> 
> Would you please elaborate more on that?

getrusage() can give you the system+user time of the current process.
If you compare that to wall time you know the percentage.

Could measure those occasionally (not every interval, but perhaps
once per second or so). If the overhead reaches a reasonable percentage (5%
perhaps?) print the warning once.

One problem is th the measurement doesn't inlude time in the remote
IPIs for reading performance counters on other CPUs.  So if the system
is very large it may be less and less accurate. But maybe it's a good
enough proxy.

Or in theory could fix the kernel to charge this somehow to the process
that triggered the IPIs, but that would be another project.

Another problem is that it doesn't account for burstiness. Maybe
the problem is not the smoothed average of CPU time, but bursts
competing with the original workload. There's probably no easy
solution for that.

Also if the CPU perf stat runs on is idle it of course doesn't matter.
Getting that would require reading /proc, which would be much more 
expensive so probably not a good idea. As a proxy you could check
the involuntary context switches (also reported by getrusage),
and if they don't cross some threshold then don't warn)

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] perf stat: avoid 10ms limit for printing event counts
  2018-03-27 11:59     ` Andi Kleen
@ 2018-03-27 16:27       ` Alexey Budankov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexey Budankov @ 2018-03-27 16:27 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-kernel

On 27.03.2018 14:59, Andi Kleen wrote:
> On Tue, Mar 27, 2018 at 02:40:29PM +0300, Alexey Budankov wrote:
>> On 27.03.2018 12:06, Andi Kleen wrote:
>>>> When running perf stat -I for monitoring e.g. PCIe uncore counters and 
>>>> at the same time profiling some I/O workload by perf record e.g. for 
>>>> cpu-cycles and context switches, it is then possible to build and 
>>>> observe good-enough consolidated CPU/OS/IO(Uncore) performance picture 
>>>> for that workload.
>>>
>>> At some point I still hope we can make uncore measurements in 
>>> perf record work. Kan tried at some point to allow multiple
>>> PMUs in a group, but was not successfull. But perhaps we
>>> can sample them from a software event instead.
>>>
>>>>
>>>> The warning on possible runtime overhead is still preserved, however 
>>>> it is only visible when specifying -v option.
>>>
>>> I would print it unconditionally. Very few people use -v.

Thought it thru more. Printing the warning doesn't make sense in case 
you have output to the console because you quickly get your screen 
scrolled down. If the interval is small you may even skip it at all 
regardless of -v option.

It turns out that the right place to say about possible overhead is
in the help message generated by perf stat -h.

Thanks,
Alexey

>>>
>>> BTW better of course would be to occasionally measure the perf stat 
>>> cpu time and only print the warning if it's above some percentage
>>> of a CPU. But that would be much more work.
>>
>> Would you please elaborate more on that?
> 
> getrusage() can give you the system+user time of the current process.
> If you compare that to wall time you know the percentage.
> 
> Could measure those occasionally (not every interval, but perhaps
> once per second or so). If the overhead reaches a reasonable percentage (5%
> perhaps?) print the warning once.
> 
> One problem is th the measurement doesn't inlude time in the remote
> IPIs for reading performance counters on other CPUs.  So if the system
> is very large it may be less and less accurate. But maybe it's a good
> enough proxy.
> 
> Or in theory could fix the kernel to charge this somehow to the process
> that triggered the IPIs, but that would be another project.
> 
> Another problem is that it doesn't account for burstiness. Maybe
> the problem is not the smoothed average of CPU time, but bursts
> competing with the original workload. There's probably no easy
> solution for that.
> 
> Also if the CPU perf stat runs on is idle it of course doesn't matter.
> Getting that would require reading /proc, which would be much more 
> expensive so probably not a good idea. As a proxy you could check
> the involuntary context switches (also reported by getrusage),
> and if they don't cross some threshold then don't warn)
> 
> -Andi
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-03-27 16:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-27  8:09 [PATCH v1] perf stat: avoid 10ms limit for printing event counts Alexey Budankov
2018-03-27  9:06 ` Andi Kleen
2018-03-27 10:55   ` Alexey Budankov
2018-03-27 11:40   ` Alexey Budankov
2018-03-27 11:59     ` Andi Kleen
2018-03-27 16:27       ` Alexey Budankov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).