linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] perf stat: refactor __run_perf_stat common code
@ 2022-07-29 16:12 Adrián Herrera Arcila
  2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila
  2022-08-01 11:27 ` [PATCH 1/2] perf stat: refactor __run_perf_stat common code Leo Yan
  0 siblings, 2 replies; 12+ messages in thread
From: Adrián Herrera Arcila @ 2022-07-29 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, acme
  Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung,
	leo.yan, songliubraving, james.clark, Adrián Herrera Arcila

This extracts common code from the branches of the forks if-then-else.
enable_counters(), which was at the beginning of both branches of the
conditional, is now unconditional; evlist__start_workload is extracted
to a different if, which enables making the common clocking code
unconditional.

Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
---
 tools/perf/builtin-stat.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d2ecd4d29624..318ffd119dad 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -966,18 +966,18 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			return err;
 	}
 
-	/*
-	 * Enable counters and exec the command:
-	 */
-	if (forks) {
-		err = enable_counters();
-		if (err)
-			return -1;
+	err = enable_counters();
+	if (err)
+		return -1;
+
+	/* Exec the command, if any */
+	if (forks)
 		evlist__start_workload(evsel_list);
 
-		t0 = rdclock();
-		clock_gettime(CLOCK_MONOTONIC, &ref_time);
+	t0 = rdclock();
+	clock_gettime(CLOCK_MONOTONIC, &ref_time);
 
+	if (forks) {
 		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
 			status = dispatch_events(forks, timeout, interval, &times);
 		if (child_pid != -1) {
@@ -995,13 +995,6 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		if (WIFSIGNALED(status))
 			psignal(WTERMSIG(status), argv[0]);
 	} else {
-		err = enable_counters();
-		if (err)
-			return -1;
-
-		t0 = rdclock();
-		clock_gettime(CLOCK_MONOTONIC, &ref_time);
-
 		status = dispatch_events(forks, timeout, interval, &times);
 	}
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-07-29 16:12 [PATCH 1/2] perf stat: refactor __run_perf_stat common code Adrián Herrera Arcila
@ 2022-07-29 16:12 ` Adrián Herrera Arcila
  2022-08-01  8:20   ` James Clark
  2022-08-01 11:45   ` Leo Yan
  2022-08-01 11:27 ` [PATCH 1/2] perf stat: refactor __run_perf_stat common code Leo Yan
  1 sibling, 2 replies; 12+ messages in thread
From: Adrián Herrera Arcila @ 2022-07-29 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, acme
  Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung,
	leo.yan, songliubraving, james.clark, Adrián Herrera Arcila

The described --delay behaviour is to delay the enablement of events, but
not the execution of the command, if one is passed, which is incorrectly
the current behaviour.

This patch decouples the enablement from the delay, and enables events
before or after launching the workload dependent on the options passed
by the user. This code structure is inspired by that in perf-record, and
tries to be consistent with it.

Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
---
 tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
 1 file changed, 32 insertions(+), 24 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 318ffd119dad..f98c823b16dd 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -559,7 +559,7 @@ static bool handle_interval(unsigned int interval, int *times)
 	return false;
 }
 
-static int enable_counters(void)
+static int enable_bpf_counters(void)
 {
 	struct evsel *evsel;
 	int err;
@@ -572,28 +572,6 @@ static int enable_counters(void)
 		if (err)
 			return err;
 	}
-
-	if (stat_config.initial_delay < 0) {
-		pr_info(EVLIST_DISABLED_MSG);
-		return 0;
-	}
-
-	if (stat_config.initial_delay > 0) {
-		pr_info(EVLIST_DISABLED_MSG);
-		usleep(stat_config.initial_delay * USEC_PER_MSEC);
-	}
-
-	/*
-	 * We need to enable counters only if:
-	 * - we don't have tracee (attaching to task or cpu)
-	 * - we have initial delay configured
-	 */
-	if (!target__none(&target) || stat_config.initial_delay) {
-		if (!all_counters_use_bpf)
-			evlist__enable(evsel_list);
-		if (stat_config.initial_delay > 0)
-			pr_info(EVLIST_ENABLED_MSG);
-	}
 	return 0;
 }
 
@@ -966,10 +944,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 			return err;
 	}
 
-	err = enable_counters();
+	err = enable_bpf_counters();
 	if (err)
 		return -1;
 
+	/*
+	 * Enable events manually here if perf-stat is run:
+	 * 1. with a target (any of --all-cpus, --cpu, --pid or --tid)
+	 * 2. without measurement delay (no --delay)
+	 * 3. without all events associated to BPF
+	 *
+	 * This is because if run with a target, events are not enabled
+	 * on exec if a workload is passed, and because there is no delay
+	 * we ensure to enable them before the workload starts
+	 */
+	if (!target__none(&target) && !stat_config.initial_delay &&
+	    !all_counters_use_bpf)
+		evlist__enable(evsel_list);
+
 	/* Exec the command, if any */
 	if (forks)
 		evlist__start_workload(evsel_list);
@@ -977,6 +969,22 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	t0 = rdclock();
 	clock_gettime(CLOCK_MONOTONIC, &ref_time);
 
+	/*
+	 * If a measurement delay was specified, start it, and if positive,
+	 * enable events manually after. We respect the delay even if all
+	 * events are associated to BPF
+	 */
+	if (stat_config.initial_delay) {
+		/* At this point, events are guaranteed disabled */
+		pr_info(EVLIST_DISABLED_MSG);
+		if (stat_config.initial_delay > 0) {
+			usleep(stat_config.initial_delay * USEC_PER_MSEC);
+			if (!all_counters_use_bpf)
+				evlist__enable(evsel_list);
+			pr_info(EVLIST_ENABLED_MSG);
+		}
+	}
+
 	if (forks) {
 		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
 			status = dispatch_events(forks, timeout, interval, &times);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila
@ 2022-08-01  8:20   ` James Clark
  2022-12-13 14:44     ` Arnaldo Carvalho de Melo
  2022-08-01 11:45   ` Leo Yan
  1 sibling, 1 reply; 12+ messages in thread
From: James Clark @ 2022-08-01  8:20 UTC (permalink / raw)
  To: Adrián Herrera Arcila, linux-kernel, linux-perf-users, acme,
	leo.yan, songliubraving
  Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa, namhyung



On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> The described --delay behaviour is to delay the enablement of events, but
> not the execution of the command, if one is passed, which is incorrectly
> the current behaviour.
> 
> This patch decouples the enablement from the delay, and enables events
> before or after launching the workload dependent on the options passed
> by the user. This code structure is inspired by that in perf-record, and
> tries to be consistent with it.
> 
> Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
> ---
>  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
>  1 file changed, 32 insertions(+), 24 deletions(-)

Looks good to me. Fixes the counter delay issue and the code is pretty
similar to perf record now. Although I would wait for Leo's or Song's
comment as well because they were involved.

Reviewed-by: James Clark <james.clark@arm.com>

> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 318ffd119dad..f98c823b16dd 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -559,7 +559,7 @@ static bool handle_interval(unsigned int interval, int *times)
>  	return false;
>  }
>  
> -static int enable_counters(void)
> +static int enable_bpf_counters(void)
>  {
>  	struct evsel *evsel;
>  	int err;
> @@ -572,28 +572,6 @@ static int enable_counters(void)
>  		if (err)
>  			return err;
>  	}
> -
> -	if (stat_config.initial_delay < 0) {
> -		pr_info(EVLIST_DISABLED_MSG);
> -		return 0;
> -	}
> -
> -	if (stat_config.initial_delay > 0) {
> -		pr_info(EVLIST_DISABLED_MSG);
> -		usleep(stat_config.initial_delay * USEC_PER_MSEC);
> -	}
> -
> -	/*
> -	 * We need to enable counters only if:
> -	 * - we don't have tracee (attaching to task or cpu)
> -	 * - we have initial delay configured
> -	 */
> -	if (!target__none(&target) || stat_config.initial_delay) {
> -		if (!all_counters_use_bpf)
> -			evlist__enable(evsel_list);
> -		if (stat_config.initial_delay > 0)
> -			pr_info(EVLIST_ENABLED_MSG);
> -	}
>  	return 0;
>  }
>  
> @@ -966,10 +944,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  			return err;
>  	}
>  
> -	err = enable_counters();
> +	err = enable_bpf_counters();
>  	if (err)
>  		return -1;
>  
> +	/*
> +	 * Enable events manually here if perf-stat is run:
> +	 * 1. with a target (any of --all-cpus, --cpu, --pid or --tid)
> +	 * 2. without measurement delay (no --delay)
> +	 * 3. without all events associated to BPF
> +	 *
> +	 * This is because if run with a target, events are not enabled
> +	 * on exec if a workload is passed, and because there is no delay
> +	 * we ensure to enable them before the workload starts
> +	 */
> +	if (!target__none(&target) && !stat_config.initial_delay &&
> +	    !all_counters_use_bpf)
> +		evlist__enable(evsel_list);
> +
>  	/* Exec the command, if any */
>  	if (forks)
>  		evlist__start_workload(evsel_list);
> @@ -977,6 +969,22 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>  	t0 = rdclock();
>  	clock_gettime(CLOCK_MONOTONIC, &ref_time);
>  
> +	/*
> +	 * If a measurement delay was specified, start it, and if positive,
> +	 * enable events manually after. We respect the delay even if all
> +	 * events are associated to BPF
> +	 */
> +	if (stat_config.initial_delay) {
> +		/* At this point, events are guaranteed disabled */
> +		pr_info(EVLIST_DISABLED_MSG);
> +		if (stat_config.initial_delay > 0) {
> +			usleep(stat_config.initial_delay * USEC_PER_MSEC);
> +			if (!all_counters_use_bpf)
> +				evlist__enable(evsel_list);
> +			pr_info(EVLIST_ENABLED_MSG);
> +		}
> +	}
> +
>  	if (forks) {
>  		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
>  			status = dispatch_events(forks, timeout, interval, &times);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] perf stat: refactor __run_perf_stat common code
  2022-07-29 16:12 [PATCH 1/2] perf stat: refactor __run_perf_stat common code Adrián Herrera Arcila
  2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila
@ 2022-08-01 11:27 ` Leo Yan
  1 sibling, 0 replies; 12+ messages in thread
From: Leo Yan @ 2022-08-01 11:27 UTC (permalink / raw)
  To: Adrián Herrera Arcila
  Cc: linux-kernel, linux-perf-users, acme, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, namhyung,
	songliubraving, james.clark

On Fri, Jul 29, 2022 at 04:12:43PM +0000, Adrián Herrera Arcila wrote:
> This extracts common code from the branches of the forks if-then-else.
> enable_counters(), which was at the beginning of both branches of the
> conditional, is now unconditional; evlist__start_workload is extracted
> to a different if, which enables making the common clocking code
> unconditional.
> 
> Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>

Reviewed-by: Leo Yan <leo.yan@linaro.org>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila
  2022-08-01  8:20   ` James Clark
@ 2022-08-01 11:45   ` Leo Yan
  1 sibling, 0 replies; 12+ messages in thread
From: Leo Yan @ 2022-08-01 11:45 UTC (permalink / raw)
  To: Adrián Herrera Arcila
  Cc: linux-kernel, linux-perf-users, acme, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, namhyung,
	songliubraving, james.clark

On Fri, Jul 29, 2022 at 04:12:44PM +0000, Adrián Herrera Arcila wrote:
> The described --delay behaviour is to delay the enablement of events, but
> not the execution of the command, if one is passed, which is incorrectly
> the current behaviour.
> 
> This patch decouples the enablement from the delay, and enables events
> before or after launching the workload dependent on the options passed
> by the user. This code structure is inspired by that in perf-record, and
> tries to be consistent with it.
> 
> Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>

bpf_counter() is not enabled for delay in this patch, but I think this
is purposed to keep the same behaviour with before.  I would leave it
to Song for a call.

The patch LGTM and I tested with commands:

$ time ./perf stat --delay 2000 --quiet sleep 2
Events disabled
Events enabled

real	0m2.039s
user	0m0.007s
sys	0m0.016s

$ ./perf stat --delay 2000 --quiet echo "marking"
Events disabled
marking
Events enabled

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>

P.s. I took a bit time to get clear how 'perf stat' set enable_on_exec
flag, which is set in the function create_perf_stat_counter(), so this
can enable PMU event for the case when delay is zero, and this can
avoid losing PMU tracing for workload.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-08-01  8:20   ` James Clark
@ 2022-12-13 14:44     ` Arnaldo Carvalho de Melo
  2022-12-13 16:40       ` Namhyung Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-12-13 14:44 UTC (permalink / raw)
  To: James Clark
  Cc: Adrián Herrera Arcila, linux-kernel, linux-perf-users,
	leo.yan, songliubraving, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, namhyung

Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
> 
> 
> On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> > The described --delay behaviour is to delay the enablement of events, but
> > not the execution of the command, if one is passed, which is incorrectly
> > the current behaviour.
> > 
> > This patch decouples the enablement from the delay, and enables events
> > before or after launching the workload dependent on the options passed
> > by the user. This code structure is inspired by that in perf-record, and
> > tries to be consistent with it.
> > 
> > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
> > ---
> >  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
> >  1 file changed, 32 insertions(+), 24 deletions(-)
> 
> Looks good to me. Fixes the counter delay issue and the code is pretty
> similar to perf record now. Although I would wait for Leo's or Song's
> comment as well because they were involved.

I think I didn't notice Leo's ack, it still applies, so I'm doing it
now.

- Arnaldo
 
> Reviewed-by: James Clark <james.clark@arm.com>
> 
> > 
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index 318ffd119dad..f98c823b16dd 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -559,7 +559,7 @@ static bool handle_interval(unsigned int interval, int *times)
> >  	return false;
> >  }
> >  
> > -static int enable_counters(void)
> > +static int enable_bpf_counters(void)
> >  {
> >  	struct evsel *evsel;
> >  	int err;
> > @@ -572,28 +572,6 @@ static int enable_counters(void)
> >  		if (err)
> >  			return err;
> >  	}
> > -
> > -	if (stat_config.initial_delay < 0) {
> > -		pr_info(EVLIST_DISABLED_MSG);
> > -		return 0;
> > -	}
> > -
> > -	if (stat_config.initial_delay > 0) {
> > -		pr_info(EVLIST_DISABLED_MSG);
> > -		usleep(stat_config.initial_delay * USEC_PER_MSEC);
> > -	}
> > -
> > -	/*
> > -	 * We need to enable counters only if:
> > -	 * - we don't have tracee (attaching to task or cpu)
> > -	 * - we have initial delay configured
> > -	 */
> > -	if (!target__none(&target) || stat_config.initial_delay) {
> > -		if (!all_counters_use_bpf)
> > -			evlist__enable(evsel_list);
> > -		if (stat_config.initial_delay > 0)
> > -			pr_info(EVLIST_ENABLED_MSG);
> > -	}
> >  	return 0;
> >  }
> >  
> > @@ -966,10 +944,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> >  			return err;
> >  	}
> >  
> > -	err = enable_counters();
> > +	err = enable_bpf_counters();
> >  	if (err)
> >  		return -1;
> >  
> > +	/*
> > +	 * Enable events manually here if perf-stat is run:
> > +	 * 1. with a target (any of --all-cpus, --cpu, --pid or --tid)
> > +	 * 2. without measurement delay (no --delay)
> > +	 * 3. without all events associated to BPF
> > +	 *
> > +	 * This is because if run with a target, events are not enabled
> > +	 * on exec if a workload is passed, and because there is no delay
> > +	 * we ensure to enable them before the workload starts
> > +	 */
> > +	if (!target__none(&target) && !stat_config.initial_delay &&
> > +	    !all_counters_use_bpf)
> > +		evlist__enable(evsel_list);
> > +
> >  	/* Exec the command, if any */
> >  	if (forks)
> >  		evlist__start_workload(evsel_list);
> > @@ -977,6 +969,22 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> >  	t0 = rdclock();
> >  	clock_gettime(CLOCK_MONOTONIC, &ref_time);
> >  
> > +	/*
> > +	 * If a measurement delay was specified, start it, and if positive,
> > +	 * enable events manually after. We respect the delay even if all
> > +	 * events are associated to BPF
> > +	 */
> > +	if (stat_config.initial_delay) {
> > +		/* At this point, events are guaranteed disabled */
> > +		pr_info(EVLIST_DISABLED_MSG);
> > +		if (stat_config.initial_delay > 0) {
> > +			usleep(stat_config.initial_delay * USEC_PER_MSEC);
> > +			if (!all_counters_use_bpf)
> > +				evlist__enable(evsel_list);
> > +			pr_info(EVLIST_ENABLED_MSG);
> > +		}
> > +	}
> > +
> >  	if (forks) {
> >  		if (interval || timeout || evlist__ctlfd_initialized(evsel_list))
> >  			status = dispatch_events(forks, timeout, interval, &times);

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-12-13 14:44     ` Arnaldo Carvalho de Melo
@ 2022-12-13 16:40       ` Namhyung Kim
  2022-12-14 14:26         ` James Clark
  2022-12-14 14:41         ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 12+ messages in thread
From: Namhyung Kim @ 2022-12-13 16:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: James Clark, Adrián Herrera Arcila, linux-kernel,
	linux-perf-users, leo.yan, songliubraving, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa

Hi,

On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
> >
> >
> > On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> > > The described --delay behaviour is to delay the enablement of events, but
> > > not the execution of the command, if one is passed, which is incorrectly
> > > the current behaviour.
> > >
> > > This patch decouples the enablement from the delay, and enables events
> > > before or after launching the workload dependent on the options passed
> > > by the user. This code structure is inspired by that in perf-record, and
> > > tries to be consistent with it.
> > >
> > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
> > > ---
> > >  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
> > >  1 file changed, 32 insertions(+), 24 deletions(-)
> >
> > Looks good to me. Fixes the counter delay issue and the code is pretty
> > similar to perf record now. Although I would wait for Leo's or Song's
> > comment as well because they were involved.
>
> I think I didn't notice Leo's ack, it still applies, so I'm doing it
> now.

I think the BPF counters should be enabled/disabled together.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-12-13 16:40       ` Namhyung Kim
@ 2022-12-14 14:26         ` James Clark
  2022-12-14 14:41         ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 12+ messages in thread
From: James Clark @ 2022-12-14 14:26 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo
  Cc: Adrián Herrera Arcila, linux-kernel, linux-perf-users,
	leo.yan, songliubraving, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa



On 13/12/2022 16:40, Namhyung Kim wrote:
> Hi,
> 
> On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
>>
>> Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
>>>
>>>
>>> On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
>>>> The described --delay behaviour is to delay the enablement of events, but
>>>> not the execution of the command, if one is passed, which is incorrectly
>>>> the current behaviour.
>>>>
>>>> This patch decouples the enablement from the delay, and enables events
>>>> before or after launching the workload dependent on the options passed
>>>> by the user. This code structure is inspired by that in perf-record, and
>>>> tries to be consistent with it.
>>>>
>>>> Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
>>>> Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
>>>> Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
>>>> ---
>>>>  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
>>>>  1 file changed, 32 insertions(+), 24 deletions(-)
>>>
>>> Looks good to me. Fixes the counter delay issue and the code is pretty
>>> similar to perf record now. Although I would wait for Leo's or Song's
>>> comment as well because they were involved.
>>
>> I think I didn't notice Leo's ack, it still applies, so I'm doing it
>> now.
> 
> I think the BPF counters should be enabled/disabled together.

I did notice that difference between the two, but I wasn't sure of the
exact reason that it was done that way on Adrián's version. It seems
like it's not separated in perf record so maybe you are right.

> 
> Thanks,
> Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-12-13 16:40       ` Namhyung Kim
  2022-12-14 14:26         ` James Clark
@ 2022-12-14 14:41         ` Arnaldo Carvalho de Melo
  2022-12-14 15:57           ` Leo Yan
  1 sibling, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-12-14 14:41 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: James Clark, Adrián Herrera Arcila, linux-kernel,
	linux-perf-users, leo.yan, songliubraving, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa

Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu:
> Hi,
> 
> On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
> > >
> > >
> > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> > > > The described --delay behaviour is to delay the enablement of events, but
> > > > not the execution of the command, if one is passed, which is incorrectly
> > > > the current behaviour.
> > > >
> > > > This patch decouples the enablement from the delay, and enables events
> > > > before or after launching the workload dependent on the options passed
> > > > by the user. This code structure is inspired by that in perf-record, and
> > > > tries to be consistent with it.
> > > >
> > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
> > > > ---
> > > >  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
> > > >  1 file changed, 32 insertions(+), 24 deletions(-)
> > >
> > > Looks good to me. Fixes the counter delay issue and the code is pretty
> > > similar to perf record now. Although I would wait for Leo's or Song's
> > > comment as well because they were involved.
> >
> > I think I didn't notice Leo's ack, it still applies, so I'm doing it
> > now.
> 
> I think the BPF counters should be enabled/disabled together.

Ok, so I removed this one and applied Namhyung's.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-12-14 14:41         ` Arnaldo Carvalho de Melo
@ 2022-12-14 15:57           ` Leo Yan
  2022-12-14 17:35             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 12+ messages in thread
From: Leo Yan @ 2022-12-14 15:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, James Clark, Adrián Herrera Arcila,
	linux-kernel, linux-perf-users, songliubraving, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa

On Wed, Dec 14, 2022 at 11:41:55AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu:
> > Hi,
> > 
> > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
> > > >
> > > >
> > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> > > > > The described --delay behaviour is to delay the enablement of events, but
> > > > > not the execution of the command, if one is passed, which is incorrectly
> > > > > the current behaviour.
> > > > >
> > > > > This patch decouples the enablement from the delay, and enables events
> > > > > before or after launching the workload dependent on the options passed
> > > > > by the user. This code structure is inspired by that in perf-record, and
> > > > > tries to be consistent with it.
> > > > >
> > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> > > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> > > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
> > > > > ---
> > > > >  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
> > > > >  1 file changed, 32 insertions(+), 24 deletions(-)
> > > >
> > > > Looks good to me. Fixes the counter delay issue and the code is pretty
> > > > similar to perf record now. Although I would wait for Leo's or Song's
> > > > comment as well because they were involved.
> > >
> > > I think I didn't notice Leo's ack, it still applies, so I'm doing it
> > > now.
> > 
> > I think the BPF counters should be enabled/disabled together.
> 
> Ok, so I removed this one and applied Namhyung's.

I can guess why Adrián doesn't enable/disable BPF counters together :)

Since 'perf stat' doesn't enable BPF counters with other normal PMU
events in the first place, I believe this is deliberately by Song's
patch fa853c4b839e ("perf stat: Enable counting events for BPF
programs"), it says:

"'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF
programs (monitor-progs) to the target BPF program (target-prog). The
monitor-progs read perf_event before and after the target-prog, and
aggregate the difference in a BPF map. Then the user space reads data
from these maps".

IIUC, when loading eBPF (counter) program, perf tool needs to handle
eBPF program map specially (so that perf tool can know the latest eBPF
program's map in kernel).

I don't know anything for eBPF counter, so this is why I am still a bit
puzzle which way is right to do (bind vs separate eBPF counters).  But
I personally prefer to let eBPF counter to respect delay, so it's fine
for me to apply Namhyung's patch.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-12-14 15:57           ` Leo Yan
@ 2022-12-14 17:35             ` Arnaldo Carvalho de Melo
  2022-12-15  1:45               ` Leo Yan
  0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-12-14 17:35 UTC (permalink / raw)
  To: Leo Yan
  Cc: Namhyung Kim, James Clark, Adrián Herrera Arcila,
	linux-kernel, linux-perf-users, songliubraving, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa

Em Wed, Dec 14, 2022 at 11:57:21PM +0800, Leo Yan escreveu:
> On Wed, Dec 14, 2022 at 11:41:55AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Dec 13, 2022 at 08:40:31AM -0800, Namhyung Kim escreveu:
> > > Hi,
> > > 
> > > On Tue, Dec 13, 2022 at 6:44 AM Arnaldo Carvalho de Melo
> > > <acme@kernel.org> wrote:
> > > >
> > > > Em Mon, Aug 01, 2022 at 09:20:37AM +0100, James Clark escreveu:
> > > > >
> > > > >
> > > > > On 29/07/2022 17:12, Adrián Herrera Arcila wrote:
> > > > > > The described --delay behaviour is to delay the enablement of events, but
> > > > > > not the execution of the command, if one is passed, which is incorrectly
> > > > > > the current behaviour.
> > > > > >
> > > > > > This patch decouples the enablement from the delay, and enables events
> > > > > > before or after launching the workload dependent on the options passed
> > > > > > by the user. This code structure is inspired by that in perf-record, and
> > > > > > tries to be consistent with it.
> > > > > >
> > > > > > Link: https://lore.kernel.org/linux-perf-users/7BFD066E-B0A8-49D4-B635-379328F0CF4C@fb.com
> > > > > > Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> > > > > > Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
> > > > > > ---
> > > > > >  tools/perf/builtin-stat.c | 56 ++++++++++++++++++++++-----------------
> > > > > >  1 file changed, 32 insertions(+), 24 deletions(-)
> > > > >
> > > > > Looks good to me. Fixes the counter delay issue and the code is pretty
> > > > > similar to perf record now. Although I would wait for Leo's or Song's
> > > > > comment as well because they were involved.
> > > >
> > > > I think I didn't notice Leo's ack, it still applies, so I'm doing it
> > > > now.
> > > 
> > > I think the BPF counters should be enabled/disabled together.
> > 
> > Ok, so I removed this one and applied Namhyung's.
> 
> I can guess why Adrián doesn't enable/disable BPF counters together :)
> 
> Since 'perf stat' doesn't enable BPF counters with other normal PMU
> events in the first place, I believe this is deliberately by Song's
> patch fa853c4b839e ("perf stat: Enable counting events for BPF
> programs"), it says:
> 
> "'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF
> programs (monitor-progs) to the target BPF program (target-prog). The
> monitor-progs read perf_event before and after the target-prog, and
> aggregate the difference in a BPF map. Then the user space reads data
> from these maps".
> 
> IIUC, when loading eBPF (counter) program, perf tool needs to handle
> eBPF program map specially (so that perf tool can know the latest eBPF
> program's map in kernel).
> 
> I don't know anything for eBPF counter, so this is why I am still a bit
> puzzle which way is right to do (bind vs separate eBPF counters).  But
> I personally prefer to let eBPF counter to respect delay, so it's fine
> for me to apply Namhyung's patch.

"I'm fine" can be read as an Acked-by, right? :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] perf stat: fix unexpected delay behaviour
  2022-12-14 17:35             ` Arnaldo Carvalho de Melo
@ 2022-12-15  1:45               ` Leo Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2022-12-15  1:45 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, James Clark, Adrián Herrera Arcila,
	linux-kernel, linux-perf-users, songliubraving, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa

On Wed, Dec 14, 2022 at 02:35:01PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > I don't know anything for eBPF counter, so this is why I am still a bit
> > puzzle which way is right to do (bind vs separate eBPF counters).  But
> > I personally prefer to let eBPF counter to respect delay, so it's fine
> > for me to apply Namhyung's patch.
> 
> "I'm fine" can be read as an Acked-by, right? :-)

Yes, have sent my Reviewed tag on Namhyung's patch.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-12-15  1:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-29 16:12 [PATCH 1/2] perf stat: refactor __run_perf_stat common code Adrián Herrera Arcila
2022-07-29 16:12 ` [PATCH 2/2] perf stat: fix unexpected delay behaviour Adrián Herrera Arcila
2022-08-01  8:20   ` James Clark
2022-12-13 14:44     ` Arnaldo Carvalho de Melo
2022-12-13 16:40       ` Namhyung Kim
2022-12-14 14:26         ` James Clark
2022-12-14 14:41         ` Arnaldo Carvalho de Melo
2022-12-14 15:57           ` Leo Yan
2022-12-14 17:35             ` Arnaldo Carvalho de Melo
2022-12-15  1:45               ` Leo Yan
2022-08-01 11:45   ` Leo Yan
2022-08-01 11:27 ` [PATCH 1/2] perf stat: refactor __run_perf_stat common code Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).