All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf sched: kill time stamp discrepancy between script and latency
@ 2016-09-29  2:25 Joonwoo Park
  2016-09-29  2:29 ` Joonwoo Park
  2016-10-01  5:15 ` Namhyung Kim
  0 siblings, 2 replies; 6+ messages in thread
From: Joonwoo Park @ 2016-09-29  2:25 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo
  Cc: Joonwoo Park, Ingo Molnar, Alexander Shishkin, Steven Rostedt,
	Namhyung Kim, linux-kernel

Perf sched latency is handy to find out the maximum sched latency and
the time stamp of the event.  After running sched latency, if a found
latency looks suspicious it's quite reasonable to run perf script
subsequently and search with the time stamp given by perf sched latency
to continue further debugging.  However, at present, it's possible the
time stamp given by perf sched latency cannot be found in the trace
output by perf script because perf sched latency converts the event
time from ns to ms as double float and prints it with printf which
does banker's rounding as opposed to perf script doesn't.

  For example:

   0x750ff0 [0x80]: event: 9
   <snip>
   2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
    ... thread: hackbench:15281
   <snip>

    $ perf sched -i perf.data latency | grep hackbench
      hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s

    $ perf script -i perf.data | grep "1858\.303050"

    $ perf script -i perf.data | grep "1858\.303049"
      hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120

Fix perf latency to print out time stamp without rounding to avoid such
discrepancy.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
---

I was tempted to get rid of all u64 to double casting in the function
output_lat_thread but didn't because there is no data loss as of
today.  Double float gives at least 15 significant decimal digits
precision while the function requires only 14 significant digits precision.

$ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
14

 tools/lib/traceevent/event-parse.h |  1 +
 tools/perf/builtin-sched.c         | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 9ffde37..f42703c 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -174,6 +174,7 @@ struct pevent_plugin_option {
 
 #define NSECS_PER_SEC		1000000000ULL
 #define NSECS_PER_USEC		1000ULL
+#define MSECS_PER_SEC		1000ULL
 
 enum format_flags {
 	FIELD_IS_ARRAY		= 1,
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index afa0576..e5cf51a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
 	int i;
 	int ret;
 	u64 avg;
+	u64 max_lat_at_sec, max_lat_at_msec;
 
 	if (!work_list->nb_atoms)
 		return;
@@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
 
 	avg = work_list->total_lat / work_list->nb_atoms;
 
-	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
+	/*
+	 * Avoid round up with printf to prevent event time discrepency
+	 * between sched script and latency.
+	 */
+	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
+	max_lat_at_msec = (work_list->max_lat_at -
+			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
+	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",
 	      (double)work_list->total_runtime / 1e6,
 		 work_list->nb_atoms, (double)avg / 1e6,
 		 (double)work_list->max_lat / 1e6,
-		 (double)work_list->max_lat_at / 1e9);
+		 max_lat_at_sec, max_lat_at_msec);
 }
 
 static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf sched: kill time stamp discrepancy between script and latency
  2016-09-29  2:25 [PATCH] perf sched: kill time stamp discrepancy between script and latency Joonwoo Park
@ 2016-09-29  2:29 ` Joonwoo Park
  2016-10-01  5:15 ` Namhyung Kim
  1 sibling, 0 replies; 6+ messages in thread
From: Joonwoo Park @ 2016-09-29  2:29 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Alexander Shishkin, Steven Rostedt, Namhyung Kim,
	linux-kernel



On 09/28/2016 07:25 PM, Joonwoo Park wrote:
> Perf sched latency is handy to find out the maximum sched latency and
> the time stamp of the event.  After running sched latency, if a found
> latency looks suspicious it's quite reasonable to run perf script
> subsequently and search with the time stamp given by perf sched latency
> to continue further debugging.  However, at present, it's possible the
> time stamp given by perf sched latency cannot be found in the trace
> output by perf script because perf sched latency converts the event
> time from ns to ms as double float and prints it with printf which
> does banker's rounding as opposed to perf script doesn't.
>
>   For example:
>
>    0x750ff0 [0x80]: event: 9
>    <snip>
>    2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
>     ... thread: hackbench:15281
>    <snip>
>
>     $ perf sched -i perf.data latency | grep hackbench
>       hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s
>
>     $ perf script -i perf.data | grep "1858\.303050"
>
>     $ perf script -i perf.data | grep "1858\.303049"
>       hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120
>
> Fix perf latency to print out time stamp without rounding to avoid such
> discrepancy.
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> ---
>
> I was tempted to get rid of all u64 to double casting in the function
> output_lat_thread but didn't because there is no data loss as of
> today.  Double float gives at least 15 significant decimal digits
> precision while the function requires only 14 significant digits precision.
>
> $ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
> 14
>
>  tools/lib/traceevent/event-parse.h |  1 +
>  tools/perf/builtin-sched.c         | 12 ++++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
> index 9ffde37..f42703c 100644
> --- a/tools/lib/traceevent/event-parse.h
> +++ b/tools/lib/traceevent/event-parse.h
> @@ -174,6 +174,7 @@ struct pevent_plugin_option {
>
>  #define NSECS_PER_SEC		1000000000ULL
>  #define NSECS_PER_USEC		1000ULL
> +#define MSECS_PER_SEC		1000ULL
>
>  enum format_flags {
>  	FIELD_IS_ARRAY		= 1,
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index afa0576..e5cf51a 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  	int i;
>  	int ret;
>  	u64 avg;
> +	u64 max_lat_at_sec, max_lat_at_msec;
>
>  	if (!work_list->nb_atoms)
>  		return;
> @@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>
>  	avg = work_list->total_lat / work_list->nb_atoms;
>
> -	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
> +	/*
> +	 * Avoid round up with printf to prevent event time discrepency

s/discrepency/discrepancy/
I will fix this typo after gathering review comments.

Thanks,
Joonwoo

> +	 * between sched script and latency.
> +	 */
> +	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
> +	max_lat_at_msec = (work_list->max_lat_at -
> +			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
> +	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",
>  	      (double)work_list->total_runtime / 1e6,
>  		 work_list->nb_atoms, (double)avg / 1e6,
>  		 (double)work_list->max_lat / 1e6,
> -		 (double)work_list->max_lat_at / 1e9);
> +		 max_lat_at_sec, max_lat_at_msec);
>  }
>
>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf sched: kill time stamp discrepancy between script and latency
  2016-09-29  2:25 [PATCH] perf sched: kill time stamp discrepancy between script and latency Joonwoo Park
  2016-09-29  2:29 ` Joonwoo Park
@ 2016-10-01  5:15 ` Namhyung Kim
  2016-10-03 22:04   ` Joonwoo Park
  1 sibling, 1 reply; 6+ messages in thread
From: Namhyung Kim @ 2016-10-01  5:15 UTC (permalink / raw)
  To: Joonwoo Park
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Alexander Shishkin, Steven Rostedt, linux-kernel

Hi Joonwoo,

On Wed, Sep 28, 2016 at 07:25:26PM -0700, Joonwoo Park wrote:
> Perf sched latency is handy to find out the maximum sched latency and
> the time stamp of the event.  After running sched latency, if a found
> latency looks suspicious it's quite reasonable to run perf script
> subsequently and search with the time stamp given by perf sched latency
> to continue further debugging.  However, at present, it's possible the
> time stamp given by perf sched latency cannot be found in the trace
> output by perf script because perf sched latency converts the event
> time from ns to ms as double float and prints it with printf which
> does banker's rounding as opposed to perf script doesn't.
> 
>   For example:
> 
>    0x750ff0 [0x80]: event: 9
>    <snip>
>    2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
>     ... thread: hackbench:15281
>    <snip>
> 
>     $ perf sched -i perf.data latency | grep hackbench
>       hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s
> 
>     $ perf script -i perf.data | grep "1858\.303050"
> 
>     $ perf script -i perf.data | grep "1858\.303049"
>       hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120
> 
> Fix perf latency to print out time stamp without rounding to avoid such
> discrepancy.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> ---
> 
> I was tempted to get rid of all u64 to double casting in the function
> output_lat_thread but didn't because there is no data loss as of
> today.  Double float gives at least 15 significant decimal digits
> precision while the function requires only 14 significant digits precision.
> 
> $ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
> 14
> 
>  tools/lib/traceevent/event-parse.h |  1 +
>  tools/perf/builtin-sched.c         | 12 ++++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
> index 9ffde37..f42703c 100644
> --- a/tools/lib/traceevent/event-parse.h
> +++ b/tools/lib/traceevent/event-parse.h
> @@ -174,6 +174,7 @@ struct pevent_plugin_option {
>  
>  #define NSECS_PER_SEC		1000000000ULL
>  #define NSECS_PER_USEC		1000ULL
> +#define MSECS_PER_SEC		1000ULL
>  
>  enum format_flags {
>  	FIELD_IS_ARRAY		= 1,
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index afa0576..e5cf51a 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  	int i;
>  	int ret;
>  	u64 avg;
> +	u64 max_lat_at_sec, max_lat_at_msec;

Isn't it usec rathen than msec? :)

>  
>  	if (!work_list->nb_atoms)
>  		return;
> @@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>  
>  	avg = work_list->total_lat / work_list->nb_atoms;
>  
> -	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
> +	/*
> +	 * Avoid round up with printf to prevent event time discrepency
> +	 * between sched script and latency.
> +	 */
> +	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
> +	max_lat_at_msec = (work_list->max_lat_at -
> +			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
> +	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",

Maybe you'd better to be in sync with the script code:

	if (PRINT_FIELD(TIME)) {
		nsecs = sample->time;
		secs = nsecs / NSECS_PER_SEC;
		nsecs -= secs * NSECS_PER_SEC;
		usecs = nsecs / NSECS_PER_USEC;
		if (nanosecs)
			printf("%5lu.%09llu: ", secs, nsecs);
		else
			printf("%5lu.%06lu: ", secs, usecs);
	}

Thanks,
Namhyung


>  	      (double)work_list->total_runtime / 1e6,
>  		 work_list->nb_atoms, (double)avg / 1e6,
>  		 (double)work_list->max_lat / 1e6,
> -		 (double)work_list->max_lat_at / 1e9);
> +		 max_lat_at_sec, max_lat_at_msec);
>  }
>  
>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf sched: kill time stamp discrepancy between script and latency
  2016-10-01  5:15 ` Namhyung Kim
@ 2016-10-03 22:04   ` Joonwoo Park
  2016-10-04 12:04     ` Namhyung Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Joonwoo Park @ 2016-10-03 22:04 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Alexander Shishkin, Steven Rostedt, linux-kernel



On 09/30/2016 10:15 PM, Namhyung Kim wrote:
> Hi Joonwoo,
>
> On Wed, Sep 28, 2016 at 07:25:26PM -0700, Joonwoo Park wrote:
>> Perf sched latency is handy to find out the maximum sched latency and
>> the time stamp of the event.  After running sched latency, if a found
>> latency looks suspicious it's quite reasonable to run perf script
>> subsequently and search with the time stamp given by perf sched latency
>> to continue further debugging.  However, at present, it's possible the
>> time stamp given by perf sched latency cannot be found in the trace
>> output by perf script because perf sched latency converts the event
>> time from ns to ms as double float and prints it with printf which
>> does banker's rounding as opposed to perf script doesn't.
>>
>>   For example:
>>
>>    0x750ff0 [0x80]: event: 9
>>    <snip>
>>    2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
>>     ... thread: hackbench:15281
>>    <snip>
>>
>>     $ perf sched -i perf.data latency | grep hackbench
>>       hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s
>>
>>     $ perf script -i perf.data | grep "1858\.303050"
>>
>>     $ perf script -i perf.data | grep "1858\.303049"
>>       hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120
>>
>> Fix perf latency to print out time stamp without rounding to avoid such
>> discrepancy.
>>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
>> ---
>>
>> I was tempted to get rid of all u64 to double casting in the function
>> output_lat_thread but didn't because there is no data loss as of
>> today.  Double float gives at least 15 significant decimal digits
>> precision while the function requires only 14 significant digits precision.
>>
>> $ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
>> 14
>>
>>  tools/lib/traceevent/event-parse.h |  1 +
>>  tools/perf/builtin-sched.c         | 12 ++++++++++--
>>  2 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
>> index 9ffde37..f42703c 100644
>> --- a/tools/lib/traceevent/event-parse.h
>> +++ b/tools/lib/traceevent/event-parse.h
>> @@ -174,6 +174,7 @@ struct pevent_plugin_option {
>>
>>  #define NSECS_PER_SEC		1000000000ULL
>>  #define NSECS_PER_USEC		1000ULL
>> +#define MSECS_PER_SEC		1000ULL
>>
>>  enum format_flags {
>>  	FIELD_IS_ARRAY		= 1,
>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>> index afa0576..e5cf51a 100644
>> --- a/tools/perf/builtin-sched.c
>> +++ b/tools/perf/builtin-sched.c
>> @@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>>  	int i;
>>  	int ret;
>>  	u64 avg;
>> +	u64 max_lat_at_sec, max_lat_at_msec;
>
> Isn't it usec rathen than msec? :)

It's to contain three decimal digits which are msecs when 'max_lat_at' 
is expressed in sec.
For example when max_lat_at = 1858303049520 which is 1858.3030495199998 
sec, max_lat_at_msec is meant to be 303.

I must admit variable's name is bit misleading.  Maybe just secs, msecs 
are better?
Also just noticed u64 isn't needed for msecs.  Will size down.

>
>>
>>  	if (!work_list->nb_atoms)
>>  		return;
>> @@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>>
>>  	avg = work_list->total_lat / work_list->nb_atoms;
>>
>> -	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
>> +	/*
>> +	 * Avoid round up with printf to prevent event time discrepency
>> +	 * between sched script and latency.
>> +	 */
>> +	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
>> +	max_lat_at_msec = (work_list->max_lat_at -
>> +			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
>> +	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",
>
> Maybe you'd better to be in sync with the script code:
>
> 	if (PRINT_FIELD(TIME)) {
> 		nsecs = sample->time;
> 		secs = nsecs / NSECS_PER_SEC;
> 		nsecs -= secs * NSECS_PER_SEC;
> 		usecs = nsecs / NSECS_PER_USEC;
> 		if (nanosecs)
> 			printf("%5lu.%09llu: ", secs, nsecs);
> 		else
> 			printf("%5lu.%06lu: ", secs, usecs);
> 	}

Apart from variable name, I'm not quite sure what to sync because sched 
doesn't print in nsecs.
Maybe you just wanted for variable names in sync rather than logic?

Thanks!
Joonwoo

>
> Thanks,
> Namhyung
>
>
>>  	      (double)work_list->total_runtime / 1e6,
>>  		 work_list->nb_atoms, (double)avg / 1e6,
>>  		 (double)work_list->max_lat / 1e6,
>> -		 (double)work_list->max_lat_at / 1e9);
>> +		 max_lat_at_sec, max_lat_at_msec);
>>  }
>>
>>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
>> --
>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
>> hosted by The Linux Foundation
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf sched: kill time stamp discrepancy between script and latency
  2016-10-03 22:04   ` Joonwoo Park
@ 2016-10-04 12:04     ` Namhyung Kim
  2016-10-04 17:11       ` Joonwoo Park
  0 siblings, 1 reply; 6+ messages in thread
From: Namhyung Kim @ 2016-10-04 12:04 UTC (permalink / raw)
  To: Joonwoo Park
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Alexander Shishkin, Steven Rostedt, linux-kernel

On Mon, Oct 03, 2016 at 03:04:48PM -0700, Joonwoo Park wrote:
> 
> 
> On 09/30/2016 10:15 PM, Namhyung Kim wrote:
> > Hi Joonwoo,
> > 
> > On Wed, Sep 28, 2016 at 07:25:26PM -0700, Joonwoo Park wrote:
> > > Perf sched latency is handy to find out the maximum sched latency and
> > > the time stamp of the event.  After running sched latency, if a found
> > > latency looks suspicious it's quite reasonable to run perf script
> > > subsequently and search with the time stamp given by perf sched latency
> > > to continue further debugging.  However, at present, it's possible the
> > > time stamp given by perf sched latency cannot be found in the trace
> > > output by perf script because perf sched latency converts the event
> > > time from ns to ms as double float and prints it with printf which
> > > does banker's rounding as opposed to perf script doesn't.
> > > 
> > >   For example:
> > > 
> > >    0x750ff0 [0x80]: event: 9
> > >    <snip>
> > >    2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
> > >     ... thread: hackbench:15281
> > >    <snip>
> > > 
> > >     $ perf sched -i perf.data latency | grep hackbench
> > >       hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s
> > > 
> > >     $ perf script -i perf.data | grep "1858\.303050"
> > > 
> > >     $ perf script -i perf.data | grep "1858\.303049"
> > >       hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120
> > > 
> > > Fix perf latency to print out time stamp without rounding to avoid such
> > > discrepancy.
> > > 
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > Cc: Namhyung Kim <namhyung@kernel.org>
> > > Cc: linux-kernel@vger.kernel.org
> > > Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> > > ---
> > > 
> > > I was tempted to get rid of all u64 to double casting in the function
> > > output_lat_thread but didn't because there is no data loss as of
> > > today.  Double float gives at least 15 significant decimal digits
> > > precision while the function requires only 14 significant digits precision.
> > > 
> > > $ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
> > > 14
> > > 
> > >  tools/lib/traceevent/event-parse.h |  1 +
> > >  tools/perf/builtin-sched.c         | 12 ++++++++++--
> > >  2 files changed, 11 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
> > > index 9ffde37..f42703c 100644
> > > --- a/tools/lib/traceevent/event-parse.h
> > > +++ b/tools/lib/traceevent/event-parse.h
> > > @@ -174,6 +174,7 @@ struct pevent_plugin_option {
> > > 
> > >  #define NSECS_PER_SEC		1000000000ULL
> > >  #define NSECS_PER_USEC		1000ULL
> > > +#define MSECS_PER_SEC		1000ULL
> > > 
> > >  enum format_flags {
> > >  	FIELD_IS_ARRAY		= 1,
> > > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> > > index afa0576..e5cf51a 100644
> > > --- a/tools/perf/builtin-sched.c
> > > +++ b/tools/perf/builtin-sched.c
> > > @@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
> > >  	int i;
> > >  	int ret;
> > >  	u64 avg;
> > > +	u64 max_lat_at_sec, max_lat_at_msec;
> > 
> > Isn't it usec rathen than msec? :)
> 
> It's to contain three decimal digits which are msecs when 'max_lat_at' is
> expressed in sec.
> For example when max_lat_at = 1858303049520 which is 1858.3030495199998 sec,
> max_lat_at_msec is meant to be 303.

???  But didn't you want to print it as 303049?

Looking at the code, the 'max_lat' is a latency which was printed in
msec (%9.3f ms) but the 'max_lat_at' is a timestamp which was printed
in sec (%13.6f s).  This is confusing but they use effectively same
unit (usec) by using different number of digit after the period.

> 
> I must admit variable's name is bit misleading.  Maybe just secs, msecs are
> better?
> Also just noticed u64 isn't needed for msecs.  Will size down.

IIUC you wanted usec for timestamp rather than msec, aren't you?

> 
> > 
> > > 
> > >  	if (!work_list->nb_atoms)
> > >  		return;
> > > @@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
> > > 
> > >  	avg = work_list->total_lat / work_list->nb_atoms;
> > > 
> > > -	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
> > > +	/*
> > > +	 * Avoid round up with printf to prevent event time discrepency
> > > +	 * between sched script and latency.
> > > +	 */
> > > +	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
> > > +	max_lat_at_msec = (work_list->max_lat_at -
> > > +			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
> > > +	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",
> > 
> > Maybe you'd better to be in sync with the script code:
> > 
> > 	if (PRINT_FIELD(TIME)) {
> > 		nsecs = sample->time;
> > 		secs = nsecs / NSECS_PER_SEC;
> > 		nsecs -= secs * NSECS_PER_SEC;
> > 		usecs = nsecs / NSECS_PER_USEC;
> > 		if (nanosecs)
> > 			printf("%5lu.%09llu: ", secs, nsecs);
> > 		else
> > 			printf("%5lu.%06lu: ", secs, usecs);
> > 	}
> 
> Apart from variable name, I'm not quite sure what to sync because sched
> doesn't print in nsecs.
> Maybe you just wanted for variable names in sync rather than logic?

You found inconsistency between the two, and wanted to show same time.
So wouldn't it be better factor out the code and reuse it from the
both callsite?  If you don't need nsec, you could make it an option.

But this is just a general suggestion and maybe it's not worth doing
here since the code is small..  So I won't insist on it strongly.

Thanks
Namhyung

> > 
> > 
> > >  	      (double)work_list->total_runtime / 1e6,
> > >  		 work_list->nb_atoms, (double)avg / 1e6,
> > >  		 (double)work_list->max_lat / 1e6,
> > > -		 (double)work_list->max_lat_at / 1e9);
> > > +		 max_lat_at_sec, max_lat_at_msec);
> > >  }
> > > 
> > >  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
> > > --
> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> > > hosted by The Linux Foundation
> > > 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf sched: kill time stamp discrepancy between script and latency
  2016-10-04 12:04     ` Namhyung Kim
@ 2016-10-04 17:11       ` Joonwoo Park
  0 siblings, 0 replies; 6+ messages in thread
From: Joonwoo Park @ 2016-10-04 17:11 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Alexander Shishkin, Steven Rostedt, linux-kernel



On 10/04/2016 05:04 AM, Namhyung Kim wrote:
> On Mon, Oct 03, 2016 at 03:04:48PM -0700, Joonwoo Park wrote:
>>
>>
>> On 09/30/2016 10:15 PM, Namhyung Kim wrote:
>>> Hi Joonwoo,
>>>
>>> On Wed, Sep 28, 2016 at 07:25:26PM -0700, Joonwoo Park wrote:
>>>> Perf sched latency is handy to find out the maximum sched latency and
>>>> the time stamp of the event.  After running sched latency, if a found
>>>> latency looks suspicious it's quite reasonable to run perf script
>>>> subsequently and search with the time stamp given by perf sched latency
>>>> to continue further debugging.  However, at present, it's possible the
>>>> time stamp given by perf sched latency cannot be found in the trace
>>>> output by perf script because perf sched latency converts the event
>>>> time from ns to ms as double float and prints it with printf which
>>>> does banker's rounding as opposed to perf script doesn't.
>>>>
>>>>   For example:
>>>>
>>>>    0x750ff0 [0x80]: event: 9
>>>>    <snip>
>>>>    2 1858303049520 0x750ff0 [0x80]: PERF_RECORD_SAMPLE(IP, 0x1): 15281/15281: 0xffffffff8162a63a period: 1 addr: 0
>>>>     ... thread: hackbench:15281
>>>>    <snip>
>>>>
>>>>     $ perf sched -i perf.data latency | grep hackbench
>>>>       hackbench:(401)       +   3539.283 ms |    23347 | avg:    7.286 ms | max:  829.998 ms | max at:   1858.303050 s
>>>>
>>>>     $ perf script -i perf.data | grep "1858\.303050"
>>>>
>>>>     $ perf script -i perf.data | grep "1858\.303049"
>>>>       hackbench 15281 [002]  1858.303049:       sched:sched_switch: prev_comm=hackbench prev_pid=15281 prev_prio=120 prev_state=D ==> next_comm=hackbench next_pid=15603 next_prio=120
>>>>
>>>> Fix perf latency to print out time stamp without rounding to avoid such
>>>> discrepancy.
>>>>
>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>>> Cc: Steven Rostedt <rostedt@goodmis.org>
>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>> Cc: linux-kernel@vger.kernel.org
>>>> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
>>>> ---
>>>>
>>>> I was tempted to get rid of all u64 to double casting in the function
>>>> output_lat_thread but didn't because there is no data loss as of
>>>> today.  Double float gives at least 15 significant decimal digits
>>>> precision while the function requires only 14 significant digits precision.
>>>>
>>>> $ python -c "print(len(str(int(0xffffffffffffffff / 1e6))))"
>>>> 14
>>>>
>>>>  tools/lib/traceevent/event-parse.h |  1 +
>>>>  tools/perf/builtin-sched.c         | 12 ++++++++++--
>>>>  2 files changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
>>>> index 9ffde37..f42703c 100644
>>>> --- a/tools/lib/traceevent/event-parse.h
>>>> +++ b/tools/lib/traceevent/event-parse.h
>>>> @@ -174,6 +174,7 @@ struct pevent_plugin_option {
>>>>
>>>>  #define NSECS_PER_SEC		1000000000ULL
>>>>  #define NSECS_PER_USEC		1000ULL
>>>> +#define MSECS_PER_SEC		1000ULL
>>>>
>>>>  enum format_flags {
>>>>  	FIELD_IS_ARRAY		= 1,
>>>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>>>> index afa0576..e5cf51a 100644
>>>> --- a/tools/perf/builtin-sched.c
>>>> +++ b/tools/perf/builtin-sched.c
>>>> @@ -1190,6 +1190,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>>>>  	int i;
>>>>  	int ret;
>>>>  	u64 avg;
>>>> +	u64 max_lat_at_sec, max_lat_at_msec;
>>>
>>> Isn't it usec rathen than msec? :)
>>
>> It's to contain three decimal digits which are msecs when 'max_lat_at' is
>> expressed in sec.
>> For example when max_lat_at = 1858303049520 which is 1858.3030495199998 sec,
>> max_lat_at_msec is meant to be 303.
>
> ???  But didn't you want to print it as 303049?
>
> Looking at the code, the 'max_lat' is a latency which was printed in
> msec (%9.3f ms) but the 'max_lat_at' is a timestamp which was printed
> in sec (%13.6f s).  This is confusing but they use effectively same
> unit (usec) by using different number of digit after the period.
>
>>
>> I must admit variable's name is bit misleading.  Maybe just secs, msecs are
>> better?
>> Also just noticed u64 isn't needed for msecs.  Will size down.
>
> IIUC you wanted usec for timestamp rather than msec, aren't you?

Ugh.  I was a dummy.  My bad.  Will fix this.

>
>>
>>>
>>>>
>>>>  	if (!work_list->nb_atoms)
>>>>  		return;
>>>> @@ -1212,11 +1213,18 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
>>>>
>>>>  	avg = work_list->total_lat / work_list->nb_atoms;
>>>>
>>>> -	printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
>>>> +	/*
>>>> +	 * Avoid round up with printf to prevent event time discrepency
>>>> +	 * between sched script and latency.
>>>> +	 */
>>>> +	max_lat_at_sec = work_list->max_lat_at / NSECS_PER_SEC;
>>>> +	max_lat_at_msec = (work_list->max_lat_at -
>>>> +			   max_lat_at_sec * NSECS_PER_SEC) / MSECS_PER_SEC;
>>>> +	printf("+%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %6lu.%06lu s\n",
>>>
>>> Maybe you'd better to be in sync with the script code:
>>>
>>> 	if (PRINT_FIELD(TIME)) {
>>> 		nsecs = sample->time;
>>> 		secs = nsecs / NSECS_PER_SEC;
>>> 		nsecs -= secs * NSECS_PER_SEC;
>>> 		usecs = nsecs / NSECS_PER_USEC;
>>> 		if (nanosecs)
>>> 			printf("%5lu.%09llu: ", secs, nsecs);
>>> 		else
>>> 			printf("%5lu.%06lu: ", secs, usecs);
>>> 	}
>>
>> Apart from variable name, I'm not quite sure what to sync because sched
>> doesn't print in nsecs.
>> Maybe you just wanted for variable names in sync rather than logic?
>
> You found inconsistency between the two, and wanted to show same time.
> So wouldn't it be better factor out the code and reuse it from the
> both callsite?  If you don't need nsec, you could make it an option.
>
> But this is just a general suggestion and maybe it's not worth doing
> here since the code is small..  So I won't insist on it strongly.

I see what you're suggesting.  Will try to factor out the common part in 
the next revision.

Thanks,
Joonwoo

>
> Thanks
> Namhyung
>
>>>
>>>
>>>>  	      (double)work_list->total_runtime / 1e6,
>>>>  		 work_list->nb_atoms, (double)avg / 1e6,
>>>>  		 (double)work_list->max_lat / 1e6,
>>>> -		 (double)work_list->max_lat_at / 1e9);
>>>> +		 max_lat_at_sec, max_lat_at_msec);
>>>>  }
>>>>
>>>>  static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
>>>> --
>>>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
>>>> hosted by The Linux Foundation
>>>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-04 17:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-29  2:25 [PATCH] perf sched: kill time stamp discrepancy between script and latency Joonwoo Park
2016-09-29  2:29 ` Joonwoo Park
2016-10-01  5:15 ` Namhyung Kim
2016-10-03 22:04   ` Joonwoo Park
2016-10-04 12:04     ` Namhyung Kim
2016-10-04 17:11       ` Joonwoo Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.