All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf tools: fix callstack entries and nr print message
@ 2022-05-09 11:47 Chengdong Li
  2022-05-09 13:45 ` Jiri Olsa
  0 siblings, 1 reply; 7+ messages in thread
From: Chengdong Li @ 2022-05-09 11:47 UTC (permalink / raw)
  To: alexey.v.bayduraev, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, rickyman7, adrian.hunter,
	irogers, german.gomez, linux-perf-users, linux-kernel
  Cc: ak, likexu, chengdongli

From: Chengdong Li <chengdongli@tencent.com>

when generating callstack information from branch_stack(Intel LBR),
the actual number of callstack entry should be bigger than the number
of branch_stack, for example:

	branch_stack records:
		B() -> C()
		A() -> B()
	converted callstack records should be:
		C()
		B()
		A()
though, the number of callstack equals
to the number of branch stack plus 1.

This patch fix above issue in branch_stack__printf(). For example,

	# echo 'scale=2000; 4*a(1)' > cmd
	# perf record --call-graph lbr bc -l < cmd

Before applying this patch, `perf script -D` output:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:6    # which is not consistent with LBR records.
	.....  0: 000000000040a410
	.....  1: 0000000000408650    # ditto
	.....  2: 00000000004022f2
	.....  3: 00000000004015f5
	.....  4: 00007f5ed6dcb553
	.....  5: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

After applied:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:7
	.....  0: 000000000040a410
	.....  1: 000000000040573c
	.....  2: 0000000000408650
	.....  3: 00000000004022f2
	.....  4: 00000000004015f5
	.....  5: 00007f5ed6dcb553
	.....  6: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

Signed-off-by: Chengdong Li <chengdongli@tencent.com>
---
 tools/perf/util/session.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f9a320694b85..568a1db98686 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1151,9 +1151,19 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
 	struct branch_entry *entries = perf_sample__branch_entries(sample);
 	uint64_t i;
 
-	printf("%s: nr:%" PRIu64 "\n",
-		!callstack ? "... branch stack" : "... branch callstack",
-		sample->branch_stack->nr);
+	if (!callstack)
+		printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr);
+	else
+		/* the reason of adding 1 to nr is because after expanding
+		 * branch stack it generates nr + 1 callstack records. e.g.,
+		 *         B()->C()
+		 *         A()->B()
+		 * the final callstack should be:
+		 *         C()
+		 *         B()
+		 *         A()
+		 */
+		printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1);
 
 	for (i = 0; i < sample->branch_stack->nr; i++) {
 		struct branch_entry *e = &entries[i];
@@ -1169,8 +1179,12 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
 				(unsigned)e->flags.reserved,
 				e->flags.type ? branch_type_name(e->flags.type) : "");
 		} else {
-			printf("..... %2"PRIu64": %016" PRIx64 "\n",
-				i, i > 0 ? e->from : e->to);
+			if (i == 0)
+				printf("..... %2"PRIu64": %016" PRIx64 "\n"
+				       "..... %2"PRIu64": %016" PRIx64 "\n",
+						i, e->to, i+1, e->from);
+			else
+				printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from);
 		}
 	}
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] perf tools: fix callstack entries and nr print message
  2022-05-09 11:47 [PATCH] perf tools: fix callstack entries and nr print message Chengdong Li
@ 2022-05-09 13:45 ` Jiri Olsa
  2022-05-10  0:40   ` [PATCH v2] " Chengdong Li
  2022-05-17  1:57   ` [RESEND PATCH " Chengdong Li
  0 siblings, 2 replies; 7+ messages in thread
From: Jiri Olsa @ 2022-05-09 13:45 UTC (permalink / raw)
  To: Chengdong Li
  Cc: alexey.v.bayduraev, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, rickyman7, adrian.hunter,
	irogers, german.gomez, linux-perf-users, linux-kernel, ak,
	likexu, chengdongli

On Mon, May 09, 2022 at 07:47:43PM +0800, Chengdong Li wrote:
> From: Chengdong Li <chengdongli@tencent.com>
> 
> when generating callstack information from branch_stack(Intel LBR),
> the actual number of callstack entry should be bigger than the number
> of branch_stack, for example:
> 
> 	branch_stack records:
> 		B() -> C()
> 		A() -> B()
> 	converted callstack records should be:
> 		C()
> 		B()
> 		A()
> though, the number of callstack equals
> to the number of branch stack plus 1.
> 
> This patch fix above issue in branch_stack__printf(). For example,
> 
> 	# echo 'scale=2000; 4*a(1)' > cmd
> 	# perf record --call-graph lbr bc -l < cmd
> 
> Before applying this patch, `perf script -D` output:
> 
> 	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
> 	... LBR call chain: nr:8
> 	.....  0: fffffffffffffe00
> 	.....  1: 000000000040a410
> 	.....  2: 000000000040573c
> 	.....  3: 0000000000408650
> 	.....  4: 00000000004022f2
> 	.....  5: 00000000004015f5
> 	.....  6: 00007f5ed6dcb553
> 	.....  7: 0000000000401698
> 	... FP chain: nr:2
> 	.....  0: fffffffffffffe00
> 	.....  1: 000000000040a6d8
> 	... branch callstack: nr:6    # which is not consistent with LBR records.
> 	.....  0: 000000000040a410
> 	.....  1: 0000000000408650    # ditto
> 	.....  2: 00000000004022f2
> 	.....  3: 00000000004015f5
> 	.....  4: 00007f5ed6dcb553
> 	.....  5: 0000000000401698
> 	 ... thread: bc:17990
> 	 ...... dso: /usr/bin/bc
> 	bc 17990 1220022.677386:     894172 cycles:
> 			  40a410 [unknown] (/usr/bin/bc)
> 			  40573c [unknown] (/usr/bin/bc)
> 			  408650 [unknown] (/usr/bin/bc)
> 			  4022f2 [unknown] (/usr/bin/bc)
> 			  4015f5 [unknown] (/usr/bin/bc)
> 		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
> 			  401698 [unknown] (/usr/bin/bc)
> 
> After applied:
> 
> 	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
> 	... LBR call chain: nr:8
> 	.....  0: fffffffffffffe00
> 	.....  1: 000000000040a410
> 	.....  2: 000000000040573c
> 	.....  3: 0000000000408650
> 	.....  4: 00000000004022f2
> 	.....  5: 00000000004015f5
> 	.....  6: 00007f5ed6dcb553
> 	.....  7: 0000000000401698
> 	... FP chain: nr:2
> 	.....  0: fffffffffffffe00
> 	.....  1: 000000000040a6d8
> 	... branch callstack: nr:7
> 	.....  0: 000000000040a410
> 	.....  1: 000000000040573c
> 	.....  2: 0000000000408650
> 	.....  3: 00000000004022f2
> 	.....  4: 00000000004015f5
> 	.....  5: 00007f5ed6dcb553
> 	.....  6: 0000000000401698
> 	 ... thread: bc:17990
> 	 ...... dso: /usr/bin/bc
> 	bc 17990 1220022.677386:     894172 cycles:
> 			  40a410 [unknown] (/usr/bin/bc)
> 			  40573c [unknown] (/usr/bin/bc)
> 			  408650 [unknown] (/usr/bin/bc)
> 			  4022f2 [unknown] (/usr/bin/bc)
> 			  4015f5 [unknown] (/usr/bin/bc)
> 		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
> 			  401698 [unknown] (/usr/bin/bc)
> 
> Signed-off-by: Chengdong Li <chengdongli@tencent.com>
> ---
>  tools/perf/util/session.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index f9a320694b85..568a1db98686 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1151,9 +1151,19 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
>  	struct branch_entry *entries = perf_sample__branch_entries(sample);
>  	uint64_t i;
>  
> -	printf("%s: nr:%" PRIu64 "\n",
> -		!callstack ? "... branch stack" : "... branch callstack",
> -		sample->branch_stack->nr);
> +	if (!callstack)
> +		printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr);
> +	else
> +		/* the reason of adding 1 to nr is because after expanding
> +		 * branch stack it generates nr + 1 callstack records. e.g.,
> +		 *         B()->C()
> +		 *         A()->B()
> +		 * the final callstack should be:
> +		 *         C()
> +		 *         B()
> +		 *         A()
> +		 */
> +		printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1);

please use { }

>  
>  	for (i = 0; i < sample->branch_stack->nr; i++) {
>  		struct branch_entry *e = &entries[i];
> @@ -1169,8 +1179,12 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
>  				(unsigned)e->flags.reserved,
>  				e->flags.type ? branch_type_name(e->flags.type) : "");
>  		} else {
> -			printf("..... %2"PRIu64": %016" PRIx64 "\n",
> -				i, i > 0 ? e->from : e->to);
> +			if (i == 0)
> +				printf("..... %2"PRIu64": %016" PRIx64 "\n"
> +				       "..... %2"PRIu64": %016" PRIx64 "\n",
> +						i, e->to, i+1, e->from);
> +			else
> +				printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from);

same here

thanks,
jirka

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] perf tools: fix callstack entries and nr print message
  2022-05-09 13:45 ` Jiri Olsa
@ 2022-05-10  0:40   ` Chengdong Li
  2022-05-11  1:35     ` Bryton Lee
  2022-05-17  1:57   ` [RESEND PATCH " Chengdong Li
  1 sibling, 1 reply; 7+ messages in thread
From: Chengdong Li @ 2022-05-10  0:40 UTC (permalink / raw)
  To: alexey.v.bayduraev, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, rickyman7, adrian.hunter,
	irogers, german.gomez, linux-perf-users, linux-kernel
  Cc: ak, likexu, chengdongli

From: Chengdong Li <chengdongli@tencent.com>

when generating callstack information from branch_stack(Intel LBR),
the actual number of callstack entry should be bigger than the number
of branch_stack, for example:

	branch_stack records:
		B() -> C()
		A() -> B()
	converted callstack records should be:
		C()
		B()
		A()
though, the number of callstack equals
to the number of branch stack plus 1.

This patch fixes above issue in branch_stack__printf(). For example,

	# echo 'scale=2000; 4*a(1)' > cmd
	# perf record --call-graph lbr bc -l < cmd

Before applying this patch, `perf script -D` output:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:6    # which is not consistent with LBR records.
	.....  0: 000000000040a410
	.....  1: 0000000000408650    # ditto
	.....  2: 00000000004022f2
	.....  3: 00000000004015f5
	.....  4: 00007f5ed6dcb553
	.....  5: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

After applied:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:7
	.....  0: 000000000040a410
	.....  1: 000000000040573c
	.....  2: 0000000000408650
	.....  3: 00000000004022f2
	.....  4: 00000000004015f5
	.....  5: 00007f5ed6dcb553
	.....  6: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

Change from v1:
	- refined code style according to Jiri's review comments.

Signed-off-by: Chengdong Li <chengdongli@tencent.com>
---
 tools/perf/util/session.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f9a320694b85..a7f93f5a1ac8 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1151,9 +1151,20 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
 	struct branch_entry *entries = perf_sample__branch_entries(sample);
 	uint64_t i;
 
-	printf("%s: nr:%" PRIu64 "\n",
-		!callstack ? "... branch stack" : "... branch callstack",
-		sample->branch_stack->nr);
+	if (!callstack) {
+		printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr);
+	} else {
+		/* the reason of adding 1 to nr is because after expanding
+		 * branch stack it generates nr + 1 callstack records. e.g.,
+		 *         B()->C()
+		 *         A()->B()
+		 * the final callstack should be:
+		 *         C()
+		 *         B()
+		 *         A()
+		 */
+		printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1);
+	}
 
 	for (i = 0; i < sample->branch_stack->nr; i++) {
 		struct branch_entry *e = &entries[i];
@@ -1169,8 +1180,13 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
 				(unsigned)e->flags.reserved,
 				e->flags.type ? branch_type_name(e->flags.type) : "");
 		} else {
-			printf("..... %2"PRIu64": %016" PRIx64 "\n",
-				i, i > 0 ? e->from : e->to);
+			if (i == 0) {
+				printf("..... %2"PRIu64": %016" PRIx64 "\n"
+				       "..... %2"PRIu64": %016" PRIx64 "\n",
+						i, e->to, i+1, e->from);
+			} else {
+				printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from);
+			}
 		}
 	}
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] perf tools: fix callstack entries and nr print message
  2022-05-10  0:40   ` [PATCH v2] " Chengdong Li
@ 2022-05-11  1:35     ` Bryton Lee
  0 siblings, 0 replies; 7+ messages in thread
From: Bryton Lee @ 2022-05-11  1:35 UTC (permalink / raw)
  To: jolsa, linux-perf-users, linux-kernel
  Cc: ak, likexu, chengdongli, mark.rutland, mingo,
	Arnaldo Carvalho de Melo, Adrian Hunter, alexander.shishkin,
	irogers, german.gomez, rickyman7, Namhyung Kim, peterz,
	alexey.v.bayduraev

Hi Jiri,

Could you kindly review my patch again? I sent out my updates
yesterday. I am happy to know your guidance.

Thanks,
Chengdong

On Tue, May 10, 2022 at 8:41 AM Chengdong Li <brytonlee01@gmail.com> wrote:
>
> From: Chengdong Li <chengdongli@tencent.com>
>
> when generating callstack information from branch_stack(Intel LBR),
> the actual number of callstack entry should be bigger than the number
> of branch_stack, for example:
>
>         branch_stack records:
>                 B() -> C()
>                 A() -> B()
>         converted callstack records should be:
>                 C()
>                 B()
>                 A()
> though, the number of callstack equals
> to the number of branch stack plus 1.
>
> This patch fixes above issue in branch_stack__printf(). For example,
>
>         # echo 'scale=2000; 4*a(1)' > cmd
>         # perf record --call-graph lbr bc -l < cmd
>
> Before applying this patch, `perf script -D` output:
>
>         1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
>         ... LBR call chain: nr:8
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a410
>         .....  2: 000000000040573c
>         .....  3: 0000000000408650
>         .....  4: 00000000004022f2
>         .....  5: 00000000004015f5
>         .....  6: 00007f5ed6dcb553
>         .....  7: 0000000000401698
>         ... FP chain: nr:2
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a6d8
>         ... branch callstack: nr:6    # which is not consistent with LBR records.
>         .....  0: 000000000040a410
>         .....  1: 0000000000408650    # ditto
>         .....  2: 00000000004022f2
>         .....  3: 00000000004015f5
>         .....  4: 00007f5ed6dcb553
>         .....  5: 0000000000401698
>          ... thread: bc:17990
>          ...... dso: /usr/bin/bc
>         bc 17990 1220022.677386:     894172 cycles:
>                           40a410 [unknown] (/usr/bin/bc)
>                           40573c [unknown] (/usr/bin/bc)
>                           408650 [unknown] (/usr/bin/bc)
>                           4022f2 [unknown] (/usr/bin/bc)
>                           4015f5 [unknown] (/usr/bin/bc)
>                     7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
>                           401698 [unknown] (/usr/bin/bc)
>
> After applied:
>
>         1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
>         ... LBR call chain: nr:8
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a410
>         .....  2: 000000000040573c
>         .....  3: 0000000000408650
>         .....  4: 00000000004022f2
>         .....  5: 00000000004015f5
>         .....  6: 00007f5ed6dcb553
>         .....  7: 0000000000401698
>         ... FP chain: nr:2
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a6d8
>         ... branch callstack: nr:7
>         .....  0: 000000000040a410
>         .....  1: 000000000040573c
>         .....  2: 0000000000408650
>         .....  3: 00000000004022f2
>         .....  4: 00000000004015f5
>         .....  5: 00007f5ed6dcb553
>         .....  6: 0000000000401698
>          ... thread: bc:17990
>          ...... dso: /usr/bin/bc
>         bc 17990 1220022.677386:     894172 cycles:
>                           40a410 [unknown] (/usr/bin/bc)
>                           40573c [unknown] (/usr/bin/bc)
>                           408650 [unknown] (/usr/bin/bc)
>                           4022f2 [unknown] (/usr/bin/bc)
>                           4015f5 [unknown] (/usr/bin/bc)
>                     7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
>                           401698 [unknown] (/usr/bin/bc)
>
> Change from v1:
>         - refined code style according to Jiri's review comments.
>
> Signed-off-by: Chengdong Li <chengdongli@tencent.com>
> ---
>  tools/perf/util/session.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index f9a320694b85..a7f93f5a1ac8 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1151,9 +1151,20 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
>         struct branch_entry *entries = perf_sample__branch_entries(sample);
>         uint64_t i;
>
> -       printf("%s: nr:%" PRIu64 "\n",
> -               !callstack ? "... branch stack" : "... branch callstack",
> -               sample->branch_stack->nr);
> +       if (!callstack) {
> +               printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr);
> +       } else {
> +               /* the reason of adding 1 to nr is because after expanding
> +                * branch stack it generates nr + 1 callstack records. e.g.,
> +                *         B()->C()
> +                *         A()->B()
> +                * the final callstack should be:
> +                *         C()
> +                *         B()
> +                *         A()
> +                */
> +               printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1);
> +       }
>
>         for (i = 0; i < sample->branch_stack->nr; i++) {
>                 struct branch_entry *e = &entries[i];
> @@ -1169,8 +1180,13 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
>                                 (unsigned)e->flags.reserved,
>                                 e->flags.type ? branch_type_name(e->flags.type) : "");
>                 } else {
> -                       printf("..... %2"PRIu64": %016" PRIx64 "\n",
> -                               i, i > 0 ? e->from : e->to);
> +                       if (i == 0) {
> +                               printf("..... %2"PRIu64": %016" PRIx64 "\n"
> +                                      "..... %2"PRIu64": %016" PRIx64 "\n",
> +                                               i, e->to, i+1, e->from);
> +                       } else {
> +                               printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from);
> +                       }
>                 }
>         }
>  }
> --
> 2.27.0
>


-- 
Best Regards

Bryton.Lee

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RESEND PATCH v2] perf tools: fix callstack entries and nr print message
  2022-05-09 13:45 ` Jiri Olsa
  2022-05-10  0:40   ` [PATCH v2] " Chengdong Li
@ 2022-05-17  1:57   ` Chengdong Li
  2022-05-17 23:14     ` Namhyung Kim
  1 sibling, 1 reply; 7+ messages in thread
From: Chengdong Li @ 2022-05-17  1:57 UTC (permalink / raw)
  To: alexey.v.bayduraev, peterz, mingo, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, rickyman7, adrian.hunter,
	irogers, german.gomez, linux-perf-users, linux-kernel
  Cc: ak, likexu, chengdongli

From: Chengdong Li <chengdongli@tencent.com>

when generating callstack information from branch_stack(Intel LBR),
the actual number of callstack entry should be bigger than the number
of branch_stack, for example:

	branch_stack records:
		B() -> C()
		A() -> B()
	converted callstack records should be:
		C()
		B()
		A()
though, the number of callstack equals
to the number of branch stack plus 1.

This patch fixes above issue in branch_stack__printf(). For example,

	# echo 'scale=2000; 4*a(1)' > cmd
	# perf record --call-graph lbr bc -l < cmd

Before applying this patch, `perf script -D` output:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:6    # which is not consistent with LBR records.
	.....  0: 000000000040a410
	.....  1: 0000000000408650    # ditto
	.....  2: 00000000004022f2
	.....  3: 00000000004015f5
	.....  4: 00007f5ed6dcb553
	.....  5: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

After applied:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:7
	.....  0: 000000000040a410
	.....  1: 000000000040573c
	.....  2: 0000000000408650
	.....  3: 00000000004022f2
	.....  4: 00000000004015f5
	.....  5: 00007f5ed6dcb553
	.....  6: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

Change from v1:
	- refined code style according to Jiri's review comments.

Signed-off-by: Chengdong Li <chengdongli@tencent.com>
---
 tools/perf/util/session.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f9a320694b85..a7f93f5a1ac8 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1151,9 +1151,20 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
 	struct branch_entry *entries = perf_sample__branch_entries(sample);
 	uint64_t i;
 
-	printf("%s: nr:%" PRIu64 "\n",
-		!callstack ? "... branch stack" : "... branch callstack",
-		sample->branch_stack->nr);
+	if (!callstack) {
+		printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr);
+	} else {
+		/* the reason of adding 1 to nr is because after expanding
+		 * branch stack it generates nr + 1 callstack records. e.g.,
+		 *         B()->C()
+		 *         A()->B()
+		 * the final callstack should be:
+		 *         C()
+		 *         B()
+		 *         A()
+		 */
+		printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1);
+	}
 
 	for (i = 0; i < sample->branch_stack->nr; i++) {
 		struct branch_entry *e = &entries[i];
@@ -1169,8 +1180,13 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
 				(unsigned)e->flags.reserved,
 				e->flags.type ? branch_type_name(e->flags.type) : "");
 		} else {
-			printf("..... %2"PRIu64": %016" PRIx64 "\n",
-				i, i > 0 ? e->from : e->to);
+			if (i == 0) {
+				printf("..... %2"PRIu64": %016" PRIx64 "\n"
+				       "..... %2"PRIu64": %016" PRIx64 "\n",
+						i, e->to, i+1, e->from);
+			} else {
+				printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from);
+			}
 		}
 	}
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RESEND PATCH v2] perf tools: fix callstack entries and nr print message
  2022-05-17  1:57   ` [RESEND PATCH " Chengdong Li
@ 2022-05-17 23:14     ` Namhyung Kim
  2022-05-21 17:57       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2022-05-17 23:14 UTC (permalink / raw)
  To: Chengdong Li
  Cc: Alexey Bayduraev, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Riccardo Mancini, Adrian Hunter, Ian Rogers,
	German Gomez, linux-perf-users, linux-kernel, Andi Kleen, likexu,
	chengdongli

Hello,

On Mon, May 16, 2022 at 6:57 PM Chengdong Li <brytonlee01@gmail.com> wrote:
>
> From: Chengdong Li <chengdongli@tencent.com>
>
> when generating callstack information from branch_stack(Intel LBR),
> the actual number of callstack entry should be bigger than the number
> of branch_stack, for example:
>
>         branch_stack records:
>                 B() -> C()
>                 A() -> B()
>         converted callstack records should be:
>                 C()
>                 B()
>                 A()
> though, the number of callstack equals
> to the number of branch stack plus 1.
>
> This patch fixes above issue in branch_stack__printf(). For example,
>
>         # echo 'scale=2000; 4*a(1)' > cmd
>         # perf record --call-graph lbr bc -l < cmd
>
> Before applying this patch, `perf script -D` output:
>
>         1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
>         ... LBR call chain: nr:8
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a410
>         .....  2: 000000000040573c
>         .....  3: 0000000000408650
>         .....  4: 00000000004022f2
>         .....  5: 00000000004015f5
>         .....  6: 00007f5ed6dcb553
>         .....  7: 0000000000401698
>         ... FP chain: nr:2
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a6d8
>         ... branch callstack: nr:6    # which is not consistent with LBR records.
>         .....  0: 000000000040a410
>         .....  1: 0000000000408650    # ditto
>         .....  2: 00000000004022f2
>         .....  3: 00000000004015f5
>         .....  4: 00007f5ed6dcb553
>         .....  5: 0000000000401698
>          ... thread: bc:17990
>          ...... dso: /usr/bin/bc
>         bc 17990 1220022.677386:     894172 cycles:
>                           40a410 [unknown] (/usr/bin/bc)
>                           40573c [unknown] (/usr/bin/bc)
>                           408650 [unknown] (/usr/bin/bc)
>                           4022f2 [unknown] (/usr/bin/bc)
>                           4015f5 [unknown] (/usr/bin/bc)
>                     7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
>                           401698 [unknown] (/usr/bin/bc)
>
> After applied:
>
>         1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
>         ... LBR call chain: nr:8
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a410
>         .....  2: 000000000040573c
>         .....  3: 0000000000408650
>         .....  4: 00000000004022f2
>         .....  5: 00000000004015f5
>         .....  6: 00007f5ed6dcb553
>         .....  7: 0000000000401698
>         ... FP chain: nr:2
>         .....  0: fffffffffffffe00
>         .....  1: 000000000040a6d8
>         ... branch callstack: nr:7
>         .....  0: 000000000040a410
>         .....  1: 000000000040573c
>         .....  2: 0000000000408650
>         .....  3: 00000000004022f2
>         .....  4: 00000000004015f5
>         .....  5: 00007f5ed6dcb553
>         .....  6: 0000000000401698
>          ... thread: bc:17990
>          ...... dso: /usr/bin/bc
>         bc 17990 1220022.677386:     894172 cycles:
>                           40a410 [unknown] (/usr/bin/bc)
>                           40573c [unknown] (/usr/bin/bc)
>                           408650 [unknown] (/usr/bin/bc)
>                           4022f2 [unknown] (/usr/bin/bc)
>                           4015f5 [unknown] (/usr/bin/bc)
>                     7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
>                           401698 [unknown] (/usr/bin/bc)
>
> Change from v1:
>         - refined code style according to Jiri's review comments.
>
> Signed-off-by: Chengdong Li <chengdongli@tencent.com>

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung


> ---
>  tools/perf/util/session.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index f9a320694b85..a7f93f5a1ac8 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1151,9 +1151,20 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
>         struct branch_entry *entries = perf_sample__branch_entries(sample);
>         uint64_t i;
>
> -       printf("%s: nr:%" PRIu64 "\n",
> -               !callstack ? "... branch stack" : "... branch callstack",
> -               sample->branch_stack->nr);
> +       if (!callstack) {
> +               printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr);
> +       } else {
> +               /* the reason of adding 1 to nr is because after expanding
> +                * branch stack it generates nr + 1 callstack records. e.g.,
> +                *         B()->C()
> +                *         A()->B()
> +                * the final callstack should be:
> +                *         C()
> +                *         B()
> +                *         A()
> +                */
> +               printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1);
> +       }
>
>         for (i = 0; i < sample->branch_stack->nr; i++) {
>                 struct branch_entry *e = &entries[i];
> @@ -1169,8 +1180,13 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack)
>                                 (unsigned)e->flags.reserved,
>                                 e->flags.type ? branch_type_name(e->flags.type) : "");
>                 } else {
> -                       printf("..... %2"PRIu64": %016" PRIx64 "\n",
> -                               i, i > 0 ? e->from : e->to);
> +                       if (i == 0) {
> +                               printf("..... %2"PRIu64": %016" PRIx64 "\n"
> +                                      "..... %2"PRIu64": %016" PRIx64 "\n",
> +                                               i, e->to, i+1, e->from);
> +                       } else {
> +                               printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from);
> +                       }
>                 }
>         }
>  }
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESEND PATCH v2] perf tools: fix callstack entries and nr print message
  2022-05-17 23:14     ` Namhyung Kim
@ 2022-05-21 17:57       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-05-21 17:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Chengdong Li, Alexey Bayduraev, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Riccardo Mancini,
	Adrian Hunter, Ian Rogers, German Gomez, linux-perf-users,
	linux-kernel, Andi Kleen, likexu, chengdongli

Em Tue, May 17, 2022 at 04:14:49PM -0700, Namhyung Kim escreveu:
> On Mon, May 16, 2022 at 6:57 PM Chengdong Li <brytonlee01@gmail.com> wrote:
> > Signed-off-by: Chengdong Li <chengdongli@tencent.com>
> 
> Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks, applied.

- Arnaldo


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-21 17:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-09 11:47 [PATCH] perf tools: fix callstack entries and nr print message Chengdong Li
2022-05-09 13:45 ` Jiri Olsa
2022-05-10  0:40   ` [PATCH v2] " Chengdong Li
2022-05-11  1:35     ` Bryton Lee
2022-05-17  1:57   ` [RESEND PATCH " Chengdong Li
2022-05-17 23:14     ` Namhyung Kim
2022-05-21 17:57       ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.