All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] perf cs-etm: Add support for callchain
@ 2019-08-30  6:24 ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

This patch seris adds support for instruction sample's callchain.

Patch 01 is to refactor the instruction size calculation; patch 02
is to add thread stack and callchain support; patch 03 is a minor fixing
for instruction sample generation thus the instruction sample can be
alignment with the callchain generation.

Before:

  # perf script --itrace=g16l64i100
            main  1579        100      instructions:  ffff0000102137f0 group_sched_in+0xb0 ([kernel.kallsyms])
            main  1579        100      instructions:  ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms])
            main  1579        100      instructions:  ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms])
            main  1579        100      instructions:  ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms])
            main  1579        100      instructions:  ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])

  [...]

After:

  # perf script --itrace=g16l64i100

  main  1579        100      instructions: 
          ffff0000102137f0 group_sched_in+0xb0 ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms])
          ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
          ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms])
          ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
          ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
          ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
          ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms])
          ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
          ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
          ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  [...]

Leo Yan (3):
  perf cs-etm: Refactor instruction size handling
  perf cs-etm: Add callchain to instruction sample
  perf cs-etm: Correct the packet usage for instruction sample

 tools/perf/util/cs-etm.c | 122 +++++++++++++++++++++++++++++++++------
 1 file changed, 105 insertions(+), 17 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v1 0/3] perf cs-etm: Add support for callchain
@ 2019-08-30  6:24 ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

This patch seris adds support for instruction sample's callchain.

Patch 01 is to refactor the instruction size calculation; patch 02
is to add thread stack and callchain support; patch 03 is a minor fixing
for instruction sample generation thus the instruction sample can be
alignment with the callchain generation.

Before:

  # perf script --itrace=g16l64i100
            main  1579        100      instructions:  ffff0000102137f0 group_sched_in+0xb0 ([kernel.kallsyms])
            main  1579        100      instructions:  ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms])
            main  1579        100      instructions:  ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms])
            main  1579        100      instructions:  ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms])
            main  1579        100      instructions:  ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])

  [...]

After:

  # perf script --itrace=g16l64i100

  main  1579        100      instructions: 
          ffff0000102137f0 group_sched_in+0xb0 ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms])
          ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
          ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms])
          ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
          ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
          ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  main  1579        100      instructions: 
          ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
          ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms])
          ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
          ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
          ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
          ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

  [...]

Leo Yan (3):
  perf cs-etm: Refactor instruction size handling
  perf cs-etm: Add callchain to instruction sample
  perf cs-etm: Correct the packet usage for instruction sample

 tools/perf/util/cs-etm.c | 122 +++++++++++++++++++++++++++++++++------
 1 file changed, 105 insertions(+), 17 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
  2019-08-30  6:24 ` Leo Yan
@ 2019-08-30  6:24   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

There has several code pieces need to know the instruction size, but
now every place calculates the instruction size separately.

This patch refactors to create a new function cs_etm__instr_size() as
a central place to analyze the instruction length based on ISA type
and instruction value.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index b3a5daaf1a8f..882a0718033d 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
 	return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
 }
 
+static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
+				     u8 trace_chan_id,
+				     enum cs_etm_isa isa,
+				     u64 addr)
+{
+	int insn_len;
+
+	/*
+	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
+	 * cs_etm__t32_instr_size().
+	 */
+	if (isa == CS_ETM_ISA_T32)
+		insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
+	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
+	else
+		insn_len = 4;
+
+	return insn_len;
+}
+
 static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
 {
 	/* Returns 0 for the CS_ETM_DISCONTINUITY packet */
@@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
 				     const struct cs_etm_packet *packet,
 				     u64 offset)
 {
+	int insn_len;
+
 	if (packet->isa == CS_ETM_ISA_T32) {
 		u64 addr = packet->start_addr;
 
 		while (offset > 0) {
-			addr += cs_etm__t32_instr_size(etmq,
-						       trace_chan_id, addr);
+			addr += cs_etm__instr_size(etmq, trace_chan_id,
+						   packet->isa, addr);
 			offset--;
 		}
 		return addr;
 	}
 
-	/* Assume a 4 byte instruction size (A32/A64) */
-	return packet->start_addr + offset * 4;
+	/* Return instruction size for A32/A64 */
+	insn_len = cs_etm__instr_size(etmq, trace_chan_id,
+				      packet->isa, packet->start_addr);
+	return packet->start_addr + offset * insn_len;
 }
 
 static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
@@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
 		return;
 	}
 
-	/*
-	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
-	 * cs_etm__t32_instr_size().
-	 */
-	if (packet->isa == CS_ETM_ISA_T32)
-		sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
-							  sample->ip);
-	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
-	else
-		sample->insn_len = 4;
+	sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
+					      packet->isa, sample->ip);
 
 	cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
 			   sample->insn_len, (void *)sample->insn);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
@ 2019-08-30  6:24   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

There has several code pieces need to know the instruction size, but
now every place calculates the instruction size separately.

This patch refactors to create a new function cs_etm__instr_size() as
a central place to analyze the instruction length based on ISA type
and instruction value.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index b3a5daaf1a8f..882a0718033d 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
 	return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
 }
 
+static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
+				     u8 trace_chan_id,
+				     enum cs_etm_isa isa,
+				     u64 addr)
+{
+	int insn_len;
+
+	/*
+	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
+	 * cs_etm__t32_instr_size().
+	 */
+	if (isa == CS_ETM_ISA_T32)
+		insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
+	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
+	else
+		insn_len = 4;
+
+	return insn_len;
+}
+
 static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
 {
 	/* Returns 0 for the CS_ETM_DISCONTINUITY packet */
@@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
 				     const struct cs_etm_packet *packet,
 				     u64 offset)
 {
+	int insn_len;
+
 	if (packet->isa == CS_ETM_ISA_T32) {
 		u64 addr = packet->start_addr;
 
 		while (offset > 0) {
-			addr += cs_etm__t32_instr_size(etmq,
-						       trace_chan_id, addr);
+			addr += cs_etm__instr_size(etmq, trace_chan_id,
+						   packet->isa, addr);
 			offset--;
 		}
 		return addr;
 	}
 
-	/* Assume a 4 byte instruction size (A32/A64) */
-	return packet->start_addr + offset * 4;
+	/* Return instruction size for A32/A64 */
+	insn_len = cs_etm__instr_size(etmq, trace_chan_id,
+				      packet->isa, packet->start_addr);
+	return packet->start_addr + offset * insn_len;
 }
 
 static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
@@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
 		return;
 	}
 
-	/*
-	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
-	 * cs_etm__t32_instr_size().
-	 */
-	if (packet->isa == CS_ETM_ISA_T32)
-		sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
-							  sample->ip);
-	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
-	else
-		sample->insn_len = 4;
+	sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
+					      packet->isa, sample->ip);
 
 	cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
 			   sample->insn_len, (void *)sample->insn);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 2/3] perf cs-etm: Add callchain to instruction sample
  2019-08-30  6:24 ` Leo Yan
@ 2019-08-30  6:24   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

Firstly, this patch adds support for the thread stack; when every branch
packet is coming we will push or pop the stack based on the sample
flags.

Secondly, based on the thread stack we can synthesize call chain for the
instruction sample, this can be used by itrace option '--itrace=g'.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 74 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 73 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 882a0718033d..ad573d3bd305 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -17,6 +17,7 @@
 #include <stdlib.h>
 
 #include "auxtrace.h"
+#include "callchain.h"
 #include "color.h"
 #include "cs-etm.h"
 #include "cs-etm-decoder/cs-etm-decoder.h"
@@ -69,6 +70,7 @@ struct cs_etm_traceid_queue {
 	size_t last_branch_pos;
 	union perf_event *event_buf;
 	struct thread *thread;
+	struct ip_callchain *chain;
 	struct branch_stack *last_branch;
 	struct branch_stack *last_branch_rb;
 	struct cs_etm_packet *prev_packet;
@@ -246,6 +248,16 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
 	if (!tidq->prev_packet)
 		goto out_free;
 
+	if (etm->synth_opts.callchain) {
+		size_t sz = sizeof(struct ip_callchain);
+
+		/* Add 1 to callchain_sz for callchain context */
+		sz += (etm->synth_opts.callchain_sz + 1) * sizeof(u64);
+		tidq->chain = zalloc(sz);
+		if (!tidq->chain)
+			goto out_free;
+	}
+
 	if (etm->synth_opts.last_branch) {
 		size_t sz = sizeof(struct branch_stack);
 
@@ -270,6 +282,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
 	zfree(&tidq->last_branch);
 	zfree(&tidq->prev_packet);
 	zfree(&tidq->packet);
+	zfree(&tidq->chain);
 out:
 	return rc;
 }
@@ -541,6 +554,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
 		zfree(&tidq->last_branch_rb);
 		zfree(&tidq->prev_packet);
 		zfree(&tidq->packet);
+		zfree(&tidq->chain);
 		zfree(&tidq);
 
 		/*
@@ -1121,6 +1135,41 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
 			   sample->insn_len, (void *)sample->insn);
 }
 
+static void cs_etm__add_stack_event(struct cs_etm_queue *etmq,
+				    struct cs_etm_traceid_queue *tidq)
+{
+	struct cs_etm_auxtrace *etm = etmq->etm;
+	u8 trace_chan_id = tidq->trace_chan_id;
+	int insn_len;
+	u64 from_ip, to_ip;
+
+	if (etm->synth_opts.callchain || etm->synth_opts.thread_stack) {
+
+		from_ip = cs_etm__last_executed_instr(tidq->prev_packet);
+		to_ip = cs_etm__first_executed_instr(tidq->packet);
+
+		/*
+		 * T32 instruction size might be 32-bit or 16-bit, decide by
+		 * calling cs_etm__t32_instr_size().
+		 */
+		if (tidq->prev_packet->isa == CS_ETM_ISA_T32)
+			insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
+							  from_ip);
+		/* Otherwise, A64 and A32 instruction size are always 32-bit. */
+		else
+			insn_len = 4;
+
+		thread_stack__event(tidq->thread, tidq->prev_packet->cpu,
+				    tidq->prev_packet->flags,
+				    from_ip, to_ip, insn_len,
+				    etmq->buffer->buffer_nr);
+	} else {
+		thread_stack__set_trace_nr(tidq->thread,
+					   tidq->prev_packet->cpu,
+					   etmq->buffer->buffer_nr);
+	}
+}
+
 static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
 					    struct cs_etm_traceid_queue *tidq,
 					    u64 addr, u64 period)
@@ -1146,6 +1195,14 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
 
 	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
 
+	if (etm->synth_opts.callchain) {
+		thread_stack__sample(tidq->thread, tidq->packet->cpu,
+				     tidq->chain,
+				     etm->synth_opts.callchain_sz + 1,
+				     sample.ip, etm->kernel_start);
+		sample.callchain = tidq->chain;
+	}
+
 	if (etm->synth_opts.last_branch) {
 		cs_etm__copy_last_branch_rb(etmq, tidq);
 		sample.branch_stack = tidq->last_branch;
@@ -1329,6 +1386,8 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm,
 		attr.sample_type &= ~(u64)PERF_SAMPLE_ADDR;
 	}
 
+	if (etm->synth_opts.callchain)
+		attr.sample_type |= PERF_SAMPLE_CALLCHAIN;
 	if (etm->synth_opts.last_branch)
 		attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
 
@@ -1397,6 +1456,9 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
 		tidq->period_instructions = instrs_over;
 	}
 
+	if (tidq->prev_packet->last_instr_taken_branch)
+		cs_etm__add_stack_event(etmq, tidq);
+
 	if (etm->sample_branches) {
 		bool generate_sample = false;
 
@@ -2596,7 +2658,17 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
 	} else {
 		itrace_synth_opts__set_default(&etm->synth_opts,
 				session->itrace_synth_opts->default_no_sample);
-		etm->synth_opts.callchain = false;
+
+		etm->synth_opts.thread_stack =
+				session->itrace_synth_opts->thread_stack;
+	}
+
+	if (etm->synth_opts.callchain && !symbol_conf.use_callchain) {
+		symbol_conf.use_callchain = true;
+		if (callchain_register_param(&callchain_param) < 0) {
+			symbol_conf.use_callchain = false;
+			etm->synth_opts.callchain = false;
+		}
 	}
 
 	err = cs_etm__synth_events(etm, session);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 2/3] perf cs-etm: Add callchain to instruction sample
@ 2019-08-30  6:24   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

Firstly, this patch adds support for the thread stack; when every branch
packet is coming we will push or pop the stack based on the sample
flags.

Secondly, based on the thread stack we can synthesize call chain for the
instruction sample, this can be used by itrace option '--itrace=g'.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 74 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 73 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 882a0718033d..ad573d3bd305 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -17,6 +17,7 @@
 #include <stdlib.h>
 
 #include "auxtrace.h"
+#include "callchain.h"
 #include "color.h"
 #include "cs-etm.h"
 #include "cs-etm-decoder/cs-etm-decoder.h"
@@ -69,6 +70,7 @@ struct cs_etm_traceid_queue {
 	size_t last_branch_pos;
 	union perf_event *event_buf;
 	struct thread *thread;
+	struct ip_callchain *chain;
 	struct branch_stack *last_branch;
 	struct branch_stack *last_branch_rb;
 	struct cs_etm_packet *prev_packet;
@@ -246,6 +248,16 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
 	if (!tidq->prev_packet)
 		goto out_free;
 
+	if (etm->synth_opts.callchain) {
+		size_t sz = sizeof(struct ip_callchain);
+
+		/* Add 1 to callchain_sz for callchain context */
+		sz += (etm->synth_opts.callchain_sz + 1) * sizeof(u64);
+		tidq->chain = zalloc(sz);
+		if (!tidq->chain)
+			goto out_free;
+	}
+
 	if (etm->synth_opts.last_branch) {
 		size_t sz = sizeof(struct branch_stack);
 
@@ -270,6 +282,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
 	zfree(&tidq->last_branch);
 	zfree(&tidq->prev_packet);
 	zfree(&tidq->packet);
+	zfree(&tidq->chain);
 out:
 	return rc;
 }
@@ -541,6 +554,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
 		zfree(&tidq->last_branch_rb);
 		zfree(&tidq->prev_packet);
 		zfree(&tidq->packet);
+		zfree(&tidq->chain);
 		zfree(&tidq);
 
 		/*
@@ -1121,6 +1135,41 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
 			   sample->insn_len, (void *)sample->insn);
 }
 
+static void cs_etm__add_stack_event(struct cs_etm_queue *etmq,
+				    struct cs_etm_traceid_queue *tidq)
+{
+	struct cs_etm_auxtrace *etm = etmq->etm;
+	u8 trace_chan_id = tidq->trace_chan_id;
+	int insn_len;
+	u64 from_ip, to_ip;
+
+	if (etm->synth_opts.callchain || etm->synth_opts.thread_stack) {
+
+		from_ip = cs_etm__last_executed_instr(tidq->prev_packet);
+		to_ip = cs_etm__first_executed_instr(tidq->packet);
+
+		/*
+		 * T32 instruction size might be 32-bit or 16-bit, decide by
+		 * calling cs_etm__t32_instr_size().
+		 */
+		if (tidq->prev_packet->isa == CS_ETM_ISA_T32)
+			insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
+							  from_ip);
+		/* Otherwise, A64 and A32 instruction size are always 32-bit. */
+		else
+			insn_len = 4;
+
+		thread_stack__event(tidq->thread, tidq->prev_packet->cpu,
+				    tidq->prev_packet->flags,
+				    from_ip, to_ip, insn_len,
+				    etmq->buffer->buffer_nr);
+	} else {
+		thread_stack__set_trace_nr(tidq->thread,
+					   tidq->prev_packet->cpu,
+					   etmq->buffer->buffer_nr);
+	}
+}
+
 static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
 					    struct cs_etm_traceid_queue *tidq,
 					    u64 addr, u64 period)
@@ -1146,6 +1195,14 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
 
 	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
 
+	if (etm->synth_opts.callchain) {
+		thread_stack__sample(tidq->thread, tidq->packet->cpu,
+				     tidq->chain,
+				     etm->synth_opts.callchain_sz + 1,
+				     sample.ip, etm->kernel_start);
+		sample.callchain = tidq->chain;
+	}
+
 	if (etm->synth_opts.last_branch) {
 		cs_etm__copy_last_branch_rb(etmq, tidq);
 		sample.branch_stack = tidq->last_branch;
@@ -1329,6 +1386,8 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm,
 		attr.sample_type &= ~(u64)PERF_SAMPLE_ADDR;
 	}
 
+	if (etm->synth_opts.callchain)
+		attr.sample_type |= PERF_SAMPLE_CALLCHAIN;
 	if (etm->synth_opts.last_branch)
 		attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
 
@@ -1397,6 +1456,9 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
 		tidq->period_instructions = instrs_over;
 	}
 
+	if (tidq->prev_packet->last_instr_taken_branch)
+		cs_etm__add_stack_event(etmq, tidq);
+
 	if (etm->sample_branches) {
 		bool generate_sample = false;
 
@@ -2596,7 +2658,17 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
 	} else {
 		itrace_synth_opts__set_default(&etm->synth_opts,
 				session->itrace_synth_opts->default_no_sample);
-		etm->synth_opts.callchain = false;
+
+		etm->synth_opts.thread_stack =
+				session->itrace_synth_opts->thread_stack;
+	}
+
+	if (etm->synth_opts.callchain && !symbol_conf.use_callchain) {
+		symbol_conf.use_callchain = true;
+		if (callchain_register_param(&callchain_param) < 0) {
+			symbol_conf.use_callchain = false;
+			etm->synth_opts.callchain = false;
+		}
 	}
 
 	err = cs_etm__synth_events(etm, session);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 3/3] perf cs-etm: Correct the packet usage for instruction sample
  2019-08-30  6:24 ` Leo Yan
@ 2019-08-30  6:24   ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

It uses 'tidq->packet' rather than 'tidq->prev_packet' to generate
instruction sample, comparing against the thread stack and the branch
samples which are using the 'tidp->prev_packet', thus this leads the
instruction sample to use one ahead packet than thread stack and branch
samples.

As result, the instruction's call chain can be wrongly displayed as
below:

  main  1579        100      instructions:
  	ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
  	ffff000010214850 perf_event_update_userpage+0x48 ([kernel.kallsyms])
  	ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms])
  	ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
  	ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
  	ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
  	ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

In this log, the continuous two lines includes two functions, the up
line contains the child function info and the bottom line contains
its parent function, and so forth.  But if review the first two lines:

  perf_event_update_userpage+0x4c  => the sampled instruction
  perf_event_update_userpage+0x48  => the parent function's calling

The child function and parent function is the same function
perf_event_update_userpage(), but perf_event_update_userpage() isn't
a recursive function, thus this calling sequence shouldn't never happen.
This is caused by the instruction sample using the 'tidq->packet', but
this packet is not handled yet by thread stack, the thread stack is
delayed to handle one return packet for stack popping.

To fix this issue, we can simply to use 'tidq->prev_packet' to generate
the instruction sample, this allows the thread stack to push/pop
properly for instruction sample.  Finally, we can get below result:

  main  1579        100      instructions:
	ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
	ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms])
	ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
	ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
	ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
	ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index ad573d3bd305..0bd2b3c48394 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1414,7 +1414,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
 	struct cs_etm_packet *tmp;
 	int ret;
 	u8 trace_chan_id = tidq->trace_chan_id;
-	u64 instrs_executed = tidq->packet->instr_count;
+	u64 instrs_executed = tidq->prev_packet->instr_count;
 
 	tidq->period_instructions += instrs_executed;
 
@@ -1445,7 +1445,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
 		 */
 		u64 offset = (instrs_executed - instrs_over - 1);
 		u64 addr = cs_etm__instr_addr(etmq, trace_chan_id,
-					      tidq->packet, offset);
+					      tidq->prev_packet, offset);
 
 		ret = cs_etm__synth_instruction_sample(
 			etmq, tidq, addr, etm->instructions_sample_period);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 3/3] perf cs-etm: Correct the packet usage for instruction sample
@ 2019-08-30  6:24   ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-08-30  6:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Mathieu Poirier, Suzuki K Poulose,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	linux-kernel, Mike Leach, Robert Walker, Adrian Hunter
  Cc: Leo Yan

It uses 'tidq->packet' rather than 'tidq->prev_packet' to generate
instruction sample, comparing against the thread stack and the branch
samples which are using the 'tidp->prev_packet', thus this leads the
instruction sample to use one ahead packet than thread stack and branch
samples.

As result, the instruction's call chain can be wrongly displayed as
below:

  main  1579        100      instructions:
  	ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
  	ffff000010214850 perf_event_update_userpage+0x48 ([kernel.kallsyms])
  	ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms])
  	ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
  	ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
  	ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
  	ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

In this log, the continuous two lines includes two functions, the up
line contains the child function info and the bottom line contains
its parent function, and so forth.  But if review the first two lines:

  perf_event_update_userpage+0x4c  => the sampled instruction
  perf_event_update_userpage+0x48  => the parent function's calling

The child function and parent function is the same function
perf_event_update_userpage(), but perf_event_update_userpage() isn't
a recursive function, thus this calling sequence shouldn't never happen.
This is caused by the instruction sample using the 'tidq->packet', but
this packet is not handled yet by thread stack, the thread stack is
delayed to handle one return packet for stack popping.

To fix this issue, we can simply to use 'tidq->prev_packet' to generate
the instruction sample, this allows the thread stack to push/pop
properly for instruction sample.  Finally, we can get below result:

  main  1579        100      instructions:
	ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
	ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms])
	ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
	ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
	ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
	ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/cs-etm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index ad573d3bd305..0bd2b3c48394 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1414,7 +1414,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
 	struct cs_etm_packet *tmp;
 	int ret;
 	u8 trace_chan_id = tidq->trace_chan_id;
-	u64 instrs_executed = tidq->packet->instr_count;
+	u64 instrs_executed = tidq->prev_packet->instr_count;
 
 	tidq->period_instructions += instrs_executed;
 
@@ -1445,7 +1445,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
 		 */
 		u64 offset = (instrs_executed - instrs_over - 1);
 		u64 addr = cs_etm__instr_addr(etmq, trace_chan_id,
-					      tidq->packet, offset);
+					      tidq->prev_packet, offset);
 
 		ret = cs_etm__synth_instruction_sample(
 			etmq, tidq, addr, etm->instructions_sample_period);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 2/3] perf cs-etm: Add callchain to instruction sample
  2019-08-30  6:24   ` Leo Yan
@ 2019-09-03 22:07     ` Mathieu Poirier
  -1 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2019-09-03 22:07 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Suzuki K Poulose, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-kernel,
	Mike Leach, Robert Walker, Adrian Hunter

On Fri, Aug 30, 2019 at 02:24:20PM +0800, Leo Yan wrote:
> Firstly, this patch adds support for the thread stack; when every branch
> packet is coming we will push or pop the stack based on the sample
> flags.
> 
> Secondly, based on the thread stack we can synthesize call chain for the
> instruction sample, this can be used by itrace option '--itrace=g'.
>

In most cases using the word "secondly" is a good indication the patch should be
split.
 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/cs-etm.c | 74 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 73 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 882a0718033d..ad573d3bd305 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -17,6 +17,7 @@
>  #include <stdlib.h>
>  
>  #include "auxtrace.h"
> +#include "callchain.h"
>  #include "color.h"
>  #include "cs-etm.h"
>  #include "cs-etm-decoder/cs-etm-decoder.h"
> @@ -69,6 +70,7 @@ struct cs_etm_traceid_queue {
>  	size_t last_branch_pos;
>  	union perf_event *event_buf;
>  	struct thread *thread;
> +	struct ip_callchain *chain;
>  	struct branch_stack *last_branch;
>  	struct branch_stack *last_branch_rb;
>  	struct cs_etm_packet *prev_packet;
> @@ -246,6 +248,16 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
>  	if (!tidq->prev_packet)
>  		goto out_free;
>  
> +	if (etm->synth_opts.callchain) {
> +		size_t sz = sizeof(struct ip_callchain);
> +
> +		/* Add 1 to callchain_sz for callchain context */
> +		sz += (etm->synth_opts.callchain_sz + 1) * sizeof(u64);
> +		tidq->chain = zalloc(sz);
> +		if (!tidq->chain)
> +			goto out_free;
> +	}
> +
>  	if (etm->synth_opts.last_branch) {
>  		size_t sz = sizeof(struct branch_stack);
>  
> @@ -270,6 +282,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
>  	zfree(&tidq->last_branch);
>  	zfree(&tidq->prev_packet);
>  	zfree(&tidq->packet);
> +	zfree(&tidq->chain);
>  out:
>  	return rc;
>  }
> @@ -541,6 +554,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
>  		zfree(&tidq->last_branch_rb);
>  		zfree(&tidq->prev_packet);
>  		zfree(&tidq->packet);
> +		zfree(&tidq->chain);
>  		zfree(&tidq);
>  
>  		/*
> @@ -1121,6 +1135,41 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
>  			   sample->insn_len, (void *)sample->insn);
>  }
>  
> +static void cs_etm__add_stack_event(struct cs_etm_queue *etmq,
> +				    struct cs_etm_traceid_queue *tidq)
> +{
> +	struct cs_etm_auxtrace *etm = etmq->etm;
> +	u8 trace_chan_id = tidq->trace_chan_id;
> +	int insn_len;
> +	u64 from_ip, to_ip;
> +
> +	if (etm->synth_opts.callchain || etm->synth_opts.thread_stack) {
> +
> +		from_ip = cs_etm__last_executed_instr(tidq->prev_packet);
> +		to_ip = cs_etm__first_executed_instr(tidq->packet);
> +
> +		/*
> +		 * T32 instruction size might be 32-bit or 16-bit, decide by
> +		 * calling cs_etm__t32_instr_size().
> +		 */
> +		if (tidq->prev_packet->isa == CS_ETM_ISA_T32)
> +			insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> +							  from_ip);
> +		/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> +		else
> +			insn_len = 4;
> +
> +		thread_stack__event(tidq->thread, tidq->prev_packet->cpu,
> +				    tidq->prev_packet->flags,
> +				    from_ip, to_ip, insn_len,
> +				    etmq->buffer->buffer_nr);
> +	} else {
> +		thread_stack__set_trace_nr(tidq->thread,
> +					   tidq->prev_packet->cpu,
> +					   etmq->buffer->buffer_nr);

Please add a comment on what the above does.  As a rule of thumb I add a comment
per addition in a patch in order to help people understand what is happening and
some of the reasonning behing the code.

> +	}
> +}
> +
>  static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
>  					    struct cs_etm_traceid_queue *tidq,
>  					    u64 addr, u64 period)
> @@ -1146,6 +1195,14 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
>  
>  	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
>  
> +	if (etm->synth_opts.callchain) {
> +		thread_stack__sample(tidq->thread, tidq->packet->cpu,
> +				     tidq->chain,
> +				     etm->synth_opts.callchain_sz + 1,
> +				     sample.ip, etm->kernel_start);
> +		sample.callchain = tidq->chain;
> +	}
> +
>  	if (etm->synth_opts.last_branch) {
>  		cs_etm__copy_last_branch_rb(etmq, tidq);
>  		sample.branch_stack = tidq->last_branch;
> @@ -1329,6 +1386,8 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm,
>  		attr.sample_type &= ~(u64)PERF_SAMPLE_ADDR;
>  	}
>  
> +	if (etm->synth_opts.callchain)
> +		attr.sample_type |= PERF_SAMPLE_CALLCHAIN;
>  	if (etm->synth_opts.last_branch)
>  		attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
>  
> @@ -1397,6 +1456,9 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
>  		tidq->period_instructions = instrs_over;
>  	}
>  
> +	if (tidq->prev_packet->last_instr_taken_branch)
> +		cs_etm__add_stack_event(etmq, tidq);
> +
>  	if (etm->sample_branches) {
>  		bool generate_sample = false;
>  
> @@ -2596,7 +2658,17 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>  	} else {
>  		itrace_synth_opts__set_default(&etm->synth_opts,
>  				session->itrace_synth_opts->default_no_sample);
> -		etm->synth_opts.callchain = false;
> +
> +		etm->synth_opts.thread_stack =
> +				session->itrace_synth_opts->thread_stack;
> +	}
> +
> +	if (etm->synth_opts.callchain && !symbol_conf.use_callchain) {
> +		symbol_conf.use_callchain = true;
> +		if (callchain_register_param(&callchain_param) < 0) {
> +			symbol_conf.use_callchain = false;
> +			etm->synth_opts.callchain = false;
> +		}
>  	}
>  
>  	err = cs_etm__synth_events(etm, session);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 2/3] perf cs-etm: Add callchain to instruction sample
@ 2019-09-03 22:07     ` Mathieu Poirier
  0 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2019-09-03 22:07 UTC (permalink / raw)
  To: Leo Yan
  Cc: Suzuki K Poulose, Alexander Shishkin, linux-kernel,
	Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
	Robert Walker, Jiri Olsa, linux-arm-kernel, Mike Leach

On Fri, Aug 30, 2019 at 02:24:20PM +0800, Leo Yan wrote:
> Firstly, this patch adds support for the thread stack; when every branch
> packet is coming we will push or pop the stack based on the sample
> flags.
> 
> Secondly, based on the thread stack we can synthesize call chain for the
> instruction sample, this can be used by itrace option '--itrace=g'.
>

In most cases using the word "secondly" is a good indication the patch should be
split.
 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/cs-etm.c | 74 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 73 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 882a0718033d..ad573d3bd305 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -17,6 +17,7 @@
>  #include <stdlib.h>
>  
>  #include "auxtrace.h"
> +#include "callchain.h"
>  #include "color.h"
>  #include "cs-etm.h"
>  #include "cs-etm-decoder/cs-etm-decoder.h"
> @@ -69,6 +70,7 @@ struct cs_etm_traceid_queue {
>  	size_t last_branch_pos;
>  	union perf_event *event_buf;
>  	struct thread *thread;
> +	struct ip_callchain *chain;
>  	struct branch_stack *last_branch;
>  	struct branch_stack *last_branch_rb;
>  	struct cs_etm_packet *prev_packet;
> @@ -246,6 +248,16 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
>  	if (!tidq->prev_packet)
>  		goto out_free;
>  
> +	if (etm->synth_opts.callchain) {
> +		size_t sz = sizeof(struct ip_callchain);
> +
> +		/* Add 1 to callchain_sz for callchain context */
> +		sz += (etm->synth_opts.callchain_sz + 1) * sizeof(u64);
> +		tidq->chain = zalloc(sz);
> +		if (!tidq->chain)
> +			goto out_free;
> +	}
> +
>  	if (etm->synth_opts.last_branch) {
>  		size_t sz = sizeof(struct branch_stack);
>  
> @@ -270,6 +282,7 @@ static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
>  	zfree(&tidq->last_branch);
>  	zfree(&tidq->prev_packet);
>  	zfree(&tidq->packet);
> +	zfree(&tidq->chain);
>  out:
>  	return rc;
>  }
> @@ -541,6 +554,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
>  		zfree(&tidq->last_branch_rb);
>  		zfree(&tidq->prev_packet);
>  		zfree(&tidq->packet);
> +		zfree(&tidq->chain);
>  		zfree(&tidq);
>  
>  		/*
> @@ -1121,6 +1135,41 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
>  			   sample->insn_len, (void *)sample->insn);
>  }
>  
> +static void cs_etm__add_stack_event(struct cs_etm_queue *etmq,
> +				    struct cs_etm_traceid_queue *tidq)
> +{
> +	struct cs_etm_auxtrace *etm = etmq->etm;
> +	u8 trace_chan_id = tidq->trace_chan_id;
> +	int insn_len;
> +	u64 from_ip, to_ip;
> +
> +	if (etm->synth_opts.callchain || etm->synth_opts.thread_stack) {
> +
> +		from_ip = cs_etm__last_executed_instr(tidq->prev_packet);
> +		to_ip = cs_etm__first_executed_instr(tidq->packet);
> +
> +		/*
> +		 * T32 instruction size might be 32-bit or 16-bit, decide by
> +		 * calling cs_etm__t32_instr_size().
> +		 */
> +		if (tidq->prev_packet->isa == CS_ETM_ISA_T32)
> +			insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> +							  from_ip);
> +		/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> +		else
> +			insn_len = 4;
> +
> +		thread_stack__event(tidq->thread, tidq->prev_packet->cpu,
> +				    tidq->prev_packet->flags,
> +				    from_ip, to_ip, insn_len,
> +				    etmq->buffer->buffer_nr);
> +	} else {
> +		thread_stack__set_trace_nr(tidq->thread,
> +					   tidq->prev_packet->cpu,
> +					   etmq->buffer->buffer_nr);

Please add a comment on what the above does.  As a rule of thumb I add a comment
per addition in a patch in order to help people understand what is happening and
some of the reasonning behing the code.

> +	}
> +}
> +
>  static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
>  					    struct cs_etm_traceid_queue *tidq,
>  					    u64 addr, u64 period)
> @@ -1146,6 +1195,14 @@ static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
>  
>  	cs_etm__copy_insn(etmq, tidq->trace_chan_id, tidq->packet, &sample);
>  
> +	if (etm->synth_opts.callchain) {
> +		thread_stack__sample(tidq->thread, tidq->packet->cpu,
> +				     tidq->chain,
> +				     etm->synth_opts.callchain_sz + 1,
> +				     sample.ip, etm->kernel_start);
> +		sample.callchain = tidq->chain;
> +	}
> +
>  	if (etm->synth_opts.last_branch) {
>  		cs_etm__copy_last_branch_rb(etmq, tidq);
>  		sample.branch_stack = tidq->last_branch;
> @@ -1329,6 +1386,8 @@ static int cs_etm__synth_events(struct cs_etm_auxtrace *etm,
>  		attr.sample_type &= ~(u64)PERF_SAMPLE_ADDR;
>  	}
>  
> +	if (etm->synth_opts.callchain)
> +		attr.sample_type |= PERF_SAMPLE_CALLCHAIN;
>  	if (etm->synth_opts.last_branch)
>  		attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
>  
> @@ -1397,6 +1456,9 @@ static int cs_etm__sample(struct cs_etm_queue *etmq,
>  		tidq->period_instructions = instrs_over;
>  	}
>  
> +	if (tidq->prev_packet->last_instr_taken_branch)
> +		cs_etm__add_stack_event(etmq, tidq);
> +
>  	if (etm->sample_branches) {
>  		bool generate_sample = false;
>  
> @@ -2596,7 +2658,17 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>  	} else {
>  		itrace_synth_opts__set_default(&etm->synth_opts,
>  				session->itrace_synth_opts->default_no_sample);
> -		etm->synth_opts.callchain = false;
> +
> +		etm->synth_opts.thread_stack =
> +				session->itrace_synth_opts->thread_stack;
> +	}
> +
> +	if (etm->synth_opts.callchain && !symbol_conf.use_callchain) {
> +		symbol_conf.use_callchain = true;
> +		if (callchain_register_param(&callchain_param) < 0) {
> +			symbol_conf.use_callchain = false;
> +			etm->synth_opts.callchain = false;
> +		}
>  	}
>  
>  	err = cs_etm__synth_events(etm, session);
> -- 
> 2.17.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
  2019-08-30  6:24   ` Leo Yan
@ 2019-09-03 22:22     ` Mathieu Poirier
  -1 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2019-09-03 22:22 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Suzuki K Poulose, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-kernel,
	Mike Leach, Robert Walker, Adrian Hunter

On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> There has several code pieces need to know the instruction size, but
> now every place calculates the instruction size separately.
> 
> This patch refactors to create a new function cs_etm__instr_size() as
> a central place to analyze the instruction length based on ISA type
> and instruction value.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
>  1 file changed, 30 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index b3a5daaf1a8f..882a0718033d 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
>  	return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
>  }
>  
> +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> +				     u8 trace_chan_id,
> +				     enum cs_etm_isa isa,
> +				     u64 addr)
> +{
> +	int insn_len;
> +
> +	/*
> +	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> +	 * cs_etm__t32_instr_size().
> +	 */
> +	if (isa == CS_ETM_ISA_T32)
> +		insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> +	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> +	else
> +		insn_len = 4;
> +
> +	return insn_len;
> +}
> +
>  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
>  {
>  	/* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
>  				     const struct cs_etm_packet *packet,
>  				     u64 offset)
>  {
> +	int insn_len;
> +
>  	if (packet->isa == CS_ETM_ISA_T32) {
>  		u64 addr = packet->start_addr;
>  
>  		while (offset > 0) {
> -			addr += cs_etm__t32_instr_size(etmq,
> -						       trace_chan_id, addr);
> +			addr += cs_etm__instr_size(etmq, trace_chan_id,
> +						   packet->isa, addr);
>  			offset--;
>  		}
>  		return addr;
>  	}
>  
> -	/* Assume a 4 byte instruction size (A32/A64) */
> -	return packet->start_addr + offset * 4;
> +	/* Return instruction size for A32/A64 */
> +	insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> +				      packet->isa, packet->start_addr);
> +	return packet->start_addr + offset * insn_len;

This patch will work but from where I stand it makes things difficult to
understand more than anything else.  It is also adding coupling between function
cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
carefully inspected in order to make changes to either one.

Last but not least function cs_etm__instr_size() isn't used in the upcoming
patches.  I really don't see what is gained here. 
 
Thanks,
Mathieu

>  }
>  
>  static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
> @@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
>  		return;
>  	}
>  
> -	/*
> -	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> -	 * cs_etm__t32_instr_size().
> -	 */
> -	if (packet->isa == CS_ETM_ISA_T32)
> -		sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> -							  sample->ip);
> -	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> -	else
> -		sample->insn_len = 4;
> +	sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> +					      packet->isa, sample->ip);
>  
>  	cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
>  			   sample->insn_len, (void *)sample->insn);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
@ 2019-09-03 22:22     ` Mathieu Poirier
  0 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2019-09-03 22:22 UTC (permalink / raw)
  To: Leo Yan
  Cc: Suzuki K Poulose, Alexander Shishkin, linux-kernel,
	Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
	Robert Walker, Jiri Olsa, linux-arm-kernel, Mike Leach

On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> There has several code pieces need to know the instruction size, but
> now every place calculates the instruction size separately.
> 
> This patch refactors to create a new function cs_etm__instr_size() as
> a central place to analyze the instruction length based on ISA type
> and instruction value.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
>  1 file changed, 30 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index b3a5daaf1a8f..882a0718033d 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
>  	return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
>  }
>  
> +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> +				     u8 trace_chan_id,
> +				     enum cs_etm_isa isa,
> +				     u64 addr)
> +{
> +	int insn_len;
> +
> +	/*
> +	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> +	 * cs_etm__t32_instr_size().
> +	 */
> +	if (isa == CS_ETM_ISA_T32)
> +		insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> +	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> +	else
> +		insn_len = 4;
> +
> +	return insn_len;
> +}
> +
>  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
>  {
>  	/* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
>  				     const struct cs_etm_packet *packet,
>  				     u64 offset)
>  {
> +	int insn_len;
> +
>  	if (packet->isa == CS_ETM_ISA_T32) {
>  		u64 addr = packet->start_addr;
>  
>  		while (offset > 0) {
> -			addr += cs_etm__t32_instr_size(etmq,
> -						       trace_chan_id, addr);
> +			addr += cs_etm__instr_size(etmq, trace_chan_id,
> +						   packet->isa, addr);
>  			offset--;
>  		}
>  		return addr;
>  	}
>  
> -	/* Assume a 4 byte instruction size (A32/A64) */
> -	return packet->start_addr + offset * 4;
> +	/* Return instruction size for A32/A64 */
> +	insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> +				      packet->isa, packet->start_addr);
> +	return packet->start_addr + offset * insn_len;

This patch will work but from where I stand it makes things difficult to
understand more than anything else.  It is also adding coupling between function
cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
carefully inspected in order to make changes to either one.

Last but not least function cs_etm__instr_size() isn't used in the upcoming
patches.  I really don't see what is gained here. 
 
Thanks,
Mathieu

>  }
>  
>  static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
> @@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
>  		return;
>  	}
>  
> -	/*
> -	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> -	 * cs_etm__t32_instr_size().
> -	 */
> -	if (packet->isa == CS_ETM_ISA_T32)
> -		sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> -							  sample->ip);
> -	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> -	else
> -		sample->insn_len = 4;
> +	sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> +					      packet->isa, sample->ip);
>  
>  	cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
>  			   sample->insn_len, (void *)sample->insn);
> -- 
> 2.17.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
  2019-09-03 22:22     ` Mathieu Poirier
@ 2019-09-04  9:19       ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-09-04  9:19 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Arnaldo Carvalho de Melo, Suzuki K Poulose, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-kernel,
	Mike Leach, Robert Walker, Adrian Hunter

Hi Mathieu,

On Tue, Sep 03, 2019 at 04:22:15PM -0600, Mathieu Poirier wrote:
> On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> > There has several code pieces need to know the instruction size, but
> > now every place calculates the instruction size separately.
> > 
> > This patch refactors to create a new function cs_etm__instr_size() as
> > a central place to analyze the instruction length based on ISA type
> > and instruction value.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
> >  1 file changed, 30 insertions(+), 14 deletions(-)
> > 
> > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > index b3a5daaf1a8f..882a0718033d 100644
> > --- a/tools/perf/util/cs-etm.c
> > +++ b/tools/perf/util/cs-etm.c
> > @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> >  	return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
> >  }
> >  
> > +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> > +				     u8 trace_chan_id,
> > +				     enum cs_etm_isa isa,
> > +				     u64 addr)
> > +{
> > +	int insn_len;
> > +
> > +	/*
> > +	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > +	 * cs_etm__t32_instr_size().
> > +	 */
> > +	if (isa == CS_ETM_ISA_T32)
> > +		insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> > +	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > +	else
> > +		insn_len = 4;
> > +
> > +	return insn_len;
> > +}
> > +
> >  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> >  {
> >  	/* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> > @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
> >  				     const struct cs_etm_packet *packet,
> >  				     u64 offset)
> >  {
> > +	int insn_len;
> > +
> >  	if (packet->isa == CS_ETM_ISA_T32) {
> >  		u64 addr = packet->start_addr;
> >  
> >  		while (offset > 0) {
> > -			addr += cs_etm__t32_instr_size(etmq,
> > -						       trace_chan_id, addr);
> > +			addr += cs_etm__instr_size(etmq, trace_chan_id,
> > +						   packet->isa, addr);
> >  			offset--;
> >  		}
> >  		return addr;
> >  	}
> >  
> > -	/* Assume a 4 byte instruction size (A32/A64) */
> > -	return packet->start_addr + offset * 4;
> > +	/* Return instruction size for A32/A64 */
> > +	insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > +				      packet->isa, packet->start_addr);
> > +	return packet->start_addr + offset * insn_len;
> 
> This patch will work but from where I stand it makes things difficult to
> understand more than anything else.  It is also adding coupling between function
> cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
> carefully inspected in order to make changes to either one.

My purpose is to use a same place to calculate the instruction
size, rather than to spread the duplicate codes in several different
functions.

> Last but not least function cs_etm__instr_size() isn't used in the upcoming
> patches.  I really don't see what is gained here. 

Sorry that I forgot to commit my final change into patch 02.

I planed to use cs_etm__instr_size() in patch 02; patch 02 has
function cs_etm__add_stack_event(), which also needs to get the
instruction size when it sends stack event.

After apply patch 02, tools/perf/util/cs-etm.c will have below three
functions to caculate instruction size; this is the main reason I want
to refactor the code for instruction size.

  cs_etm__instr_addr()
  cs_etm__copy_insn()
  cs_etm__add_stack_event()

If this lets code more difficult to understand, will drop it.

Thanks,
Leo Yan

> >  }
> >  
> >  static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
> > @@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
> >  		return;
> >  	}
> >  
> > -	/*
> > -	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > -	 * cs_etm__t32_instr_size().
> > -	 */
> > -	if (packet->isa == CS_ETM_ISA_T32)
> > -		sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> > -							  sample->ip);
> > -	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > -	else
> > -		sample->insn_len = 4;
> > +	sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > +					      packet->isa, sample->ip);
> >  
> >  	cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
> >  			   sample->insn_len, (void *)sample->insn);
> > -- 
> > 2.17.1
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
@ 2019-09-04  9:19       ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-09-04  9:19 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Suzuki K Poulose, Alexander Shishkin, linux-kernel,
	Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
	Robert Walker, Jiri Olsa, linux-arm-kernel, Mike Leach

Hi Mathieu,

On Tue, Sep 03, 2019 at 04:22:15PM -0600, Mathieu Poirier wrote:
> On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> > There has several code pieces need to know the instruction size, but
> > now every place calculates the instruction size separately.
> > 
> > This patch refactors to create a new function cs_etm__instr_size() as
> > a central place to analyze the instruction length based on ISA type
> > and instruction value.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
> >  1 file changed, 30 insertions(+), 14 deletions(-)
> > 
> > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > index b3a5daaf1a8f..882a0718033d 100644
> > --- a/tools/perf/util/cs-etm.c
> > +++ b/tools/perf/util/cs-etm.c
> > @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> >  	return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
> >  }
> >  
> > +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> > +				     u8 trace_chan_id,
> > +				     enum cs_etm_isa isa,
> > +				     u64 addr)
> > +{
> > +	int insn_len;
> > +
> > +	/*
> > +	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > +	 * cs_etm__t32_instr_size().
> > +	 */
> > +	if (isa == CS_ETM_ISA_T32)
> > +		insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> > +	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > +	else
> > +		insn_len = 4;
> > +
> > +	return insn_len;
> > +}
> > +
> >  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> >  {
> >  	/* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> > @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
> >  				     const struct cs_etm_packet *packet,
> >  				     u64 offset)
> >  {
> > +	int insn_len;
> > +
> >  	if (packet->isa == CS_ETM_ISA_T32) {
> >  		u64 addr = packet->start_addr;
> >  
> >  		while (offset > 0) {
> > -			addr += cs_etm__t32_instr_size(etmq,
> > -						       trace_chan_id, addr);
> > +			addr += cs_etm__instr_size(etmq, trace_chan_id,
> > +						   packet->isa, addr);
> >  			offset--;
> >  		}
> >  		return addr;
> >  	}
> >  
> > -	/* Assume a 4 byte instruction size (A32/A64) */
> > -	return packet->start_addr + offset * 4;
> > +	/* Return instruction size for A32/A64 */
> > +	insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > +				      packet->isa, packet->start_addr);
> > +	return packet->start_addr + offset * insn_len;
> 
> This patch will work but from where I stand it makes things difficult to
> understand more than anything else.  It is also adding coupling between function
> cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
> carefully inspected in order to make changes to either one.

My purpose is to use a same place to calculate the instruction
size, rather than to spread the duplicate codes in several different
functions.

> Last but not least function cs_etm__instr_size() isn't used in the upcoming
> patches.  I really don't see what is gained here. 

Sorry that I forgot to commit my final change into patch 02.

I planed to use cs_etm__instr_size() in patch 02; patch 02 has
function cs_etm__add_stack_event(), which also needs to get the
instruction size when it sends stack event.

After apply patch 02, tools/perf/util/cs-etm.c will have below three
functions to caculate instruction size; this is the main reason I want
to refactor the code for instruction size.

  cs_etm__instr_addr()
  cs_etm__copy_insn()
  cs_etm__add_stack_event()

If this lets code more difficult to understand, will drop it.

Thanks,
Leo Yan

> >  }
> >  
> >  static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
> > @@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
> >  		return;
> >  	}
> >  
> > -	/*
> > -	 * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > -	 * cs_etm__t32_instr_size().
> > -	 */
> > -	if (packet->isa == CS_ETM_ISA_T32)
> > -		sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> > -							  sample->ip);
> > -	/* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > -	else
> > -		sample->insn_len = 4;
> > +	sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > +					      packet->isa, sample->ip);
> >  
> >  	cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
> >  			   sample->insn_len, (void *)sample->insn);
> > -- 
> > 2.17.1
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
  2019-09-04  9:19       ` Leo Yan
@ 2019-09-04 17:06         ` Mathieu Poirier
  -1 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2019-09-04 17:06 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Suzuki K Poulose, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	Linux Kernel Mailing List, Mike Leach, Robert Walker,
	Adrian Hunter

On Wed, 4 Sep 2019 at 03:19, Leo Yan <leo.yan@linaro.org> wrote:
>
> Hi Mathieu,
>
> On Tue, Sep 03, 2019 at 04:22:15PM -0600, Mathieu Poirier wrote:
> > On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> > > There has several code pieces need to know the instruction size, but
> > > now every place calculates the instruction size separately.
> > >
> > > This patch refactors to create a new function cs_etm__instr_size() as
> > > a central place to analyze the instruction length based on ISA type
> > > and instruction value.
> > >
> > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > ---
> > >  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
> > >  1 file changed, 30 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > > index b3a5daaf1a8f..882a0718033d 100644
> > > --- a/tools/perf/util/cs-etm.c
> > > +++ b/tools/perf/util/cs-etm.c
> > > @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> > >     return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
> > >  }
> > >
> > > +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> > > +                                u8 trace_chan_id,
> > > +                                enum cs_etm_isa isa,
> > > +                                u64 addr)
> > > +{
> > > +   int insn_len;
> > > +
> > > +   /*
> > > +    * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > > +    * cs_etm__t32_instr_size().
> > > +    */
> > > +   if (isa == CS_ETM_ISA_T32)
> > > +           insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> > > +   /* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > > +   else
> > > +           insn_len = 4;
> > > +
> > > +   return insn_len;
> > > +}
> > > +
> > >  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > >  {
> > >     /* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> > > @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
> > >                                  const struct cs_etm_packet *packet,
> > >                                  u64 offset)
> > >  {
> > > +   int insn_len;
> > > +
> > >     if (packet->isa == CS_ETM_ISA_T32) {
> > >             u64 addr = packet->start_addr;
> > >
> > >             while (offset > 0) {
> > > -                   addr += cs_etm__t32_instr_size(etmq,
> > > -                                                  trace_chan_id, addr);
> > > +                   addr += cs_etm__instr_size(etmq, trace_chan_id,
> > > +                                              packet->isa, addr);
> > >                     offset--;
> > >             }
> > >             return addr;
> > >     }
> > >
> > > -   /* Assume a 4 byte instruction size (A32/A64) */
> > > -   return packet->start_addr + offset * 4;
> > > +   /* Return instruction size for A32/A64 */
> > > +   insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > > +                                 packet->isa, packet->start_addr);
> > > +   return packet->start_addr + offset * insn_len;
> >
> > This patch will work but from where I stand it makes things difficult to
> > understand more than anything else.  It is also adding coupling between function
> > cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
> > carefully inspected in order to make changes to either one.
>
> My purpose is to use a same place to calculate the instruction
> size, rather than to spread the duplicate codes in several different
> functions.
>
> > Last but not least function cs_etm__instr_size() isn't used in the upcoming
> > patches.  I really don't see what is gained here.
>
> Sorry that I forgot to commit my final change into patch 02.
>
> I planed to use cs_etm__instr_size() in patch 02; patch 02 has
> function cs_etm__add_stack_event(), which also needs to get the
> instruction size when it sends stack event.
>
> After apply patch 02, tools/perf/util/cs-etm.c will have below three
> functions to caculate instruction size; this is the main reason I want
> to refactor the code for instruction size.
>
>   cs_etm__instr_addr()
>   cs_etm__copy_insn()
>   cs_etm__add_stack_event()
>
> If this lets code more difficult to understand, will drop it.
>

I agree with the consolidation but for that to work function
cs_etm__instr_addr() needs to be refactored.  Since
cs_etm__instr_size() is already taking care of checking the ISA type
the while() loop in cs_etm__instr_addr() can be done regardless of the
operation mode.  That way cs_etm__instr_size() can be changed at will
without breaking anything.

The downside is that we are doing a few more iterations... Which isn't
that big a deal considering we are in user space.  We can think about
some optimisation if it is ever proven to be a bottleneck.

Let me know if you see a problem with that approach.

Regards,
Mathieu

> Thanks,
> Leo Yan
>
> > >  }
> > >
> > >  static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
> > > @@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
> > >             return;
> > >     }
> > >
> > > -   /*
> > > -    * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > > -    * cs_etm__t32_instr_size().
> > > -    */
> > > -   if (packet->isa == CS_ETM_ISA_T32)
> > > -           sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> > > -                                                     sample->ip);
> > > -   /* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > > -   else
> > > -           sample->insn_len = 4;
> > > +   sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > > +                                         packet->isa, sample->ip);
> > >
> > >     cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
> > >                        sample->insn_len, (void *)sample->insn);
> > > --
> > > 2.17.1
> > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
@ 2019-09-04 17:06         ` Mathieu Poirier
  0 siblings, 0 replies; 18+ messages in thread
From: Mathieu Poirier @ 2019-09-04 17:06 UTC (permalink / raw)
  To: Leo Yan
  Cc: Suzuki K Poulose, Alexander Shishkin, Linux Kernel Mailing List,
	Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
	Robert Walker, Jiri Olsa, linux-arm-kernel, Mike Leach

On Wed, 4 Sep 2019 at 03:19, Leo Yan <leo.yan@linaro.org> wrote:
>
> Hi Mathieu,
>
> On Tue, Sep 03, 2019 at 04:22:15PM -0600, Mathieu Poirier wrote:
> > On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> > > There has several code pieces need to know the instruction size, but
> > > now every place calculates the instruction size separately.
> > >
> > > This patch refactors to create a new function cs_etm__instr_size() as
> > > a central place to analyze the instruction length based on ISA type
> > > and instruction value.
> > >
> > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > ---
> > >  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
> > >  1 file changed, 30 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > > index b3a5daaf1a8f..882a0718033d 100644
> > > --- a/tools/perf/util/cs-etm.c
> > > +++ b/tools/perf/util/cs-etm.c
> > > @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> > >     return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
> > >  }
> > >
> > > +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> > > +                                u8 trace_chan_id,
> > > +                                enum cs_etm_isa isa,
> > > +                                u64 addr)
> > > +{
> > > +   int insn_len;
> > > +
> > > +   /*
> > > +    * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > > +    * cs_etm__t32_instr_size().
> > > +    */
> > > +   if (isa == CS_ETM_ISA_T32)
> > > +           insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> > > +   /* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > > +   else
> > > +           insn_len = 4;
> > > +
> > > +   return insn_len;
> > > +}
> > > +
> > >  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > >  {
> > >     /* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> > > @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
> > >                                  const struct cs_etm_packet *packet,
> > >                                  u64 offset)
> > >  {
> > > +   int insn_len;
> > > +
> > >     if (packet->isa == CS_ETM_ISA_T32) {
> > >             u64 addr = packet->start_addr;
> > >
> > >             while (offset > 0) {
> > > -                   addr += cs_etm__t32_instr_size(etmq,
> > > -                                                  trace_chan_id, addr);
> > > +                   addr += cs_etm__instr_size(etmq, trace_chan_id,
> > > +                                              packet->isa, addr);
> > >                     offset--;
> > >             }
> > >             return addr;
> > >     }
> > >
> > > -   /* Assume a 4 byte instruction size (A32/A64) */
> > > -   return packet->start_addr + offset * 4;
> > > +   /* Return instruction size for A32/A64 */
> > > +   insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > > +                                 packet->isa, packet->start_addr);
> > > +   return packet->start_addr + offset * insn_len;
> >
> > This patch will work but from where I stand it makes things difficult to
> > understand more than anything else.  It is also adding coupling between function
> > cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
> > carefully inspected in order to make changes to either one.
>
> My purpose is to use a same place to calculate the instruction
> size, rather than to spread the duplicate codes in several different
> functions.
>
> > Last but not least function cs_etm__instr_size() isn't used in the upcoming
> > patches.  I really don't see what is gained here.
>
> Sorry that I forgot to commit my final change into patch 02.
>
> I planed to use cs_etm__instr_size() in patch 02; patch 02 has
> function cs_etm__add_stack_event(), which also needs to get the
> instruction size when it sends stack event.
>
> After apply patch 02, tools/perf/util/cs-etm.c will have below three
> functions to caculate instruction size; this is the main reason I want
> to refactor the code for instruction size.
>
>   cs_etm__instr_addr()
>   cs_etm__copy_insn()
>   cs_etm__add_stack_event()
>
> If this lets code more difficult to understand, will drop it.
>

I agree with the consolidation but for that to work function
cs_etm__instr_addr() needs to be refactored.  Since
cs_etm__instr_size() is already taking care of checking the ISA type
the while() loop in cs_etm__instr_addr() can be done regardless of the
operation mode.  That way cs_etm__instr_size() can be changed at will
without breaking anything.

The downside is that we are doing a few more iterations... Which isn't
that big a deal considering we are in user space.  We can think about
some optimisation if it is ever proven to be a bottleneck.

Let me know if you see a problem with that approach.

Regards,
Mathieu

> Thanks,
> Leo Yan
>
> > >  }
> > >
> > >  static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq,
> > > @@ -1090,16 +1114,8 @@ static void cs_etm__copy_insn(struct cs_etm_queue *etmq,
> > >             return;
> > >     }
> > >
> > > -   /*
> > > -    * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > > -    * cs_etm__t32_instr_size().
> > > -    */
> > > -   if (packet->isa == CS_ETM_ISA_T32)
> > > -           sample->insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id,
> > > -                                                     sample->ip);
> > > -   /* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > > -   else
> > > -           sample->insn_len = 4;
> > > +   sample->insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > > +                                         packet->isa, sample->ip);
> > >
> > >     cs_etm__mem_access(etmq, trace_chan_id, sample->ip,
> > >                        sample->insn_len, (void *)sample->insn);
> > > --
> > > 2.17.1
> > >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
  2019-09-04 17:06         ` Mathieu Poirier
@ 2019-09-05  1:50           ` Leo Yan
  -1 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-09-05  1:50 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Arnaldo Carvalho de Melo, Suzuki K Poulose, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel,
	Linux Kernel Mailing List, Mike Leach, Robert Walker,
	Adrian Hunter

Hi Mathieu,

On Wed, Sep 04, 2019 at 11:06:10AM -0600, Mathieu Poirier wrote:
> On Wed, 4 Sep 2019 at 03:19, Leo Yan <leo.yan@linaro.org> wrote:
> >
> > Hi Mathieu,
> >
> > On Tue, Sep 03, 2019 at 04:22:15PM -0600, Mathieu Poirier wrote:
> > > On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> > > > There has several code pieces need to know the instruction size, but
> > > > now every place calculates the instruction size separately.
> > > >
> > > > This patch refactors to create a new function cs_etm__instr_size() as
> > > > a central place to analyze the instruction length based on ISA type
> > > > and instruction value.
> > > >
> > > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > > ---
> > > >  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
> > > >  1 file changed, 30 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > > > index b3a5daaf1a8f..882a0718033d 100644
> > > > --- a/tools/perf/util/cs-etm.c
> > > > +++ b/tools/perf/util/cs-etm.c
> > > > @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> > > >     return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
> > > >  }
> > > >
> > > > +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> > > > +                                u8 trace_chan_id,
> > > > +                                enum cs_etm_isa isa,
> > > > +                                u64 addr)
> > > > +{
> > > > +   int insn_len;
> > > > +
> > > > +   /*
> > > > +    * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > > > +    * cs_etm__t32_instr_size().
> > > > +    */
> > > > +   if (isa == CS_ETM_ISA_T32)
> > > > +           insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> > > > +   /* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > > > +   else
> > > > +           insn_len = 4;
> > > > +
> > > > +   return insn_len;
> > > > +}
> > > > +
> > > >  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > > >  {
> > > >     /* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> > > > @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
> > > >                                  const struct cs_etm_packet *packet,
> > > >                                  u64 offset)
> > > >  {
> > > > +   int insn_len;
> > > > +
> > > >     if (packet->isa == CS_ETM_ISA_T32) {
> > > >             u64 addr = packet->start_addr;
> > > >
> > > >             while (offset > 0) {
> > > > -                   addr += cs_etm__t32_instr_size(etmq,
> > > > -                                                  trace_chan_id, addr);
> > > > +                   addr += cs_etm__instr_size(etmq, trace_chan_id,
> > > > +                                              packet->isa, addr);
> > > >                     offset--;
> > > >             }
> > > >             return addr;
> > > >     }
> > > >
> > > > -   /* Assume a 4 byte instruction size (A32/A64) */
> > > > -   return packet->start_addr + offset * 4;
> > > > +   /* Return instruction size for A32/A64 */
> > > > +   insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > > > +                                 packet->isa, packet->start_addr);
> > > > +   return packet->start_addr + offset * insn_len;
> > >
> > > This patch will work but from where I stand it makes things difficult to
> > > understand more than anything else.  It is also adding coupling between function
> > > cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
> > > carefully inspected in order to make changes to either one.
> >
> > My purpose is to use a same place to calculate the instruction
> > size, rather than to spread the duplicate codes in several different
> > functions.
> >
> > > Last but not least function cs_etm__instr_size() isn't used in the upcoming
> > > patches.  I really don't see what is gained here.
> >
> > Sorry that I forgot to commit my final change into patch 02.
> >
> > I planed to use cs_etm__instr_size() in patch 02; patch 02 has
> > function cs_etm__add_stack_event(), which also needs to get the
> > instruction size when it sends stack event.
> >
> > After apply patch 02, tools/perf/util/cs-etm.c will have below three
> > functions to caculate instruction size; this is the main reason I want
> > to refactor the code for instruction size.
> >
> >   cs_etm__instr_addr()
> >   cs_etm__copy_insn()
> >   cs_etm__add_stack_event()
> >
> > If this lets code more difficult to understand, will drop it.
> >
> 
> I agree with the consolidation but for that to work function
> cs_etm__instr_addr() needs to be refactored.  Since
> cs_etm__instr_size() is already taking care of checking the ISA type
> the while() loop in cs_etm__instr_addr() can be done regardless of the
> operation mode.  That way cs_etm__instr_size() can be changed at will
> without breaking anything.
> 
> The downside is that we are doing a few more iterations... Which isn't
> that big a deal considering we are in user space.  We can think about
> some optimisation if it is ever proven to be a bottleneck.
> 
> Let me know if you see a problem with that approach.

Yes, your approach is neat; I will try it in next patch version.

Thanks a lot for suggestion!

[...]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling
@ 2019-09-05  1:50           ` Leo Yan
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Yan @ 2019-09-05  1:50 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: Suzuki K Poulose, Alexander Shishkin, Linux Kernel Mailing List,
	Arnaldo Carvalho de Melo, Adrian Hunter, Namhyung Kim,
	Robert Walker, Jiri Olsa, linux-arm-kernel, Mike Leach

Hi Mathieu,

On Wed, Sep 04, 2019 at 11:06:10AM -0600, Mathieu Poirier wrote:
> On Wed, 4 Sep 2019 at 03:19, Leo Yan <leo.yan@linaro.org> wrote:
> >
> > Hi Mathieu,
> >
> > On Tue, Sep 03, 2019 at 04:22:15PM -0600, Mathieu Poirier wrote:
> > > On Fri, Aug 30, 2019 at 02:24:19PM +0800, Leo Yan wrote:
> > > > There has several code pieces need to know the instruction size, but
> > > > now every place calculates the instruction size separately.
> > > >
> > > > This patch refactors to create a new function cs_etm__instr_size() as
> > > > a central place to analyze the instruction length based on ISA type
> > > > and instruction value.
> > > >
> > > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > > ---
> > > >  tools/perf/util/cs-etm.c | 44 +++++++++++++++++++++++++++-------------
> > > >  1 file changed, 30 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > > > index b3a5daaf1a8f..882a0718033d 100644
> > > > --- a/tools/perf/util/cs-etm.c
> > > > +++ b/tools/perf/util/cs-etm.c
> > > > @@ -914,6 +914,26 @@ static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
> > > >     return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
> > > >  }
> > > >
> > > > +static inline int cs_etm__instr_size(struct cs_etm_queue *etmq,
> > > > +                                u8 trace_chan_id,
> > > > +                                enum cs_etm_isa isa,
> > > > +                                u64 addr)
> > > > +{
> > > > +   int insn_len;
> > > > +
> > > > +   /*
> > > > +    * T32 instruction size might be 32-bit or 16-bit, decide by calling
> > > > +    * cs_etm__t32_instr_size().
> > > > +    */
> > > > +   if (isa == CS_ETM_ISA_T32)
> > > > +           insn_len = cs_etm__t32_instr_size(etmq, trace_chan_id, addr);
> > > > +   /* Otherwise, A64 and A32 instruction size are always 32-bit. */
> > > > +   else
> > > > +           insn_len = 4;
> > > > +
> > > > +   return insn_len;
> > > > +}
> > > > +
> > > >  static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
> > > >  {
> > > >     /* Returns 0 for the CS_ETM_DISCONTINUITY packet */
> > > > @@ -938,19 +958,23 @@ static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
> > > >                                  const struct cs_etm_packet *packet,
> > > >                                  u64 offset)
> > > >  {
> > > > +   int insn_len;
> > > > +
> > > >     if (packet->isa == CS_ETM_ISA_T32) {
> > > >             u64 addr = packet->start_addr;
> > > >
> > > >             while (offset > 0) {
> > > > -                   addr += cs_etm__t32_instr_size(etmq,
> > > > -                                                  trace_chan_id, addr);
> > > > +                   addr += cs_etm__instr_size(etmq, trace_chan_id,
> > > > +                                              packet->isa, addr);
> > > >                     offset--;
> > > >             }
> > > >             return addr;
> > > >     }
> > > >
> > > > -   /* Assume a 4 byte instruction size (A32/A64) */
> > > > -   return packet->start_addr + offset * 4;
> > > > +   /* Return instruction size for A32/A64 */
> > > > +   insn_len = cs_etm__instr_size(etmq, trace_chan_id,
> > > > +                                 packet->isa, packet->start_addr);
> > > > +   return packet->start_addr + offset * insn_len;
> > >
> > > This patch will work but from where I stand it makes things difficult to
> > > understand more than anything else.  It is also adding coupling between function
> > > cs_etm__instr_addr() and cs_etm__instr_size(), meaning the code needs to be
> > > carefully inspected in order to make changes to either one.
> >
> > My purpose is to use a same place to calculate the instruction
> > size, rather than to spread the duplicate codes in several different
> > functions.
> >
> > > Last but not least function cs_etm__instr_size() isn't used in the upcoming
> > > patches.  I really don't see what is gained here.
> >
> > Sorry that I forgot to commit my final change into patch 02.
> >
> > I planed to use cs_etm__instr_size() in patch 02; patch 02 has
> > function cs_etm__add_stack_event(), which also needs to get the
> > instruction size when it sends stack event.
> >
> > After apply patch 02, tools/perf/util/cs-etm.c will have below three
> > functions to caculate instruction size; this is the main reason I want
> > to refactor the code for instruction size.
> >
> >   cs_etm__instr_addr()
> >   cs_etm__copy_insn()
> >   cs_etm__add_stack_event()
> >
> > If this lets code more difficult to understand, will drop it.
> >
> 
> I agree with the consolidation but for that to work function
> cs_etm__instr_addr() needs to be refactored.  Since
> cs_etm__instr_size() is already taking care of checking the ISA type
> the while() loop in cs_etm__instr_addr() can be done regardless of the
> operation mode.  That way cs_etm__instr_size() can be changed at will
> without breaking anything.
> 
> The downside is that we are doing a few more iterations... Which isn't
> that big a deal considering we are in user space.  We can think about
> some optimisation if it is ever proven to be a bottleneck.
> 
> Let me know if you see a problem with that approach.

Yes, your approach is neat; I will try it in next patch version.

Thanks a lot for suggestion!

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-09-05  1:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-30  6:24 [PATCH v1 0/3] perf cs-etm: Add support for callchain Leo Yan
2019-08-30  6:24 ` Leo Yan
2019-08-30  6:24 ` [PATCH v1 1/3] perf cs-etm: Refactor instruction size handling Leo Yan
2019-08-30  6:24   ` Leo Yan
2019-09-03 22:22   ` Mathieu Poirier
2019-09-03 22:22     ` Mathieu Poirier
2019-09-04  9:19     ` Leo Yan
2019-09-04  9:19       ` Leo Yan
2019-09-04 17:06       ` Mathieu Poirier
2019-09-04 17:06         ` Mathieu Poirier
2019-09-05  1:50         ` Leo Yan
2019-09-05  1:50           ` Leo Yan
2019-08-30  6:24 ` [PATCH v1 2/3] perf cs-etm: Add callchain to instruction sample Leo Yan
2019-08-30  6:24   ` Leo Yan
2019-09-03 22:07   ` Mathieu Poirier
2019-09-03 22:07     ` Mathieu Poirier
2019-08-30  6:24 ` [PATCH v1 3/3] perf cs-etm: Correct the packet usage for " Leo Yan
2019-08-30  6:24   ` Leo Yan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.