From: Leo Yan <leo.yan@linaro.org> To: Arnaldo Carvalho de Melo <acme@kernel.org>, Mathieu Poirier <mathieu.poirier@linaro.org>, Suzuki K Poulose <suzuki.poulose@arm.com>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Mike Leach <mike.leach@linaro.org>, Coresight ML <coresight@lists.linaro.org>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Subject: [PATCH v3 6/6] perf cs-etm: Synchronize instruction sample with the thread stack Date: Sat, 5 Oct 2019 17:16:14 +0800 [thread overview] Message-ID: <20191005091614.11635-7-leo.yan@linaro.org> (raw) In-Reply-To: <20191005091614.11635-1-leo.yan@linaro.org> The synthesized flow use 'tidq->packet' for instruction samples; on the other hand, 'tidp->prev_packet' is used to generate the thread stack and the branch samples, this results in the instruction samples using one packet ahead than thread stack and branch samples ('tidp->prev_packet' vs 'tidq->packet'). This leads to an instruction's callchain error as shows in below example: main 1579 100 instructions: ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) ffff000010214850 perf_event_update_userpage+0x48 ([kernel.kallsyms]) ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms]) ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) In the callchain log, for the two continuous lines the up line contains one child function info and the followed line contains the caller function info, and so forth. So the first two lines are: perf_event_update_userpage+0x4c => the sampled instruction perf_event_update_userpage+0x48 => the parent function's calling The child function and parent function both are the same function perf_event_update_userpage(), but this isn't a recursive function, thus the sequence for perf_event_update_userpage() calling itself shouldn't never happen. This callchain error is caused by the instruction sample using an ahead packet than the thread stack, the thread stack is deferred to process the new packet and misses to pop stack if it is just a return packet. To fix this issue, we can simply change to use 'tidq->prev_packet' to generate the instruction samples, this allows the thread stack to push and pop synchronously with instruction sample. Finally, the callchain can be displayed correctly as below: main 1579 100 instructions: ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms]) ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) Signed-off-by: Leo Yan <leo.yan@linaro.org> --- tools/perf/util/cs-etm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 56e501cd2f5f..fa969dcb45d2 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1419,7 +1419,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, struct cs_etm_packet *tmp; int ret; u8 trace_chan_id = tidq->trace_chan_id; - u64 instrs_executed = tidq->packet->instr_count; + u64 instrs_executed = tidq->prev_packet->instr_count; tidq->period_instructions += instrs_executed; @@ -1450,7 +1450,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, */ s64 offset = (instrs_executed - instrs_over - 1); u64 addr = cs_etm__instr_addr(etmq, trace_chan_id, - tidq->packet, offset); + tidq->prev_packet, offset); ret = cs_etm__synth_instruction_sample( etmq, tidq, addr, etm->instructions_sample_period); -- 2.17.1
WARNING: multiple messages have this Message-ID (diff)
From: Leo Yan <leo.yan@linaro.org> To: Arnaldo Carvalho de Melo <acme@kernel.org>, Mathieu Poirier <mathieu.poirier@linaro.org>, Suzuki K Poulose <suzuki.poulose@arm.com>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Mike Leach <mike.leach@linaro.org>, Coresight ML <coresight@lists.linaro.org>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Subject: [PATCH v3 6/6] perf cs-etm: Synchronize instruction sample with the thread stack Date: Sat, 5 Oct 2019 17:16:14 +0800 [thread overview] Message-ID: <20191005091614.11635-7-leo.yan@linaro.org> (raw) In-Reply-To: <20191005091614.11635-1-leo.yan@linaro.org> The synthesized flow use 'tidq->packet' for instruction samples; on the other hand, 'tidp->prev_packet' is used to generate the thread stack and the branch samples, this results in the instruction samples using one packet ahead than thread stack and branch samples ('tidp->prev_packet' vs 'tidq->packet'). This leads to an instruction's callchain error as shows in below example: main 1579 100 instructions: ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) ffff000010214850 perf_event_update_userpage+0x48 ([kernel.kallsyms]) ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms]) ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) In the callchain log, for the two continuous lines the up line contains one child function info and the followed line contains the caller function info, and so forth. So the first two lines are: perf_event_update_userpage+0x4c => the sampled instruction perf_event_update_userpage+0x48 => the parent function's calling The child function and parent function both are the same function perf_event_update_userpage(), but this isn't a recursive function, thus the sequence for perf_event_update_userpage() calling itself shouldn't never happen. This callchain error is caused by the instruction sample using an ahead packet than the thread stack, the thread stack is deferred to process the new packet and misses to pop stack if it is just a return packet. To fix this issue, we can simply change to use 'tidq->prev_packet' to generate the instruction samples, this allows the thread stack to push and pop synchronously with instruction sample. Finally, the callchain can be displayed correctly as below: main 1579 100 instructions: ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms]) ffff000010219360 perf_swevent_add+0x88 ([kernel.kallsyms]) ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms]) ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms]) ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms]) ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms]) Signed-off-by: Leo Yan <leo.yan@linaro.org> --- tools/perf/util/cs-etm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 56e501cd2f5f..fa969dcb45d2 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1419,7 +1419,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, struct cs_etm_packet *tmp; int ret; u8 trace_chan_id = tidq->trace_chan_id; - u64 instrs_executed = tidq->packet->instr_count; + u64 instrs_executed = tidq->prev_packet->instr_count; tidq->period_instructions += instrs_executed; @@ -1450,7 +1450,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq, */ s64 offset = (instrs_executed - instrs_over - 1); u64 addr = cs_etm__instr_addr(etmq, trace_chan_id, - tidq->packet, offset); + tidq->prev_packet, offset); ret = cs_etm__synth_instruction_sample( etmq, tidq, addr, etm->instructions_sample_period); -- 2.17.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-10-05 9:17 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-10-05 9:16 [PATCH v3 0/6] perf cs-etm: Support thread stack and callchain Leo Yan 2019-10-05 9:16 ` Leo Yan 2019-10-05 9:16 ` [PATCH v3 1/6] perf cs-etm: Fix unsigned variable comparison to zero Leo Yan 2019-10-05 9:16 ` Leo Yan 2019-10-11 20:16 ` Mathieu Poirier 2019-10-11 20:16 ` Mathieu Poirier 2019-10-22 5:10 ` Leo Yan 2019-10-22 5:10 ` Leo Yan 2019-10-22 23:36 ` Mike Leach 2019-10-22 23:36 ` Mike Leach 2019-10-23 6:49 ` Leo Yan 2019-10-23 6:49 ` Leo Yan 2019-10-05 9:16 ` [PATCH v3 2/6] perf cs-etm: Refactor instruction size handling Leo Yan 2019-10-05 9:16 ` Leo Yan 2019-10-05 9:16 ` [PATCH v3 3/6] perf cs-etm: Support thread stack Leo Yan 2019-10-05 9:16 ` Leo Yan 2019-10-11 17:53 ` Mathieu Poirier 2019-10-11 17:53 ` Mathieu Poirier 2019-10-15 3:33 ` Leo Yan 2019-10-15 3:33 ` Leo Yan 2019-10-22 5:03 ` Leo Yan 2019-10-22 5:03 ` Leo Yan 2019-10-28 22:43 ` Mathieu Poirier 2019-10-28 22:43 ` Mathieu Poirier 2019-10-29 4:11 ` Leo Yan 2019-10-29 4:11 ` Leo Yan 2019-10-05 9:16 ` [PATCH v3 4/6] perf cs-etm: Support branch filter Leo Yan 2019-10-05 9:16 ` Leo Yan 2019-10-05 9:16 ` [PATCH v3 5/6] perf cs-etm: Support callchain for instruction sample Leo Yan 2019-10-05 9:16 ` Leo Yan 2019-10-11 19:59 ` Mathieu Poirier 2019-10-11 19:59 ` Mathieu Poirier 2019-10-05 9:16 ` Leo Yan [this message] 2019-10-05 9:16 ` [PATCH v3 6/6] perf cs-etm: Synchronize instruction sample with the thread stack Leo Yan 2019-10-11 20:17 ` Mathieu Poirier 2019-10-11 20:17 ` Mathieu Poirier 2019-10-15 3:44 ` Leo Yan 2019-10-15 3:44 ` Leo Yan 2019-10-22 4:50 ` Leo Yan 2019-10-22 4:50 ` Leo Yan
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191005091614.11635-7-leo.yan@linaro.org \ --to=leo.yan@linaro.org \ --cc=acme@kernel.org \ --cc=alexander.shishkin@linux.intel.com \ --cc=coresight@lists.linaro.org \ --cc=jolsa@redhat.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mark.rutland@arm.com \ --cc=mathieu.poirier@linaro.org \ --cc=mike.leach@linaro.org \ --cc=mingo@redhat.com \ --cc=namhyung@kernel.org \ --cc=peterz@infradead.org \ --cc=suzuki.poulose@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.