linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: James Clark <james.clark@arm.com>
To: Douglas Anderson <dianders@chromium.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>,
	Seth LaForge <sethml@google.com>,
	Ricky Liang <jcliang@chromium.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Jiri Olsa <jolsa@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Nathan Chancellor <nathan@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	clang-built-linux@googlegroups.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Alexandre Truong <alexandre.truong@arm.com>,
	Wilco Dijkstra <wilco.dijkstra@arm.com>,
	Al Grant <Al.Grant@arm.com>
Subject: Re: [PATCH 2/3] arm64: perf: Improve compat perf_callchain_user() for clang leaf functions
Date: Mon, 7 Jun 2021 12:14:20 +0300	[thread overview]
Message-ID: <47a95789-ca75-70a5-9d65-a2d3e9c651bc@arm.com> (raw)
In-Reply-To: <20210507135509.2.Ib54050e4091679cc31b04d52d7ef200f99faaae5@changeid>



On 07/05/2021 23:55, Douglas Anderson wrote:
> It turns out that even when you compile code with clang with
> "-fno-omit-frame-pointer" that it won't generate a frame pointer for
> leaf functions (those that don't call any sub-functions). Presumably
> clang does this to reduce the overhead of frame pointers. In a leaf
> function you don't really need frame pointers since the Link Register
> (LR) is guaranteed to always point to the caller> 
[...]
> 
>  arch/arm64/kernel/perf_callchain.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
> index e5ce5f7965d1..b3cd9f371469 100644
> --- a/arch/arm64/kernel/perf_callchain.c
> +++ b/arch/arm64/kernel/perf_callchain.c
> @@ -326,6 +326,20 @@ static void compat_perf_callchain_user(struct perf_callchain_entry_ctx *entry,
>  	while ((entry->nr < entry->max_stack) && fp && !(fp & 0x3)) {
>  		err = compat_perf_trace_1(&fp, &pc, leaf_lr);
>  
> +		/*
> +		 * If this is the first trace and it didn't find the LR then
> +		 * let's throw it in the trace first. This isn't perfect but
> +		 * is the best we can do for handling clang leaf functions (or
> +		 * the case where we're right at the start of the function
> +		 * before the new frame has been pushed). In the worst case
> +		 * this can cause us to throw an extra entry that will be some
> +		 * location in the same function as the PC. That's not
> +		 * amazing but shouldn't really hurt. It seems better than
> +		 * throwing away the LR.
> +		 */

Hi Douglas,

I think the behaviour with GCC is also similar. We were working on this change
(https://lore.kernel.org/lkml/20210304163255.10363-4-alexandre.truong@arm.com/)
in userspace Perf which addresses the same issue.

The basic concept of our version is to record only the link register
(as in --user-regs=lr). Then use the existing dwarf based unwind
to determine if the link register is valid for that frame, and then if
it is and it doesn't already exist on the stack then insert it.

You mention that your version isn't perfect, do you think that saving the
LR and using something like libunwind in a post process could be better?

Thanks
James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-07  9:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07 20:55 [PATCH 0/3] arm64: perf: Make compat tracing better Douglas Anderson
2021-05-07 20:55 ` [PATCH 1/3] arm64: perf: perf_callchain_user() compat support for clang/non-APCS-gcc-arm Douglas Anderson
2021-05-07 20:55 ` [PATCH 2/3] arm64: perf: Improve compat perf_callchain_user() for clang leaf functions Douglas Anderson
2021-06-07  9:14   ` James Clark [this message]
2021-06-07 20:57     ` Doug Anderson
2021-05-07 20:55 ` [PATCH 3/3] arm64: perf: Add a config option saying 32-bit thumb code uses R11 for FP Douglas Anderson
2021-05-25 15:04 ` [PATCH 0/3] arm64: perf: Make compat tracing better Doug Anderson
2021-06-02 17:55 ` Will Deacon
2021-06-07 20:34   ` Doug Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47a95789-ca75-70a5-9d65-a2d3e9c651bc@arm.com \
    --to=james.clark@arm.com \
    --cc=Al.Grant@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexandre.truong@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=dianders@chromium.org \
    --cc=jcliang@chromium.org \
    --cc=jolsa@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=sethml@google.com \
    --cc=wilco.dijkstra@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).