From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4545CC47082 for ; Mon, 7 Jun 2021 09:34:53 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 13F5D61159 for ; Mon, 7 Jun 2021 09:34:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 13F5D61159 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=COwaiqmKdQjVDgzEuqeYSqEcDC4i5h33pynXuzUjMgQ=; b=wKS9IjqPkXwpgpXVzhu7vZTu6S JPagPIFTlxz9usfG4ytx3x9hSVhuohZuJmeROOK3REBF9nC/o0haTtOX91x9PsQDZgzhHwUYWwhts 6vicEfjlUf+JSTyKu+U+VglER50UaBQja6C+mR/EgIJtSd71NlbHVS6N2+rWzcHVjZriAdFxWIw5e CmQ/96eaFd5Q62KzfUQYBkxk0auPL+3jte9s648Aq5+8/8Xpt5xnLQrH29lHNlOySsUbd278Rz0Eo bY0vVWvPd0UBudcMywIFRWCfRqWwe/vucq47nNEdvJp2SPySFIYrAU+8An1xvYd2mRqMZ+fku1M3m gpo3OT9A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lqBca-002clN-50; Mon, 07 Jun 2021 09:32:36 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lqBL2-002W30-Ex for linux-arm-kernel@lists.infradead.org; Mon, 07 Jun 2021 09:14:36 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F216C31B; Mon, 7 Jun 2021 02:14:24 -0700 (PDT) Received: from [10.57.1.61] (unknown [10.57.1.61]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 379F43F719; Mon, 7 Jun 2021 02:14:22 -0700 (PDT) Subject: Re: [PATCH 2/3] arm64: perf: Improve compat perf_callchain_user() for clang leaf functions To: Douglas Anderson , Catalin Marinas , Will Deacon Cc: Nick Desaulniers , Seth LaForge , Ricky Liang , Alexander Shishkin , Arnaldo Carvalho de Melo , Ingo Molnar , Jiri Olsa , Mark Rutland , Namhyung Kim , Nathan Chancellor , Peter Zijlstra , clang-built-linux@googlegroups.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Alexandre Truong , Wilco Dijkstra , Al Grant References: <20210507205513.640780-1-dianders@chromium.org> <20210507135509.2.Ib54050e4091679cc31b04d52d7ef200f99faaae5@changeid> From: James Clark Message-ID: <47a95789-ca75-70a5-9d65-a2d3e9c651bc@arm.com> Date: Mon, 7 Jun 2021 12:14:20 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20210507135509.2.Ib54050e4091679cc31b04d52d7ef200f99faaae5@changeid> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210607_021428_657835_1E55D4C6 X-CRM114-Status: GOOD ( 29.70 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 07/05/2021 23:55, Douglas Anderson wrote: > It turns out that even when you compile code with clang with > "-fno-omit-frame-pointer" that it won't generate a frame pointer for > leaf functions (those that don't call any sub-functions). Presumably > clang does this to reduce the overhead of frame pointers. In a leaf > function you don't really need frame pointers since the Link Register > (LR) is guaranteed to always point to the caller> [...] > > arch/arm64/kernel/perf_callchain.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c > index e5ce5f7965d1..b3cd9f371469 100644 > --- a/arch/arm64/kernel/perf_callchain.c > +++ b/arch/arm64/kernel/perf_callchain.c > @@ -326,6 +326,20 @@ static void compat_perf_callchain_user(struct perf_callchain_entry_ctx *entry, > while ((entry->nr < entry->max_stack) && fp && !(fp & 0x3)) { > err = compat_perf_trace_1(&fp, &pc, leaf_lr); > > + /* > + * If this is the first trace and it didn't find the LR then > + * let's throw it in the trace first. This isn't perfect but > + * is the best we can do for handling clang leaf functions (or > + * the case where we're right at the start of the function > + * before the new frame has been pushed). In the worst case > + * this can cause us to throw an extra entry that will be some > + * location in the same function as the PC. That's not > + * amazing but shouldn't really hurt. It seems better than > + * throwing away the LR. > + */ Hi Douglas, I think the behaviour with GCC is also similar. We were working on this change (https://lore.kernel.org/lkml/20210304163255.10363-4-alexandre.truong@arm.com/) in userspace Perf which addresses the same issue. The basic concept of our version is to record only the link register (as in --user-regs=lr). Then use the existing dwarf based unwind to determine if the link register is valid for that frame, and then if it is and it doesn't already exist on the stack then insert it. You mention that your version isn't perfect, do you think that saving the LR and using something like libunwind in a post process could be better? Thanks James _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel