linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Desaulniers <ndesaulniers@google.com>
To: Fangrui Song <maskray@google.com>
Cc: Kees Cook <keescook@chromium.org>, "KE . LI" <like1@oppo.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Miroslav Benes <mbenes@suse.cz>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	Stephen Boyd <swboyd@chromium.org>,
	Sami Tolvanen <samitolvanen@google.com>,
	Joe Perches <joe@perches.com>,
	linux-kernel@vger.kernel.org, clang-built-linux@googlegroups.com
Subject: Re: [PATCH] kallsyms: strip LTO suffixes from static functions
Date: Mon, 28 Jun 2021 10:54:24 -0700	[thread overview]
Message-ID: <CAKwvOdki=HZh4TYwqwDSo4BWtbGHp6pM_2akA+D3K8JO+dMGoQ@mail.gmail.com> (raw)
In-Reply-To: <20210622201822.ayavok3d2fw3u2pl@google.com>

On Tue, Jun 22, 2021 at 1:18 PM Fangrui Song <maskray@google.com> wrote:
>
> On 2021-06-22, 'Nick Desaulniers' via Clang Built Linux wrote:
> >Similar to:
> >commit 8b8e6b5d3b01 ("kallsyms: strip ThinLTO hashes from static
> >functions")
> >
> >It's very common for compilers to modify the symbol name for static
> >functions as part of optimizing transformations. That makes hooking
> >static functions (that weren't inlined or DCE'd) with kprobes difficult.
> >
> >Full LTO uses a different mangling scheme than thin LTO; full LTO
> >imports all code into effectively one big translation unit. It must
> >rename static functions to prevent collisions. Strip off these suffixes
> >so that we can continue to hook such static functions.
>
> See below. The message needs a change.
>
> I can comment on the LTO side thing, but a maintainer needs to check
> about the kernel side logic.
>
> Reviewed-by: Fangrui Song <maskray@google.com>
>
> >Reported-by: KE.LI(Lieke) <like1@oppo.com>
> >Tested-by: KE.LI(Lieke) <like1@oppo.com>
> >Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> >---
> > kernel/kallsyms.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> >diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> >index 4067564ec59f..14cf3a6474de 100644
> >--- a/kernel/kallsyms.c
> >+++ b/kernel/kallsyms.c
> >@@ -188,6 +188,24 @@ static inline bool cleanup_symbol_name(char *s)
> >
> >       return res != NULL;
> > }
> >+#elif defined(CONFIG_LTO_CLANG_FULL)
> >+/*
> >+ * LLVM mangles static functions for full LTO so that two static functions with
> >+ * the same identifier do not collide when all code is combined into one
> >+ * module. The scheme used converts references to foo into
> >+ * foo.llvm.974640843467629774, for example. This can break hooking of static
> >+ * functions with kprobes.
> >+ */
>
> The comment should say ThinLTO instead.
>
> The .llvm.123 suffix is for global scope promotion for local linkage
> symbols. The scheme is ThinLTO specific. This ensures that a local

Oh, boy. Indeed.  I had identified the mangling coming from
getGlobalNameForLocal(), but looking at the call chain now I see:

FunctionImportGlobalProcessing::processGlobalForThinLTO()
-> FunctionImportGlobalProcessing::getPromotedName()
  -> ModuleSummaryIndex::getGlobalNameForLocal()

I'm not sure then how I figured it was specific to full LTO.

Android recently switched from thin LTO to full LTO, which is what I
assumed was the cause of the bug report. Rereading our internal bug
report, it was tested against a prior version that did the symbol
truncation for thinLTO. I then assumed this was full LTO specific for
whatever reason, and modified the patch to only apply to full LTO.  I
see via the above call chain that this patch is not correct.  Let me
send my original patch as a v2. b/189560201 if you're interested.

> linkage symbol, when imported into multiple translation units, then
> compiled into different object files, during linking, the copies can be
> deduplicated. This matters for code size and for correctness when the
> function address is taken.
>
> Regular LTO (sometimes called full LTO) uses the regular name.\d+
> scheme.
>
> >+static inline bool cleanup_symbol_name(char *s)
> >+{
> >+      char *res;
> >+
> >+      res = strstr(s, ".llvm.");
> >+      if (res)
> >+              *res = '\0';
> >+
> >+      return res != NULL;
> >+}
> > #else
> > static inline bool cleanup_symbol_name(char *s) { return false; }
> > #endif
> >--
> >2.32.0.288.g62a8d224e6-goog
>
> I wonder whether it makes sense to strip all `.something` suffixes.
> For example, the recent -funique-internal-linkage-name (which can
> improve sample profile accuracy) uses the `.__uniq.1234` scheme.
>
> Function specialization/clones can create arbitrary `.123` suffixes.

I definitely don't see hooking static functions via kprobes as being
scalable. There are numerous different mangling schemes different
compilers apply to different static functions.

--
Thanks,
~Nick Desaulniers

  reply	other threads:[~2021-06-28 17:54 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 18:38 [PATCH] kallsyms: strip LTO suffixes from static functions Nick Desaulniers
2021-06-22 20:18 ` Fangrui Song
2021-06-28 17:54   ` Nick Desaulniers [this message]
2021-06-28 18:20     ` Nick Desaulniers
2021-06-28 19:05   ` [PATCH v2] " Nick Desaulniers
2021-06-28 19:45     ` Nathan Chancellor
2021-06-28 20:31       ` [PATCH v3] " Nick Desaulniers
2021-06-28 21:19         ` Nathan Chancellor
2021-06-28 22:01           ` Nick Desaulniers
2021-06-28 22:16             ` Nathan Chancellor
2021-07-07 18:18               ` [PATCH v4] " Nick Desaulniers
2021-07-07 18:34                 ` Nathan Chancellor
2021-07-07 18:59                   ` Fāng-ruì Sòng
2021-08-06 16:20                 ` Sami Tolvanen
2021-10-01 19:58                   ` [PATCH v5] " Nick Desaulniers
2021-10-01 20:05                     ` Sami Tolvanen
2021-10-04 10:46                       ` Padmanabha Srinivasaiah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKwvOdki=HZh4TYwqwDSo4BWtbGHp6pM_2akA+D3K8JO+dMGoQ@mail.gmail.com' \
    --to=ndesaulniers@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=clang-built-linux@googlegroups.com \
    --cc=gustavoars@kernel.org \
    --cc=joe@perches.com \
    --cc=keescook@chromium.org \
    --cc=like1@oppo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maskray@google.com \
    --cc=mbenes@suse.cz \
    --cc=nathan@kernel.org \
    --cc=samitolvanen@google.com \
    --cc=sfr@canb.auug.org.au \
    --cc=swboyd@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).