From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2A19C4332F for ; Mon, 21 Mar 2022 15:28:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242882AbiCUP3j (ORCPT ); Mon, 21 Mar 2022 11:29:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350211AbiCUP3i (ORCPT ); Mon, 21 Mar 2022 11:29:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA10FDEE7; Mon, 21 Mar 2022 08:28:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5659961047; Mon, 21 Mar 2022 15:28:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3725BC340E8; Mon, 21 Mar 2022 15:28:07 +0000 (UTC) Date: Mon, 21 Mar 2022 11:28:05 -0400 From: Steven Rostedt To: Peter Zijlstra Cc: Stephen Rothwell , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Linux Kernel Mailing List , Linux Next Mailing List , mhiramat@kernel.org, ast@kernel.org, hjl.tools@gmail.com, rick.p.edgecombe@intel.com, rppt@kernel.org, linux-toolchains@vger.kernel.org, Andrew.Cooper3@citrix.com, ndesaulniers@google.com Subject: Re: linux-next: build warnings after merge of the tip tree Message-ID: <20220321112805.1393f9b9@gandalf.local.home> In-Reply-To: References: <20220321140327.777f9554@canb.auug.org.au> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-next@vger.kernel.org On Mon, 21 Mar 2022 14:04:05 +0100 Peter Zijlstra wrote: > Ahh, something tracing. I'll go do some patches on top of it. > > Also, folks, I'm thinking we should start to move to __fexit__, if CET > SHSTK ever wants to come to kernel land return trampolines will > insta-stop working. > > Hjl, do you think we could get -mfexit to go along with -mfentry ? If we do every add a -mfexit, we will need to add a __ftail__ call. Because, the current function exit tracing works for functions, even with tail calls. int funcA () { [..] return funcB(); } Can turn into: [..] pop all stack from funcA load reg params to funcB jmp funcB Then when funcB does does it's [..] ret It will pop the call site of funcA (not the call site of funcB) and return to wherever called funcA with the proper return values. This currently works with function graph and kretprobe tracing because of the shadow stack. Let's say we traced both funcA and funcB funcA: call __fentry__ Replace caller address with graph_trampoline and store the return caller into the shadow stack. [..] jmp funcB funcB: call __fentry__ Replace caller address with graph_trampoline and store the return caller (which is the graph_trampoline that was switched earlier) in the shadow stack. [..] ret Returns to the graph_trampoline and we trace the return of funcB. Then we pop off the shadow stack and jump to that. But the shadow stack had a call to the graph_trampoline, which gets called again. Returns to the graph_trampoline and we trace the return of funcA. Then we pop off the shadow stack and jump to that, which is the original caller to funcA. That is, the current algorithm traces the end of both funcA and funcB without issue, because of how the shadow stack works. Now if we add a __fexit__, we will need a way to tell the tracers how to record this scenario. That is why I'm thinking of a jmp to __ftail__. Perhaps something like: funcA: call __fentry__ [..] push address of funcB jmp __ftail__ jmp funcB Where, __ftail__ would do at the end: ret To jump to funcB and we skip the jmp to funcB anyway. And to "nop" it out, we would have to convert it to. funcA: call __fentry__ [..] jmp 1 jmp __ftail__ 1: jmp funcB This is one way I can think of if we include a __fexit__. But to maintain backward compatibility to function graph tracing (which is a requirement), we need to be able to handle such cases. Perhaps this is a good topic to bring up at Plumbers? :-) Do I need to submit a tracing MC, or can we have this conversation at a compiler / toolchain MC? -- Steve