From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF78C433F5 for ; Mon, 27 Sep 2021 08:59:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5E43E60F24 for ; Mon, 27 Sep 2021 08:59:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233589AbhI0JA6 (ORCPT ); Mon, 27 Sep 2021 05:00:58 -0400 Received: from foss.arm.com ([217.140.110.172]:34354 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233594AbhI0JAu (ORCPT ); Mon, 27 Sep 2021 05:00:50 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7D531D6E; Mon, 27 Sep 2021 01:59:12 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B8B903F70D; Mon, 27 Sep 2021 01:59:10 -0700 (PDT) Date: Mon, 27 Sep 2021 09:58:57 +0100 From: Mark Rutland To: David Laight Cc: Ard Biesheuvel , Peter Zijlstra , Frederic Weisbecker , Catalin Marinas , Will Deacon , LKML , James Morse , Quentin Perret , Christophe Leroy Subject: Re: [PATCH 2/4] arm64: implement support for static call trampolines Message-ID: <20210927085837.GA1131@C02TD0UTHF1T.local> References: <20210920233237.90463-1-frederic@kernel.org> <20210920233237.90463-3-frederic@kernel.org> <20210921153352.GC35846@C02TD0UTHF1T.local> <20210921162804.GD35846@C02TD0UTHF1T.local> <944ef479f1104c4a97d0e3f629a9b765@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <944ef479f1104c4a97d0e3f629a9b765@AcuMS.aculab.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Sep 25, 2021 at 05:46:23PM +0000, David Laight wrote: > From: Mark Rutland > > Sent: 21 September 2021 17:28 > > > > On Tue, Sep 21, 2021 at 05:55:11PM +0200, Ard Biesheuvel wrote: > > > On Tue, 21 Sept 2021 at 17:33, Mark Rutland wrote: > > > > > > > > On Tue, Sep 21, 2021 at 04:44:56PM +0200, Ard Biesheuvel wrote: > > > > > On Tue, 21 Sept 2021 at 09:10, Peter Zijlstra wrote: > > > ... > ... > > > > > > > > I think so, yes. We can do sligntly better with an inline literal pool > > > > and a PC-relative LDR to fold the ADRP+LDR, e.g. > > > > > > > > .align 3 > > > > tramp: > > > > BTI C > > > > {B | RET | NOP} > > > > LDR X16, 1f > > > > BR X16 > > > > 1: .quad > > > > > > > > Since that's in the .text, it's RO for regular accesses anyway. > > > > > > > > > > I tried to keep the literal in .rodata to avoid inadvertent gadgets > > > and/or anticipate exec-only mappings of .text, but that may be a bit > > > overzealous. > > > > I think that in practice the risk of gadgetisation is minimal, and > > having it inline means we only need to record a single address per > > trampoline, so there's less risk that we get the patching wrong. > > But doesn't that mean that it is almost certainly a data cache miss? > You really want an instruction that reads the constant from the I-cache. > Or at least be able to 'bunch together' the constants so they > stand a chance of sharing a D-cache line. The idea is that in the common case we don't even use the literal, and the `B ` goes to the target. The literal is there as a fallback for when the target is a sufficiently long distance away (more than +/-128MiB from the `BR X16`). By default we try to keep modules within 128MiB of the kernel image, and this should only happen in uncommon configs (e.g. my debug kernel configs when the kernel can be 100s of MiBs). With that in mind, I'd strongly prefer to optimize for simplicity rather than making the uncommon case faster. Thanks, Mark.