From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E69BCC2D0A3 for ; Thu, 29 Oct 2020 10:59:38 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 542CF2072D for ; Thu, 29 Oct 2020 10:59:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="TByE9Ilv"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="IcHbLsE4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 542CF2072D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=LhPx5vGvIO54cNgPuEX+TZhPscDTP9MqinvaDCc+8c8=; b=TByE9Ilv+LR6E+8JJPgNr1WgE 0J2o2kxAIfSWPX8LyULvhd2rRe2BMOoM0Dl5xzE4wmEmiaLQHsRgl+J+NyTJbGUiuA3IC5mrcp9ZC W6Kkhho7KNMS222oVg2DbyEhkHf0yoQReO5Js+2CoEw8G8rUO13WkMZjKHtVWBNNLI759iXnJwyeZ qvO890qk38TMb+Fowb0tuIrPVaGmPyojeVfMz1eP3XeE2wOd74QFPy/auKipV1C4+MRp9Bq08ML9e 8CjBFQlaYyXnrrcVlssA7jjIFypb2Oveo7Ubw7LaC9hIE1vzkleUJmmNkAAJfXIeu58m79nFzVwFp kGptAVYvg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kY5e8-0006bI-7W; Thu, 29 Oct 2020 10:59:08 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kY5e5-0006aZ-Fe for linux-arm-kernel@lists.infradead.org; Thu, 29 Oct 2020 10:59:06 +0000 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3EF4020727 for ; Thu, 29 Oct 2020 10:59:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603969144; bh=YFCfA/1OWzn93BLUtE3fdK5gx8OV7T172+uwWPjbW9k=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=IcHbLsE4Vx5V3Twte5BtxIFNboO12sRvFgIwfwytw7V8dQtmFEuP4mK/mNeYxuqJ6 pMo8ZLYQCE3U1RaGalnxz6RBQr+5oqLzwguMVP1OXtzk4Fr3I0w96mRqSxvNwqSAfK uG6J3U+tAIxlDhmFj6iV4aqO4yp9WsffPFgxQUKg= Received: by mail-oi1-f173.google.com with SMTP id f7so2764509oib.4 for ; Thu, 29 Oct 2020 03:59:04 -0700 (PDT) X-Gm-Message-State: AOAM530F2YiGAbiIyr7MLZdJEzrkLO4Xewxyst4DIrMQRF27wGuB/O3Q JRQ692tYVjOYnuOFHDM45n1zxuxengntSFgrUC8= X-Google-Smtp-Source: ABdhPJxw8I2sV7yZwzx392v/E9y9Q9UYb3mzF5aeIzC3DKq0hamismwcxb1SXxrm8DFWlc30AoWTbjiFaTdhge6G1nc= X-Received: by 2002:aca:2310:: with SMTP id e16mr2307723oie.47.1603969143424; Thu, 29 Oct 2020 03:59:03 -0700 (PDT) MIME-Version: 1.0 References: <20201028184114.6834-1-ardb@kernel.org> <20201029104007.GK2628@hirez.programming.kicks-ass.net> In-Reply-To: <20201029104007.GK2628@hirez.programming.kicks-ass.net> From: Ard Biesheuvel Date: Thu, 29 Oct 2020 11:58:52 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2] arm64: implement support for static call trampolines To: Peter Zijlstra X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201029_065905_709653_6618BABD X-CRM114-Status: GOOD ( 29.69 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Catalin Marinas , Will Deacon , James Morse , Linux ARM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, 29 Oct 2020 at 11:40, Peter Zijlstra wrote: > > On Wed, Oct 28, 2020 at 07:41:14PM +0100, Ard Biesheuvel wrote: > > +/* > > + * The static call trampoline consists of one of the following sequences: > > + * > > + * (A) (B) (C) (D) (E) > > + * 00: BTI C BTI C BTI C BTI C BTI C > > + * 04: B fn NOP NOP NOP NOP > > + * 08: RET RET ADRP X16, fn ADRP X16, fn ADRP X16, fn > > + * 0c: NOP NOP ADD X16, fn ADD X16, fn ADD X16, fn > > + * 10: BR X16 RET NOP > > + * 14: ADRP X16, &fn > > + * 18: LDR X16, [X16, &fn] > > + * 1c: BR X16 > > + * > > + * The architecture permits us to patch B instructions into NOPs or vice versa > > + * directly, but patching any other instruction sequence requires careful > > + * synchronization. Since branch targets may be out of range for ordinary > > + * immediate branch instructions, we may have to fall back to ADRP/ADD/BR > > + * sequences in some cases, which complicates things considerably; since any > > + * sleeping tasks may have been preempted right in the middle of any of these > > + * sequences, we have to carefully transform one into the other, and ensure > > + * that it is safe to resume execution at any point in the sequence for tasks > > + * that have already executed part of it. > > + * > > + * So the rules are: > > + * - we start out with (A) or (B) > > + * - a branch within immediate range can always be patched in at offset 0x4; > > + * - sequence (A) can be turned into (B) for NULL branch targets; > > + * - a branch outside of immediate range can be patched using (C), but only if > > + * . the sequence being updated is (A) or (B), or > > + * . the branch target address modulo 4k results in the same ADD opcode > > + * (which could occur when patching the same far target a second time) > > + * - once we have patched in (C) we cannot go back to (A) or (B), so patching > > + * in a NULL target now requires sequence (D); > > + * - if we cannot patch a far target using (C), we fall back to sequence (E), > > + * which loads the function pointer from memory. > > + * > > + * If we abide by these rules, then the following must hold for tasks that were > > + * interrupted halfway through execution of the trampoline: > > + * - when resuming at offset 0x8, we can only encounter a RET if (B) or (D) > > + * was patched in at any point, and therefore a NULL target is valid; > > + * - when resuming at offset 0xc, we are executing the ADD opcode that is only > > + * reachable via the preceding ADRP, and which is patched in only a single > > + * time, and is therefore guaranteed to be consistent with the ADRP target; > > + * - when resuming at offset 0x10, X16 must refer to a valid target, since it > > + * is only reachable via a ADRP/ADD pair that is guaranteed to be consistent. > > + * > > + * Note that sequence (E) is only used when switching between multiple far > > + * targets, and that it is not a terminal degraded state. > > + */ > > Would it make things easier if your trampoline consisted of two complete > slots, between which you can flip? > > Something like: > > 0x00 B 0x24 / NOP > 0x04 < slot 1 > > .... > 0x20 > 0x24 < slot 2 > > .... > 0x40 > > Then each (20 byte) slot can contain any of the variants above and you > can write the unused slot without stop-machine. Then, when the unused > slot is populated, flip the initial instruction (like a static-branch), > issue synchronize_rcu_tasks() and flip to using the other slot for next > time. > Once we've populated a slot and activated it, we have to assume that it is live and we can no longer modify it freely. > Alternatively, you can patch the call-sites to point to the alternative > trampoline slot, but that might be pushing things a bit. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel