From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 741E4C433EF for ; Mon, 16 May 2022 06:57:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240797AbiEPG4p (ORCPT ); Mon, 16 May 2022 02:56:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240595AbiEPGzw (ORCPT ); Mon, 16 May 2022 02:55:52 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 751C5B1D; Sun, 15 May 2022 23:55:50 -0700 (PDT) Received: from kwepemi500013.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L1qjw6brpzgYLZ; Mon, 16 May 2022 14:54:28 +0800 (CST) Received: from [10.67.111.192] (10.67.111.192) by kwepemi500013.china.huawei.com (7.221.188.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 16 May 2022 14:55:46 +0800 Message-ID: <264ecbe1-4514-d6c8-182b-3af4babb457e@huawei.com> Date: Mon, 16 May 2022 14:55:46 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Content-Language: en-US To: Mark Rutland CC: , , , , , Catalin Marinas , Will Deacon , Steven Rostedt , Ingo Molnar , Daniel Borkmann , Alexei Starovoitov , Zi Shen Lim , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "David S . Miller" , Hideaki YOSHIFUJI , David Ahern , Thomas Gleixner , Borislav Petkov , Dave Hansen , , , Shuah Khan , Jakub Kicinski , Jesper Dangaard Brouer , Pasha Tatashin , Ard Biesheuvel , Daniel Kiss , Steven Price , Sudeep Holla , Marc Zyngier , Peter Collingbourne , Mark Brown , Delyan Kratunov , Kumar Kartikeya Dwivedi References: <20220424154028.1698685-1-xukuohai@huawei.com> <20220424154028.1698685-5-xukuohai@huawei.com> From: Xu Kuohai In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.111.192] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemi500013.china.huawei.com (7.221.188.120) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/13/2022 10:59 PM, Mark Rutland wrote: > On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote: >> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use >> it to replace nop with jump, or replace jump with nop. >> >> Signed-off-by: Xu Kuohai >> Acked-by: Song Liu >> --- >> arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++ >> 1 file changed, 63 insertions(+) >> >> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c >> index 8ab4035dea27..3f9bdfec54c4 100644 >> --- a/arch/arm64/net/bpf_jit_comp.c >> +++ b/arch/arm64/net/bpf_jit_comp.c >> @@ -9,6 +9,7 @@ >> >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -18,6 +19,7 @@ >> #include >> #include >> #include >> +#include >> #include >> >> #include "bpf_jit.h" >> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr) >> { >> return vfree(addr); >> } >> + >> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip, >> + void *addr, u32 *insn) >> +{ >> + if (!addr) >> + *insn = aarch64_insn_gen_nop(); >> + else >> + *insn = aarch64_insn_gen_branch_imm((unsigned long)ip, >> + (unsigned long)addr, >> + type); >> + >> + return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT; >> +} >> + >> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type, >> + void *old_addr, void *new_addr) >> +{ >> + int ret; >> + u32 old_insn; >> + u32 new_insn; >> + u32 replaced; >> + enum aarch64_insn_branch_type branch_type; >> + >> + if (!is_bpf_text_address((long)ip)) >> + /* Only poking bpf text is supported. Since kernel function >> + * entry is set up by ftrace, we reply on ftrace to poke kernel >> + * functions. For kernel funcitons, bpf_arch_text_poke() is only >> + * called after a failed poke with ftrace. In this case, there >> + * is probably something wrong with fentry, so there is nothing >> + * we can do here. See register_fentry, unregister_fentry and >> + * modify_fentry for details. >> + */ >> + return -EINVAL; > > If you rely on ftrace to poke functions, why do you need to patch text > at all? Why does the rest of this function exist? > > I really don't like having another piece of code outside of ftrace > patching the ftrace patch-site; this needs a much better explanation. > Sorry for the incorrect explaination in the comment. I don't think it's reasonable to patch ftrace patch-site without ftrace code either. The patching logic in register_fentry, unregister_fentry and modify_fentry is as follows: if (tr->func.ftrace_managed) ret = register_ftrace_direct((long)ip, (long)new_addr); else ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr, true); ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is only used to patch bpf prog and bpf trampoline, which are not managed by ftrace. >> + >> + if (poke_type == BPF_MOD_CALL) >> + branch_type = AARCH64_INSN_BRANCH_LINK; >> + else >> + branch_type = AARCH64_INSN_BRANCH_NOLINK; >> + >> + if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0) >> + return -EFAULT; >> + >> + if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0) >> + return -EFAULT; >> + >> + mutex_lock(&text_mutex); >> + if (aarch64_insn_read(ip, &replaced)) { >> + ret = -EFAULT; >> + goto out; >> + } >> + >> + if (replaced != old_insn) { >> + ret = -EFAULT; >> + goto out; >> + } >> + >> + ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn); > > ... and where does the actual synchronization come from in this case? > aarch64_insn_patch_text_nosync() replaces an instruction atomically, so no other CPUs will fetch a half-new and half-old instruction. The scenario here is that there is a chance that another CPU fetches the old instruction after bpf_arch_text_poke() finishes, that is, different CPUs may execute different versions of instructions at the same time. 1. When a new trampoline is attached, it doesn't seem to be an issue for different CPUs to jump to different trampolines temporarily. 2. When an old trampoline is freed, we should wait for all other CPUs to exit the trampoline and make sure the trampoline is no longer reachable, IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu tasks to do this. > Thanks, > Mark. > >> +out: >> + mutex_unlock(&text_mutex); >> + return ret; >> +} >> -- >> 2.30.2 >> > . From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2EBD6C433EF for ; Mon, 16 May 2022 06:57:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:CC:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=r9DpuUfD8mmh9J/ulPCzMA3+L29wdSlUoQUWh4yM5gc=; b=bx4s7pD02H4Yt8 deqifgbRvbbyS9LBZu8zIfYinhEXQBIDoJY2IXQ+gJdFDoglPcYXqVc5VU4oMNSAmun7usNrKtD3e RZN7/b5ewiaxhCgHFgy9C6uspukcSG4mQljmJpzuYp/4sDycn1nKI4Tw36zMUuDzb8WLygzP+yL6X yLyp/pIN0iSqCqTMukhyHAfsPv8+OCF0I+MQusqwZbYqA62bINaAoDJW0kL5VFlnzdnpSUaBGxH0D IVF+eX8R6H+eB3HtESzu0zZ7LuAFR9wWCI/nME+NmSCdbTyi4X6OyAaCgMgZwvkJXoLaB4wt2g3ri WzQbKDSL0jbH4QutbYfg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nqUeA-006HPH-P9; Mon, 16 May 2022 06:56:02 +0000 Received: from szxga01-in.huawei.com ([45.249.212.187]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nqUe4-006HKU-5I for linux-arm-kernel@lists.infradead.org; Mon, 16 May 2022 06:55:58 +0000 Received: from kwepemi500013.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L1qjw6brpzgYLZ; Mon, 16 May 2022 14:54:28 +0800 (CST) Received: from [10.67.111.192] (10.67.111.192) by kwepemi500013.china.huawei.com (7.221.188.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 16 May 2022 14:55:46 +0800 Message-ID: <264ecbe1-4514-d6c8-182b-3af4babb457e@huawei.com> Date: Mon, 16 May 2022 14:55:46 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Content-Language: en-US To: Mark Rutland CC: , , , , , Catalin Marinas , Will Deacon , Steven Rostedt , Ingo Molnar , Daniel Borkmann , Alexei Starovoitov , Zi Shen Lim , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "David S . Miller" , Hideaki YOSHIFUJI , David Ahern , Thomas Gleixner , Borislav Petkov , Dave Hansen , , , Shuah Khan , Jakub Kicinski , Jesper Dangaard Brouer , Pasha Tatashin , Ard Biesheuvel , Daniel Kiss , Steven Price , Sudeep Holla , Marc Zyngier , Peter Collingbourne , Mark Brown , Delyan Kratunov , Kumar Kartikeya Dwivedi References: <20220424154028.1698685-1-xukuohai@huawei.com> <20220424154028.1698685-5-xukuohai@huawei.com> From: Xu Kuohai In-Reply-To: X-Originating-IP: [10.67.111.192] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemi500013.china.huawei.com (7.221.188.120) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220515_235556_588108_1E61EE4A X-CRM114-Status: GOOD ( 28.65 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 5/13/2022 10:59 PM, Mark Rutland wrote: > On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote: >> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use >> it to replace nop with jump, or replace jump with nop. >> >> Signed-off-by: Xu Kuohai >> Acked-by: Song Liu >> --- >> arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++ >> 1 file changed, 63 insertions(+) >> >> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c >> index 8ab4035dea27..3f9bdfec54c4 100644 >> --- a/arch/arm64/net/bpf_jit_comp.c >> +++ b/arch/arm64/net/bpf_jit_comp.c >> @@ -9,6 +9,7 @@ >> >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -18,6 +19,7 @@ >> #include >> #include >> #include >> +#include >> #include >> >> #include "bpf_jit.h" >> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr) >> { >> return vfree(addr); >> } >> + >> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip, >> + void *addr, u32 *insn) >> +{ >> + if (!addr) >> + *insn = aarch64_insn_gen_nop(); >> + else >> + *insn = aarch64_insn_gen_branch_imm((unsigned long)ip, >> + (unsigned long)addr, >> + type); >> + >> + return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT; >> +} >> + >> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type, >> + void *old_addr, void *new_addr) >> +{ >> + int ret; >> + u32 old_insn; >> + u32 new_insn; >> + u32 replaced; >> + enum aarch64_insn_branch_type branch_type; >> + >> + if (!is_bpf_text_address((long)ip)) >> + /* Only poking bpf text is supported. Since kernel function >> + * entry is set up by ftrace, we reply on ftrace to poke kernel >> + * functions. For kernel funcitons, bpf_arch_text_poke() is only >> + * called after a failed poke with ftrace. In this case, there >> + * is probably something wrong with fentry, so there is nothing >> + * we can do here. See register_fentry, unregister_fentry and >> + * modify_fentry for details. >> + */ >> + return -EINVAL; > > If you rely on ftrace to poke functions, why do you need to patch text > at all? Why does the rest of this function exist? > > I really don't like having another piece of code outside of ftrace > patching the ftrace patch-site; this needs a much better explanation. > Sorry for the incorrect explaination in the comment. I don't think it's reasonable to patch ftrace patch-site without ftrace code either. The patching logic in register_fentry, unregister_fentry and modify_fentry is as follows: if (tr->func.ftrace_managed) ret = register_ftrace_direct((long)ip, (long)new_addr); else ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr, true); ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is only used to patch bpf prog and bpf trampoline, which are not managed by ftrace. >> + >> + if (poke_type == BPF_MOD_CALL) >> + branch_type = AARCH64_INSN_BRANCH_LINK; >> + else >> + branch_type = AARCH64_INSN_BRANCH_NOLINK; >> + >> + if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0) >> + return -EFAULT; >> + >> + if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0) >> + return -EFAULT; >> + >> + mutex_lock(&text_mutex); >> + if (aarch64_insn_read(ip, &replaced)) { >> + ret = -EFAULT; >> + goto out; >> + } >> + >> + if (replaced != old_insn) { >> + ret = -EFAULT; >> + goto out; >> + } >> + >> + ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn); > > ... and where does the actual synchronization come from in this case? > aarch64_insn_patch_text_nosync() replaces an instruction atomically, so no other CPUs will fetch a half-new and half-old instruction. The scenario here is that there is a chance that another CPU fetches the old instruction after bpf_arch_text_poke() finishes, that is, different CPUs may execute different versions of instructions at the same time. 1. When a new trampoline is attached, it doesn't seem to be an issue for different CPUs to jump to different trampolines temporarily. 2. When an old trampoline is freed, we should wait for all other CPUs to exit the trampoline and make sure the trampoline is no longer reachable, IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu tasks to do this. > Thanks, > Mark. > >> +out: >> + mutex_unlock(&text_mutex); >> + return ret; >> +} >> -- >> 2.30.2 >> > . _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel