From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09888C433EF for ; Mon, 16 May 2022 07:19:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240985AbiEPHTH (ORCPT ); Mon, 16 May 2022 03:19:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240969AbiEPHS7 (ORCPT ); Mon, 16 May 2022 03:18:59 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 64AB813CDD; Mon, 16 May 2022 00:18:58 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B8CF31063; Mon, 16 May 2022 00:18:57 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.2.236]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 028A63F718; Mon, 16 May 2022 00:18:51 -0700 (PDT) Date: Mon, 16 May 2022 08:18:32 +0100 From: Mark Rutland To: Xu Kuohai Cc: bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, Catalin Marinas , Will Deacon , Steven Rostedt , Ingo Molnar , Daniel Borkmann , Alexei Starovoitov , Zi Shen Lim , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "David S . Miller" , Hideaki YOSHIFUJI , David Ahern , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, hpa@zytor.com, Shuah Khan , Jakub Kicinski , Jesper Dangaard Brouer , Pasha Tatashin , Ard Biesheuvel , Daniel Kiss , Steven Price , Sudeep Holla , Marc Zyngier , Peter Collingbourne , Mark Brown , Delyan Kratunov , Kumar Kartikeya Dwivedi Subject: Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Message-ID: References: <20220424154028.1698685-1-xukuohai@huawei.com> <20220424154028.1698685-5-xukuohai@huawei.com> <264ecbe1-4514-d6c8-182b-3af4babb457e@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <264ecbe1-4514-d6c8-182b-3af4babb457e@huawei.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 16, 2022 at 02:55:46PM +0800, Xu Kuohai wrote: > On 5/13/2022 10:59 PM, Mark Rutland wrote: > > On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote: > >> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use > >> it to replace nop with jump, or replace jump with nop. > >> > >> Signed-off-by: Xu Kuohai > >> Acked-by: Song Liu > >> --- > >> arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++ > >> 1 file changed, 63 insertions(+) > >> > >> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c > >> index 8ab4035dea27..3f9bdfec54c4 100644 > >> --- a/arch/arm64/net/bpf_jit_comp.c > >> +++ b/arch/arm64/net/bpf_jit_comp.c > >> @@ -9,6 +9,7 @@ > >> > >> #include > >> #include > >> +#include > >> #include > >> #include > >> #include > >> @@ -18,6 +19,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> #include > >> > >> #include "bpf_jit.h" > >> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr) > >> { > >> return vfree(addr); > >> } > >> + > >> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip, > >> + void *addr, u32 *insn) > >> +{ > >> + if (!addr) > >> + *insn = aarch64_insn_gen_nop(); > >> + else > >> + *insn = aarch64_insn_gen_branch_imm((unsigned long)ip, > >> + (unsigned long)addr, > >> + type); > >> + > >> + return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT; > >> +} > >> + > >> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type, > >> + void *old_addr, void *new_addr) > >> +{ > >> + int ret; > >> + u32 old_insn; > >> + u32 new_insn; > >> + u32 replaced; > >> + enum aarch64_insn_branch_type branch_type; > >> + > >> + if (!is_bpf_text_address((long)ip)) > >> + /* Only poking bpf text is supported. Since kernel function > >> + * entry is set up by ftrace, we reply on ftrace to poke kernel > >> + * functions. For kernel funcitons, bpf_arch_text_poke() is only > >> + * called after a failed poke with ftrace. In this case, there > >> + * is probably something wrong with fentry, so there is nothing > >> + * we can do here. See register_fentry, unregister_fentry and > >> + * modify_fentry for details. > >> + */ > >> + return -EINVAL; > > > > If you rely on ftrace to poke functions, why do you need to patch text > > at all? Why does the rest of this function exist? > > > > I really don't like having another piece of code outside of ftrace > > patching the ftrace patch-site; this needs a much better explanation. > > > > Sorry for the incorrect explaination in the comment. I don't think it's > reasonable to patch ftrace patch-site without ftrace code either. > > The patching logic in register_fentry, unregister_fentry and > modify_fentry is as follows: > > if (tr->func.ftrace_managed) > ret = register_ftrace_direct((long)ip, (long)new_addr); > else > ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr, > true); > > ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is > only used to patch bpf prog and bpf trampoline, which are not managed by > ftrace. Sorry, I had misunderstood. Thanks for the correction! I'll have another look with that in mind. > >> + > >> + if (poke_type == BPF_MOD_CALL) > >> + branch_type = AARCH64_INSN_BRANCH_LINK; > >> + else > >> + branch_type = AARCH64_INSN_BRANCH_NOLINK; > >> + > >> + if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0) > >> + return -EFAULT; > >> + > >> + if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0) > >> + return -EFAULT; > >> + > >> + mutex_lock(&text_mutex); > >> + if (aarch64_insn_read(ip, &replaced)) { > >> + ret = -EFAULT; > >> + goto out; > >> + } > >> + > >> + if (replaced != old_insn) { > >> + ret = -EFAULT; > >> + goto out; > >> + } > >> + > >> + ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn); > > > > ... and where does the actual synchronization come from in this case? > > aarch64_insn_patch_text_nosync() replaces an instruction atomically, so > no other CPUs will fetch a half-new and half-old instruction. > > The scenario here is that there is a chance that another CPU fetches the > old instruction after bpf_arch_text_poke() finishes, that is, different > CPUs may execute different versions of instructions at the same time. > > 1. When a new trampoline is attached, it doesn't seem to be an issue for > different CPUs to jump to different trampolines temporarily. > > 2. When an old trampoline is freed, we should wait for all other CPUs to > exit the trampoline and make sure the trampoline is no longer reachable, > IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu > tasks to do this. It would be good to have a comment for these points. Thanks, Mark. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B0946C433F5 for ; Mon, 16 May 2022 07:20:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=qPAjMVbCZyAEnSaJIbaGYseDdxkhqEod3IvIxmvIHyY=; b=iYvE48S0KVpXhI C8pyiuvNch3InsXRufaOTz34Xsea2NHnMFL1Z6sITNUUknwFSi/yuGxvgKXCt254gaavzDrJ5ZVGK NWOP2I9+XNOVbOrEtcb5tSizYT29SpAmhedRSBzFwPWYBYNP9Ve9MUMUCLl2bew6vdom9cqlN4kHT GJVZ43mhrQrXVa1dAfZe2Ofg1UFx8wayoMIVATb2BI/x2pmeJEkoJ+KP+grKwYllZJ+szMm0HogG/ xEQo0wtvpqZmDvoj9NGOnE/wToZWQu2ekOd41wD9z7hq9cEspCjqxZQzXL5WgnIu4yHEztY+opNAL sOlB3QI2cK147lCJMH6g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nqV0U-006OVA-4m; Mon, 16 May 2022 07:19:06 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nqV0Q-006OSy-5D for linux-arm-kernel@lists.infradead.org; Mon, 16 May 2022 07:19:04 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B8CF31063; Mon, 16 May 2022 00:18:57 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.2.236]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 028A63F718; Mon, 16 May 2022 00:18:51 -0700 (PDT) Date: Mon, 16 May 2022 08:18:32 +0100 From: Mark Rutland To: Xu Kuohai Cc: bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, Catalin Marinas , Will Deacon , Steven Rostedt , Ingo Molnar , Daniel Borkmann , Alexei Starovoitov , Zi Shen Lim , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "David S . Miller" , Hideaki YOSHIFUJI , David Ahern , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, hpa@zytor.com, Shuah Khan , Jakub Kicinski , Jesper Dangaard Brouer , Pasha Tatashin , Ard Biesheuvel , Daniel Kiss , Steven Price , Sudeep Holla , Marc Zyngier , Peter Collingbourne , Mark Brown , Delyan Kratunov , Kumar Kartikeya Dwivedi Subject: Re: [PATCH bpf-next v3 4/7] bpf, arm64: Impelment bpf_arch_text_poke() for arm64 Message-ID: References: <20220424154028.1698685-1-xukuohai@huawei.com> <20220424154028.1698685-5-xukuohai@huawei.com> <264ecbe1-4514-d6c8-182b-3af4babb457e@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <264ecbe1-4514-d6c8-182b-3af4babb457e@huawei.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220516_001902_331270_B2292CEB X-CRM114-Status: GOOD ( 42.59 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, May 16, 2022 at 02:55:46PM +0800, Xu Kuohai wrote: > On 5/13/2022 10:59 PM, Mark Rutland wrote: > > On Sun, Apr 24, 2022 at 11:40:25AM -0400, Xu Kuohai wrote: > >> Impelment bpf_arch_text_poke() for arm64, so bpf trampoline code can use > >> it to replace nop with jump, or replace jump with nop. > >> > >> Signed-off-by: Xu Kuohai > >> Acked-by: Song Liu > >> --- > >> arch/arm64/net/bpf_jit_comp.c | 63 +++++++++++++++++++++++++++++++++++ > >> 1 file changed, 63 insertions(+) > >> > >> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c > >> index 8ab4035dea27..3f9bdfec54c4 100644 > >> --- a/arch/arm64/net/bpf_jit_comp.c > >> +++ b/arch/arm64/net/bpf_jit_comp.c > >> @@ -9,6 +9,7 @@ > >> > >> #include > >> #include > >> +#include > >> #include > >> #include > >> #include > >> @@ -18,6 +19,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> #include > >> > >> #include "bpf_jit.h" > >> @@ -1529,3 +1531,64 @@ void bpf_jit_free_exec(void *addr) > >> { > >> return vfree(addr); > >> } > >> + > >> +static int gen_branch_or_nop(enum aarch64_insn_branch_type type, void *ip, > >> + void *addr, u32 *insn) > >> +{ > >> + if (!addr) > >> + *insn = aarch64_insn_gen_nop(); > >> + else > >> + *insn = aarch64_insn_gen_branch_imm((unsigned long)ip, > >> + (unsigned long)addr, > >> + type); > >> + > >> + return *insn != AARCH64_BREAK_FAULT ? 0 : -EFAULT; > >> +} > >> + > >> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type, > >> + void *old_addr, void *new_addr) > >> +{ > >> + int ret; > >> + u32 old_insn; > >> + u32 new_insn; > >> + u32 replaced; > >> + enum aarch64_insn_branch_type branch_type; > >> + > >> + if (!is_bpf_text_address((long)ip)) > >> + /* Only poking bpf text is supported. Since kernel function > >> + * entry is set up by ftrace, we reply on ftrace to poke kernel > >> + * functions. For kernel funcitons, bpf_arch_text_poke() is only > >> + * called after a failed poke with ftrace. In this case, there > >> + * is probably something wrong with fentry, so there is nothing > >> + * we can do here. See register_fentry, unregister_fentry and > >> + * modify_fentry for details. > >> + */ > >> + return -EINVAL; > > > > If you rely on ftrace to poke functions, why do you need to patch text > > at all? Why does the rest of this function exist? > > > > I really don't like having another piece of code outside of ftrace > > patching the ftrace patch-site; this needs a much better explanation. > > > > Sorry for the incorrect explaination in the comment. I don't think it's > reasonable to patch ftrace patch-site without ftrace code either. > > The patching logic in register_fentry, unregister_fentry and > modify_fentry is as follows: > > if (tr->func.ftrace_managed) > ret = register_ftrace_direct((long)ip, (long)new_addr); > else > ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr, > true); > > ftrace patch-site is patched by ftrace code. bpf_arch_text_poke() is > only used to patch bpf prog and bpf trampoline, which are not managed by > ftrace. Sorry, I had misunderstood. Thanks for the correction! I'll have another look with that in mind. > >> + > >> + if (poke_type == BPF_MOD_CALL) > >> + branch_type = AARCH64_INSN_BRANCH_LINK; > >> + else > >> + branch_type = AARCH64_INSN_BRANCH_NOLINK; > >> + > >> + if (gen_branch_or_nop(branch_type, ip, old_addr, &old_insn) < 0) > >> + return -EFAULT; > >> + > >> + if (gen_branch_or_nop(branch_type, ip, new_addr, &new_insn) < 0) > >> + return -EFAULT; > >> + > >> + mutex_lock(&text_mutex); > >> + if (aarch64_insn_read(ip, &replaced)) { > >> + ret = -EFAULT; > >> + goto out; > >> + } > >> + > >> + if (replaced != old_insn) { > >> + ret = -EFAULT; > >> + goto out; > >> + } > >> + > >> + ret = aarch64_insn_patch_text_nosync((void *)ip, new_insn); > > > > ... and where does the actual synchronization come from in this case? > > aarch64_insn_patch_text_nosync() replaces an instruction atomically, so > no other CPUs will fetch a half-new and half-old instruction. > > The scenario here is that there is a chance that another CPU fetches the > old instruction after bpf_arch_text_poke() finishes, that is, different > CPUs may execute different versions of instructions at the same time. > > 1. When a new trampoline is attached, it doesn't seem to be an issue for > different CPUs to jump to different trampolines temporarily. > > 2. When an old trampoline is freed, we should wait for all other CPUs to > exit the trampoline and make sure the trampoline is no longer reachable, > IIUC, bpf_tramp_image_put() function already uses percpu_ref and rcu > tasks to do this. It would be good to have a comment for these points. Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel