All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Puranjay Mohan <puranjay12@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	bpf <bpf@vger.kernel.org>, KP Singh <kpsingh@kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH bpf-next v5 1/3] arm64: patching: Add aarch64_insn_copy()
Date: Thu, 2 Nov 2023 10:41:17 -0700	[thread overview]
Message-ID: <CAADnVQKNtMw1JBShJsf003ogfuCF+J7_NeQcKQjgVVAM26ZDDw@mail.gmail.com> (raw)
In-Reply-To: <ZUPL-TeBpl1WEN7M@FVFF77S0Q05N.cambridge.arm.com>

On Thu, Nov 2, 2023 at 9:19 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Hi Puranjay,
>
> On Fri, Sep 08, 2023 at 02:43:18PM +0000, Puranjay Mohan wrote:
> > This will be used by BPF JIT compiler to dump JITed binary to a RX huge
> > page, and thus allow multiple BPF programs sharing the a huge (2MB)
> > page.
> >
> > The bpf_prog_pack allocator that implements the above feature allocates
> > a RX/RW buffer pair. The JITed code is written to the RW buffer and then
> > this function will be used to copy the code from RW to RX buffer.
> >
> > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > Acked-by: Song Liu <song@kernel.org>
> > ---
> >  arch/arm64/include/asm/patching.h |  1 +
> >  arch/arm64/kernel/patching.c      | 41 +++++++++++++++++++++++++++++++
> >  2 files changed, 42 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/patching.h b/arch/arm64/include/asm/patching.h
> > index 68908b82b168..f78a0409cbdb 100644
> > --- a/arch/arm64/include/asm/patching.h
> > +++ b/arch/arm64/include/asm/patching.h
> > @@ -8,6 +8,7 @@ int aarch64_insn_read(void *addr, u32 *insnp);
> >  int aarch64_insn_write(void *addr, u32 insn);
> >
> >  int aarch64_insn_write_literal_u64(void *addr, u64 val);
> > +void *aarch64_insn_copy(void *dst, const void *src, size_t len);
> >
> >  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> >  int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
> > diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> > index b4835f6d594b..243d6ae8d2d8 100644
> > --- a/arch/arm64/kernel/patching.c
> > +++ b/arch/arm64/kernel/patching.c
> > @@ -105,6 +105,47 @@ noinstr int aarch64_insn_write_literal_u64(void *addr, u64 val)
> >       return ret;
> >  }
> >
> > +/**
> > + * aarch64_insn_copy - Copy instructions into (an unused part of) RX memory
> > + * @dst: address to modify
> > + * @src: source of the copy
> > + * @len: length to copy
> > + *
> > + * Useful for JITs to dump new code blocks into unused regions of RX memory.
> > + */
> > +noinstr void *aarch64_insn_copy(void *dst, const void *src, size_t len)
> > +{
> > +     unsigned long flags;
> > +     size_t patched = 0;
> > +     size_t size;
> > +     void *waddr;
> > +     void *ptr;
> > +     int ret;
> > +
> > +     raw_spin_lock_irqsave(&patch_lock, flags);
> > +
> > +     while (patched < len) {
> > +             ptr = dst + patched;
> > +             size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > +                          len - patched);
> > +
> > +             waddr = patch_map(ptr, FIX_TEXT_POKE0);
> > +             ret = copy_to_kernel_nofault(waddr, src + patched, size);
> > +             patch_unmap(FIX_TEXT_POKE0);
> > +
> > +             if (ret < 0) {
> > +                     raw_spin_unlock_irqrestore(&patch_lock, flags);
> > +                     return NULL;
> > +             }
> > +             patched += size;
> > +     }
> > +     raw_spin_unlock_irqrestore(&patch_lock, flags);
> > +
> > +     caches_clean_inval_pou((uintptr_t)dst, (uintptr_t)dst + len);
>
> As Xu mentioned, either this needs to use flush_icache_range() to IPI all CPUs
> in the system, or we need to make it the caller's responsibility to do that.
>
> Otherwise, I think this is functionally ok, but I'm not certain that it's good
> for BPF to be using the FIX_TEXT_POKE0 slot as that will serialize all BPF
> loading, ftrace, kprobes, etc against one another. Do we ever expect to load
> multiple BPF programs in parallel, or is that serialized at a higher level?

bpf loading is pretty much serialized by the verifier.
It's a very slow operation.

WARNING: multiple messages have this Message-ID (diff)
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Puranjay Mohan <puranjay12@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,
	 Catalin Marinas <catalin.marinas@arm.com>,
	bpf <bpf@vger.kernel.org>,  KP Singh <kpsingh@kernel.org>,
	 linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH bpf-next v5 1/3] arm64: patching: Add aarch64_insn_copy()
Date: Thu, 2 Nov 2023 10:41:17 -0700	[thread overview]
Message-ID: <CAADnVQKNtMw1JBShJsf003ogfuCF+J7_NeQcKQjgVVAM26ZDDw@mail.gmail.com> (raw)
In-Reply-To: <ZUPL-TeBpl1WEN7M@FVFF77S0Q05N.cambridge.arm.com>

On Thu, Nov 2, 2023 at 9:19 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Hi Puranjay,
>
> On Fri, Sep 08, 2023 at 02:43:18PM +0000, Puranjay Mohan wrote:
> > This will be used by BPF JIT compiler to dump JITed binary to a RX huge
> > page, and thus allow multiple BPF programs sharing the a huge (2MB)
> > page.
> >
> > The bpf_prog_pack allocator that implements the above feature allocates
> > a RX/RW buffer pair. The JITed code is written to the RW buffer and then
> > this function will be used to copy the code from RW to RX buffer.
> >
> > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > Acked-by: Song Liu <song@kernel.org>
> > ---
> >  arch/arm64/include/asm/patching.h |  1 +
> >  arch/arm64/kernel/patching.c      | 41 +++++++++++++++++++++++++++++++
> >  2 files changed, 42 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/patching.h b/arch/arm64/include/asm/patching.h
> > index 68908b82b168..f78a0409cbdb 100644
> > --- a/arch/arm64/include/asm/patching.h
> > +++ b/arch/arm64/include/asm/patching.h
> > @@ -8,6 +8,7 @@ int aarch64_insn_read(void *addr, u32 *insnp);
> >  int aarch64_insn_write(void *addr, u32 insn);
> >
> >  int aarch64_insn_write_literal_u64(void *addr, u64 val);
> > +void *aarch64_insn_copy(void *dst, const void *src, size_t len);
> >
> >  int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> >  int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
> > diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> > index b4835f6d594b..243d6ae8d2d8 100644
> > --- a/arch/arm64/kernel/patching.c
> > +++ b/arch/arm64/kernel/patching.c
> > @@ -105,6 +105,47 @@ noinstr int aarch64_insn_write_literal_u64(void *addr, u64 val)
> >       return ret;
> >  }
> >
> > +/**
> > + * aarch64_insn_copy - Copy instructions into (an unused part of) RX memory
> > + * @dst: address to modify
> > + * @src: source of the copy
> > + * @len: length to copy
> > + *
> > + * Useful for JITs to dump new code blocks into unused regions of RX memory.
> > + */
> > +noinstr void *aarch64_insn_copy(void *dst, const void *src, size_t len)
> > +{
> > +     unsigned long flags;
> > +     size_t patched = 0;
> > +     size_t size;
> > +     void *waddr;
> > +     void *ptr;
> > +     int ret;
> > +
> > +     raw_spin_lock_irqsave(&patch_lock, flags);
> > +
> > +     while (patched < len) {
> > +             ptr = dst + patched;
> > +             size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > +                          len - patched);
> > +
> > +             waddr = patch_map(ptr, FIX_TEXT_POKE0);
> > +             ret = copy_to_kernel_nofault(waddr, src + patched, size);
> > +             patch_unmap(FIX_TEXT_POKE0);
> > +
> > +             if (ret < 0) {
> > +                     raw_spin_unlock_irqrestore(&patch_lock, flags);
> > +                     return NULL;
> > +             }
> > +             patched += size;
> > +     }
> > +     raw_spin_unlock_irqrestore(&patch_lock, flags);
> > +
> > +     caches_clean_inval_pou((uintptr_t)dst, (uintptr_t)dst + len);
>
> As Xu mentioned, either this needs to use flush_icache_range() to IPI all CPUs
> in the system, or we need to make it the caller's responsibility to do that.
>
> Otherwise, I think this is functionally ok, but I'm not certain that it's good
> for BPF to be using the FIX_TEXT_POKE0 slot as that will serialize all BPF
> loading, ftrace, kprobes, etc against one another. Do we ever expect to load
> multiple BPF programs in parallel, or is that serialized at a higher level?

bpf loading is pretty much serialized by the verifier.
It's a very slow operation.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-11-02 17:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-08 14:43 [PATCH bpf-next v5 0/3] bpf, arm64: use BPF prog pack allocator in BPF JIT Puranjay Mohan
2023-09-08 14:43 ` Puranjay Mohan
2023-09-08 14:43 ` [PATCH bpf-next v5 1/3] arm64: patching: Add aarch64_insn_copy() Puranjay Mohan
2023-09-08 14:43   ` Puranjay Mohan
2023-09-09  9:04   ` Xu Kuohai
2023-09-09  9:04     ` Xu Kuohai
2023-09-21 14:33     ` Puranjay Mohan
2023-09-21 14:33       ` Puranjay Mohan
2023-11-02 16:19   ` Mark Rutland
2023-11-02 16:19     ` Mark Rutland
2023-11-02 17:41     ` Alexei Starovoitov [this message]
2023-11-02 17:41       ` Alexei Starovoitov
2023-09-08 14:43 ` [PATCH bpf-next v5 2/3] arm64: patching: Add aarch64_insn_set() Puranjay Mohan
2023-09-08 14:43   ` Puranjay Mohan
2023-09-09  9:13   ` Xu Kuohai
2023-09-09  9:13     ` Xu Kuohai
2023-09-21 14:50     ` Puranjay Mohan
2023-09-21 14:50       ` Puranjay Mohan
2023-09-22  1:25       ` Xu Kuohai
2023-09-22  1:25         ` Xu Kuohai
2023-11-02 16:26   ` Mark Rutland
2023-11-02 16:26     ` Mark Rutland
2023-09-08 14:43 ` [PATCH bpf-next v5 3/3] bpf, arm64: use bpf_jit_binary_pack_alloc Puranjay Mohan
2023-09-08 14:43   ` Puranjay Mohan
2023-09-09  8:59   ` Xu Kuohai
2023-09-09  8:59     ` Xu Kuohai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAADnVQKNtMw1JBShJsf003ogfuCF+J7_NeQcKQjgVVAM26ZDDw@mail.gmail.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel@iogearbox.net \
    --cc=kpsingh@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=puranjay12@gmail.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.