From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf <bpf@vger.kernel.org>, Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eddy Z <eddyz87@gmail.com>, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next] bpf: Optimize emit_mov_imm64().
Date: Tue, 2 Apr 2024 19:34:52 -0700 [thread overview]
Message-ID: <CAADnVQKFfpY-QZBrOU2CG8v2du8Lgyb7MNVmOZVK_yTyOdNbBA@mail.gmail.com> (raw)
In-Reply-To: <ec622bce-eb17-e774-db03-7541a31661f4@iogearbox.net>
On Tue, Apr 2, 2024 at 8:48 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 4/2/24 1:38 AM, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Turned out that bpf prog callback addresses, bpf prog addresses
> > used in bpf_trampoline, and in other cases the 64-bit address
> > can be represented as sign extended 32-bit value.
> > According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339
> > "Skylake has 0.64c throughput for mov r64, imm64, vs. 0.25 for mov r32, imm32."
> > So use shorter encoding and faster instruction when possible.
> >
> > Special care is needed in jit_subprogs(), since bpf_pseudo_func()
> > instruction cannot change its size during the last step of JIT.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> > arch/x86/net/bpf_jit_comp.c | 5 ++++-
> > kernel/bpf/verifier.c | 13 ++++++++++---
> > 2 files changed, 14 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 3b639d6f2f54..47abddac6dc3 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -816,9 +816,10 @@ static void emit_mov_imm32(u8 **pprog, bool sign_propagate,
> > static void emit_mov_imm64(u8 **pprog, u32 dst_reg,
> > const u32 imm32_hi, const u32 imm32_lo)
> > {
> > + u64 imm64 = ((u64)imm32_hi << 32) | (u32)imm32_lo;
> > u8 *prog = *pprog;
> >
> > - if (is_uimm32(((u64)imm32_hi << 32) | (u32)imm32_lo)) {
> > + if (is_uimm32(imm64)) {
> > /*
> > * For emitting plain u32, where sign bit must not be
> > * propagated LLVM tends to load imm64 over mov32
> > @@ -826,6 +827,8 @@ static void emit_mov_imm64(u8 **pprog, u32 dst_reg,
> > * 'mov %eax, imm32' instead.
> > */
> > emit_mov_imm32(&prog, false, dst_reg, imm32_lo);
> > + } else if (is_simm32(imm64)) {
> > + emit_mov_imm32(&prog, true, dst_reg, imm32_lo);
> > } else {
> > /* movabsq rax, imm64 */
> > EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index edb650667f44..d4a338e7b5e7 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -19145,12 +19145,19 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > env->insn_aux_data[i].call_imm = insn->imm;
> > /* point imm to __bpf_call_base+1 from JITs point of view */
> > insn->imm = 1;
> > - if (bpf_pseudo_func(insn))
> > + if (bpf_pseudo_func(insn)) {
> > +#if defined(MODULES_VADDR)
> > + u64 addr = MODULES_VADDR;
> > +#else
> > + u64 addr = VMALLOC_START;
> > +#endif
>
> Is this beneficial for all archs? It seems this patch is mainly targetting x86.
> Why not having a weak function like u64 bpf_jit_alloc_exec_start() which returns
> the MODULES_VADDR for x86, but leaves the rest as-is?
>
> For example, arm64 has MODULES_VADDR defined, but the allocator uses vmalloc
> range instead, see bpf_jit_alloc_exec() there, so this is a different pool and
> it's also not clear if this is better or worse wrt its imm encoding.
This part makes no difference for all JITs except x86.
Back when commit 3990ed4c4266 ("bpf: Stop caching subprog index in the
bpf_pseudo_func insn")
added the comment below: "jit (e.g. x86_64) may emit fewer instructions"
pseudo_func-s were introduced for x86 and only x86 JIT has this behavior.
Since then other JITs added support for pseudo_func-s, but none
of them rely on this part of the verifier.
So the comment still applies to x86 only (afaics).
s390, riscv, arm64 went with: "if (bpf_pseudo_func)" process
ld_imm64 differently regardless of what is the value of
insn[0].imm, insn[1].imm.
I think it's a bit wrong.
I considered removing this if (bpf_pseudo_func(insn)) from verifier.c
and doing a similar hack in x86 jit, but decided against that.
The previous insn[1].imm = 1 was a hack targeted at x86.
It served its purpose for 3 years.
A hack, but imo cleaner than if (bpf_pseudo_func(insn)) in JITs.
Since I'm making emit_mov_imm64() smarter, there is a need to
make this part of the verifier.c a bit more accurate in terms of
value it represents.
MODULES_VADDR or VMALLOC_START doesn't make a difference.
It's a kernel text address. It could be an (long)&_text. fwiw.
I believe all JITs can potentially generalize
if (bpf_pseudo_func(insn)) check into if (kernel_addr(imm64)),
but that's a follow up for somebody.
weak helper bpf_jit_alloc_exec_start() is certainly an overkill.
pseudo_func callback doesn't have to be jit-ed bpf prog.
It's the address of the function.
If there is ever an arch where kernel and jit-ed code needs different
insns to represent an address then we will tackle such issue at that time.
Notice that we have similar #if defined(MODULES_VADDR)
logic in bpf_jit_alloc_exec_limit() that was added 6 years ago
and it's still fine. No need to over design this one either.
>
> > /* jit (e.g. x86_64) may emit fewer instructions
> > * if it learns a u32 imm is the same as a u64 imm.
> > - * Force a non zero here.
> > + * Set close enough to possible prog address.
> > */
> > - insn[1].imm = 1;
> > + insn[0].imm = (u32)addr;
> > + insn[1].imm = addr >> 32;
> > + }
prev parent reply other threads:[~2024-04-03 2:35 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-01 23:38 [PATCH bpf-next] bpf: Optimize emit_mov_imm64() Alexei Starovoitov
2024-04-02 15:48 ` Daniel Borkmann
2024-04-03 2:34 ` Alexei Starovoitov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAADnVQKFfpY-QZBrOU2CG8v2du8Lgyb7MNVmOZVK_yTyOdNbBA@mail.gmail.com \
--to=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).