Re: [PATCH bpf-next] bpf: x86: Explicitly zero-extend rax after 32-bit cmpxchg

From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Brendan Jackman <jackmanb@google.com>,
	bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	KP Singh <kpsingh@chromium.org>,
	Florent Revest <revest@chromium.org>
Subject: Re: [PATCH bpf-next] bpf: x86: Explicitly zero-extend rax after 32-bit cmpxchg
Date: Mon, 15 Feb 2021 23:42:32 +0100	[thread overview]
Message-ID: <5f7b836cc07980352215a5ad9a959c7e7c47f1cf.camel@linux.ibm.com> (raw)
In-Reply-To: <725b73b5-be08-f253-165d-e027ec568691@iogearbox.net>

On Mon, 2021-02-15 at 23:35 +0100, Daniel Borkmann wrote:
> On 2/15/21 11:24 PM, Ilya Leoshkevich wrote:
> > On Mon, 2021-02-15 at 23:20 +0100, Daniel Borkmann wrote:
> > > On 2/15/21 6:12 PM, Brendan Jackman wrote:
> > > > As pointed out by Ilya and explained in the new comment,
> > > > there's a
> > > > discrepancy between x86 and BPF CMPXCHG semantics: BPF always
> > > > loads
> > > > the value from memory into r0, while x86 only does so when r0
> > > > and
> > > > the
> > > > value in memory are different.
> > > > 
> > > > At first this might sound like pure semantics, but it makes a
> > > > real
> > > > difference when the comparison is 32-bit, since the load will
> > > > zero-extend r0/rax.
> > > > 
> > > > The fix is to explicitly zero-extend rax after doing such a
> > > > CMPXCHG.
> > > > 
> > > > Note that this doesn't generate totally optimal code: at one of
> > > > emit_atomic's callsites (where BPF_{AND,OR,XOR} | BPF_FETCH are
> > > > implemented), the new mov is superfluous because there's
> > > > already a
> > > > mov generated afterwards that will zero-extend r0. We could
> > > > avoid
> > > > this unnecessary mov by just moving the new logic outside of
> > > > emit_atomic. But I think it's simpler to keep emit_atomic as a
> > > > unit
> > > > of correctness (it generates the correct x86 code for a certain
> > > > set
> > > > of BPF instructions, no further knowledge is needed to use it
> > > > correctly).
> > > > 
> > > > Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > > Fixes: 5ffa25502b5a ("bpf: Add instructions for
> > > > atomic_[cmp]xchg")
> > > > Signed-off-by: Brendan Jackman <jackmanb@google.com>
> > > > ---
> > > >    arch/x86/net/bpf_jit_comp.c                   | 10 +++++++
> > > >    .../selftests/bpf/verifier/atomic_cmpxchg.c   | 25
> > > > ++++++++++++++++++
> > > >    .../selftests/bpf/verifier/atomic_or.c        | 26
> > > > +++++++++++++++++++
> > > >    3 files changed, 61 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/net/bpf_jit_comp.c
> > > > b/arch/x86/net/bpf_jit_comp.c
> > > > index 79e7a0ec1da5..7919d5c54164 100644
> > > > --- a/arch/x86/net/bpf_jit_comp.c
> > > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > > @@ -834,6 +834,16 @@ static int emit_atomic(u8 **pprog, u8
> > > > atomic_op,
> > > >    
> > > >          emit_insn_suffix(&prog, dst_reg, src_reg, off);
> > > >    
> > > > +       if (atomic_op == BPF_CMPXCHG && bpf_size == BPF_W) {
> > > > +               /*
> > > > +                * BPF_CMPXCHG unconditionally loads into R0,
> > > > which
> > > > means it
> > > > +                * zero-extends 32-bit values. However x86
> > > > CMPXCHG
> > > > doesn't do a
> > > > +                * load if the comparison is successful.
> > > > Therefore
> > > > zero-extend
> > > > +                * explicitly.
> > > > +                */
> > > > +               emit_mov_reg(&prog, false, BPF_REG_0,
> > > > BPF_REG_0);
> > > 
> > > How does the situation look on other archs when they need to
> > > implement this in future?
> > > Mainly asking whether it would be better to instead to move this
> > > logic into the verifier
> > > instead, so it'll be consistent across all archs.
> > 
> > I have exactly the same check in my s390 wip patch.
> > So having a common solution would be great.
> 
> We do rewrites for various cases like div/mod handling, perhaps would
> be
> best to emit an explicit BPF_MOV32_REG(insn->dst_reg, insn->dst_reg)
> there,
> see the fixup_bpf_calls().

How about BPF_ZEXT_REG? Then arches that don't need this (I think
aarch64's instruction always zero-extends) can detect this using
insn_is_zext() and skip such insns.