All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: "Ilya Leoshkevich" <iii@linux.ibm.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"John Fastabend" <john.fastabend@gmail.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Cc: "Björn Töpel" <bjorn@rivosinc.com>,
	"Brendan Jackman" <jackmanb@google.com>
Subject: Re: [PATCH bpf] bpf: Proper R0 zero-extension for BPF_CALL instructions
Date: Tue, 6 Dec 2022 09:47:21 -0800	[thread overview]
Message-ID: <d26622c6-d51e-e280-6c8a-38c893c49446@meta.com> (raw)
In-Reply-To: <3b77aa12a864ab2db081e99aec1bfad78e3b9b51.camel@linux.ibm.com>



On 12/6/22 5:21 AM, Ilya Leoshkevich wrote:
> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote:
>> From: Björn Töpel <bjorn@rivosinc.com>
>>
>> A BPF call instruction can be, correctly, marked with zext_dst set to
>> true. An example of this can be found in the BPF selftests
>> progs/bpf_cubic.c:
>>
>>    ...
>>    extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym;
>>
>>    __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk)
>>    {
>>            return tcp_reno_undo_cwnd(sk);
>>    }
>>    ...
>>
>> which compiles to:
>>    0:  r1 = *(u64 *)(r1 + 0x0)
>>    1:  call -0x1
>>    2:  exit
>>
>> The call will be marked as zext_dst set to true, and for some
>> backends
>> (bpf_jit_needs_zext() returns true) expanded to:
>>    0:  r1 = *(u64 *)(r1 + 0x0)
>>    1:  call -0x1
>>    2:  w0 = w0
>>    3:  exit
> 
> In the verifier, the marking is done by check_kfunc_call() (added in
> e6ac2450d6de), right? So the problem occurs only for kfuncs?
> 
>          /* Check return type */
>          t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
> 
>          ...
> 
>          if (btf_type_is_scalar(t)) {
>                  mark_reg_unknown(env, regs, BPF_REG_0);
>                  mark_btf_func_reg_size(env, BPF_REG_0, t->size);
> 
> I tried to find some official information whether the eBPF calling
> convention requires sign- or zero- extending return values and
> arguments, but unfortunately [1] doesn't mention this.
> 
> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W*
> registers, but since assigning to W* leads to zero-extension, it seems
> to me that this is the case.

We actually follow the clang convention, the zero-extension is either
done in caller or callee, but not both. See 
https://reviews.llvm.org/D131598 how the convention could be changed.

The following is an example.

$ cat t.c
extern unsigned foo(void);
unsigned bar1(void) {
     return foo();
}
unsigned bar2(void) {
     if (foo()) return 10; else return 20;
}
$ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o

t.o:    file format elf64-bpf

Disassembly of section .text:

0000000000000000 <bar1>:
        0:       85 10 00 00 ff ff ff ff call -0x1
        1:       95 00 00 00 00 00 00 00 exit

0000000000000010 <bar2>:
        2:       85 10 00 00 ff ff ff ff call -0x1
        3:       bc 01 00 00 00 00 00 00 w1 = w0
        4:       b4 00 00 00 14 00 00 00 w0 = 0x14
        5:       16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2>
        6:       b4 00 00 00 0a 00 00 00 w0 = 0xa

0000000000000038 <LBB1_2>:
        7:       95 00 00 00 00 00 00 00 exit
$

If the return value of 'foo()' is actually used in the bpf program, the
proper zero extension will be done. Otherwise, it is not done.

This is with latest llvm16. I guess we need to check llvm whether
we could enforce to add a w0 = w0 in bar1().

Otherwise, with this patch, it will add w0 = w0 in all cases which
is not necessary in most of practical cases.

> 
> If the above is correct, then shouldn't we rather use sizeof(void *) in
> the mark_btf_func_reg_size() call above?
> 
>> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for
>> the zext patching, relies on insn_def_regno() to fetch the register
>> to
>> zero-extend. However, this function does not handle call instructions
>> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the
>> verification.
>>
>> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL)
>> instructions.
>>
>> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in
>> insn_has_def32()")
>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>> ---
>> I'm not super happy about the additional special case -- first
>> cmpxchg, and now call. :-( A more elegant/generic solution is
>> welcome!
>> ---
>>   kernel/bpf/verifier.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 264b3dc714cc..4f9660eafc72 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -13386,6 +13386,9 @@ static int
>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>>                  if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn))
>>                          continue;
>>   
>> +               if (insn.code == (BPF_JMP | BPF_CALL))
>> +                       load_reg = BPF_REG_0;
>> +
>>                  if (WARN_ON(load_reg == -1)) {
>>                          verbose(env, "verifier bug. zext_dst is set,
>> but no reg is defined\n");
>>                          return -EFAULT;
>>
>> base-commit: 01f856ae6d0ca5ad0505b79bf2d22d7ca439b2a1
> 
> [1]
> https://docs.kernel.org/bpf/instruction-set.html#registers-and-calling-convention

  parent reply	other threads:[~2022-12-06 17:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-02 10:36 [PATCH bpf] bpf: Proper R0 zero-extension for BPF_CALL instructions Björn Töpel
2022-12-06 13:21 ` Ilya Leoshkevich
2022-12-06 13:49   ` Björn Töpel
2022-12-06 13:51     ` Ilya Leoshkevich
2022-12-06 17:47   ` Yonghong Song [this message]
2022-12-06 18:02     ` Björn Töpel
2022-12-06 18:09     ` Yonghong Song
2022-12-06 18:38       ` Björn Töpel
2022-12-07  1:01         ` Yonghong Song
2022-12-07  7:10           ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d26622c6-d51e-e280-6c8a-38c893c49446@meta.com \
    --to=yhs@meta.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bjorn@rivosinc.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=iii@linux.ibm.com \
    --cc=jackmanb@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.