Re: [RFC bpf-next 0/5] Support for BPF_ST instruction in LLVM C compiler

From: Yonghong Song <yhs@meta.com>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, kernel-team@fb.com, yhs@fb.com,
	david.faust@oracle.com,
	James Hilliard <james.hilliard1@gmail.com>
Subject: Re: [RFC bpf-next 0/5] Support for BPF_ST instruction in LLVM C compiler
Date: Fri, 13 Jan 2023 11:23:51 -0800	[thread overview]
Message-ID: <8c8cc92a-7437-728f-4f38-f278d9f35f07@meta.com> (raw)
In-Reply-To: <87a62mhl3m.fsf@oracle.com>

On 1/13/23 12:53 AM, Jose E. Marchesi wrote:
> 
>> On 1/12/23 2:27 PM, Alexei Starovoitov wrote:
>>> On Thu, Jan 05, 2023 at 02:07:05PM +0200, Eduard Zingerman wrote:
>>>> On Thu, 2023-01-05 at 11:06 +0100, Jose E. Marchesi wrote:
>>>>>> On Sat, Dec 31, 2022 at 8:31 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>>>>>>>
>>>>>>> BPF has two documented (non-atomic) memory store instructions:
>>>>>>>
>>>>>>> BPF_STX: *(size *) (dst_reg + off) = src_reg
>>>>>>> BPF_ST : *(size *) (dst_reg + off) = imm32
>>>>>>>
>>>>>>> Currently LLVM BPF back-end does not emit BPF_ST instruction and does
>>>>>>> not allow one to be specified as inline assembly.
>>>>>>>
>>>>>>> Recently I've been exploring ways to port some of the verifier test
>>>>>>> cases from tools/testing/selftests/bpf/verifier/*.c to use inline assembly
>>>>>>> and machinery provided in tools/testing/selftests/bpf/test_loader.c
>>>>>>> (which should hopefully simplify tests maintenance).
>>>>>>> The BPF_ST instruction is popular in these tests: used in 52 of 94 files.
>>>>>>>
>>>>>>> While it is possible to adjust LLVM to only support BPF_ST for inline
>>>>>>> assembly blocks it seems a bit wasteful. This patch-set contains a set
>>>>>>> of changes to verifier necessary in case when LLVM is allowed to
>>>>>>> freely emit BPF_ST instructions (source code is available here [1]).
>>>>>>
>>>>>> Would we gate LLVM's emitting of BPF_ST for C code behind some new
>>>>>> cpu=v4? What is the benefit for compiler to start automatically emit
>>>>>> such instructions? Such thinking about logistics, if there isn't much
>>>>>> benefit, as BPF application owner I wouldn't bother enabling this
>>>>>> behavior risking regressions on old kernels that don't have these
>>>>>> changes.
>>>>>
>>>>> Hmm, GCC happily generates BPF_ST instructions:
>>>>>
>>>>>     $ echo 'int v; void foo () {  v = 666; }' | bpf-unknown-none-gcc -O2 -xc -S -o foo.s -
>>>>>     $ cat foo.s
>>>>>           .file	"<stdin>"
>>>>>           .text
>>>>>           .align	3
>>>>>           .global	foo
>>>>>           .type	foo, @function
>>>>>     foo:
>>>>>           lddw	%r0,v
>>>>>           stw	[%r0+0],666
>>>>>           exit
>>>>>           .size	foo, .-foo
>>>>>           .global	v
>>>>>           .type	v, @object
>>>>>           .lcomm	v,4,4
>>>>>           .ident	"GCC: (GNU) 12.0.0 20211206 (experimental)"
>>>>>
>>>>> Been doing that since October 2019, I think before the cpu versioning
>>>>> mechanism was got in place?
>>>>>
>>>>> We weren't aware this was problematic.  Does the verifier reject such
>>>>> instructions?
>>>>
>>>> Interesting, do BPF selftests generated by GCC pass the same way they
>>>> do if generated by clang?
>>>>
>>>> I had to do the following changes to the verifier to make the
>>>> selftests pass when BPF_ST instruction is allowed for selection:
>>>>
>>>> - patch #1 in this patchset: track values of constants written to
>>>>     stack using BPF_ST. Currently these are tracked imprecisely, unlike
>>>>     the writes using BPF_STX, e.g.:
>>>>          fp[-8] = 42;   currently verifier assumes that
>>>> fp[-8]=mmmmmmmm
>>>>                      after such instruction, where m stands for "misc",
>>>>                      just a note that something is written at fp[-8].
>>>>                           r1 = 42;       verifier tracks r1=42 after
>>>> this instruction.
>>>>       fp[-8] = r1;   verifier tracks fp[-8]=42 after this instruction.
>>>>
>>>>     So the patch makes both cases equivalent.
>>>>     - patch #3 in this patchset: adjusts
>>>> verifier.c:convert_ctx_access()
>>>>     to operate on BPF_ST alongside BPF_STX.
>>>>        Context parameters for some BPF programs types are "fake"
>>>> data
>>>>     structures. The verifier matches all BPF_STX and BPF_LDX
>>>>     instructions that operate on pointers to such contexts and rewrites
>>>>     these instructions. It might change an offset or add another layer
>>>>     of indirection, etc. E.g. see filter.c:bpf_convert_ctx_access().
>>>>     (This also implies that verifier forbids writes to non-constant
>>>>      offsets inside such structures).
>>>>         So the patch extends this logic to also handle BPF_ST.
>>> The patch 3 is necessary to land before llvm starts generating 'st'
>>> for ctx access.
>>> That's clear, but I'm missing why patch 1 is necessary.
>>> Sure, it's making the verifier understand scalar spills with 'st' and
>>> makes 'st' equivalent to 'stx', but I'm missing why it's necessary.
>>> What kind of programs fail to be verified when llvm starts generating 'st' ?
>>> Regarind -mcpu=v4.
>>> I think we need to add all of our upcoming instructions as a single flag.
>>> Otherwise we'll have -mcpu=v5,v6,v7 and full combinations of them.
>>> -mcpu=v4 could mean:
>>> - ST
>>> - sign extending loads
>>> - sign extend a register
>>> - 32-bit JA
>>> - proper bswap insns: bswap16, bswap32, bswap64
>>> The sign and 32-bit JA we've discussed earlier.
>>> The bswap was on my wish list forever.
>>> The existing TO_LE, TO_BE insns are really odd from compiler pov.
>>> The compiler should translate bswap IR op into proper bswap insn
>>> just like it does on all cpus.
>>> Maybe add SDIV to -mcpu=v4 as well?
>>
>> Right, we should add these insns in llvm17 with -mcpu=v4, so we
>> can keep the number of cpu generations minimum.
> 
> How do you plan to encode the sign-extend load instructions?
> 
> I guess a possibility would be to use one of the available op-mode for
> load instructions that are currently marked as reserved.  For example:
> 
>     IMM  = 0b000
>     ABS  = 0b001
>     IND  = 0b010
>     MEM  = 0b011
>     SEM = 0b100  <- new
> 
> Then we would have the following code fields for sign-extending LDX
> instructions:
> 
>     op-mode:SEM op-size:{W,H,B,DW} op-class:LDX

Right, this is what I plan to do as well. See my incomplete llvm
prototype here:
   https://reviews.llvm.org/D133464