* LSF/MM session: eBPF standardization
@ 2022-05-03 14:04 Christoph Hellwig
2022-05-10 8:16 ` Christoph Hellwig
0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-05-03 14:04 UTC (permalink / raw)
To: ast, daniel; +Cc: bpf
Hi Alexei and Daniel,
if you have some time left I'd like to kick off a short discussion on
what to do about documenting and standardizing eBPF.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LSF/MM session: eBPF standardization
2022-05-03 14:04 LSF/MM session: eBPF standardization Christoph Hellwig
@ 2022-05-10 8:16 ` Christoph Hellwig
2022-05-12 2:39 ` Alexei Starovoitov
0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-05-10 8:16 UTC (permalink / raw)
To: ast, daniel; +Cc: bpf, Harris, James R
Thanks everyone who participated.
Here is my rough memory an action items from the meeting. As I
was on stage and did not take notes these might be a bit off and
may need correction.
The separate instruction set document wasn't known by everyone but
seens as a good idea.
The content needs a little more work:
- document the version levels, based on the clang cpu levels
(I plan to do this ASAP)
- we need to decide to do about the legacy BPF packet access
instrutions. Alexei mentioned that the modern JIT doesn't
even use those internally any more.
- we need to document behavior for underflows / overflows and
other behavior not mentioned. The example in the session
was divive by zero behavior. Are there any notes on what
the consensus for a lot of this behavior is, or do we need
to reverse engineer it from the implementation? I'd happily
write the documentation, but I'd be really grateful for any
input into what needs to go into it
Discussion on where to host a definitive version of the document:
- I think the rough consensus is to just host regular (hopefully
low cadence) documents and maybe the latest gratest at a eBPF
foundation website. Whom do we need to work with at the fundation
to make this happen?
- On a technical side we need to figure out a way how to build a
standalone document from the kerneldoc tree of documents. I
volunteers to look into that as well.
The verifier is not very well documented, and mixes up generic behavior
with that of specific implementations and program types.
- as idea it was brought up to write a doument with the minimal
verification requirements required for any eBPF implementation
independent of the program type. Again I can volunteer to
draft a documentation, but I need input on what such a consensus
would be. In this case input from the non-Linux verifier
implementors (I only know the Microsoft research one) would
be very helpful as well.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LSF/MM session: eBPF standardization
2022-05-10 8:16 ` Christoph Hellwig
@ 2022-05-12 2:39 ` Alexei Starovoitov
2022-05-17 9:10 ` Christoph Hellwig
0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2022-05-12 2:39 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Harris, James R,
Andrii Nakryiko, Dave Thaler, KP Singh, Brendan Gregg,
Joe Stringer
On Tue, May 10, 2022 at 1:17 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Thanks everyone who participated.
>
> Here is my rough memory an action items from the meeting. As I
> was on stage and did not take notes these might be a bit off and
> may need correction.
>
> The separate instruction set document wasn't known by everyone but
> seens as a good idea.
>
> The content needs a little more work:
>
> - document the version levels, based on the clang cpu levels
> (I plan to do this ASAP)
Turns out that clang -mcpu=v1,v2,v3 are not exactly correct.
We've extended ISA more than three times.
For example when we added more atomics insns in
https://lore.kernel.org/bpf/20210114181751.768687-1-jackmanb@google.com/
The corresponding llvm diff didn't add a new -mcpu flavor.
There was no need to do it.
Also llvm flags can turn a subset of insns on and off.
Like llvm can turn on alu32, but without <,<= insns.
-mcpu=v3 includes both. -mcpu=v2 are only <,<=.
So we need a plan B.
How about using the commit sha where support was added to the verifier
as a 'version' of the ISA ?
We can try to use a kernel version, but backports
will ruin that picture.
Looks like upstream 'commit sha' is the only stable number.
Another approach would be to declare the current ISA as
1.0 (or bpf-isa-may-2022) and
in the future bump it with every new insn.
> - we need to decide to do about the legacy BPF packet access
> instrutions. Alexei mentioned that the modern JIT doesn't
> even use those internally any more.
I think we need to document them as supported in the linux kernel,
but deprecated in general.
The standard might say "implementation defined" meaning that
different run-times don't have to support them.
> - we need to document behavior for underflows / overflows and
> other behavior not mentioned. The example in the session
> was divive by zero behavior. Are there any notes on what
> the consensus for a lot of this behavior is, or do we need
> to reverse engineer it from the implementation? I'd happily
> write the documentation, but I'd be really grateful for any
> input into what needs to go into it
For div by zero see do_misc_fixups().
We patch it with: 'if src_reg == 0 -> xor dst_reg, dst_reg'.
Interpreter and JITs will execute div/0 as-is, but in practice
it cannot happen because the verifier patched it.
Other undefined/underflows/overflows are implementation defined.
Meaning after JIT-ing they may behave differently on
different architectures.
For example the interpreter for shifts does
DST = DST OP (SRC & 63);
where OP is <<, >>
to avoid undefined behavior in C.
The JITs won't be adding the masking insns, since CPU HW will
do a mask implicitly. Which could potentially change
from one CPU to another.
I don't think it's worth documenting all that.
I would group all undefined/underflow/overflow as implementation
defined and document only things that matter.
>
> Discussion on where to host a definitive version of the document:
>
> - I think the rough consensus is to just host regular (hopefully
> low cadence) documents and maybe the latest gratest at a eBPF
> foundation website. Whom do we need to work with at the fundation
> to make this happen?
foundation folks cc-ed.
> - On a technical side we need to figure out a way how to build a
> standalone document from the kerneldoc tree of documents. I
> volunteers to look into that as well.
+1
> The verifier is not very well documented, and mixes up generic behavior
> with that of specific implementations and program types.
>
> - as idea it was brought up to write a doument with the minimal
> verification requirements required for any eBPF implementation
> independent of the program type. Again I can volunteer to
> draft a documentation, but I need input on what such a consensus
> would be. In this case input from the non-Linux verifier
> implementors (I only know the Microsoft research one) would
> be very helpful as well.
The verifier is a moving target.
I'd say minimal verification is the one that checks that:
- instructions are formed correctly
- opcode is valid
- no reserved bits are used
- registers are within range (r11+ are not used)
- combination of opcode+regs+off+imm is valid
- simple things like that
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LSF/MM session: eBPF standardization
2022-05-12 2:39 ` Alexei Starovoitov
@ 2022-05-17 9:10 ` Christoph Hellwig
2022-05-17 22:29 ` Harris, James R
0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-05-17 9:10 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Christoph Hellwig, Alexei Starovoitov, Daniel Borkmann, bpf,
Harris, James R, Andrii Nakryiko, Dave Thaler, KP Singh,
Brendan Gregg, Joe Stringer
On Wed, May 11, 2022 at 07:39:34PM -0700, Alexei Starovoitov wrote:
> Turns out that clang -mcpu=v1,v2,v3 are not exactly correct.
> We've extended ISA more than three times.
> For example when we added more atomics insns in
> https://lore.kernel.org/bpf/20210114181751.768687-1-jackmanb@google.com/
>
> The corresponding llvm diff didn't add a new -mcpu flavor.
> There was no need to do it.
.. because all uses of these new instructions are through builtins
that wouldn't other be available, yes.
> Also llvm flags can turn a subset of insns on and off.
> Like llvm can turn on alu32, but without <,<= insns.
> -mcpu=v3 includes both. -mcpu=v2 are only <,<=.
>
> So we need a plan B.
Yes.
> How about using the commit sha where support was added to the verifier
> as a 'version' of the ISA ?
>
> We can try to use a kernel version, but backports
> will ruin that picture.
> Looks like upstream 'commit sha' is the only stable number.
Using kernel release hashes is a pretty horrible interface, especially
for non-kernel users. I also think the compilers and other tools
would really like some vaguely meaninfuly identifiers.
> Another approach would be to declare the current ISA as
> 1.0 (or bpf-isa-may-2022) and
> in the future bump it with every new insn.
I think that is a much more reasonable starting position. However, are
we sure the ISA actually evolves linearly? As far as I can tell support
for full atomics only exists for a few JITs so far.
So maybe starting with a basedline, and then just have name for
each meaningful extension (e.g. the full atomics as a start) might be
even better. For the Linux kernel case we can then also have a user
interface where userspace programs can query which extensions are
supported before loading eBPF programs that rely on them instead of
doing a roundtrip through the verifier.
> > - we need to decide to do about the legacy BPF packet access
> > instrutions. Alexei mentioned that the modern JIT doesn't
> > even use those internally any more.
>
> I think we need to document them as supported in the linux kernel,
> but deprecated in general.
> The standard might say "implementation defined" meaning that
> different run-times don't have to support them.
Yeah. If we do the extensions proposal above we could make these
a specific extension as well.
> [...]
> I don't think it's worth documenting all that.
> I would group all undefined/underflow/overflow as implementation
> defined and document only things that matter.
Makese sense.
> > Discussion on where to host a definitive version of the document:
> >
> > - I think the rough consensus is to just host regular (hopefully
> > low cadence) documents and maybe the latest gratest at a eBPF
> > foundation website. Whom do we need to work with at the fundation
> > to make this happen?
>
> foundation folks cc-ed.
I'd be rally glad if we could kick off this ASAP. Feel free to contact
me privately if we want to keep it off the list.
> > - as idea it was brought up to write a doument with the minimal
> > verification requirements required for any eBPF implementation
> > independent of the program type. Again I can volunteer to
> > draft a documentation, but I need input on what such a consensus
> > would be. In this case input from the non-Linux verifier
> > implementors (I only know the Microsoft research one) would
> > be very helpful as well.
>
> The verifier is a moving target.
Absolutely.
> I'd say minimal verification is the one that checks that:
> - instructions are formed correctly
> - opcode is valid
> - no reserved bits are used
> - registers are within range (r11+ are not used)
> - combination of opcode+regs+off+imm is valid
> - simple things like that
Sounds good. One useful thing for this would be an opcode table
with all the optional field usage in machine readable format.
Jim who is on CC has already built a nice table off all opcodes based
on existing material that might be a good starting point.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LSF/MM session: eBPF standardization
2022-05-17 9:10 ` Christoph Hellwig
@ 2022-05-17 22:29 ` Harris, James R
0 siblings, 0 replies; 5+ messages in thread
From: Harris, James R @ 2022-05-17 22:29 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann, bpf,
Andrii Nakryiko, Dave Thaler, KP Singh, Brendan Gregg,
Joe Stringer
> On May 17, 2022, at 2:10 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, May 11, 2022 at 07:39:34PM -0700, Alexei Starovoitov wrote:
>> Turns out that clang -mcpu=v1,v2,v3 are not exactly correct.
>> We've extended ISA more than three times.
>> For example when we added more atomics insns in
>> https://lore.kernel.org/bpf/20210114181751.768687-1-jackmanb@google.com/
>>
>> The corresponding llvm diff didn't add a new -mcpu flavor.
>> There was no need to do it.
>
> .. because all uses of these new instructions are through builtins
> that wouldn't other be available, yes.
>
>> Also llvm flags can turn a subset of insns on and off.
>> Like llvm can turn on alu32, but without <,<= insns.
>> -mcpu=v3 includes both. -mcpu=v2 are only <,<=.
>>
>> So we need a plan B.
>
> Yes.
>
>> How about using the commit sha where support was added to the verifier
>> as a 'version' of the ISA ?
>>
>> We can try to use a kernel version, but backports
>> will ruin that picture.
>> Looks like upstream 'commit sha' is the only stable number.
>
> Using kernel release hashes is a pretty horrible interface, especially
> for non-kernel users. I also think the compilers and other tools
> would really like some vaguely meaninfuly identifiers.
>
>> Another approach would be to declare the current ISA as
>> 1.0 (or bpf-isa-may-2022) and
>> in the future bump it with every new insn.
>
> I think that is a much more reasonable starting position. However, are
> we sure the ISA actually evolves linearly? As far as I can tell support
> for full atomics only exists for a few JITs so far.
>
> So maybe starting with a basedline, and then just have name for
> each meaningful extension (e.g. the full atomics as a start) might be
> even better. For the Linux kernel case we can then also have a user
> interface where userspace programs can query which extensions are
> supported before loading eBPF programs that rely on them instead of
> doing a roundtrip through the verifier.
>
>>> - we need to decide to do about the legacy BPF packet access
>>> instrutions. Alexei mentioned that the modern JIT doesn't
>>> even use those internally any more.
>>
>> I think we need to document them as supported in the linux kernel,
>> but deprecated in general.
>> The standard might say "implementation defined" meaning that
>> different run-times don't have to support them.
>
> Yeah. If we do the extensions proposal above we could make these
> a specific extension as well.
>
>> [...]
>> I don't think it's worth documenting all that.
>> I would group all undefined/underflow/overflow as implementation
>> defined and document only things that matter.
>
> Makese sense.
>
>>> Discussion on where to host a definitive version of the document:
>>>
>>> - I think the rough consensus is to just host regular (hopefully
>>> low cadence) documents and maybe the latest gratest at a eBPF
>>> foundation website. Whom do we need to work with at the fundation
>>> to make this happen?
>>
>> foundation folks cc-ed.
>
> I'd be rally glad if we could kick off this ASAP. Feel free to contact
> me privately if we want to keep it off the list.
>
>>> - as idea it was brought up to write a doument with the minimal
>>> verification requirements required for any eBPF implementation
>>> independent of the program type. Again I can volunteer to
>>> draft a documentation, but I need input on what such a consensus
>>> would be. In this case input from the non-Linux verifier
>>> implementors (I only know the Microsoft research one) would
>>> be very helpful as well.
>>
>> The verifier is a moving target.
>
> Absolutely.
>
>> I'd say minimal verification is the one that checks that:
>> - instructions are formed correctly
>> - opcode is valid
>> - no reserved bits are used
>> - registers are within range (r11+ are not used)
>> - combination of opcode+regs+off+imm is valid
>> - simple things like that
>
> Sounds good. One useful thing for this would be an opcode table
> with all the optional field usage in machine readable format.
>
> Jim who is on CC has already built a nice table off all opcodes based
> on existing material that might be a good starting point.
Table is inline below. I used the tables in the iovisor project
documentation as a starting point for this table.
https://github.com/iovisor/bpf-docs/blob/master/eBPF.md.
Feedback welcome. The atomic sections at the bottom could especially use some
careful review for correctness.
BPF_ALU64 (opc & 0x07 == 0x07)
==============================
0x07 BPF_ALU64 | BPF_K | BPF_ADD bpf_add dst, imm dst += imm
0x0f BPF_ALU64 | BPF_X | BPF_ADD bpf_add dst, src dst += src
0x17 BPF_ALU64 | BPF_K | BPF_SUB bpf_sub dst, imm dst -= imm
0x1f BPF_ALU64 | BPF_X | BPF_SUB bpf_sub dst, src dst -= src
0x27 BPF_ALU64 | BPF_K | BPF_MUL bpf_mul dst, imm dst *= imm
0x2f BPF_ALU64 | BPF_X | BPF_MUL bpf_mul dst, src dst *= src
0x37 BPF_ALU64 | BPF_K | BPF_DIV bpf_div dst, imm dst /= imm
0x3f BPF_ALU64 | BPF_X | BPF_DIV bpf_div dst, src dst /= src
0x47 BPF_ALU64 | BPF_K | BPF_OR bpf_or dst, imm dst |= imm
0x4f BPF_ALU64 | BPF_X | BPF_OR bpf_or dst, src dst |= src
0x57 BPF_ALU64 | BPF_K | BPF_AND bpf_and dst, imm dst &= imm
0x5f BPF_ALU64 | BPF_X | BPF_AND bpf_and dst, src dst &= src
0x67 BPF_ALU64 | BPF_K | BPF_LSH bpf_lsh dst, imm dst <<= imm
0x6f BPF_ALU64 | BPF_X | BPF_LSH bpf_lsh dst, src dst <<= src
0x77 BPF_ALU64 | BPF_K | BPF_RSH bpf_lsh dst, imm dst >>= imm (logical)
0x7f BPF_ALU64 | BPF_X | BPF_RSH bpf_lsh dst, src dst >>= src (logical)
0x87 BPF_ALU64 | BPF_K | BPF_NEG bpf_neg dst dst = ~dst
0x97 BPF_ALU64 | BPF_K | BPF_MOD bpf_mod dst, imm dst %= imm
0x9f BPF_ALU64 | BPF_X | BPF_MOD bpf_mod dst, src dst %= src
0xa7 BPF_ALU64 | BPF_K | BPF_XOR bpf_xor dst, imm dst ^= imm
0xaf BPF_ALU64 | BPF_X | BPF_XOR bpf_xor dst, src dst ^= src
0xb7 BPF_ALU64 | BPF_K | BPF_MOV bpf_mov dst, imm dst = imm
0xbf BPF_ALU64 | BPF_X | BPF_MOV bpf_mov dst, src dst = src
0xc7 BPF_ALU64 | BPF_K | BPF_ARSH bpf_arsh dst, imm dst >>= imm (arithmetic)
0xcf BPF_ALU64 | BPF_X | BPF_ARSH bpf_arsh dst, src dst >>= src (arithmetic)
BPF_ALU (opc & 0x07 == 0x04)
============================
0x04 BPF_ALU | BPF_K | BPF_ADD bpf_add32 dst, imm dst32 += imm
0x0c BPF_ALU | BPF_X | BPF_ADD bpf_add32 dst, src dst32 += src32
0x14 BPF_ALU | BPF_K | BPF_SUB bpf_sub32 dst, imm dst32 -= imm
0x1c BPF_ALU | BPF_X | BPF_SUB bpf_sub32 dst, src dst32 -= src32
0x24 BPF_ALU | BPF_K | BPF_MUL bpf_mul32 dst, imm dst32 *= imm
0x2c BPF_ALU | BPF_X | BPF_MUL bpf_mul32 dst, src dst32 *= src32
0x34 BPF_ALU | BPF_K | BPF_DIV bpf_div32 dst, imm dst32 /= imm
0x3c BPF_ALU | BPF_X | BPF_DIV bpf_div32 dst, src dst32 /= src32
0x44 BPF_ALU | BPF_K | BPF_OR bpf_or32 dst, imm dst32 |= imm
0x4c BPF_ALU | BPF_X | BPF_OR bpf_or32 dst, src dst32 |= src32
0x54 BPF_ALU | BPF_K | BPF_AND bpf_and32 dst, imm dst32 &= imm
0x5c BPF_ALU | BPF_X | BPF_AND bpf_and32 dst, src dst32 &= src32
0x64 BPF_ALU | BPF_K | BPF_LSH bpf_lsh32 dst, imm dst32 <<= imm
0x6c BPF_ALU | BPF_X | BPF_LSH bpf_lsh32 dst, src dst32 <<= src32
0x74 BPF_ALU | BPF_K | BPF_RSH bpf_lsh32 dst, imm dst32 >>= imm (logical)
0x7c BPF_ALU | BPF_X | BPF_RSH bpf_lsh32 dst, src dst32 >>= src32 (logical)
0x84 BPF_ALU | BPF_K | BPF_NEG bpf_neg32 dst dst32 = ~dst32
0x94 BPF_ALU | BPF_K | BPF_MOD bpf_mod32 dst, imm dst32 %= imm
0x9c BPF_ALU | BPF_X | BPF_MOD bpf_mod32 dst, src dst32 %= src32
0xa4 BPF_ALU | BPF_K | BPF_XOR bpf_xor32 dst, imm dst32 ^= imm
0xac BPF_ALU | BPF_X | BPF_XOR bpf_xor32 dst, src dst32 ^= src32
0xb4 BPF_ALU | BPF_K | BPF_MOV bpf_mov32 dst, imm dst32 = imm
0xbc BPF_ALU | BPF_X | BPF_MOV bpf_mov32 dst, src dst32 = src32
0xc4 BPF_ALU | BPF_K | BPF_ARSH bpf_arsh32 dst, imm dst32 >>= imm (arithmetic)
0xcc BPF_ALU | BPF_X | BPF_ARSH bpf_arsh32 dst, src dst32 >>= src32 (arithmetic)
For BPF_END instructions, BPF_K == htole conversion, BPF_X == htobe conversion
Operation size (16, 32, 64 bit) determined by 'imm' value of instruction (upper 32 bits)
0xd4 (.imm = 16) BPF_ALU | BPF_K | BPF_END bpf_le16 dst dst16 = htole16(dst16)
0xd4 (.imm = 32) BPF_ALU | BPF_K | BPF_END bpf_le32 dst dst32 = htole32(dst32)
0xd4 (.imm = 64) BPF_ALU | BPF_K | BPF_END bpf_le64 dst dst = htole64(dst)
0xdc (.imm = 16) BPF_ALU | BPF_X | BPF_END bpf_be16 dst dst16 = htobe16(dst16)
0xdc (.imm = 32) BPF_ALU | BPF_X | BPF_END bpf_be32 dst dst32 = htobe32(dst32)
0xdc (.imm = 64) BPF_ALU | BPF_X | BPF_END bpf_be64 dst dst = htobe64(dst)
BPF_JMP (opc & 0x07 == 0x05)
============================
0x05 BPF_JMP | BPF_K | BPF_JA bpf_ja +off PC += off
0x15 BPF_JMP | BPF_K | BPF_JEQ bpf_jeq dst, imm, +off PC += off if dst == imm
0x1d BPF_JMP | BPF_X | BPF_JEQ bpf_jeq dst, src, +off PC += off if dst == src
0x25 BPF_JMP | BPF_K | BPF_JGT bpf_jgt dst, imm, +off PC += off if dst > imm
0x2d BPF_JMP | BPF_X | BPF_JGT bpf_jgt dst, src, +off PC += off if dst > src
0x35 BPF_JMP | BPF_K | BPF_JGE bpf_jge dst, imm, +off PC += off if dst >= imm
0x3d BPF_JMP | BPF_X | BPF_JGE bpf_jge dst, src, +off PC += off if dst >= src
0x45 BPF_JMP | BPF_K | BPF_JSET bpf_jset dst, imm, +off PC += off if dst & imm
0x4d BPF_JMP | BPF_X | BPF_JSET bpf_jset dst, src, +off PC += off if dst & src
0x55 BPF_JMP | BPF_K | BPF_JNE bpf_jne dst, imm, +off PC += off if dst != imm
0x5d BPF_JMP | BPF_X | BPF_JNE bpf_jne dst, src, +off PC += off if dst != src
0x65 BPF_JMP | BPF_K | BPF_JSGT bpf_jsgt dst, imm, +off PC += off if dst > imm (signed)
0x6d BPF_JMP | BPF_X | BPF_JSGT bpf_jsgt dst, src, +off PC += off if dst > src (signed)
0x75 BPF_JMP | BPF_K | BPF_JSGE bpf_jsge dst, imm, +off PC += off if dst >= imm (signed)
0x7d BPF_JMP | BPF_X | BPF_JSGE bpf_jsge dst, src, +off PC += off if dst >= src (signed)
0x85 BPF_JMP | BPF_K | BPF_CALL bpf_call imm Function call
0x95 BPF_JMP | BPF_K | BPF_EXIT bpf_exit return r0
0xa5 BPF_JMP | BPF_K | BPF_JLT bpf_jlt dst, imm, +off PC += off if dst < imm
0xad BPF_JMP | BPF_X | BPF_JLT bpf_jlt dst, src, +off PC += off if dst < src
0xb5 BPF_JMP | BPF_K | BPF_JLE bpf_jle dst, imm, +off PC += off if dst <= imm
0xbd BPF_JMP | BPF_X | BPF_JLE bpf_jle dst, src, +off PC += off if dst <= src
0xc5 BPF_JMP | BPF_K | BPF_JSLT bpf_jslt dst, imm, +off PC += off if dst < imm (signed)
0xcd BPF_JMP | BPF_X | BPF_JSLT bpf_jslt dst, src, +off PC += off if dst < src (signed)
0xd5 BPF_JMP | BPF_K | BPF_JSLE bpf_jsle dst, imm, +off PC += off if dst <= imm (signed)
0xdd BPF_JMP | BPF_X | BPF_JSLE bpf_jsle dst, src, +off PC += off if dst <= src (signed)
BPF_JMP32 (opc & 0x07 == 0x06)
==============================
0x16 BPF_JMP32 | BPF_K | BPF_JEQ bpf_jeq32 dst, imm, +off PC += off if dst32 == imm
0x1e BPF_JMP32 | BPF_X | BPF_JEQ bpf_jeq32 dst, src, +off PC += off if dst32 == src32
0x26 BPF_JMP32 | BPF_K | BPF_JGT bpf_jgt32 dst, imm, +off PC += off if dst32 > imm
0x2e BPF_JMP32 | BPF_X | BPF_JGT bpf_jgt32 dst, src, +off PC += off if dst32 > src32
0x36 BPF_JMP32 | BPF_K | BPF_JGE bpf_jge32 dst, imm, +off PC += off if dst32 >= imm
0x3e BPF_JMP32 | BPF_X | BPF_JGE bpf_jge32 dst, src, +off PC += off if dst32 >= src32
0x46 BPF_JMP32 | BPF_K | BPF_JSET bpf_jset32 dst, imm, +off PC += off if dst32 & imm
0x4e BPF_JMP32 | BPF_X | BPF_JSET bpf_jset32 dst, src, +off PC += off if dst32 & src32
0x56 BPF_JMP32 | BPF_K | BPF_JNE bpf_jne32 dst, imm, +off PC += off if dst32 != imm
0x5e BPF_JMP32 | BPF_X | BPF_JNE bpf_jne32 dst, src, +off PC += off if dst32 != src32
0x66 BPF_JMP32 | BPF_K | BPF_JSGT bpf_jsgt32 dst, imm, +off PC += off if dst32 > imm (signed)
0x6e BPF_JMP32 | BPF_X | BPF_JSGT bpf_jsgt32 dst, src, +off PC += off if dst32 > src32 (signed)
0x76 BPF_JMP32 | BPF_K | BPF_JSGE bpf_jsge32 dst, imm, +off PC += off if dst32 >= imm (signed)
0x7e BPF_JMP32 | BPF_X | BPF_JSGE bpf_jsge32 dst, src, +off PC += off if dst32 >= src32 (signed)
0xa6 BPF_JMP32 | BPF_K | BPF_JLT bpf_jlt32 dst, imm, +off PC += off if dst32 < imm
0xae BPF_JMP32 | BPF_X | BPF_JLT bpf_jlt32 dst, src, +off PC += off if dst32 < src32
0xb6 BPF_JMP32 | BPF_K | BPF_JLE bpf_jle32 dst, imm, +off PC += off if dst32 <= imm
0xbe BPF_JMP32 | BPF_X | BPF_JLE bpf_jle32 dst, src, +off PC += off if dst32 <= src32
0xc6 BPF_JMP32 | BPF_K | BPF_JSLT bpf_jslt32 dst, imm, +off PC += off if dst32 < imm (signed)
0xce BPF_JMP32 | BPF_X | BPF_JSLT bpf_jslt32 dst, src, +off PC += off if dst32 < src32 (signed)
0xd6 BPF_JMP32 | BPF_K | BPF_JSLE bpf_jsle32 dst, imm, +off PC += off if dst32 <= imm (signed)
0xde BPF_JMP32 | BPF_X | BPF_JSLE bpf_jsle32 dst, src, +off PC += off if dst32 <= src32 (signed)
BPF_LD (opc & 0x07 == 0x00)
===========================
0x18 BPF_LD | BPF_DW | BPF_IMM bpf_lddw dst, imm64 dst = imm64
Note: imm64 expressed in 8 bytes following instruction
BPF_LDX (opc & 0x07 == 0x01)
============================
0x61 BPF_LDX | BPF_W | BPF_MEM bpf_ldxw dst, [src + off] dst32 = *(u32 *)(src + off)
0x69 BPF_LDX | BPF_H | BPF_MEM bpf_ldxh dst, [src + off] dst16 = *(u16 *)(src + off)
0x71 BPF_LDX | BPF_B | BPF_MEM bpf_ldxb dst, [src + off] dst8 = *(u8 *)(src + off)
0x79 BPF_LDX | BPF_DW | BPF_MEM bpf_ldxdw dst, [src + off] dst = *(u64 *)(src + off)
BPF_ST (opc & 0x07 == 0x02)
===========================
0x62 BPF_ST | BPF_W | BPF_IMM bpf_stw [dst + off], imm *(u32 *)(dst + off) = imm
0x6a BPF_ST | BPF_H | BPF_IMM bpf_sth [dst + off], imm *(u16 *)(dst + off) = imm
0x72 BPF_ST | BPF_B | BPF_IMM bpf_stb [dst + off], imm *(u8 *)(dst + off) = imm
0x7a BPF_ST | BPF_DW | BPF_IMM bpf_stdw [dst + off], imm *(u64 *)(dst + off) = imm
BPF_STX (opc & 0x07 == 0x03)
============================
0x63 BPF_STX | BPF_W | BPF_MEM bpf_stxw [dst + off], src *(u32 *)(dst + off) = src32
0x6b BPF_STX | BPF_H | BPF_MEM bpf_stxh [dst + off], src *(u16 *)(dst + off) = src16
0x73 BPF_STX | BPF_B | BPF_MEM bpf_stxb [dst + off], src *(u8 *)(dst + off) = src8
0x7b BPF_STX | BPF_DW | BPF_MEM bpf_stxdw [dst + off], src *(u64 *)(dst + off) = src
0xdb BPF_STX | BPF_DW | BPF_ATOMIC 64-bit atomic instructions (see below)
0xc3 BPF_STX | BPF_W | BPF_ATOMIC 32-bit atomic instructions (see below)
Note: mnemonic for atomic instructions? for example, eBPF originally had only XADD atomic instruction which could be
represented as bpf_xadd. But with addition of atomic operations for AND, OR, XOR - that same pattern no longer works.
So for now, just show pseudocode for each atomic operation.
64-bit atomic instructions (opc == 0xdb)
========================================
The following table applies to opc 0xdb (BPF_STX | BPF_DW | BPF_ATOMIC) which are 64-bit atomic operations.
The imm (immediate) value in column 1 specifies the type of atomic operation.
0x00 BPF_ADD lock *(u64 *)(dst + off) += src
0x01 BPF_ADD | BPF_FETCH src = atomic_fetch_add((u64 *)(dst + off), src)
0x40 BPF_OR lock *(u64 *)(dst + off) |= src
0x41 BPF_OR | BPF_FETCH src = atomic_fetch_or((u64 *)(dst + off), src)
0x50 BPF_AND lock *(u64 *)(dst + off) &= src
0x51 BPF_AND | BPF_FETCH src = atomic_fetch_and((u64 *)(dst + off), src)
0xa0 BPF_XOR lock *(u64 *)(dst + off) ^= src
0xa1 BPF_XOR | BPF_FETCH src = atomic_fetch_xor((u64 *)(dst + off), src)
0xe1 BPF_XCHG | BPF_FETCH src = xchg((u64 *)(dst + off), src)
0xf1 BPF_CMPXCHG | BPF_FETCH r0 = cmpxchg((u64 *)(dst + off), r0, src)
32-bit atomic instructions (opc == 0xc3)
========================================
The following table applies to opc 0xc3 (BPF_STX | BPF_W | BPF_ATOMIC) which are 32-bit atomic operations.
The imm (immediate) value in column 1 specifies the type of atomic operation.
0x00 BPF_ADD lock *(u32 *)(dst + off) += src32
0x01 BPF_ADD | BPF_FETCH src32 = atomic_fetch_add((u32 *)(dst + off), src32)
0x40 BPF_OR lock *(u32 *)(dst + off) |= src32
0x41 BPF_OR | BPF_FETCH src32 = atomic_fetch_or((u32 *)(dst + off), src32)
0x50 BPF_AND lock *(u32 *)(dst + off) &= src32
0x51 BPF_AND | BPF_FETCH src32 = atomic_fetch_and((u32 *)(dst + off), src32)
0xa0 BPF_XOR lock *(u32 *)(dst + off) ^= src32
0xa1 BPF_XOR | BPF_FETCH src32 = atomic_fetch_xor((u32 *)(dst + off), src32)
0xe1 BPF_XCHG | BPF_FETCH src32 = xchg((u32 *)(dst + off), src32)
0xf1 BPF_CMPXCHG | BPF_FETCH r0 = cmpxchg((u32 *)(dst + off), r0, src32)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-05-17 22:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03 14:04 LSF/MM session: eBPF standardization Christoph Hellwig
2022-05-10 8:16 ` Christoph Hellwig
2022-05-12 2:39 ` Alexei Starovoitov
2022-05-17 9:10 ` Christoph Hellwig
2022-05-17 22:29 ` Harris, James R
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).