bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* LSF/MM session: eBPF standardization
@ 2022-05-03 14:04 Christoph Hellwig
  2022-05-10  8:16 ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-05-03 14:04 UTC (permalink / raw)
  To: ast, daniel; +Cc: bpf

Hi Alexei and Daniel,

if you have some time left I'd like to kick off a short discussion on
what to do about documenting and standardizing eBPF.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: LSF/MM session: eBPF standardization
  2022-05-03 14:04 LSF/MM session: eBPF standardization Christoph Hellwig
@ 2022-05-10  8:16 ` Christoph Hellwig
  2022-05-12  2:39   ` Alexei Starovoitov
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-05-10  8:16 UTC (permalink / raw)
  To: ast, daniel; +Cc: bpf, Harris, James R

Thanks everyone who participated.

Here is my rough memory an action items from the meeting.  As I
was on stage and did not take notes these might be a bit off and
may need correction.

The separate instruction set document wasn't known by everyone but
seens as a good idea.

The content needs a little more work:

 - document the version levels, based on the clang cpu levels
   (I plan to do this ASAP)
 - we need to decide to do about the legacy BPF packet access
   instrutions.  Alexei mentioned that the modern JIT doesn't
   even use those internally any more.
 - we need to document behavior for underflows / overflows and
   other behavior not mentioned.  The example in the session
   was divive by zero behavior.  Are there any notes on what
   the consensus for a lot of this behavior is, or do we need
   to reverse engineer it from the implementation?  I'd happily
   write the documentation, but I'd be really grateful for any
   input into what needs to go into it

Discussion on where to host a definitive version of the document:

 - I think the rough consensus is to just host regular (hopefully
   low cadence) documents and maybe the latest gratest at a eBPF
   foundation website.  Whom do we need to work with at the fundation
   to make this happen?
 - On a technical side we need to figure out a way how to build a
   standalone document from the kerneldoc tree of documents.  I
   volunteers to look into that as well.

The verifier is not very well documented, and mixes up generic behavior
with that of specific implementations and program types.

 - as idea it was brought up to write a doument with the minimal
   verification requirements required for any eBPF implementation
   independent of the program type.  Again I can volunteer to
   draft a documentation, but I need input on what such a consensus
   would be.  In this case input from the non-Linux verifier
   implementors (I only know the Microsoft research one) would
   be very helpful as well.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: LSF/MM session: eBPF standardization
  2022-05-10  8:16 ` Christoph Hellwig
@ 2022-05-12  2:39   ` Alexei Starovoitov
  2022-05-17  9:10     ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2022-05-12  2:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Harris, James R,
	Andrii Nakryiko, Dave Thaler, KP Singh, Brendan Gregg,
	Joe Stringer

On Tue, May 10, 2022 at 1:17 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Thanks everyone who participated.
>
> Here is my rough memory an action items from the meeting.  As I
> was on stage and did not take notes these might be a bit off and
> may need correction.
>
> The separate instruction set document wasn't known by everyone but
> seens as a good idea.
>
> The content needs a little more work:
>
>  - document the version levels, based on the clang cpu levels
>    (I plan to do this ASAP)

Turns out that clang -mcpu=v1,v2,v3 are not exactly correct.
We've extended ISA more than three times.
For example when we added more atomics insns in
https://lore.kernel.org/bpf/20210114181751.768687-1-jackmanb@google.com/

The corresponding llvm diff didn't add a new -mcpu flavor.
There was no need to do it.

Also llvm flags can turn a subset of insns on and off.
Like llvm can turn on alu32, but without <,<= insns.
-mcpu=v3 includes both. -mcpu=v2 are only <,<=.

So we need a plan B.

How about using the commit sha where support was added to the verifier
as a 'version' of the ISA ?

We can try to use a kernel version, but backports
will ruin that picture.
Looks like upstream 'commit sha' is the only stable number.

Another approach would be to declare the current ISA as
1.0 (or bpf-isa-may-2022) and
in the future bump it with every new insn.

>  - we need to decide to do about the legacy BPF packet access
>    instrutions.  Alexei mentioned that the modern JIT doesn't
>    even use those internally any more.

I think we need to document them as supported in the linux kernel,
but deprecated in general.
The standard might say "implementation defined" meaning that
different run-times don't have to support them.

>  - we need to document behavior for underflows / overflows and
>    other behavior not mentioned.  The example in the session
>    was divive by zero behavior.  Are there any notes on what
>    the consensus for a lot of this behavior is, or do we need
>    to reverse engineer it from the implementation?  I'd happily
>    write the documentation, but I'd be really grateful for any
>    input into what needs to go into it

For div by zero see do_misc_fixups().
We patch it with: 'if src_reg == 0 -> xor dst_reg, dst_reg'.
Interpreter and JITs will execute div/0 as-is, but in practice
it cannot happen because the verifier patched it.

Other undefined/underflows/overflows are implementation defined.
Meaning after JIT-ing they may behave differently on
different architectures.
For example the interpreter for shifts does
DST = DST OP (SRC & 63);
where OP is <<, >>
to avoid undefined behavior in C.

The JITs won't be adding the masking insns, since CPU HW will
do a mask implicitly. Which could potentially change
from one CPU to another.
I don't think it's worth documenting all that.
I would group all undefined/underflow/overflow as implementation
defined and document only things that matter.

>
> Discussion on where to host a definitive version of the document:
>
>  - I think the rough consensus is to just host regular (hopefully
>    low cadence) documents and maybe the latest gratest at a eBPF
>    foundation website.  Whom do we need to work with at the fundation
>    to make this happen?

foundation folks cc-ed.

>  - On a technical side we need to figure out a way how to build a
>    standalone document from the kerneldoc tree of documents.  I
>    volunteers to look into that as well.

+1

> The verifier is not very well documented, and mixes up generic behavior
> with that of specific implementations and program types.
>
>  - as idea it was brought up to write a doument with the minimal
>    verification requirements required for any eBPF implementation
>    independent of the program type.  Again I can volunteer to
>    draft a documentation, but I need input on what such a consensus
>    would be.  In this case input from the non-Linux verifier
>    implementors (I only know the Microsoft research one) would
>    be very helpful as well.

The verifier is a moving target.
I'd say minimal verification is the one that checks that:
- instructions are formed correctly
- opcode is valid
- no reserved bits are used
- registers are within range (r11+ are not used)
- combination of opcode+regs+off+imm is valid
- simple things like that

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: LSF/MM session: eBPF standardization
  2022-05-12  2:39   ` Alexei Starovoitov
@ 2022-05-17  9:10     ` Christoph Hellwig
  2022-05-17 22:29       ` Harris, James R
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-05-17  9:10 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Christoph Hellwig, Alexei Starovoitov, Daniel Borkmann, bpf,
	Harris, James R, Andrii Nakryiko, Dave Thaler, KP Singh,
	Brendan Gregg, Joe Stringer

On Wed, May 11, 2022 at 07:39:34PM -0700, Alexei Starovoitov wrote:
> Turns out that clang -mcpu=v1,v2,v3 are not exactly correct.
> We've extended ISA more than three times.
> For example when we added more atomics insns in
> https://lore.kernel.org/bpf/20210114181751.768687-1-jackmanb@google.com/
> 
> The corresponding llvm diff didn't add a new -mcpu flavor.
> There was no need to do it.

.. because all uses of these new instructions are through builtins
that wouldn't other be available, yes.

> Also llvm flags can turn a subset of insns on and off.
> Like llvm can turn on alu32, but without <,<= insns.
> -mcpu=v3 includes both. -mcpu=v2 are only <,<=.
> 
> So we need a plan B.

Yes.

> How about using the commit sha where support was added to the verifier
> as a 'version' of the ISA ?
> 
> We can try to use a kernel version, but backports
> will ruin that picture.
> Looks like upstream 'commit sha' is the only stable number.

Using kernel release hashes is a pretty horrible interface, especially
for non-kernel users.  I also think the compilers and other tools
would really like some vaguely meaninfuly identifiers.

> Another approach would be to declare the current ISA as
> 1.0 (or bpf-isa-may-2022) and
> in the future bump it with every new insn.

I think that is a much more reasonable starting position.  However, are
we sure the ISA actually evolves linearly?  As far as I can tell support
for full atomics only exists for a few JITs so far.

So maybe starting with a basedline, and then just have name for
each meaningful extension (e.g. the full atomics as a start) might be
even better.  For the Linux kernel case we can then also have a user
interface where userspace programs can query which extensions are
supported before loading eBPF programs that rely on them instead of
doing a roundtrip through the verifier.

> >  - we need to decide to do about the legacy BPF packet access
> >    instrutions.  Alexei mentioned that the modern JIT doesn't
> >    even use those internally any more.
> 
> I think we need to document them as supported in the linux kernel,
> but deprecated in general.
> The standard might say "implementation defined" meaning that
> different run-times don't have to support them.

Yeah.  If we do the extensions proposal above we could make these
a specific extension as well.

> [...]
> I don't think it's worth documenting all that.
> I would group all undefined/underflow/overflow as implementation
> defined and document only things that matter.

Makese sense.

> > Discussion on where to host a definitive version of the document:
> >
> >  - I think the rough consensus is to just host regular (hopefully
> >    low cadence) documents and maybe the latest gratest at a eBPF
> >    foundation website.  Whom do we need to work with at the fundation
> >    to make this happen?
> 
> foundation folks cc-ed.

I'd be rally glad if we could kick off this ASAP.  Feel free to contact
me privately if we want to keep it off the list.

> >  - as idea it was brought up to write a doument with the minimal
> >    verification requirements required for any eBPF implementation
> >    independent of the program type.  Again I can volunteer to
> >    draft a documentation, but I need input on what such a consensus
> >    would be.  In this case input from the non-Linux verifier
> >    implementors (I only know the Microsoft research one) would
> >    be very helpful as well.
> 
> The verifier is a moving target.

Absolutely.

> I'd say minimal verification is the one that checks that:
> - instructions are formed correctly
> - opcode is valid
> - no reserved bits are used
> - registers are within range (r11+ are not used)
> - combination of opcode+regs+off+imm is valid
> - simple things like that

Sounds good.  One useful thing for this would be an opcode table
with all the optional field usage in machine readable format.

Jim who is on CC has already built a nice table off all opcodes based
on existing material that might be a good starting point.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: LSF/MM session: eBPF standardization
  2022-05-17  9:10     ` Christoph Hellwig
@ 2022-05-17 22:29       ` Harris, James R
  0 siblings, 0 replies; 5+ messages in thread
From: Harris, James R @ 2022-05-17 22:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann, bpf,
	Andrii Nakryiko, Dave Thaler, KP Singh, Brendan Gregg,
	Joe Stringer



> On May 17, 2022, at 2:10 AM, Christoph Hellwig <hch@lst.de> wrote:
> 
> On Wed, May 11, 2022 at 07:39:34PM -0700, Alexei Starovoitov wrote:
>> Turns out that clang -mcpu=v1,v2,v3 are not exactly correct.
>> We've extended ISA more than three times.
>> For example when we added more atomics insns in
>> https://lore.kernel.org/bpf/20210114181751.768687-1-jackmanb@google.com/
>> 
>> The corresponding llvm diff didn't add a new -mcpu flavor.
>> There was no need to do it.
> 
> .. because all uses of these new instructions are through builtins
> that wouldn't other be available, yes.
> 
>> Also llvm flags can turn a subset of insns on and off.
>> Like llvm can turn on alu32, but without <,<= insns.
>> -mcpu=v3 includes both. -mcpu=v2 are only <,<=.
>> 
>> So we need a plan B.
> 
> Yes.
> 
>> How about using the commit sha where support was added to the verifier
>> as a 'version' of the ISA ?
>> 
>> We can try to use a kernel version, but backports
>> will ruin that picture.
>> Looks like upstream 'commit sha' is the only stable number.
> 
> Using kernel release hashes is a pretty horrible interface, especially
> for non-kernel users.  I also think the compilers and other tools
> would really like some vaguely meaninfuly identifiers.
> 
>> Another approach would be to declare the current ISA as
>> 1.0 (or bpf-isa-may-2022) and
>> in the future bump it with every new insn.
> 
> I think that is a much more reasonable starting position.  However, are
> we sure the ISA actually evolves linearly?  As far as I can tell support
> for full atomics only exists for a few JITs so far.
> 
> So maybe starting with a basedline, and then just have name for
> each meaningful extension (e.g. the full atomics as a start) might be
> even better.  For the Linux kernel case we can then also have a user
> interface where userspace programs can query which extensions are
> supported before loading eBPF programs that rely on them instead of
> doing a roundtrip through the verifier.
> 
>>> - we need to decide to do about the legacy BPF packet access
>>>   instrutions.  Alexei mentioned that the modern JIT doesn't
>>>   even use those internally any more.
>> 
>> I think we need to document them as supported in the linux kernel,
>> but deprecated in general.
>> The standard might say "implementation defined" meaning that
>> different run-times don't have to support them.
> 
> Yeah.  If we do the extensions proposal above we could make these
> a specific extension as well.
> 
>> [...]
>> I don't think it's worth documenting all that.
>> I would group all undefined/underflow/overflow as implementation
>> defined and document only things that matter.
> 
> Makese sense.
> 
>>> Discussion on where to host a definitive version of the document:
>>> 
>>> - I think the rough consensus is to just host regular (hopefully
>>>   low cadence) documents and maybe the latest gratest at a eBPF
>>>   foundation website.  Whom do we need to work with at the fundation
>>>   to make this happen?
>> 
>> foundation folks cc-ed.
> 
> I'd be rally glad if we could kick off this ASAP.  Feel free to contact
> me privately if we want to keep it off the list.
> 
>>> - as idea it was brought up to write a doument with the minimal
>>>   verification requirements required for any eBPF implementation
>>>   independent of the program type.  Again I can volunteer to
>>>   draft a documentation, but I need input on what such a consensus
>>>   would be.  In this case input from the non-Linux verifier
>>>   implementors (I only know the Microsoft research one) would
>>>   be very helpful as well.
>> 
>> The verifier is a moving target.
> 
> Absolutely.
> 
>> I'd say minimal verification is the one that checks that:
>> - instructions are formed correctly
>> - opcode is valid
>> - no reserved bits are used
>> - registers are within range (r11+ are not used)
>> - combination of opcode+regs+off+imm is valid
>> - simple things like that
> 
> Sounds good.  One useful thing for this would be an opcode table
> with all the optional field usage in machine readable format.
> 
> Jim who is on CC has already built a nice table off all opcodes based
> on existing material that might be a good starting point.

Table is inline below.  I used the tables in the iovisor project
documentation as a starting point for this table.

https://github.com/iovisor/bpf-docs/blob/master/eBPF.md.

Feedback welcome.  The atomic sections at the bottom could especially use some
careful review for correctness.

BPF_ALU64 (opc & 0x07 == 0x07)
==============================
0x07		BPF_ALU64 | BPF_K | BPF_ADD	bpf_add dst, imm	dst += imm
0x0f		BPF_ALU64 | BPF_X | BPF_ADD	bpf_add dst, src	dst += src
0x17		BPF_ALU64 | BPF_K | BPF_SUB	bpf_sub dst, imm	dst -= imm
0x1f		BPF_ALU64 | BPF_X | BPF_SUB	bpf_sub dst, src	dst -= src
0x27		BPF_ALU64 | BPF_K | BPF_MUL	bpf_mul dst, imm	dst *= imm
0x2f		BPF_ALU64 | BPF_X | BPF_MUL	bpf_mul dst, src	dst *= src
0x37		BPF_ALU64 | BPF_K | BPF_DIV	bpf_div dst, imm	dst /= imm
0x3f		BPF_ALU64 | BPF_X | BPF_DIV	bpf_div dst, src	dst /= src
0x47		BPF_ALU64 | BPF_K | BPF_OR	bpf_or dst, imm		dst |= imm
0x4f		BPF_ALU64 | BPF_X | BPF_OR	bpf_or dst, src		dst |= src
0x57		BPF_ALU64 | BPF_K | BPF_AND	bpf_and dst, imm	dst &= imm
0x5f		BPF_ALU64 | BPF_X | BPF_AND	bpf_and dst, src	dst &= src
0x67		BPF_ALU64 | BPF_K | BPF_LSH	bpf_lsh dst, imm	dst <<= imm
0x6f		BPF_ALU64 | BPF_X | BPF_LSH	bpf_lsh dst, src	dst <<= src
0x77		BPF_ALU64 | BPF_K | BPF_RSH	bpf_lsh dst, imm	dst >>= imm (logical)
0x7f		BPF_ALU64 | BPF_X | BPF_RSH	bpf_lsh dst, src	dst >>= src (logical)
0x87		BPF_ALU64 | BPF_K | BPF_NEG	bpf_neg dst		dst = ~dst
0x97		BPF_ALU64 | BPF_K | BPF_MOD	bpf_mod dst, imm	dst %= imm
0x9f		BPF_ALU64 | BPF_X | BPF_MOD	bpf_mod dst, src	dst %= src
0xa7		BPF_ALU64 | BPF_K | BPF_XOR	bpf_xor dst, imm	dst ^= imm
0xaf		BPF_ALU64 | BPF_X | BPF_XOR	bpf_xor dst, src	dst ^= src
0xb7		BPF_ALU64 | BPF_K | BPF_MOV	bpf_mov dst, imm	dst = imm
0xbf		BPF_ALU64 | BPF_X | BPF_MOV	bpf_mov dst, src	dst = src
0xc7		BPF_ALU64 | BPF_K | BPF_ARSH	bpf_arsh dst, imm	dst >>= imm (arithmetic)
0xcf		BPF_ALU64 | BPF_X | BPF_ARSH	bpf_arsh dst, src	dst >>= src (arithmetic)

BPF_ALU (opc & 0x07 == 0x04)
============================
0x04		BPF_ALU | BPF_K | BPF_ADD	bpf_add32 dst, imm	dst32 += imm
0x0c		BPF_ALU | BPF_X | BPF_ADD	bpf_add32 dst, src	dst32 += src32
0x14		BPF_ALU | BPF_K | BPF_SUB	bpf_sub32 dst, imm	dst32 -= imm
0x1c		BPF_ALU | BPF_X | BPF_SUB	bpf_sub32 dst, src	dst32 -= src32
0x24		BPF_ALU | BPF_K | BPF_MUL	bpf_mul32 dst, imm	dst32 *= imm
0x2c		BPF_ALU | BPF_X | BPF_MUL	bpf_mul32 dst, src	dst32 *= src32
0x34		BPF_ALU | BPF_K | BPF_DIV	bpf_div32 dst, imm	dst32 /= imm
0x3c		BPF_ALU | BPF_X | BPF_DIV	bpf_div32 dst, src	dst32 /= src32
0x44		BPF_ALU | BPF_K | BPF_OR	bpf_or32 dst, imm	dst32 |= imm
0x4c		BPF_ALU | BPF_X | BPF_OR	bpf_or32 dst, src	dst32 |= src32
0x54		BPF_ALU | BPF_K | BPF_AND	bpf_and32 dst, imm	dst32 &= imm
0x5c		BPF_ALU | BPF_X | BPF_AND	bpf_and32 dst, src	dst32 &= src32
0x64		BPF_ALU | BPF_K | BPF_LSH	bpf_lsh32 dst, imm	dst32 <<= imm
0x6c		BPF_ALU | BPF_X | BPF_LSH	bpf_lsh32 dst, src	dst32 <<= src32
0x74		BPF_ALU | BPF_K | BPF_RSH	bpf_lsh32 dst, imm	dst32 >>= imm (logical)
0x7c		BPF_ALU | BPF_X | BPF_RSH	bpf_lsh32 dst, src	dst32 >>= src32 (logical)
0x84		BPF_ALU | BPF_K | BPF_NEG	bpf_neg32 dst		dst32 = ~dst32
0x94		BPF_ALU | BPF_K | BPF_MOD	bpf_mod32 dst, imm	dst32 %= imm
0x9c		BPF_ALU | BPF_X | BPF_MOD	bpf_mod32 dst, src	dst32 %= src32
0xa4		BPF_ALU | BPF_K | BPF_XOR	bpf_xor32 dst, imm	dst32 ^= imm
0xac		BPF_ALU | BPF_X | BPF_XOR	bpf_xor32 dst, src	dst32 ^= src32
0xb4		BPF_ALU | BPF_K | BPF_MOV	bpf_mov32 dst, imm	dst32 = imm
0xbc		BPF_ALU | BPF_X | BPF_MOV	bpf_mov32 dst, src	dst32 = src32
0xc4		BPF_ALU | BPF_K | BPF_ARSH	bpf_arsh32 dst, imm	dst32 >>= imm (arithmetic)
0xcc		BPF_ALU | BPF_X | BPF_ARSH	bpf_arsh32 dst, src	dst32 >>= src32 (arithmetic)

For BPF_END instructions, BPF_K == htole conversion, BPF_X == htobe conversion
Operation size (16, 32, 64 bit) determined by 'imm' value of instruction (upper 32 bits)

0xd4 (.imm = 16)	BPF_ALU | BPF_K | BPF_END	bpf_le16 dst	dst16 = htole16(dst16)
0xd4 (.imm = 32)	BPF_ALU | BPF_K | BPF_END	bpf_le32 dst	dst32 = htole32(dst32)
0xd4 (.imm = 64)	BPF_ALU | BPF_K | BPF_END	bpf_le64 dst	dst = htole64(dst)
0xdc (.imm = 16)	BPF_ALU | BPF_X | BPF_END	bpf_be16 dst	dst16 = htobe16(dst16)
0xdc (.imm = 32)	BPF_ALU | BPF_X | BPF_END	bpf_be32 dst	dst32 = htobe32(dst32)
0xdc (.imm = 64)	BPF_ALU | BPF_X | BPF_END	bpf_be64 dst	dst = htobe64(dst)

BPF_JMP (opc & 0x07 == 0x05)
============================
0x05		BPF_JMP | BPF_K | BPF_JA	bpf_ja +off			PC += off
0x15		BPF_JMP | BPF_K | BPF_JEQ	bpf_jeq dst, imm, +off		PC += off if dst == imm
0x1d		BPF_JMP | BPF_X | BPF_JEQ	bpf_jeq dst, src, +off		PC += off if dst == src
0x25		BPF_JMP | BPF_K | BPF_JGT	bpf_jgt dst, imm, +off		PC += off if dst > imm
0x2d		BPF_JMP | BPF_X | BPF_JGT	bpf_jgt dst, src, +off		PC += off if dst > src
0x35		BPF_JMP | BPF_K | BPF_JGE	bpf_jge dst, imm, +off		PC += off if dst >= imm
0x3d		BPF_JMP | BPF_X | BPF_JGE	bpf_jge dst, src, +off		PC += off if dst >= src
0x45		BPF_JMP | BPF_K | BPF_JSET	bpf_jset dst, imm, +off		PC += off if dst & imm
0x4d		BPF_JMP | BPF_X | BPF_JSET	bpf_jset dst, src, +off		PC += off if dst & src
0x55		BPF_JMP | BPF_K | BPF_JNE	bpf_jne dst, imm, +off		PC += off if dst != imm
0x5d		BPF_JMP | BPF_X | BPF_JNE	bpf_jne dst, src, +off		PC += off if dst != src
0x65		BPF_JMP | BPF_K | BPF_JSGT	bpf_jsgt dst, imm, +off		PC += off if dst > imm (signed)
0x6d		BPF_JMP | BPF_X | BPF_JSGT	bpf_jsgt dst, src, +off		PC += off if dst > src (signed)
0x75		BPF_JMP | BPF_K | BPF_JSGE	bpf_jsge dst, imm, +off		PC += off if dst >= imm (signed)
0x7d		BPF_JMP | BPF_X | BPF_JSGE	bpf_jsge dst, src, +off		PC += off if dst >= src (signed)
0x85		BPF_JMP | BPF_K | BPF_CALL	bpf_call imm			Function call
0x95		BPF_JMP | BPF_K | BPF_EXIT	bpf_exit			return r0
0xa5		BPF_JMP | BPF_K | BPF_JLT	bpf_jlt dst, imm, +off		PC += off if dst < imm
0xad		BPF_JMP | BPF_X | BPF_JLT	bpf_jlt dst, src, +off		PC += off if dst < src
0xb5		BPF_JMP | BPF_K | BPF_JLE	bpf_jle dst, imm, +off		PC += off if dst <= imm
0xbd		BPF_JMP | BPF_X | BPF_JLE	bpf_jle dst, src, +off		PC += off if dst <= src
0xc5		BPF_JMP | BPF_K | BPF_JSLT	bpf_jslt dst, imm, +off		PC += off if dst < imm (signed)
0xcd		BPF_JMP | BPF_X | BPF_JSLT	bpf_jslt dst, src, +off		PC += off if dst < src (signed)
0xd5		BPF_JMP | BPF_K | BPF_JSLE	bpf_jsle dst, imm, +off		PC += off if dst <= imm (signed)
0xdd		BPF_JMP | BPF_X | BPF_JSLE	bpf_jsle dst, src, +off		PC += off if dst <= src (signed)

BPF_JMP32 (opc & 0x07 == 0x06)
==============================
0x16		BPF_JMP32 | BPF_K | BPF_JEQ	bpf_jeq32 dst, imm, +off	PC += off if dst32 == imm
0x1e		BPF_JMP32 | BPF_X | BPF_JEQ	bpf_jeq32 dst, src, +off	PC += off if dst32 == src32
0x26		BPF_JMP32 | BPF_K | BPF_JGT	bpf_jgt32 dst, imm, +off	PC += off if dst32 > imm
0x2e		BPF_JMP32 | BPF_X | BPF_JGT	bpf_jgt32 dst, src, +off	PC += off if dst32 > src32
0x36		BPF_JMP32 | BPF_K | BPF_JGE	bpf_jge32 dst, imm, +off	PC += off if dst32 >= imm
0x3e		BPF_JMP32 | BPF_X | BPF_JGE	bpf_jge32 dst, src, +off	PC += off if dst32 >= src32
0x46		BPF_JMP32 | BPF_K | BPF_JSET	bpf_jset32 dst, imm, +off	PC += off if dst32 & imm
0x4e		BPF_JMP32 | BPF_X | BPF_JSET	bpf_jset32 dst, src, +off	PC += off if dst32 & src32
0x56		BPF_JMP32 | BPF_K | BPF_JNE	bpf_jne32 dst, imm, +off	PC += off if dst32 != imm
0x5e		BPF_JMP32 | BPF_X | BPF_JNE	bpf_jne32 dst, src, +off	PC += off if dst32 != src32
0x66		BPF_JMP32 | BPF_K | BPF_JSGT	bpf_jsgt32 dst, imm, +off	PC += off if dst32 > imm (signed)
0x6e		BPF_JMP32 | BPF_X | BPF_JSGT	bpf_jsgt32 dst, src, +off	PC += off if dst32 > src32 (signed)
0x76		BPF_JMP32 | BPF_K | BPF_JSGE	bpf_jsge32 dst, imm, +off	PC += off if dst32 >= imm (signed)
0x7e		BPF_JMP32 | BPF_X | BPF_JSGE	bpf_jsge32 dst, src, +off	PC += off if dst32 >= src32 (signed)
0xa6		BPF_JMP32 | BPF_K | BPF_JLT	bpf_jlt32 dst, imm, +off	PC += off if dst32 < imm
0xae		BPF_JMP32 | BPF_X | BPF_JLT	bpf_jlt32 dst, src, +off	PC += off if dst32 < src32
0xb6		BPF_JMP32 | BPF_K | BPF_JLE	bpf_jle32 dst, imm, +off	PC += off if dst32 <= imm
0xbe		BPF_JMP32 | BPF_X | BPF_JLE	bpf_jle32 dst, src, +off	PC += off if dst32 <= src32
0xc6		BPF_JMP32 | BPF_K | BPF_JSLT	bpf_jslt32 dst, imm, +off	PC += off if dst32 < imm (signed)
0xce		BPF_JMP32 | BPF_X | BPF_JSLT	bpf_jslt32 dst, src, +off	PC += off if dst32 < src32 (signed)
0xd6		BPF_JMP32 | BPF_K | BPF_JSLE	bpf_jsle32 dst, imm, +off	PC += off if dst32 <= imm (signed)
0xde		BPF_JMP32 | BPF_X | BPF_JSLE	bpf_jsle32 dst, src, +off	PC += off if dst32 <= src32 (signed)

BPF_LD (opc & 0x07 == 0x00)
===========================
0x18		BPF_LD | BPF_DW | BPF_IMM	bpf_lddw dst, imm64		dst = imm64
Note: imm64 expressed in 8 bytes following instruction

BPF_LDX (opc & 0x07 == 0x01)
============================
0x61		BPF_LDX | BPF_W | BPF_MEM	bpf_ldxw dst, [src + off]	dst32 = *(u32 *)(src + off)
0x69		BPF_LDX | BPF_H | BPF_MEM	bpf_ldxh dst, [src + off]	dst16 = *(u16 *)(src + off)
0x71		BPF_LDX | BPF_B | BPF_MEM	bpf_ldxb dst, [src + off]	dst8 = *(u8 *)(src + off)
0x79		BPF_LDX | BPF_DW | BPF_MEM	bpf_ldxdw dst, [src + off]	dst = *(u64 *)(src + off)

BPF_ST (opc & 0x07 == 0x02)
===========================
0x62		BPF_ST | BPF_W | BPF_IMM	bpf_stw [dst + off], imm	*(u32 *)(dst + off) = imm
0x6a		BPF_ST | BPF_H | BPF_IMM	bpf_sth [dst + off], imm	*(u16 *)(dst + off) = imm
0x72		BPF_ST | BPF_B | BPF_IMM	bpf_stb [dst + off], imm	*(u8 *)(dst + off) = imm
0x7a		BPF_ST | BPF_DW | BPF_IMM	bpf_stdw [dst + off], imm	*(u64 *)(dst + off) = imm

BPF_STX (opc & 0x07 == 0x03)
============================
0x63		BPF_STX | BPF_W | BPF_MEM	bpf_stxw [dst + off], src	*(u32 *)(dst + off) = src32
0x6b		BPF_STX | BPF_H | BPF_MEM	bpf_stxh [dst + off], src	*(u16 *)(dst + off) = src16
0x73		BPF_STX | BPF_B | BPF_MEM	bpf_stxb [dst + off], src	*(u8 *)(dst + off) = src8
0x7b		BPF_STX | BPF_DW | BPF_MEM	bpf_stxdw [dst + off], src	*(u64 *)(dst + off) = src
0xdb		BPF_STX | BPF_DW | BPF_ATOMIC	64-bit atomic instructions (see below)
0xc3		BPF_STX | BPF_W | BPF_ATOMIC	32-bit atomic instructions (see below)

Note: mnemonic for atomic instructions?  for example, eBPF originally had only XADD atomic instruction which could be
represented as bpf_xadd.  But with addition of atomic operations for AND, OR, XOR - that same pattern no longer works.
So for now, just show pseudocode for each atomic operation.

64-bit atomic instructions (opc == 0xdb)
========================================
The following table applies to opc 0xdb (BPF_STX | BPF_DW | BPF_ATOMIC) which are 64-bit atomic operations.
The imm (immediate) value in column 1 specifies the type of atomic operation.

0x00	BPF_ADD			lock *(u64 *)(dst + off) += src
0x01	BPF_ADD | BPF_FETCH	src = atomic_fetch_add((u64 *)(dst + off), src)
0x40	BPF_OR			lock *(u64 *)(dst + off) |= src
0x41	BPF_OR | BPF_FETCH	src = atomic_fetch_or((u64 *)(dst + off), src)
0x50	BPF_AND			lock *(u64 *)(dst + off) &= src
0x51	BPF_AND | BPF_FETCH	src = atomic_fetch_and((u64 *)(dst + off), src)
0xa0	BPF_XOR			lock *(u64 *)(dst + off) ^= src
0xa1	BPF_XOR | BPF_FETCH	src = atomic_fetch_xor((u64 *)(dst + off), src)
0xe1	BPF_XCHG | BPF_FETCH	src = xchg((u64 *)(dst + off), src)
0xf1	BPF_CMPXCHG | BPF_FETCH	r0 = cmpxchg((u64 *)(dst + off), r0, src)

32-bit atomic instructions (opc == 0xc3)
========================================
The following table applies to opc 0xc3 (BPF_STX | BPF_W | BPF_ATOMIC) which are 32-bit atomic operations.
The imm (immediate) value in column 1 specifies the type of atomic operation.

0x00	BPF_ADD			lock *(u32 *)(dst + off) += src32
0x01	BPF_ADD | BPF_FETCH	src32 = atomic_fetch_add((u32 *)(dst + off), src32)
0x40	BPF_OR			lock *(u32 *)(dst + off) |= src32
0x41	BPF_OR | BPF_FETCH	src32 = atomic_fetch_or((u32 *)(dst + off), src32)
0x50	BPF_AND			lock *(u32 *)(dst + off) &= src32
0x51	BPF_AND | BPF_FETCH	src32 = atomic_fetch_and((u32 *)(dst + off), src32)
0xa0	BPF_XOR			lock *(u32 *)(dst + off) ^= src32
0xa1	BPF_XOR | BPF_FETCH	src32 = atomic_fetch_xor((u32 *)(dst + off), src32)
0xe1	BPF_XCHG | BPF_FETCH	src32 = xchg((u32 *)(dst + off), src32)
0xf1	BPF_CMPXCHG | BPF_FETCH	r0 = cmpxchg((u32 *)(dst + off), r0, src32)



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-05-17 22:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03 14:04 LSF/MM session: eBPF standardization Christoph Hellwig
2022-05-10  8:16 ` Christoph Hellwig
2022-05-12  2:39   ` Alexei Starovoitov
2022-05-17  9:10     ` Christoph Hellwig
2022-05-17 22:29       ` Harris, James R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).