* [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file
@ 2022-09-27 18:59 dthaler1968
2022-09-27 18:59 ` [PATCH 02/15] ebpf-docs: Linux byteswap note dthaler1968
` (14 more replies)
0 siblings, 15 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 38 ++--------------
Documentation/bpf/linux-notes.rst | 65 +++++++++++++++++++++++++++
2 files changed, 68 insertions(+), 35 deletions(-)
create mode 100644 Documentation/bpf/linux-notes.rst
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 1b0e6711d..352f25a1e 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -282,8 +282,6 @@ arithmetic operations in the imm field to encode the atomic operation:
*(u64 *)(dst_reg + off16) += src_reg
-``BPF_XADD`` is a deprecated name for ``BPF_ATOMIC | BPF_ADD``.
-
In addition to the simple atomic operations, there also is a modifier and
two complex atomic operations:
@@ -331,36 +329,6 @@ There is currently only one such instruction.
Legacy BPF Packet access instructions
-------------------------------------
-eBPF has special instructions for access to packet data that have been
-carried over from classic BPF to retain the performance of legacy socket
-filters running in the eBPF interpreter.
-
-The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
-``BPF_IND | <size> | BPF_LD``.
-
-These instructions are used to access packet data and can only be used when
-the program context is a pointer to networking packet. ``BPF_ABS``
-accesses packet data at an absolute offset specified by the immediate data
-and ``BPF_IND`` access packet data at an offset that includes the value of
-a register in addition to the immediate data.
-
-These instructions have seven implicit operands:
-
- * Register R6 is an implicit input that must contain pointer to a
- struct sk_buff.
- * Register R0 is an implicit output which contains the data fetched from
- the packet.
- * Registers R1-R5 are scratch registers that are clobbered after a call to
- ``BPF_ABS | BPF_LD`` or ``BPF_IND | BPF_LD`` instructions.
-
-These instructions have an implicit program exit condition as well. When an
-eBPF program is trying to access the data beyond the packet boundary, the
-program execution will be aborted.
-
-``BPF_ABS | BPF_W | BPF_LD`` means::
-
- R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32))
-
-``BPF_IND | BPF_W | BPF_LD`` means::
-
- R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
+eBPF previously introduced special instructions for access to packet data that were
+carried over from classic BPF. However, these instructions are
+deprecated and should no longer be used.
diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
new file mode 100644
index 000000000..93c01386d
--- /dev/null
+++ b/Documentation/bpf/linux-notes.rst
@@ -0,0 +1,65 @@
+.. contents::
+.. sectnum::
+
+==========================
+Linux implementation notes
+==========================
+
+This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
+
+Legacy BPF Packet access instructions
+=====================================
+
+As mentioned in the `ISA standard documentation <instruction-set.rst#legacy-bpf-packet-access-instructions>`_,
+Linux has special eBPF instructions for access to packet data that have been
+carried over from classic BPF to retain the performance of legacy socket
+filters running in the eBPF interpreter.
+
+The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
+``BPF_IND | <size> | BPF_LD``.
+
+These instructions are used to access packet data and can only be used when
+the program context is a pointer to a networking packet. ``BPF_ABS``
+accesses packet data at an absolute offset specified by the immediate data
+and ``BPF_IND`` access packet data at an offset that includes the value of
+a register in addition to the immediate data.
+
+These instructions have seven implicit operands:
+
+* Register R6 is an implicit input that must contain a pointer to a
+ struct sk_buff.
+* Register R0 is an implicit output which contains the data fetched from
+ the packet.
+* Registers R1-R5 are scratch registers that are clobbered by the
+ instruction.
+
+These instructions have an implicit program exit condition as well. If an
+eBPF program attempts access data beyond the packet boundary, the
+program execution will be aborted.
+
+``BPF_ABS | BPF_W | BPF_LD`` (0x20) means::
+
+ R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + imm))
+
+where ``ntohl()`` converts a 32-bit value from network byte order to host byte order.
+
+``BPF_IND | BPF_W | BPF_LD`` (0x40) means::
+
+ R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + src + imm))
+
+Appendix
+========
+
+For reference, the following table lists legacy Linux-specific opcodes in order by value.
+
+====== ==== =================================================== =============
+opcode imm description reference
+====== ==== =================================================== =============
+0x20 any dst = ntohl(\*(uint32_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_
+0x28 any dst = ntohs(\*(uint16_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_
+0x30 any dst = (\*(uint8_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_
+0x38 any dst = ntohll(\*(uint64_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_
+0x40 any dst = ntohl(\*(uint32_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_
+0x48 any dst = ntohs(\*(uint16_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_
+0x50 any dst = \*(uint8_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_
+0x58 any dst = ntohll(\*(uint64_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 02/15] ebpf-docs: Linux byteswap note
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-27 18:59 ` [PATCH 03/15] ebpf-docs: Move Clang notes to a separate file dthaler1968
` (13 subsequent siblings)
14 siblings, 0 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 4 ----
Documentation/bpf/linux-notes.rst | 5 +++++
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 352f25a1e..1735b91ec 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -156,10 +156,6 @@ Examples:
dst_reg = htobe64(dst_reg)
-``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and
-``BPF_TO_BE`` respectively.
-
-
Jump instructions
-----------------
diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
index 93c01386d..1c31379b4 100644
--- a/Documentation/bpf/linux-notes.rst
+++ b/Documentation/bpf/linux-notes.rst
@@ -7,6 +7,11 @@ Linux implementation notes
This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
+Byte swap instructions
+======================
+
+``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively.
+
Legacy BPF Packet access instructions
=====================================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 03/15] ebpf-docs: Move Clang notes to a separate file
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
2022-09-27 18:59 ` [PATCH 02/15] ebpf-docs: Linux byteswap note dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-27 18:59 ` [PATCH 04/15] ebpf-docs: Add Clang note about BPF_ALU dthaler1968
` (12 subsequent siblings)
14 siblings, 0 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/clang-notes.rst | 24 ++++++++++++++++++++++++
Documentation/bpf/instruction-set.rst | 6 ------
2 files changed, 24 insertions(+), 6 deletions(-)
create mode 100644 Documentation/bpf/clang-notes.rst
diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst
new file mode 100644
index 000000000..b15179cb5
--- /dev/null
+++ b/Documentation/bpf/clang-notes.rst
@@ -0,0 +1,24 @@
+.. contents::
+.. sectnum::
+
+==========================
+Clang implementation notes
+==========================
+
+This document provides more details specific to the Clang/LLVM implementation of the eBPF instruction set.
+
+Versions
+========
+
+Clang defined "CPU" versions, where a CPU version of 3 corresponds to the current eBPF ISA.
+
+Clang can select the eBPF ISA version using ``-mcpu=v3`` for example to select version 3.
+
+Atomic operations
+=================
+
+Clang can generate atomic instructions by default when ``-mcpu=v3`` is
+enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
+Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
+the atomics features, while keeping a lower ``-mcpu`` version, you can use
+``-Xclang -target-feature -Xclang +alu32``.
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 1735b91ec..541483118 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -303,12 +303,6 @@ The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
value that was at ``dst_reg + off`` before the operation is zero-extended
and loaded back to ``R0``.
-Clang can generate atomic instructions by default when ``-mcpu=v3`` is
-enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
-Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
-the atomics features, while keeping a lower ``-mcpu`` version, you can use
-``-Xclang -target-feature -Xclang +alu32``.
-
64-bit immediate instructions
-----------------------------
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 04/15] ebpf-docs: Add Clang note about BPF_ALU
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
2022-09-27 18:59 ` [PATCH 02/15] ebpf-docs: Linux byteswap note dthaler1968
2022-09-27 18:59 ` [PATCH 03/15] ebpf-docs: Move Clang notes to a separate file dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-27 18:59 ` [PATCH 05/15] ebpf-docs: Add TOC and fix formatting dthaler1968
` (11 subsequent siblings)
14 siblings, 0 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/clang-notes.rst | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst
index b15179cb5..528feddf2 100644
--- a/Documentation/bpf/clang-notes.rst
+++ b/Documentation/bpf/clang-notes.rst
@@ -14,6 +14,12 @@ Clang defined "CPU" versions, where a CPU version of 3 corresponds to the curren
Clang can select the eBPF ISA version using ``-mcpu=v3`` for example to select version 3.
+Arithmetic instructions
+=======================
+
+For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with
+``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included.
+
Atomic operations
=================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 05/15] ebpf-docs: Add TOC and fix formatting
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (2 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 04/15] ebpf-docs: Add Clang note about BPF_ALU dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-27 18:59 ` [PATCH 06/15] ebpf-docs: Use standard type convention in standard doc dthaler1968
` (10 subsequent siblings)
14 siblings, 0 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 268 +++++++++++++-------------
1 file changed, 136 insertions(+), 132 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 541483118..4997d2088 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -1,7 +1,12 @@
+.. contents::
+.. sectnum::
+
+========================================
+eBPF Instruction Set Specification, v1.0
+========================================
+
+This document specifies version 1.0 of the eBPF instruction set.
-====================
-eBPF Instruction Set
-====================
Registers and calling convention
================================
@@ -11,10 +16,10 @@ all of which are 64-bits wide.
The eBPF calling convention is defined as:
- * R0: return value from function calls, and exit value for eBPF programs
- * R1 - R5: arguments for function calls
- * R6 - R9: callee saved registers that function calls will preserve
- * R10: read-only frame pointer to access stack
+* R0: return value from function calls, and exit value for eBPF programs
+* R1 - R5: arguments for function calls
+* R6 - R9: callee saved registers that function calls will preserve
+* R10: read-only frame pointer to access stack
R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
necessary across calls.
@@ -24,17 +29,17 @@ Instruction encoding
eBPF has two instruction encodings:
- * the basic instruction encoding, which uses 64 bits to encode an instruction
- * the wide instruction encoding, which appends a second 64-bit immediate value
- (imm64) after the basic instruction for a total of 128 bits.
+* the basic instruction encoding, which uses 64 bits to encode an instruction
+* the wide instruction encoding, which appends a second 64-bit immediate value
+ (imm64) after the basic instruction for a total of 128 bits.
The basic instruction encoding looks as follows:
- ============= ======= =============== ==================== ============
- 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
- ============= ======= =============== ==================== ============
- immediate offset source register destination register opcode
- ============= ======= =============== ==================== ============
+============= ======= =============== ==================== ============
+32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
+============= ======= =============== ==================== ============
+immediate offset source register destination register opcode
+============= ======= =============== ==================== ============
Note that most instructions do not use all of the fields.
Unused fields shall be cleared to zero.
@@ -44,30 +49,30 @@ Instruction classes
The three LSB bits of the 'opcode' field store the instruction class:
- ========= ===== ===============================
- class value description
- ========= ===== ===============================
- BPF_LD 0x00 non-standard load operations
- BPF_LDX 0x01 load into register operations
- BPF_ST 0x02 store from immediate operations
- BPF_STX 0x03 store from register operations
- BPF_ALU 0x04 32-bit arithmetic operations
- BPF_JMP 0x05 64-bit jump operations
- BPF_JMP32 0x06 32-bit jump operations
- BPF_ALU64 0x07 64-bit arithmetic operations
- ========= ===== ===============================
+========= ===== =============================== ===================================
+class value description reference
+========= ===== =============================== ===================================
+BPF_LD 0x00 non-standard load operations `Load and store instructions`_
+BPF_LDX 0x01 load into register operations `Load and store instructions`_
+BPF_ST 0x02 store from immediate operations `Load and store instructions`_
+BPF_STX 0x03 store from register operations `Load and store instructions`_
+BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_
+BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_
+BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_
+BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_
+========= ===== =============================== ===================================
Arithmetic and jump instructions
================================
-For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
-BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:
+For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and
+``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts:
- ============== ====== =================
- 4 bits (MSB) 1 bit 3 bits (LSB)
- ============== ====== =================
- operation code source instruction class
- ============== ====== =================
+============== ====== =================
+4 bits (MSB) 1 bit 3 bits (LSB)
+============== ====== =================
+operation code source instruction class
+============== ====== =================
The 4th bit encodes the source operand:
@@ -84,51 +89,51 @@ The four MSB bits store the operation code.
Arithmetic instructions
-----------------------
-BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
+``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for
otherwise identical operations.
-The code field encodes the operation as below:
-
- ======== ===== =================================================
- code value description
- ======== ===== =================================================
- BPF_ADD 0x00 dst += src
- BPF_SUB 0x10 dst -= src
- BPF_MUL 0x20 dst \*= src
- BPF_DIV 0x30 dst /= src
- BPF_OR 0x40 dst \|= src
- BPF_AND 0x50 dst &= src
- BPF_LSH 0x60 dst <<= src
- BPF_RSH 0x70 dst >>= src
- BPF_NEG 0x80 dst = ~src
- BPF_MOD 0x90 dst %= src
- BPF_XOR 0xa0 dst ^= src
- BPF_MOV 0xb0 dst = src
- BPF_ARSH 0xc0 sign extending shift right
- BPF_END 0xd0 byte swap operations (see separate section below)
- ======== ===== =================================================
-
-BPF_ADD | BPF_X | BPF_ALU means::
+The 'code' field encodes the operation as below:
+
+======== ===== ==========================================================
+code value description
+======== ===== ==========================================================
+BPF_ADD 0x00 dst += src
+BPF_SUB 0x10 dst -= src
+BPF_MUL 0x20 dst \*= src
+BPF_DIV 0x30 dst /= src
+BPF_OR 0x40 dst \|= src
+BPF_AND 0x50 dst &= src
+BPF_LSH 0x60 dst <<= src
+BPF_RSH 0x70 dst >>= src
+BPF_NEG 0x80 dst = ~src
+BPF_MOD 0x90 dst %= src
+BPF_XOR 0xa0 dst ^= src
+BPF_MOV 0xb0 dst = src
+BPF_ARSH 0xc0 sign extending shift right
+BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
+======== ===== ==========================================================
+
+``BPF_ADD | BPF_X | BPF_ALU`` means::
dst_reg = (u32) dst_reg + (u32) src_reg;
-BPF_ADD | BPF_X | BPF_ALU64 means::
+``BPF_ADD | BPF_X | BPF_ALU64`` means::
dst_reg = dst_reg + src_reg
-BPF_XOR | BPF_K | BPF_ALU means::
+``BPF_XOR | BPF_K | BPF_ALU`` means::
src_reg = (u32) src_reg ^ (u32) imm32
-BPF_XOR | BPF_K | BPF_ALU64 means::
+``BPF_XOR | BPF_K | BPF_ALU64`` means::
src_reg = src_reg ^ imm32
Byte swap instructions
-----------------------
+~~~~~~~~~~~~~~~~~~~~~~
The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
-code field of ``BPF_END``.
+'code' field of ``BPF_END``.
The byte swap instructions operate on the destination register
only and do not use a separate source register or immediate value.
@@ -136,14 +141,14 @@ only and do not use a separate source register or immediate value.
The 1-bit source operand field in the opcode is used to to select what byte
order the operation convert from or to:
- ========= ===== =================================================
- source value description
- ========= ===== =================================================
- BPF_TO_LE 0x00 convert between host byte order and little endian
- BPF_TO_BE 0x08 convert between host byte order and big endian
- ========= ===== =================================================
+========= ===== =================================================
+source value description
+========= ===== =================================================
+BPF_TO_LE 0x00 convert between host byte order and little endian
+BPF_TO_BE 0x08 convert between host byte order and big endian
+========= ===== =================================================
-The imm field encodes the width of the swap operations. The following widths
+The 'imm' field encodes the width of the swap operations. The following widths
are supported: 16, 32 and 64.
Examples:
@@ -159,28 +164,28 @@ Examples:
Jump instructions
-----------------
-BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for
+``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
otherwise identical operations.
-The code field encodes the operation as below:
-
- ======== ===== ========================= ============
- code value description notes
- ======== ===== ========================= ============
- BPF_JA 0x00 PC += off BPF_JMP only
- BPF_JEQ 0x10 PC += off if dst == src
- BPF_JGT 0x20 PC += off if dst > src unsigned
- BPF_JGE 0x30 PC += off if dst >= src unsigned
- BPF_JSET 0x40 PC += off if dst & src
- BPF_JNE 0x50 PC += off if dst != src
- BPF_JSGT 0x60 PC += off if dst > src signed
- BPF_JSGE 0x70 PC += off if dst >= src signed
- BPF_CALL 0x80 function call
- BPF_EXIT 0x90 function / program return BPF_JMP only
- BPF_JLT 0xa0 PC += off if dst < src unsigned
- BPF_JLE 0xb0 PC += off if dst <= src unsigned
- BPF_JSLT 0xc0 PC += off if dst < src signed
- BPF_JSLE 0xd0 PC += off if dst <= src signed
- ======== ===== ========================= ============
+The 'code' field encodes the operation as below:
+
+======== ===== ========================= ============
+code value description notes
+======== ===== ========================= ============
+BPF_JA 0x00 PC += off BPF_JMP only
+BPF_JEQ 0x10 PC += off if dst == src
+BPF_JGT 0x20 PC += off if dst > src unsigned
+BPF_JGE 0x30 PC += off if dst >= src unsigned
+BPF_JSET 0x40 PC += off if dst & src
+BPF_JNE 0x50 PC += off if dst != src
+BPF_JSGT 0x60 PC += off if dst > src signed
+BPF_JSGE 0x70 PC += off if dst >= src signed
+BPF_CALL 0x80 function call
+BPF_EXIT 0x90 function / program return BPF_JMP only
+BPF_JLT 0xa0 PC += off if dst < src unsigned
+BPF_JLE 0xb0 PC += off if dst <= src unsigned
+BPF_JSLT 0xc0 PC += off if dst < src signed
+BPF_JSLE 0xd0 PC += off if dst <= src signed
+======== ===== ========================= ============
The eBPF program needs to store the return value into register R0 before doing a
BPF_EXIT.
@@ -189,14 +194,26 @@ BPF_EXIT.
Load and store instructions
===========================
-For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
+For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the
8-bit 'opcode' field is divided as:
- ============ ====== =================
- 3 bits (MSB) 2 bits 3 bits (LSB)
- ============ ====== =================
- mode size instruction class
- ============ ====== =================
+============ ====== =================
+3 bits (MSB) 2 bits 3 bits (LSB)
+============ ====== =================
+mode size instruction class
+============ ====== =================
+
+The mode modifier is one of:
+
+ ============= ===== ==================================== =============
+ mode modifier value description reference
+ ============= ===== ==================================== =============
+ BPF_IMM 0x00 64-bit immediate instructions `64-bit immediate instructions`_
+ BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
+ BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
+ BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_
+ BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
+ ============= ===== ==================================== =============
The size modifier is one of:
@@ -209,19 +226,6 @@ The size modifier is one of:
BPF_DW 0x18 double word (8 bytes)
============= ===== =====================
-The mode modifier is one of:
-
- ============= ===== ====================================
- mode modifier value description
- ============= ===== ====================================
- BPF_IMM 0x00 64-bit immediate instructions
- BPF_ABS 0x20 legacy BPF packet access (absolute)
- BPF_IND 0x40 legacy BPF packet access (indirect)
- BPF_MEM 0x60 regular load and store operations
- BPF_ATOMIC 0xc0 atomic operations
- ============= ===== ====================================
-
-
Regular load and store operations
---------------------------------
@@ -252,42 +256,42 @@ by other eBPF programs or means outside of this specification.
All atomic operations supported by eBPF are encoded as store operations
that use the ``BPF_ATOMIC`` mode modifier as follows:
- * ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
- * ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
- * 8-bit and 16-bit wide atomic operations are not supported.
+* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
+* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
+* 8-bit and 16-bit wide atomic operations are not supported.
-The imm field is used to encode the actual atomic operation.
+The 'imm' field is used to encode the actual atomic operation.
Simple atomic operation use a subset of the values defined to encode
-arithmetic operations in the imm field to encode the atomic operation:
+arithmetic operations in the 'imm' field to encode the atomic operation:
- ======== ===== ===========
- imm value description
- ======== ===== ===========
- BPF_ADD 0x00 atomic add
- BPF_OR 0x40 atomic or
- BPF_AND 0x50 atomic and
- BPF_XOR 0xa0 atomic xor
- ======== ===== ===========
+======== ===== ===========
+imm value description
+======== ===== ===========
+BPF_ADD 0x00 atomic add
+BPF_OR 0x40 atomic or
+BPF_AND 0x50 atomic and
+BPF_XOR 0xa0 atomic xor
+======== ===== ===========
-``BPF_ATOMIC | BPF_W | BPF_STX`` with imm = BPF_ADD means::
+``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
*(u32 *)(dst_reg + off16) += src_reg
-``BPF_ATOMIC | BPF_DW | BPF_STX`` with imm = BPF ADD means::
+``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
*(u64 *)(dst_reg + off16) += src_reg
In addition to the simple atomic operations, there also is a modifier and
two complex atomic operations:
- =========== ================ ===========================
- imm value description
- =========== ================ ===========================
- BPF_FETCH 0x01 modifier: return old value
- BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
- BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
- =========== ================ ===========================
+=========== ================ ===========================
+imm value description
+=========== ================ ===========================
+BPF_FETCH 0x01 modifier: return old value
+BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
+BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
+=========== ================ ===========================
The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
always set for the complex atomic operations. If the ``BPF_FETCH`` flag
@@ -306,7 +310,7 @@ and loaded back to ``R0``.
64-bit immediate instructions
-----------------------------
-Instructions with the ``BPF_IMM`` mode modifier use the wide instruction
+Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction
encoding for an extra imm64 value.
There is currently only one such instruction.
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 06/15] ebpf-docs: Use standard type convention in standard doc
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (3 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 05/15] ebpf-docs: Add TOC and fix formatting dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 20:49 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow dthaler1968
` (9 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 14 ++++++++++----
Documentation/bpf/linux-notes.rst | 6 ++++++
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 4997d2088..a24bc5d53 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -7,6 +7,10 @@ eBPF Instruction Set Specification, v1.0
This document specifies version 1.0 of the eBPF instruction set.
+Documentation conventions
+=========================
+
+This specification uses the standard C types (uint32_t, etc.) in documentation.
Registers and calling convention
================================
@@ -114,7 +118,9 @@ BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
``BPF_ADD | BPF_X | BPF_ALU`` means::
- dst_reg = (u32) dst_reg + (u32) src_reg;
+ dst_reg = (uint32_t) dst_reg + (uint32_t) src_reg;
+
+where '(uint32_t)' indicates truncation to 32 bits.
``BPF_ADD | BPF_X | BPF_ALU64`` means::
@@ -122,7 +128,7 @@ BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
``BPF_XOR | BPF_K | BPF_ALU`` means::
- src_reg = (u32) src_reg ^ (u32) imm32
+ src_reg = (uint32_t) src_reg ^ (uint32_t) imm32
``BPF_XOR | BPF_K | BPF_ALU64`` means::
@@ -276,11 +282,11 @@ BPF_XOR 0xa0 atomic xor
``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
- *(u32 *)(dst_reg + off16) += src_reg
+ *(uint32_t *)(dst_reg + off16) += src_reg
``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
- *(u64 *)(dst_reg + off16) += src_reg
+ *(uint32_t *)(dst_reg + off16) += src_reg
In addition to the simple atomic operations, there also is a modifier and
two complex atomic operations:
diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
index 1c31379b4..522ebe27d 100644
--- a/Documentation/bpf/linux-notes.rst
+++ b/Documentation/bpf/linux-notes.rst
@@ -7,6 +7,12 @@ Linux implementation notes
This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
+Arithmetic instructions
+=======================
+
+While the eBPF instruction set document uses the standard C terminology as the cross-platform specification,
+in the Linux kernel, uint32_t is expressed as u32, uint64_t is expressed as u64, etc.
+
Byte swap instructions
======================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (4 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 06/15] ebpf-docs: Use standard type convention in standard doc dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 20:52 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 08/15] ebpf-docs: Use consistent names for the same field dthaler1968
` (8 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index a24bc5d53..3c5a63612 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -103,19 +103,26 @@ code value description
BPF_ADD 0x00 dst += src
BPF_SUB 0x10 dst -= src
BPF_MUL 0x20 dst \*= src
-BPF_DIV 0x30 dst /= src
+BPF_DIV 0x30 dst = (src != 0) ? (dst / src) : 0
BPF_OR 0x40 dst \|= src
BPF_AND 0x50 dst &= src
BPF_LSH 0x60 dst <<= src
BPF_RSH 0x70 dst >>= src
BPF_NEG 0x80 dst = ~src
-BPF_MOD 0x90 dst %= src
+BPF_MOD 0x90 dst = (src != 0) ? (dst % src) : dst
BPF_XOR 0xa0 dst ^= src
BPF_MOV 0xb0 dst = src
BPF_ARSH 0xc0 sign extending shift right
BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
======== ===== ==========================================================
+Underflow and overflow are allowed during arithmetic operations,
+meaning the 64-bit or 32-bit value will wrap. If
+eBPF program execution would result in division by zero,
+the destination register is instead set to zero.
+If execution would result in modulo by zero,
+the destination register is instead left unchanged.
+
``BPF_ADD | BPF_X | BPF_ALU`` means::
dst_reg = (uint32_t) dst_reg + (uint32_t) src_reg;
@@ -135,6 +142,14 @@ where '(uint32_t)' indicates truncation to 32 bits.
src_reg = src_reg ^ imm32
+Also note that the modulo operation often varies by language
+when the dividend or divisor are negative, where Python, Ruby, etc.
+differ from C, Go, Java, etc. This specification requires that
+modulo use truncated division (where -13 % 3 == -1) as implemented
+in C, Go, etc.:
+
+ a % n = a - n * trunc(a / n)
+
Byte swap instructions
~~~~~~~~~~~~~~~~~~~~~~
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 08/15] ebpf-docs: Use consistent names for the same field
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (5 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 20:57 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 09/15] ebpf-docs: Explain helper functions dthaler1968
` (7 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 107 ++++++++++++++++++--------
1 file changed, 76 insertions(+), 31 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 3c5a63612..2987234eb 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -34,20 +34,59 @@ Instruction encoding
eBPF has two instruction encodings:
* the basic instruction encoding, which uses 64 bits to encode an instruction
-* the wide instruction encoding, which appends a second 64-bit immediate value
- (imm64) after the basic instruction for a total of 128 bits.
+* the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
+ constant) value after the basic instruction for a total of 128 bits.
-The basic instruction encoding looks as follows:
+The basic instruction encoding is as follows, where MSB and LSB mean the most significant
+bits and least significant bits, respectively:
============= ======= =============== ==================== ============
32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
============= ======= =============== ==================== ============
-immediate offset source register destination register opcode
+imm offset src dst opcode
============= ======= =============== ==================== ============
+imm
+ signed integer immediate value
+
+offset
+ signed integer offset used with pointer arithmetic
+
+src
+ the source register number (0-10), except where otherwise specified
+ (`64-bit immediate instructions`_ reuse this field for other purposes)
+
+dst
+ destination register number (0-10)
+
+opcode
+ operation to perform
+
Note that most instructions do not use all of the fields.
Unused fields shall be cleared to zero.
+As discussed below in `64-bit immediate instructions`_, some
+instructions use a 64-bit immediate value that is constructed as follows.
+The 64 bits following the basic instruction contain a pseudo instruction
+using the same format but with opcode, dst, src, and offset all set to zero,
+and imm containing the high 32 bits of the immediate value.
+
+================= ==================
+64 bits (MSB) 64 bits (LSB)
+================= ==================
+basic instruction pseudo instruction
+================= ==================
+
+Thus the 64-bit immediate value is constructed as follows:
+
+ imm64 = imm + (next_imm << 32)
+
+where 'next_imm' refers to the imm value of the pseudo instruction
+following the basic instruction.
+
+In the remainder of this document 'src' and 'dst' refer to the values of the source
+and destination registers, respectively, rather than the register number.
+
Instruction classes
-------------------
@@ -75,20 +114,24 @@ For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` an
============== ====== =================
4 bits (MSB) 1 bit 3 bits (LSB)
============== ====== =================
-operation code source instruction class
+code source instruction class
============== ====== =================
-The 4th bit encodes the source operand:
+code
+ the operation code, whose meaning varies by instruction class
- ====== ===== ========================================
- source value description
- ====== ===== ========================================
- BPF_K 0x00 use 32-bit immediate as source operand
- BPF_X 0x08 use 'src_reg' register as source operand
- ====== ===== ========================================
+source
+ the source operand location, which unless otherwise specified is one of:
-The four MSB bits store the operation code.
+ ====== ===== ==========================================
+ source value description
+ ====== ===== ==========================================
+ BPF_K 0x00 use 32-bit 'imm' value as source operand
+ BPF_X 0x08 use 'src' register value as source operand
+ ====== ===== ==========================================
+instruction class
+ the instruction class (see `Instruction classes`_)
Arithmetic instructions
-----------------------
@@ -116,6 +159,8 @@ BPF_ARSH 0xc0 sign extending shift right
BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
======== ===== ==========================================================
+where 'src' is the source operand value.
+
Underflow and overflow are allowed during arithmetic operations,
meaning the 64-bit or 32-bit value will wrap. If
eBPF program execution would result in division by zero,
@@ -125,21 +170,21 @@ the destination register is instead left unchanged.
``BPF_ADD | BPF_X | BPF_ALU`` means::
- dst_reg = (uint32_t) dst_reg + (uint32_t) src_reg;
+ dst = (uint32_t) (dst + src)
where '(uint32_t)' indicates truncation to 32 bits.
``BPF_ADD | BPF_X | BPF_ALU64`` means::
- dst_reg = dst_reg + src_reg
+ dst = dst + src
``BPF_XOR | BPF_K | BPF_ALU`` means::
- src_reg = (uint32_t) src_reg ^ (uint32_t) imm32
+ src = (uint32_t) src ^ (uint32_t) imm
``BPF_XOR | BPF_K | BPF_ALU64`` means::
- src_reg = src_reg ^ imm32
+ src = src ^ imm
Also note that the modulo operation often varies by language
@@ -176,11 +221,11 @@ Examples:
``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
- dst_reg = htole16(dst_reg)
+ dst = htole16(dst)
``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
- dst_reg = htobe64(dst_reg)
+ dst = htobe64(dst)
Jump instructions
-----------------
@@ -255,15 +300,15 @@ instructions that transfer data between a register and memory.
``BPF_MEM | <size> | BPF_STX`` means::
- *(size *) (dst_reg + off) = src_reg
+ *(size *) (dst + offset) = src_reg
``BPF_MEM | <size> | BPF_ST`` means::
- *(size *) (dst_reg + off) = imm32
+ *(size *) (dst + offset) = imm32
``BPF_MEM | <size> | BPF_LDX`` means::
- dst_reg = *(size *) (src_reg + off)
+ dst = *(size *) (src + offset)
Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
@@ -297,11 +342,11 @@ BPF_XOR 0xa0 atomic xor
``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
- *(uint32_t *)(dst_reg + off16) += src_reg
+ *(uint32_t *)(dst + offset) += src
``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
- *(uint32_t *)(dst_reg + off16) += src_reg
+ *(uint64_t *)(dst + offset) += src
In addition to the simple atomic operations, there also is a modifier and
two complex atomic operations:
@@ -316,16 +361,16 @@ BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
always set for the complex atomic operations. If the ``BPF_FETCH`` flag
-is set, then the operation also overwrites ``src_reg`` with the value that
+is set, then the operation also overwrites ``src`` with the value that
was in memory before it was modified.
-The ``BPF_XCHG`` operation atomically exchanges ``src_reg`` with the value
-addressed by ``dst_reg + off``.
+The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value
+addressed by ``dst + offset``.
The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
-``dst_reg + off`` with ``R0``. If they match, the value addressed by
-``dst_reg + off`` is replaced with ``src_reg``. In either case, the
-value that was at ``dst_reg + off`` before the operation is zero-extended
+``dst + offset`` with ``R0``. If they match, the value addressed by
+``dst + offset`` is replaced with ``src``. In either case, the
+value that was at ``dst + offset`` before the operation is zero-extended
and loaded back to ``R0``.
64-bit immediate instructions
@@ -338,7 +383,7 @@ There is currently only one such instruction.
``BPF_LD | BPF_DW | BPF_IMM`` means::
- dst_reg = imm64
+ dst = imm64
Legacy BPF Packet access instructions
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 09/15] ebpf-docs: Explain helper functions
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (6 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 08/15] ebpf-docs: Use consistent names for the same field dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 22:01 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order dthaler1968
` (6 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 2987234eb..926957830 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -245,7 +245,7 @@ BPF_JSET 0x40 PC += off if dst & src
BPF_JNE 0x50 PC += off if dst != src
BPF_JSGT 0x60 PC += off if dst > src signed
BPF_JSGE 0x70 PC += off if dst >= src signed
-BPF_CALL 0x80 function call
+BPF_CALL 0x80 function call see `Helper functions`_
BPF_EXIT 0x90 function / program return BPF_JMP only
BPF_JLT 0xa0 PC += off if dst < src unsigned
BPF_JLE 0xb0 PC += off if dst <= src unsigned
@@ -256,6 +256,22 @@ BPF_JSLE 0xd0 PC += off if dst <= src signed
The eBPF program needs to store the return value into register R0 before doing a
BPF_EXIT.
+Helper functions
+~~~~~~~~~~~~~~~~
+Helper functions are a concept whereby BPF programs can call into a
+set of function calls exposed by the eBPF runtime. Each helper
+function is identified by an integer used in a ``BPF_CALL`` instruction.
+The available helper functions may differ for each eBPF program type.
+
+Conceptually, each helper function is implemented with a commonly shared function
+signature defined as:
+
+ uint64_t function(uint64_t r1, uint64_t r2, uint64_t r3, uint64_t r4, uint64_t r5)
+
+In actuality, each helper function is defined as taking between 0 and 5 arguments,
+with the remaining registers being ignored. The definition of a helper function
+is responsible for specifying the type (e.g., integer, pointer, etc.) of the value returned,
+the number of arguments, and the type of each argument.
Load and store instructions
===========================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (7 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 09/15] ebpf-docs: Explain helper functions dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 22:02 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 11/15] ebpf-docs: Improve English readability dthaler1968
` (5 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 197 ++++++++++++++++++++++++++
1 file changed, 197 insertions(+)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 926957830..b6f098104 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -408,3 +408,200 @@ Legacy BPF Packet access instructions
eBPF previously introduced special instructions for access to packet data that were
carried over from classic BPF. However, these instructions are
deprecated and should no longer be used.
+
+Appendix
+========
+
+For reference, the following table lists opcodes in order by value.
+
+====== === ==== =================================================== ========================================
+opcode src imm description reference
+====== === ==== =================================================== ========================================
+0x00 0x0 any (additional immediate value) `64-bit immediate instructions`_
+0x04 0x0 any dst = (uint32_t)(dst + imm) `Arithmetic instructions`_
+0x05 0x0 0x00 goto +offset `Jump instructions`_
+0x07 0x0 any dst += imm `Arithmetic instructions`_
+0x0c any 0x00 dst = (uint32_t)(dst + src) `Arithmetic instructions`_
+0x0f any 0x00 dst += src `Arithmetic instructions`_
+0x14 0x0 any dst = (uint32_t)(dst - imm) `Arithmetic instructions`_
+0x15 0x0 any if dst == imm goto +offset `Jump instructions`_
+0x16 0x0 any if (uint32_t)dst == imm goto +offset `Jump instructions`_
+0x17 0x0 any dst -= imm `Arithmetic instructions`_
+0x18 0x0 any dst = imm64 `64-bit immediate instructions`_
+0x1c any 0x00 dst = (uint32_t)(dst - src) `Arithmetic instructions`_
+0x1d any 0x00 if dst == src goto +offset `Jump instructions`_
+0x1e any 0x00 if (uint32_t)dst == (uint32_t)src goto +offset `Jump instructions`_
+0x1f any 0x00 dst -= src `Arithmetic instructions`_
+0x20 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x24 0x0 any dst = (uint32_t)(dst \* imm) `Arithmetic instructions`_
+0x25 0x0 any if dst > imm goto +offset `Jump instructions`_
+0x26 0x0 any if (uint32_t)dst > imm goto +offset `Jump instructions`_
+0x27 0x0 any dst \*= imm `Arithmetic instructions`_
+0x28 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x2c any 0x00 dst = (uint32_t)(dst \* src) `Arithmetic instructions`_
+0x2d any 0x00 if dst > src goto +offset `Jump instructions`_
+0x2e any 0x00 if (uint32_t)dst > (uint32_t)src goto +offset `Jump instructions`_
+0x2f any 0x00 dst \*= src `Arithmetic instructions`_
+0x30 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x34 0x0 any dst = (uint32_t)((imm != 0) ? (dst / imm) : 0) `Arithmetic instructions`_
+0x35 0x0 any if dst >= imm goto +offset `Jump instructions`_
+0x36 0x0 any if (uint32_t)dst >= imm goto +offset `Jump instructions`_
+0x37 0x0 any dst = (imm != 0) ? (dst / imm) : 0 `Arithmetic instructions`_
+0x38 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x3c any 0x00 dst = (uint32_t)((imm != 0) ? (dst / src) : 0) `Arithmetic instructions`_
+0x3d any 0x00 if dst >= src goto +offset `Jump instructions`_
+0x3e any 0x00 if (uint32_t)dst >= (uint32_t)src goto +offset `Jump instructions`_
+0x3f any 0x00 dst = (src !+ 0) ? (dst / src) : 0 `Arithmetic instructions`_
+0x40 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x44 0x0 any dst = (uint32_t)(dst \| imm) `Arithmetic instructions`_
+0x45 0x0 any if dst & imm goto +offset `Jump instructions`_
+0x46 0x0 any if (uint32_t)dst & imm goto +offset `Jump instructions`_
+0x47 0x0 any dst \|= imm `Arithmetic instructions`_
+0x48 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x4c any 0x00 dst = (uint32_t)(dst \| src) `Arithmetic instructions`_
+0x4d any 0x00 if dst & src goto +offset `Jump instructions`_
+0x4e any 0x00 if (uint32_t)dst & (uint32_t)src goto +offset `Jump instructions`_
+0x4f any 0x00 dst \|= src `Arithmetic instructions`_
+0x50 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x54 0x0 any dst = (uint32_t)(dst & imm) `Arithmetic instructions`_
+0x55 0x0 any if dst != imm goto +offset `Jump instructions`_
+0x56 0x0 any if (uint32_t)dst != imm goto +offset `Jump instructions`_
+0x57 0x0 any dst &= imm `Arithmetic instructions`_
+0x58 any any (deprecated, implementation-specific) `Legacy BPF Packet access instructions`_
+0x5c any 0x00 dst = (uint32_t)(dst & src) `Arithmetic instructions`_
+0x5d any 0x00 if dst != src goto +offset `Jump instructions`_
+0x5e any 0x00 if (uint32_t)dst != (uint32_t)src goto +offset `Jump instructions`_
+0x5f any 0x00 dst &= src `Arithmetic instructions`_
+0x61 any 0x00 dst = \*(uint32_t \*)(src + offset) `Load and store instructions`_
+0x62 0x0 any \*(uint32_t \*)(dst + offset) = imm `Load and store instructions`_
+0x63 any 0x00 \*(uint32_t \*)(dst + offset) = src `Load and store instructions`_
+0x64 0x0 any dst = (uint32_t)(dst << imm) `Arithmetic instructions`_
+0x65 0x0 any if dst s> imm goto +offset `Jump instructions`_
+0x66 0x0 any if (int32_t)dst s> (int32_t)imm goto +offset `Jump instructions`_
+0x67 0x0 any dst <<= imm `Arithmetic instructions`_
+0x69 any 0x00 dst = \*(uint16_t \*)(src + offset) `Load and store instructions`_
+0x6a 0x0 any \*(uint16_t \*)(dst + offset) = imm `Load and store instructions`_
+0x6b any 0x00 \*(uint16_t \*)(dst + offset) = src `Load and store instructions`_
+0x6c any 0x00 dst = (uint32_t)(dst << src) `Arithmetic instructions`_
+0x6d any 0x00 if dst s> src goto +offset `Jump instructions`_
+0x6e any 0x00 if (int32_t)dst s> (int32_t)src goto +offset `Jump instructions`_
+0x6f any 0x00 dst <<= src `Arithmetic instructions`_
+0x71 any 0x00 dst = \*(uint8_t \*)(src + offset) `Load and store instructions`_
+0x72 0x0 any \*(uint8_t \*)(dst + offset) = imm `Load and store instructions`_
+0x73 any 0x00 \*(uint8_t \*)(dst + offset) = src `Load and store instructions`_
+0x74 0x0 any dst = (uint32_t)(dst >> imm) `Arithmetic instructions`_
+0x75 0x0 any if dst s>= imm goto +offset `Jump instructions`_
+0x76 0x0 any if (int32_t)dst s>= (int32_t)imm goto +offset `Jump instructions`_
+0x77 0x0 any dst >>= imm `Arithmetic instructions`_
+0x79 any 0x00 dst = \*(uint64_t \*)(src + offset) `Load and store instructions`_
+0x7a 0x0 any \*(uint64_t \*)(dst + offset) = imm `Load and store instructions`_
+0x7b any 0x00 \*(uint64_t \*)(dst + offset) = src `Load and store instructions`_
+0x7c any 0x00 dst = (uint32_t)(dst >> src) `Arithmetic instructions`_
+0x7d any 0x00 if dst s>= src goto +offset `Jump instructions`_
+0x7e any 0x00 if (int32_t)dst s>= (int32_t)src goto +offset `Jump instructions`_
+0x7f any 0x00 dst >>= src `Arithmetic instructions`_
+0x84 0x0 0x00 dst = (uint32_t)-dst `Arithmetic instructions`_
+0x85 0x0 any call helper function imm `Helper functions`_
+0x87 0x0 0x00 dst = -dst `Arithmetic instructions`_
+0x94 0x0 any dst = (uint32_t)((imm != 0) ? (dst % imm) : dst) `Arithmetic instructions`_
+0x95 0x0 0x00 return `Jump instructions`_
+0x97 0x0 any dst = (imm != 0) ? (dst % imm) : dst `Arithmetic instructions`_
+0x9c any 0x00 dst = (uint32_t)((src != 0) ? (dst % src) : dst) `Arithmetic instructions`_
+0x9f any 0x00 dst = (src != 0) ? (dst % src) : dst `Arithmetic instructions`_
+0xa4 0x0 any dst = (uint32_t)(dst ^ imm) `Arithmetic instructions`_
+0xa5 0x0 any if dst < imm goto +offset `Jump instructions`_
+0xa6 0x0 any if (uint32_t)dst < imm goto +offset `Jump instructions`_
+0xa7 0x0 any dst ^= imm `Arithmetic instructions`_
+0xac any 0x00 dst = (uint32_t)(dst ^ src) `Arithmetic instructions`_
+0xad any 0x00 if dst < src goto +offset `Jump instructions`_
+0xae any 0x00 if (uint32_t)dst < (uint32_t)src goto +offset `Jump instructions`_
+0xaf any 0x00 dst ^= src `Arithmetic instructions`_
+0xb4 0x0 any dst = (uint32_t) imm `Arithmetic instructions`_
+0xb5 0x0 any if dst <= imm goto +offset `Jump instructions`_
+0xa6 0x0 any if (uint32_t)dst <= imm goto +offset `Jump instructions`_
+0xb7 0x0 any dst = imm `Arithmetic instructions`_
+0xbc any 0x00 dst = (uint32_t) src `Arithmetic instructions`_
+0xbd any 0x00 if dst <= src goto +offset `Jump instructions`_
+0xbe any 0x00 if (uint32_t)dst <= (uint32_t)src goto +offset `Jump instructions`_
+0xbf any 0x00 dst = src `Arithmetic instructions`_
+0xc3 any 0x00 lock \*(uint32_t \*)(dst + offset) += src `Atomic operations`_
+0xc3 any 0x01 lock:: `Atomic operations`_
+
+ *(uint32_t *)(dst + offset) += src
+ src = *(uint32_t *)(dst + offset)
+0xc3 any 0x40 \*(uint32_t \*)(dst + offset) \|= src `Atomic operations`_
+0xc3 any 0x41 lock:: `Atomic operations`_
+
+ *(uint32_t *)(dst + offset) |= src
+ src = *(uint32_t *)(dst + offset)
+0xc3 any 0x50 \*(uint32_t \*)(dst + offset) &= src `Atomic operations`_
+0xc3 any 0x51 lock:: `Atomic operations`_
+
+ *(uint32_t *)(dst + offset) &= src
+ src = *(uint32_t *)(dst + offset)
+0xc3 any 0xa0 \*(uint32_t \*)(dst + offset) ^= src `Atomic operations`_
+0xc3 any 0xa1 lock:: `Atomic operations`_
+
+ *(uint32_t *)(dst + offset) ^= src
+ src = *(uint32_t *)(dst + offset)
+0xc3 any 0xe1 lock:: `Atomic operations`_
+
+ temp = *(uint32_t *)(dst + offset)
+ *(uint32_t *)(dst + offset) = src
+ src = temp
+0xc3 any 0xf1 lock:: `Atomic operations`_
+
+ temp = *(uint32_t *)(dst + offset)
+ if *(uint32_t)(dst + offset) == R0
+ *(uint32_t)(dst + offset) = src
+ R0 = temp
+0xc4 0x0 any dst = (uint32_t)(dst s>> imm) `Arithmetic instructions`_
+0xc5 0x0 any if dst s< imm goto +offset `Jump instructions`_
+0xc6 0x0 any if (int32_t)dst s< (int32_t)imm goto +offset `Jump instructions`_
+0xc7 0x0 any dst s>>= imm `Arithmetic instructions`_
+0xcc any 0x00 dst = (uint32_t)(dst s>> src) `Arithmetic instructions`_
+0xcd any 0x00 if dst s< src goto +offset `Jump instructions`_
+0xce any 0x00 if (int32_t)dst s< (int32_t)src goto +offset `Jump instructions`_
+0xcf any 0x00 dst s>>= src `Arithmetic instructions`_
+0xd4 0x0 0x10 dst = htole16(dst) `Byte swap instructions`_
+0xd4 0x0 0x20 dst = htole32(dst) `Byte swap instructions`_
+0xd4 0x0 0x40 dst = htole64(dst) `Byte swap instructions`_
+0xd5 0x0 any if dst s<= imm goto +offset `Jump instructions`_
+0xd6 0x0 any if (int32_t)dst s<= (int32_t)imm goto +offset `Jump instructions`_
+0xdb any 0x00 lock \*(uint64_t \*)(dst + offset) += src `Atomic operations`_
+0xdb any 0x01 lock:: `Atomic operations`_
+
+ *(uint64_t *)(dst + offset) += src
+ src = *(uint64_t *)(dst + offset)
+0xdb any 0x40 \*(uint64_t \*)(dst + offset) \|= src `Atomic operations`_
+0xdb any 0x41 lock:: `Atomic operations`_
+
+ *(uint64_t *)(dst + offset) |= src
+ lock src = *(uint64_t *)(dst + offset)
+0xdb any 0x50 \*(uint64_t \*)(dst + offset) &= src `Atomic operations`_
+0xdb any 0x51 lock:: `Atomic operations`_
+
+ *(uint64_t *)(dst + offset) &= src
+ src = *(uint64_t *)(dst + offset)
+0xdb any 0xa0 \*(uint64_t \*)(dst + offset) ^= src `Atomic operations`_
+0xdb any 0xa1 lock:: `Atomic operations`_
+
+ *(uint64_t *)(dst + offset) ^= src
+ src = *(uint64_t *)(dst + offset)
+0xdb any 0xe1 lock:: `Atomic operations`_
+
+ temp = *(uint64_t *)(dst + offset)
+ *(uint64_t *)(dst + offset) = src
+ src = temp
+0xdb any 0xf1 lock:: `Atomic operations`_
+
+ temp = *(uint64_t *)(dst + offset)
+ if *(uint64_t)(dst + offset) == R0
+ *(uint64_t)(dst + offset) = src
+ R0 = temp
+0xdc 0x0 0x10 dst = htobe16(dst) `Byte swap instructions`_
+0xdc 0x0 0x20 dst = htobe32(dst) `Byte swap instructions`_
+0xdc 0x0 0x40 dst = htobe64(dst) `Byte swap instructions`_
+0xdd any 0x00 if dst s<= src goto +offset `Jump instructions`_
+0xde any 0x00 if (int32_t)dst s<= (int32_t)src goto +offset `Jump instructions`_
+====== === ==== =================================================== ========================================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 11/15] ebpf-docs: Improve English readability
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (8 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 22:16 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 12/15] ebpf-docs: Add Linux note about register calling convention dthaler1968
` (4 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 137 ++++++++++++++++----------
1 file changed, 87 insertions(+), 50 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index b6f098104..328207ff6 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -7,6 +7,9 @@ eBPF Instruction Set Specification, v1.0
This document specifies version 1.0 of the eBPF instruction set.
+The eBPF instruction set consists of eleven 64 bit registers, a program counter,
+and 512 bytes of stack space.
+
Documentation conventions
=========================
@@ -25,12 +28,24 @@ The eBPF calling convention is defined as:
* R6 - R9: callee saved registers that function calls will preserve
* R10: read-only frame pointer to access stack
-R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
-necessary across calls.
+Registers R0 - R5 are scratch registers, meaning the BPF program needs to either
+spill them to the BPF stack or move them to callee saved registers if these
+arguments are to be reused across multiple function calls. Spilling means
+that the value in the register is moved to the BPF stack. The reverse operation
+of moving the variable from the BPF stack to the register is called filling.
+The reason for spilling/filling is due to the limited number of registers.
+
+Upon entering execution of an eBPF program, registers R1 - R5 initially can contain
+the input arguments for the program (similar to the argc/argv pair for a typical C program).
+The actual number of registers used, and their meaning, is defined by the program type;
+for example, a networking program might have an argument that includes network packet data
+and/or metadata.
Instruction encoding
====================
+An eBPF program is a sequence of instructions.
+
eBPF has two instruction encodings:
* the basic instruction encoding, which uses 64 bits to encode an instruction
@@ -63,7 +78,7 @@ opcode
operation to perform
Note that most instructions do not use all of the fields.
-Unused fields shall be cleared to zero.
+Unused fields must be set to zero.
As discussed below in `64-bit immediate instructions`_, some
instructions use a 64-bit immediate value that is constructed as follows.
@@ -90,7 +105,9 @@ and destination registers, respectively, rather than the register number.
Instruction classes
-------------------
-The three LSB bits of the 'opcode' field store the instruction class:
+The encoding of the 'opcode' field varies and can be determined from
+the three least significant bits (LSB) of the 'opcode' field which holds
+the "instruction class", as follows:
========= ===== =============================== ===================================
class value description reference
@@ -136,9 +153,11 @@ instruction class
Arithmetic instructions
-----------------------
-``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for
+Instruction class ``BPF_ALU`` uses 32-bit wide operands (zeroing the upper 32 bits
+of the destination register) while ``BPF_ALU64`` uses 64-bit wide operands for
otherwise identical operations.
-The 'code' field encodes the operation as below:
+
+The 4-bit 'code' field encodes the operation as follows:
======== ===== ==========================================================
code value description
@@ -168,21 +187,23 @@ the destination register is instead set to zero.
If execution would result in modulo by zero,
the destination register is instead left unchanged.
-``BPF_ADD | BPF_X | BPF_ALU`` means::
+Examples:
+
+``BPF_ADD | BPF_X | BPF_ALU`` (0x0c) means::
dst = (uint32_t) (dst + src)
where '(uint32_t)' indicates truncation to 32 bits.
-``BPF_ADD | BPF_X | BPF_ALU64`` means::
+``BPF_ADD | BPF_X | BPF_ALU64`` (0x0f) means::
dst = dst + src
-``BPF_XOR | BPF_K | BPF_ALU`` means::
+``BPF_XOR | BPF_K | BPF_ALU`` (0xa4) means::
src = (uint32_t) src ^ (uint32_t) imm
-``BPF_XOR | BPF_K | BPF_ALU64`` means::
+``BPF_XOR | BPF_K | BPF_ALU64`` (0xa7) means::
src = src ^ imm
@@ -204,8 +225,9 @@ The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
The byte swap instructions operate on the destination register
only and do not use a separate source register or immediate value.
-The 1-bit source operand field in the opcode is used to to select what byte
-order the operation convert from or to:
+Byte swap instructions use non-default semantics of the 1-bit 'source' field in
+the 'opcode' field. Instead of indicating the source operator, it is instead
+used to select what byte order the operation converts from or to:
========= ===== =================================================
source value description
@@ -215,24 +237,33 @@ BPF_TO_BE 0x08 convert between host byte order and big endian
========= ===== =================================================
The 'imm' field encodes the width of the swap operations. The following widths
-are supported: 16, 32 and 64.
-
-Examples:
-
-``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
-
- dst = htole16(dst)
-
-``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
-
- dst = htobe64(dst)
+are supported: 16, 32 and 64. The following table summarizes the resulting
+possibilities:
+
+============================= ========= === ======== ==================
+opcode construction opcode imm mnemonic pseudocode
+============================= ========= === ======== ==================
+BPF_END | BPF_TO_LE | BPF_ALU 0xd4 16 le16 dst dst = htole16(dst)
+BPF_END | BPF_TO_LE | BPF_ALU 0xd4 32 le32 dst dst = htole32(dst)
+BPF_END | BPF_TO_LE | BPF_ALU 0xd4 64 le64 dst dst = htole64(dst)
+BPF_END | BPF_TO_BE | BPF_ALU 0xdc 16 be16 dst dst = htobe16(dst)
+BPF_END | BPF_TO_BE | BPF_ALU 0xdc 32 be32 dst dst = htobe32(dst)
+BPF_END | BPF_TO_BE | BPF_ALU 0xdc 64 be64 dst dst = htobe64(dst)
+============================= ========= === ======== ==================
+
+where
+
+* mnenomic indicates a short form that might be displayed by some tools such as disassemblers
+* 'htoleNN()' indicates converting a NN-bit value from host byte order to little-endian byte order
+* 'htobeNN()' indicates converting a NN-bit value from host byte order to big-endian byte order
Jump instructions
-----------------
-``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
+Instruction class ``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
otherwise identical operations.
-The 'code' field encodes the operation as below:
+
+The 4-bit 'code' field encodes the operation as below, where PC is the program counter:
======== ===== ========================= ============
code value description notes
@@ -253,9 +284,6 @@ BPF_JSLT 0xc0 PC += off if dst < src signed
BPF_JSLE 0xd0 PC += off if dst <= src signed
======== ===== ========================= ============
-The eBPF program needs to store the return value into register R0 before doing a
-BPF_EXIT.
-
Helper functions
~~~~~~~~~~~~~~~~
Helper functions are a concept whereby BPF programs can call into a
@@ -285,7 +313,8 @@ For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_
mode size instruction class
============ ====== =================
-The mode modifier is one of:
+mode
+ one of:
============= ===== ==================================== =============
mode modifier value description reference
@@ -297,7 +326,8 @@ The mode modifier is one of:
BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
============= ===== ==================================== =============
-The size modifier is one of:
+size
+ one of:
============= ===== =====================
size modifier value description
@@ -308,25 +338,31 @@ The size modifier is one of:
BPF_DW 0x18 double word (8 bytes)
============= ===== =====================
+instruction class
+ the instruction class (see `Instruction classes`_)
+
Regular load and store operations
---------------------------------
The ``BPF_MEM`` mode modifier is used to encode regular load and store
instructions that transfer data between a register and memory.
-``BPF_MEM | <size> | BPF_STX`` means::
-
- *(size *) (dst + offset) = src_reg
-
-``BPF_MEM | <size> | BPF_ST`` means::
-
- *(size *) (dst + offset) = imm32
-
-``BPF_MEM | <size> | BPF_LDX`` means::
-
- dst = *(size *) (src + offset)
-
-Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
+============================= ========= ====================================
+opcode construction opcode pseudocode
+============================= ========= ====================================
+BPF_MEM | BPF_B | BPF_LDX 0x71 dst = \*(uint8_t \*) (src + offset)
+BPF_MEM | BPF_H | BPF_LDX 0x69 dst = \*(uint16_t \*) (src + offset)
+BPF_MEM | BPF_W | BPF_LDX 0x61 dst = \*(uint32_t \*) (src + offset)
+BPF_MEM | BPF_DW | BPF_LDX 0x79 dst = \*(uint64_t \*) (src + offset)
+BPF_MEM | BPF_B | BPF_ST 0x72 \*(uint8_t \*) (dst + offset) = imm
+BPF_MEM | BPF_H | BPF_ST 0x6a \*(uint16_t \*) (dst + offset) = imm
+BPF_MEM | BPF_W | BPF_ST 0x62 \*(uint32_t \*) (dst + offset) = imm
+BPF_MEM | BPF_DW | BPF_ST 0x7a \*(uint64_t \*) (dst + offset) = imm
+BPF_MEM | BPF_B | BPF_STX 0x73 \*(uint8_t \*) (dst + offset) = src
+BPF_MEM | BPF_H | BPF_STX 0x6b \*(uint16_t \*) (dst + offset) = src
+BPF_MEM | BPF_W | BPF_STX 0x63 \*(uint32_t \*) (dst + offset) = src
+BPF_MEM | BPF_DW | BPF_STX 0x7b \*(uint64_t \*) (dst + offset) = src
+============================= ========= ====================================
Atomic operations
-----------------
@@ -338,9 +374,11 @@ by other eBPF programs or means outside of this specification.
All atomic operations supported by eBPF are encoded as store operations
that use the ``BPF_ATOMIC`` mode modifier as follows:
-* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
-* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
-* 8-bit and 16-bit wide atomic operations are not supported.
+* ``BPF_ATOMIC | BPF_W | BPF_STX`` (0xc3) for 32-bit operations
+* ``BPF_ATOMIC | BPF_DW | BPF_STX`` (0xdb) for 64-bit operations
+
+Note that 8-bit (``BPF_B``) and 16-bit (``BPF_H``) wide atomic operations are not supported,
+nor is ``BPF_ATOMIC | <size> | BPF_ST``.
The 'imm' field is used to encode the actual atomic operation.
Simple atomic operation use a subset of the values defined to encode
@@ -355,16 +393,15 @@ BPF_AND 0x50 atomic and
BPF_XOR 0xa0 atomic xor
======== ===== ===========
-
-``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
+``BPF_ATOMIC | BPF_W | BPF_STX`` (0xc3) with 'imm' = BPF_ADD means::
*(uint32_t *)(dst + offset) += src
-``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
+``BPF_ATOMIC | BPF_DW | BPF_STX`` (0xdb) with 'imm' = BPF ADD means::
*(uint64_t *)(dst + offset) += src
-In addition to the simple atomic operations, there also is a modifier and
+In addition to the simple atomic operations above, there also is a modifier and
two complex atomic operations:
=========== ================ ===========================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 12/15] ebpf-docs: Add Linux note about register calling convention
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (9 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 11/15] ebpf-docs: Improve English readability dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 22:17 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 13/15] ebpf-docs: Add extended 64-bit immediate instructions dthaler1968
` (3 subsequent siblings)
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/linux-notes.rst | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
index 522ebe27d..0581ba326 100644
--- a/Documentation/bpf/linux-notes.rst
+++ b/Documentation/bpf/linux-notes.rst
@@ -7,6 +7,12 @@ Linux implementation notes
This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
+Registers and calling convention
+================================
+
+All program types only use R1 which contains the "context", which is typically a structure containing all
+the inputs needed, and the exit value for eBPF programs is passed as a 32 bit value.
+
Arithmetic instructions
=======================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 13/15] ebpf-docs: Add extended 64-bit immediate instructions
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (10 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 12/15] ebpf-docs: Add Linux note about register calling convention dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-27 18:59 ` [PATCH 14/15] ebpf-docs: Add extended call instructions dthaler1968
` (2 subsequent siblings)
14 siblings, 0 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 54 +++++++++++++++++++++++++--
Documentation/bpf/linux-notes.rst | 10 +++++
2 files changed, 60 insertions(+), 4 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 328207ff6..667d97715 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -430,14 +430,54 @@ and loaded back to ``R0``.
-----------------------------
Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction
-encoding for an extra imm64 value.
+encoding defined in `Instruction encoding`_, and use the 'src' field of the
+basic instruction to hold an opcode subtype.
+
+The following instructions are defined, and use additional concepts defined below:
+
+========================= ====== === ===================================== =========== ==============
+opcode construction opcode src pseudocode imm type dst type
+========================= ====== === ===================================== =========== ==============
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = imm64 integer integer
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = mva(map_by_fd(imm)) + next_imm map fd data pointer
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = variable_addr(imm) variable id data pointer
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map
+BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = mva(map_by_idx(imm)) + next_imm map index data pointer
+========================= ====== === ===================================== =========== ==============
-There is currently only one such instruction.
+where
+
+* map_by_fd(fd) means to convert a 32-bit POSIX file descriptor into an address of a map object (see `Map objects`_)
+* map_by_index(index) means to convert a 32-bit index into an address of a map object
+* mva(map) gets the address of the first value in a given map object
+* variable_addr(id) gets the address of a variable (see `Variables`_) with a given id
+* code_addr(offset) gets the address of the instruction at a specified relative offset in units of 64-bit blocks
+* the 'imm type' can be used by disassemblers for display
+* the 'dst type' can be used for verification and JIT compilation purposes
+
+Map objects
+~~~~~~~~~~~
+
+Maps are shared memory regions accessible by eBPF programs on some platforms, where we use the term "map object"
+to refer to an object containing the data and metadata (e.g., size) about the memory region.
+A map can have various semantics as defined in a separate document, and may or may not have a single
+contiguous memory region, but the 'mva(map)' is currently only defined for maps that do have a single
+contiguous memory region. Support for maps is optional.
-``BPF_LD | BPF_DW | BPF_IMM`` means::
+Each map object can have a POSIX file descriptor (fd) if supported by the platform,
+where 'map_by_fd(fd)' means to get the map with the specified file descriptor.
+Each eBPF program can also be defined to use a set of maps associated with the program
+at load time, and 'map_by_index(index)' means to get the map with the given index in the set
+associated with the eBPF program containing the instruction.
- dst = imm64
+Variables
+~~~~~~~~~
+Variables are memory regions, identified by integer ids, accessible by eBPF programs on
+some platforms. The 'variable_addr(id)' operation means to get the address of the memory region
+identified by the given id. Support for such variables is optional.
Legacy BPF Packet access instructions
-------------------------------------
@@ -465,6 +505,12 @@ opcode src imm description referenc
0x16 0x0 any if (uint32_t)dst == imm goto +offset `Jump instructions`_
0x17 0x0 any dst -= imm `Arithmetic instructions`_
0x18 0x0 any dst = imm64 `64-bit immediate instructions`_
+0x18 0x1 any dst = map_by_fd(imm) `64-bit immediate instructions`_
+0x18 0x2 any dst = mva(map_by_fd(imm)) + next_imm `64-bit immediate instructions`_
+0x18 0x3 any dst = variable_addr(imm) `64-bit immediate instructions`_
+0x18 0x4 any dst = code_addr(imm) `64-bit immediate instructions`_
+0x18 0x5 any dst = map_by_idx(imm) `64-bit immediate instructions`_
+0x18 0x6 any dst = mva(map_by_idx(imm)) + next_imm `64-bit immediate instructions`_
0x1c any 0x00 dst = (uint32_t)(dst - src) `Arithmetic instructions`_
0x1d any 0x00 if dst == src goto +offset `Jump instructions`_
0x1e any 0x00 if (uint32_t)dst == (uint32_t)src goto +offset `Jump instructions`_
diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
index 0581ba326..e7f79242c 100644
--- a/Documentation/bpf/linux-notes.rst
+++ b/Documentation/bpf/linux-notes.rst
@@ -24,6 +24,16 @@ Byte swap instructions
``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively.
+Map objects
+===========
+
+Linux only supports the 'mva(map)' operation on array maps with a single element.
+
+Variables
+=========
+
+Linux uses BTF ids to identify variables.
+
Legacy BPF Packet access instructions
=====================================
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 14/15] ebpf-docs: Add extended call instructions
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (11 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 13/15] ebpf-docs: Add extended 64-bit immediate instructions dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-27 18:59 ` [PATCH 15/15] ebpf-docs: Add note about invalid instruction dthaler1968
2022-09-30 20:50 ` [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file patchwork-bot+netdevbpf
14 siblings, 0 replies; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/instruction-set.rst | 52 +++++++++++++++++----------
1 file changed, 34 insertions(+), 18 deletions(-)
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 667d97715..2ac8f0dae 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -265,24 +265,26 @@ otherwise identical operations.
The 4-bit 'code' field encodes the operation as below, where PC is the program counter:
-======== ===== ========================= ============
-code value description notes
-======== ===== ========================= ============
-BPF_JA 0x00 PC += off BPF_JMP only
-BPF_JEQ 0x10 PC += off if dst == src
-BPF_JGT 0x20 PC += off if dst > src unsigned
-BPF_JGE 0x30 PC += off if dst >= src unsigned
-BPF_JSET 0x40 PC += off if dst & src
-BPF_JNE 0x50 PC += off if dst != src
-BPF_JSGT 0x60 PC += off if dst > src signed
-BPF_JSGE 0x70 PC += off if dst >= src signed
-BPF_CALL 0x80 function call see `Helper functions`_
-BPF_EXIT 0x90 function / program return BPF_JMP only
-BPF_JLT 0xa0 PC += off if dst < src unsigned
-BPF_JLE 0xb0 PC += off if dst <= src unsigned
-BPF_JSLT 0xc0 PC += off if dst < src signed
-BPF_JSLE 0xd0 PC += off if dst <= src signed
-======== ===== ========================= ============
+======== ===== === ========================== ========================
+code value src description notes
+======== ===== === ========================== ========================
+BPF_JA 0x0 0x0 PC += offset BPF_JMP only
+BPF_JEQ 0x1 any PC += offset if dst == src
+BPF_JGT 0x2 any PC += offset if dst > src unsigned
+BPF_JGE 0x3 any PC += offset if dst >= src unsigned
+BPF_JSET 0x4 any PC += offset if dst & src
+BPF_JNE 0x5 any PC += offset if dst != src
+BPF_JSGT 0x6 any PC += offset if dst > src signed
+BPF_JSGE 0x7 any PC += offset if dst >= src signed
+BPF_CALL 0x8 0x0 call helper function imm see `Helper functions`_
+BPF_CALL 0x8 0x1 call PC += offset see `eBPF functions`_
+BPF_CALL 0x8 0x2 call runtime function imm see `Runtime functions`_
+BPF_EXIT 0x9 0x0 return BPF_JMP only
+BPF_JLT 0xa any PC += offset if dst < src unsigned
+BPF_JLE 0xb any PC += offset if dst <= src unsigned
+BPF_JSLT 0xc any PC += offset if dst < src signed
+BPF_JSLE 0xd any PC += offset if dst <= src signed
+======== ===== === ========================== ========================
Helper functions
~~~~~~~~~~~~~~~~
@@ -301,6 +303,18 @@ with the remaining registers being ignored. The definition of a helper function
is responsible for specifying the type (e.g., integer, pointer, etc.) of the value returned,
the number of arguments, and the type of each argument.
+Runtime functions
+~~~~~~~~~~~~~~~~~
+Runtime functions are like helper functions except that they are not specific
+to eBPF programs. They use a different numbering space from helper functions,
+but otherwise the same considerations apply.
+
+eBPF functions
+~~~~~~~~~~~~~~
+eBPF functions are functions exposed by the same eBPF program as the caller,
+and are referenced by offset from the call instruction, similar to ``BPF_JA``.
+A ``BPF_EXIT`` within the eBPF function will return to the caller.
+
Load and store instructions
===========================
@@ -585,6 +599,8 @@ opcode src imm description referenc
0x7f any 0x00 dst >>= src `Arithmetic instructions`_
0x84 0x0 0x00 dst = (uint32_t)-dst `Arithmetic instructions`_
0x85 0x0 any call helper function imm `Helper functions`_
+0x85 0x1 any call PC += offset `eBPF functions`_
+0x85 0x2 any call runtime function imm `Runtime functions`_
0x87 0x0 0x00 dst = -dst `Arithmetic instructions`_
0x94 0x0 any dst = (uint32_t)((imm != 0) ? (dst % imm) : dst) `Arithmetic instructions`_
0x95 0x0 0x00 return `Jump instructions`_
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 15/15] ebpf-docs: Add note about invalid instruction
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (12 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 14/15] ebpf-docs: Add extended call instructions dthaler1968
@ 2022-09-27 18:59 ` dthaler1968
2022-09-30 22:21 ` Alexei Starovoitov
2022-09-30 20:50 ` [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file patchwork-bot+netdevbpf
14 siblings, 1 reply; 50+ messages in thread
From: dthaler1968 @ 2022-09-27 18:59 UTC (permalink / raw)
To: bpf; +Cc: Dave Thaler
From: Dave Thaler <dthaler@microsoft.com>
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
---
Documentation/bpf/clang-notes.rst | 5 +++++
Documentation/bpf/instruction-set.rst | 3 +++
2 files changed, 8 insertions(+)
diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst
index 528feddf2..3c934421b 100644
--- a/Documentation/bpf/clang-notes.rst
+++ b/Documentation/bpf/clang-notes.rst
@@ -20,6 +20,11 @@ Arithmetic instructions
For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with
``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included.
+Invalid instructions
+====================
+
+Clang will generate the invalid ``BPF_CALL | BPF_X | BPF_JMP`` (0x8d) instruction if ``-O0`` is used.
+
Atomic operations
=================
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 2ac8f0dae..af9dc0cc6 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -303,6 +303,9 @@ with the remaining registers being ignored. The definition of a helper function
is responsible for specifying the type (e.g., integer, pointer, etc.) of the value returned,
the number of arguments, and the type of each argument.
+Note that ``BPF_CALL | BPF_X | BPF_JMP`` (0x8d), where the helper function integer
+would be read from a specified register, is not currently permitted.
+
Runtime functions
~~~~~~~~~~~~~~~~~
Runtime functions are like helper functions except that they are not specific
--
2.33.4
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH 06/15] ebpf-docs: Use standard type convention in standard doc
2022-09-27 18:59 ` [PATCH 06/15] ebpf-docs: Use standard type convention in standard doc dthaler1968
@ 2022-09-30 20:49 ` Alexei Starovoitov
0 siblings, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 20:49 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:49PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
Please always add commit log even it's just a copy paste of a subject line.
The patch subj should be 'bpf, docs:'.
I fixed it up while applying the first 5 patches.
> ---
> Documentation/bpf/instruction-set.rst | 14 ++++++++++----
> Documentation/bpf/linux-notes.rst | 6 ++++++
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index 4997d2088..a24bc5d53 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -7,6 +7,10 @@ eBPF Instruction Set Specification, v1.0
>
> This document specifies version 1.0 of the eBPF instruction set.
>
> +Documentation conventions
> +=========================
> +
> +This specification uses the standard C types (uint32_t, etc.) in documentation.
>
> Registers and calling convention
> ================================
> @@ -114,7 +118,9 @@ BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
>
> ``BPF_ADD | BPF_X | BPF_ALU`` means::
>
> - dst_reg = (u32) dst_reg + (u32) src_reg;
> + dst_reg = (uint32_t) dst_reg + (uint32_t) src_reg;
> +
> +where '(uint32_t)' indicates truncation to 32 bits.
This part I'm not excited about.
imo the "standard" types are too verbose. They make the doc harder to read.
I think it would be just fine for the standard to say in the conventions section
that u32 is a 32-bit unsigned integer and that's how it's used in the doc.
u32 would be a name that the doc uses. Name match with linux type is 'accidental'.
> ``BPF_ADD | BPF_X | BPF_ALU64`` means::
>
> @@ -122,7 +128,7 @@ BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
>
> ``BPF_XOR | BPF_K | BPF_ALU`` means::
>
> - src_reg = (u32) src_reg ^ (u32) imm32
> + src_reg = (uint32_t) src_reg ^ (uint32_t) imm32
>
> ``BPF_XOR | BPF_K | BPF_ALU64`` means::
>
> @@ -276,11 +282,11 @@ BPF_XOR 0xa0 atomic xor
>
> ``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
>
> - *(u32 *)(dst_reg + off16) += src_reg
> + *(uint32_t *)(dst_reg + off16) += src_reg
>
> ``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
>
> - *(u64 *)(dst_reg + off16) += src_reg
> + *(uint32_t *)(dst_reg + off16) += src_reg
Bug.
And the later patch fixes it :(
Let's not add it in the first place.
> In addition to the simple atomic operations, there also is a modifier and
> two complex atomic operations:
> diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
> index 1c31379b4..522ebe27d 100644
> --- a/Documentation/bpf/linux-notes.rst
> +++ b/Documentation/bpf/linux-notes.rst
> @@ -7,6 +7,12 @@ Linux implementation notes
>
> This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
>
> +Arithmetic instructions
> +=======================
> +
> +While the eBPF instruction set document uses the standard C terminology as the cross-platform specification,
> +in the Linux kernel, uint32_t is expressed as u32, uint64_t is expressed as u64, etc.
> +
> Byte swap instructions
> ======================
>
> --
> 2.33.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
` (13 preceding siblings ...)
2022-09-27 18:59 ` [PATCH 15/15] ebpf-docs: Add note about invalid instruction dthaler1968
@ 2022-09-30 20:50 ` patchwork-bot+netdevbpf
14 siblings, 0 replies; 50+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-09-30 20:50 UTC (permalink / raw)
To: None; +Cc: bpf, dthaler
Hello:
This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Tue, 27 Sep 2022 18:59:44 +0000 you wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/instruction-set.rst | 38 ++--------------
> Documentation/bpf/linux-notes.rst | 65 +++++++++++++++++++++++++++
> 2 files changed, 68 insertions(+), 35 deletions(-)
> create mode 100644 Documentation/bpf/linux-notes.rst
Here is the summary with links:
- [01/15] ebpf-docs: Move legacy packet instructions to a separate file
https://git.kernel.org/bpf/bpf-next/c/6166da0a02cd
- [02/15] ebpf-docs: Linux byteswap note
https://git.kernel.org/bpf/bpf-next/c/9a0bf21337c6
- [03/15] ebpf-docs: Move Clang notes to a separate file
https://git.kernel.org/bpf/bpf-next/c/6c7aaffb24ef
- [04/15] ebpf-docs: Add Clang note about BPF_ALU
https://git.kernel.org/bpf/bpf-next/c/ee159bdbdbce
- [05/15] ebpf-docs: Add TOC and fix formatting
https://git.kernel.org/bpf/bpf-next/c/5a8921ba96ce
- [06/15] ebpf-docs: Use standard type convention in standard doc
(no matching commit)
- [07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
(no matching commit)
- [08/15] ebpf-docs: Use consistent names for the same field
(no matching commit)
- [09/15] ebpf-docs: Explain helper functions
(no matching commit)
- [10/15] ebpf-docs: Add appendix of all opcodes in order
(no matching commit)
- [11/15] ebpf-docs: Improve English readability
(no matching commit)
- [12/15] ebpf-docs: Add Linux note about register calling convention
(no matching commit)
- [13/15] ebpf-docs: Add extended 64-bit immediate instructions
(no matching commit)
- [14/15] ebpf-docs: Add extended call instructions
(no matching commit)
- [15/15] ebpf-docs: Add note about invalid instruction
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-27 18:59 ` [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow dthaler1968
@ 2022-09-30 20:52 ` Alexei Starovoitov
2022-09-30 21:54 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 20:52 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:50PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/instruction-set.rst | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index a24bc5d53..3c5a63612 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -103,19 +103,26 @@ code value description
> BPF_ADD 0x00 dst += src
> BPF_SUB 0x10 dst -= src
> BPF_MUL 0x20 dst \*= src
> -BPF_DIV 0x30 dst /= src
> +BPF_DIV 0x30 dst = (src != 0) ? (dst / src) : 0
> BPF_OR 0x40 dst \|= src
> BPF_AND 0x50 dst &= src
> BPF_LSH 0x60 dst <<= src
> BPF_RSH 0x70 dst >>= src
> BPF_NEG 0x80 dst = ~src
> -BPF_MOD 0x90 dst %= src
> +BPF_MOD 0x90 dst = (src != 0) ? (dst % src) : dst
> BPF_XOR 0xa0 dst ^= src
> BPF_MOV 0xb0 dst = src
> BPF_ARSH 0xc0 sign extending shift right
> BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
> ======== ===== ==========================================================
>
> +Underflow and overflow are allowed during arithmetic operations,
> +meaning the 64-bit or 32-bit value will wrap. If
> +eBPF program execution would result in division by zero,
> +the destination register is instead set to zero.
> +If execution would result in modulo by zero,
> +the destination register is instead left unchanged.
> +
> ``BPF_ADD | BPF_X | BPF_ALU`` means::
>
> dst_reg = (uint32_t) dst_reg + (uint32_t) src_reg;
> @@ -135,6 +142,14 @@ where '(uint32_t)' indicates truncation to 32 bits.
> src_reg = src_reg ^ imm32
>
>
> +Also note that the modulo operation often varies by language
> +when the dividend or divisor are negative, where Python, Ruby, etc.
> +differ from C, Go, Java, etc. This specification requires that
> +modulo use truncated division (where -13 % 3 == -1) as implemented
> +in C, Go, etc.:
> +
> + a % n = a - n * trunc(a / n)
> +
Interesting bit of info, but I'm not sure how it relates to the ISA doc.
I think it's more important to say that BPF ISA only supports unsigned div/mod.
There are no instructions for signed div/mod.
> Byte swap instructions
> ~~~~~~~~~~~~~~~~~~~~~~
>
> --
> 2.33.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 08/15] ebpf-docs: Use consistent names for the same field
2022-09-27 18:59 ` [PATCH 08/15] ebpf-docs: Use consistent names for the same field dthaler1968
@ 2022-09-30 20:57 ` Alexei Starovoitov
2022-10-04 14:44 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 20:57 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:51PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/instruction-set.rst | 107 ++++++++++++++++++--------
> 1 file changed, 76 insertions(+), 31 deletions(-)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index 3c5a63612..2987234eb 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -34,20 +34,59 @@ Instruction encoding
> eBPF has two instruction encodings:
>
> * the basic instruction encoding, which uses 64 bits to encode an instruction
> -* the wide instruction encoding, which appends a second 64-bit immediate value
> - (imm64) after the basic instruction for a total of 128 bits.
> +* the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
> + constant) value after the basic instruction for a total of 128 bits.
>
> -The basic instruction encoding looks as follows:
> +The basic instruction encoding is as follows, where MSB and LSB mean the most significant
> +bits and least significant bits, respectively:
>
> ============= ======= =============== ==================== ============
> 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
> ============= ======= =============== ==================== ============
> -immediate offset source register destination register opcode
> +imm offset src dst opcode
> ============= ======= =============== ==================== ============
>
> +imm
> + signed integer immediate value
> +
> +offset
> + signed integer offset used with pointer arithmetic
> +
> +src
> + the source register number (0-10), except where otherwise specified
> + (`64-bit immediate instructions`_ reuse this field for other purposes)
There are more than one?
I guess we have such section now,
but in ISA it really is only one insn. LD_IMM64.
It's one insn for the interpreter and one insn for JITs.
> +
> +dst
> + destination register number (0-10)
> +
> +opcode
> + operation to perform
> +
> Note that most instructions do not use all of the fields.
> Unused fields shall be cleared to zero.
>
> +As discussed below in `64-bit immediate instructions`_, some
> +instructions use a 64-bit immediate value that is constructed as follows.
> +The 64 bits following the basic instruction contain a pseudo instruction
> +using the same format but with opcode, dst, src, and offset all set to zero,
> +and imm containing the high 32 bits of the immediate value.
'instructions' here and further reads a bit odd.
May be calling it one instruction where imm_lo/hi have different semantics
depending on src would be better?
> +
> +================= ==================
> +64 bits (MSB) 64 bits (LSB)
> +================= ==================
> +basic instruction pseudo instruction
> +================= ==================
> +
> +Thus the 64-bit immediate value is constructed as follows:
> +
> + imm64 = imm + (next_imm << 32)
> +
> +where 'next_imm' refers to the imm value of the pseudo instruction
> +following the basic instruction.
> +
> +In the remainder of this document 'src' and 'dst' refer to the values of the source
> +and destination registers, respectively, rather than the register number.
> +
> Instruction classes
> -------------------
>
> @@ -75,20 +114,24 @@ For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` an
> ============== ====== =================
> 4 bits (MSB) 1 bit 3 bits (LSB)
> ============== ====== =================
> -operation code source instruction class
> +code source instruction class
> ============== ====== =================
>
> -The 4th bit encodes the source operand:
feels wrong to lose this part.
> +code
> + the operation code, whose meaning varies by instruction class
>
> - ====== ===== ========================================
> - source value description
> - ====== ===== ========================================
> - BPF_K 0x00 use 32-bit immediate as source operand
> - BPF_X 0x08 use 'src_reg' register as source operand
> - ====== ===== ========================================
> +source
> + the source operand location, which unless otherwise specified is one of:
>
> -The four MSB bits store the operation code.
same here.
> + ====== ===== ==========================================
> + source value description
> + ====== ===== ==========================================
> + BPF_K 0x00 use 32-bit 'imm' value as source operand
> + BPF_X 0x08 use 'src' register value as source operand
> + ====== ===== ==========================================
>
> +instruction class
> + the instruction class (see `Instruction classes`_)
>
> Arithmetic instructions
> -----------------------
> @@ -116,6 +159,8 @@ BPF_ARSH 0xc0 sign extending shift right
> BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
> ======== ===== ==========================================================
>
> +where 'src' is the source operand value.
> +
> Underflow and overflow are allowed during arithmetic operations,
> meaning the 64-bit or 32-bit value will wrap. If
> eBPF program execution would result in division by zero,
> @@ -125,21 +170,21 @@ the destination register is instead left unchanged.
>
> ``BPF_ADD | BPF_X | BPF_ALU`` means::
>
> - dst_reg = (uint32_t) dst_reg + (uint32_t) src_reg;
> + dst = (uint32_t) (dst + src)
>
> where '(uint32_t)' indicates truncation to 32 bits.
>
> ``BPF_ADD | BPF_X | BPF_ALU64`` means::
>
> - dst_reg = dst_reg + src_reg
> + dst = dst + src
>
> ``BPF_XOR | BPF_K | BPF_ALU`` means::
>
> - src_reg = (uint32_t) src_reg ^ (uint32_t) imm32
> + src = (uint32_t) src ^ (uint32_t) imm
>
> ``BPF_XOR | BPF_K | BPF_ALU64`` means::
>
> - src_reg = src_reg ^ imm32
> + src = src ^ imm
>
>
> Also note that the modulo operation often varies by language
> @@ -176,11 +221,11 @@ Examples:
>
> ``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
>
> - dst_reg = htole16(dst_reg)
> + dst = htole16(dst)
>
> ``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
>
> - dst_reg = htobe64(dst_reg)
> + dst = htobe64(dst)
>
> Jump instructions
> -----------------
> @@ -255,15 +300,15 @@ instructions that transfer data between a register and memory.
>
> ``BPF_MEM | <size> | BPF_STX`` means::
>
> - *(size *) (dst_reg + off) = src_reg
> + *(size *) (dst + offset) = src_reg
>
> ``BPF_MEM | <size> | BPF_ST`` means::
>
> - *(size *) (dst_reg + off) = imm32
> + *(size *) (dst + offset) = imm32
>
> ``BPF_MEM | <size> | BPF_LDX`` means::
>
> - dst_reg = *(size *) (src_reg + off)
> + dst = *(size *) (src + offset)
>
> Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
>
> @@ -297,11 +342,11 @@ BPF_XOR 0xa0 atomic xor
>
> ``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
>
> - *(uint32_t *)(dst_reg + off16) += src_reg
> + *(uint32_t *)(dst + offset) += src
>
> ``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
>
> - *(uint32_t *)(dst_reg + off16) += src_reg
> + *(uint64_t *)(dst + offset) += src
>
> In addition to the simple atomic operations, there also is a modifier and
> two complex atomic operations:
> @@ -316,16 +361,16 @@ BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
>
> The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
> always set for the complex atomic operations. If the ``BPF_FETCH`` flag
> -is set, then the operation also overwrites ``src_reg`` with the value that
> +is set, then the operation also overwrites ``src`` with the value that
> was in memory before it was modified.
>
> -The ``BPF_XCHG`` operation atomically exchanges ``src_reg`` with the value
> -addressed by ``dst_reg + off``.
> +The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value
> +addressed by ``dst + offset``.
>
> The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
> -``dst_reg + off`` with ``R0``. If they match, the value addressed by
> -``dst_reg + off`` is replaced with ``src_reg``. In either case, the
> -value that was at ``dst_reg + off`` before the operation is zero-extended
> +``dst + offset`` with ``R0``. If they match, the value addressed by
> +``dst + offset`` is replaced with ``src``. In either case, the
> +value that was at ``dst + offset`` before the operation is zero-extended
> and loaded back to ``R0``.
>
> 64-bit immediate instructions
> @@ -338,7 +383,7 @@ There is currently only one such instruction.
>
> ``BPF_LD | BPF_DW | BPF_IMM`` means::
>
> - dst_reg = imm64
> + dst = imm64
>
>
> Legacy BPF Packet access instructions
> --
> 2.33.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-30 20:52 ` Alexei Starovoitov
@ 2022-09-30 21:54 ` Dave Thaler
2022-09-30 21:59 ` Alexei Starovoitov
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-09-30 21:54 UTC (permalink / raw)
To: Alexei Starovoitov, dthaler1968; +Cc: bpf
[...]
> > +Also note that the modulo operation often varies by language when the
> > +dividend or divisor are negative, where Python, Ruby, etc.
> > +differ from C, Go, Java, etc. This specification requires that modulo
> > +use truncated division (where -13 % 3 == -1) as implemented in C, Go,
> > +etc.:
> > +
> > + a % n = a - n * trunc(a / n)
> > +
>
> Interesting bit of info, but I'm not sure how it relates to the ISA doc.
It's because there's multiple definitions of modulo out there as the paragraph notes,
which differ in what they do with negative numbers.
The ISA defines the modulo operation as being the specific version above.
If you tried to implement the ISA in say Python and didn't know that,
you'd have a non-compliant implementation.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-30 21:54 ` Dave Thaler
@ 2022-09-30 21:59 ` Alexei Starovoitov
2022-09-30 22:41 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 21:59 UTC (permalink / raw)
To: Dave Thaler; +Cc: dthaler1968, bpf
On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
> [...]
> > > +Also note that the modulo operation often varies by language when the
> > > +dividend or divisor are negative, where Python, Ruby, etc.
> > > +differ from C, Go, Java, etc. This specification requires that modulo
> > > +use truncated division (where -13 % 3 == -1) as implemented in C, Go,
> > > +etc.:
> > > +
> > > + a % n = a - n * trunc(a / n)
> > > +
> >
> > Interesting bit of info, but I'm not sure how it relates to the ISA doc.
>
> It's because there's multiple definitions of modulo out there as the paragraph notes,
> which differ in what they do with negative numbers.
> The ISA defines the modulo operation as being the specific version above.
> If you tried to implement the ISA in say Python and didn't know that,
> you'd have a non-compliant implementation.
Is it because the languages have weird rules to pick between signed vs unsigned mod?
At least from llvm pov the smod and umod have fixed behavior.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 09/15] ebpf-docs: Explain helper functions
2022-09-27 18:59 ` [PATCH 09/15] ebpf-docs: Explain helper functions dthaler1968
@ 2022-09-30 22:01 ` Alexei Starovoitov
2022-09-30 23:01 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 22:01 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:52PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/instruction-set.rst | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index 2987234eb..926957830 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -245,7 +245,7 @@ BPF_JSET 0x40 PC += off if dst & src
> BPF_JNE 0x50 PC += off if dst != src
> BPF_JSGT 0x60 PC += off if dst > src signed
> BPF_JSGE 0x70 PC += off if dst >= src signed
> -BPF_CALL 0x80 function call
> +BPF_CALL 0x80 function call see `Helper functions`_
> BPF_EXIT 0x90 function / program return BPF_JMP only
> BPF_JLT 0xa0 PC += off if dst < src unsigned
> BPF_JLE 0xb0 PC += off if dst <= src unsigned
> @@ -256,6 +256,22 @@ BPF_JSLE 0xd0 PC += off if dst <= src signed
> The eBPF program needs to store the return value into register R0 before doing a
> BPF_EXIT.
>
> +Helper functions
> +~~~~~~~~~~~~~~~~
> +Helper functions are a concept whereby BPF programs can call into a
> +set of function calls exposed by the eBPF runtime. Each helper
> +function is identified by an integer used in a ``BPF_CALL`` instruction.
> +The available helper functions may differ for each eBPF program type.
> +
> +Conceptually, each helper function is implemented with a commonly shared function
> +signature defined as:
> +
> + uint64_t function(uint64_t r1, uint64_t r2, uint64_t r3, uint64_t r4, uint64_t r5)
> +
> +In actuality, each helper function is defined as taking between 0 and 5 arguments,
> +with the remaining registers being ignored. The definition of a helper function
> +is responsible for specifying the type (e.g., integer, pointer, etc.) of the value returned,
> +the number of arguments, and the type of each argument.
If we explain helpers in the doc then we should explain kfuncs and bpf-to-bpf calls as well.
Otherwise it looks incomplete and eventually will suffer the same issue as '64-bit instructionS'.
Here it's only one CALL insn.
Though 'imm' value can be interpreted differently bpf2bpf vs helper vs kfunc.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order
2022-09-27 18:59 ` [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order dthaler1968
@ 2022-09-30 22:02 ` Alexei Starovoitov
2022-09-30 22:43 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 22:02 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:53PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/instruction-set.rst | 197 ++++++++++++++++++++++++++
> 1 file changed, 197 insertions(+)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index 926957830..b6f098104 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -408,3 +408,200 @@ Legacy BPF Packet access instructions
> eBPF previously introduced special instructions for access to packet data that were
> carried over from classic BPF. However, these instructions are
> deprecated and should no longer be used.
> +
> +Appendix
> +========
> +
> +For reference, the following table lists opcodes in order by value.
> +
> +====== === ==== =================================================== ========================================
> +opcode src imm description reference
> +====== === ==== =================================================== ========================================
> +0x00 0x0 any (additional immediate value) `64-bit immediate instructions`_
> +0x04 0x0 any dst = (uint32_t)(dst + imm) `Arithmetic instructions`_
> +0x05 0x0 0x00 goto +offset `Jump instructions`_
> +0x07 0x0 any dst += imm `Arithmetic instructions`_
> +0x0c any 0x00 dst = (uint32_t)(dst + src) `Arithmetic instructions`_
I guess it's useful, but how did you generate this table?
Looks very error prone if all hex numbers were done by human.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 11/15] ebpf-docs: Improve English readability
2022-09-27 18:59 ` [PATCH 11/15] ebpf-docs: Improve English readability dthaler1968
@ 2022-09-30 22:16 ` Alexei Starovoitov
2022-10-04 14:32 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 22:16 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:54PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/instruction-set.rst | 137 ++++++++++++++++----------
> 1 file changed, 87 insertions(+), 50 deletions(-)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index b6f098104..328207ff6 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -7,6 +7,9 @@ eBPF Instruction Set Specification, v1.0
>
> This document specifies version 1.0 of the eBPF instruction set.
>
> +The eBPF instruction set consists of eleven 64 bit registers, a program counter,
> +and 512 bytes of stack space.
I would not add 512 to a doc.
That's what we have now, but we might relax that in the future.
> +
> Documentation conventions
> =========================
>
> @@ -25,12 +28,24 @@ The eBPF calling convention is defined as:
> * R6 - R9: callee saved registers that function calls will preserve
> * R10: read-only frame pointer to access stack
>
> -R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
> -necessary across calls.
> +Registers R0 - R5 are scratch registers, meaning the BPF program needs to either
> +spill them to the BPF stack or move them to callee saved registers if these
> +arguments are to be reused across multiple function calls. Spilling means
> +that the value in the register is moved to the BPF stack. The reverse operation
> +of moving the variable from the BPF stack to the register is called filling.
> +The reason for spilling/filling is due to the limited number of registers.
More canonical way would be to say that r0-r5 are caller saved.
I like above clarification though.
> +
> +Upon entering execution of an eBPF program, registers R1 - R5 initially can contain
> +the input arguments for the program (similar to the argc/argv pair for a typical C program).
> +The actual number of registers used, and their meaning, is defined by the program type;
> +for example, a networking program might have an argument that includes network packet data
> +and/or metadata.
>
> Instruction encoding
> ====================
>
> +An eBPF program is a sequence of instructions.
> +
> eBPF has two instruction encodings:
>
> * the basic instruction encoding, which uses 64 bits to encode an instruction
> @@ -63,7 +78,7 @@ opcode
> operation to perform
>
> Note that most instructions do not use all of the fields.
> -Unused fields shall be cleared to zero.
> +Unused fields must be set to zero.
>
> As discussed below in `64-bit immediate instructions`_, some
> instructions use a 64-bit immediate value that is constructed as follows.
> @@ -90,7 +105,9 @@ and destination registers, respectively, rather than the register number.
> Instruction classes
> -------------------
>
> -The three LSB bits of the 'opcode' field store the instruction class:
> +The encoding of the 'opcode' field varies and can be determined from
> +the three least significant bits (LSB) of the 'opcode' field which holds
> +the "instruction class", as follows:
>
> ========= ===== =============================== ===================================
> class value description reference
> @@ -136,9 +153,11 @@ instruction class
> Arithmetic instructions
> -----------------------
>
> -``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for
> +Instruction class ``BPF_ALU`` uses 32-bit wide operands (zeroing the upper 32 bits
> +of the destination register) while ``BPF_ALU64`` uses 64-bit wide operands for
> otherwise identical operations.
> -The 'code' field encodes the operation as below:
> +
> +The 4-bit 'code' field encodes the operation as follows:
>
> ======== ===== ==========================================================
> code value description
> @@ -168,21 +187,23 @@ the destination register is instead set to zero.
> If execution would result in modulo by zero,
> the destination register is instead left unchanged.
>
> -``BPF_ADD | BPF_X | BPF_ALU`` means::
> +Examples:
> +
> +``BPF_ADD | BPF_X | BPF_ALU`` (0x0c) means::
>
> dst = (uint32_t) (dst + src)
>
> where '(uint32_t)' indicates truncation to 32 bits.
>
> -``BPF_ADD | BPF_X | BPF_ALU64`` means::
> +``BPF_ADD | BPF_X | BPF_ALU64`` (0x0f) means::
I don't think adding hex values help here.
>
> dst = dst + src
>
> -``BPF_XOR | BPF_K | BPF_ALU`` means::
> +``BPF_XOR | BPF_K | BPF_ALU`` (0xa4) means::
>
> src = (uint32_t) src ^ (uint32_t) imm
>
> -``BPF_XOR | BPF_K | BPF_ALU64`` means::
> +``BPF_XOR | BPF_K | BPF_ALU64`` (0xa7) means::
>
> src = src ^ imm
>
> @@ -204,8 +225,9 @@ The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
> The byte swap instructions operate on the destination register
> only and do not use a separate source register or immediate value.
>
> -The 1-bit source operand field in the opcode is used to to select what byte
> -order the operation convert from or to:
> +Byte swap instructions use non-default semantics of the 1-bit 'source' field in
I would drop 'non-default'. Many fields have different meanings depending on opcode.
BPF_SRC() macro just reads that bit.
> +the 'opcode' field. Instead of indicating the source operator, it is instead
> +used to select what byte order the operation converts from or to:
>
> ========= ===== =================================================
> source value description
> @@ -215,24 +237,33 @@ BPF_TO_BE 0x08 convert between host byte order and big endian
> ========= ===== =================================================
>
> The 'imm' field encodes the width of the swap operations. The following widths
> -are supported: 16, 32 and 64.
> -
> -Examples:
> -
> -``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
> -
> - dst = htole16(dst)
> -
> -``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
> -
> - dst = htobe64(dst)
> +are supported: 16, 32 and 64. The following table summarizes the resulting
> +possibilities:
> +
> +============================= ========= === ======== ==================
> +opcode construction opcode imm mnemonic pseudocode
> +============================= ========= === ======== ==================
> +BPF_END | BPF_TO_LE | BPF_ALU 0xd4 16 le16 dst dst = htole16(dst)
> +BPF_END | BPF_TO_LE | BPF_ALU 0xd4 32 le32 dst dst = htole32(dst)
> +BPF_END | BPF_TO_LE | BPF_ALU 0xd4 64 le64 dst dst = htole64(dst)
> +BPF_END | BPF_TO_BE | BPF_ALU 0xdc 16 be16 dst dst = htobe16(dst)
> +BPF_END | BPF_TO_BE | BPF_ALU 0xdc 32 be32 dst dst = htobe32(dst)
> +BPF_END | BPF_TO_BE | BPF_ALU 0xdc 64 be64 dst dst = htobe64(dst)
> +============================= ========= === ======== ==================
> +
> +where
> +
> +* mnenomic indicates a short form that might be displayed by some tools such as disassemblers
> +* 'htoleNN()' indicates converting a NN-bit value from host byte order to little-endian byte order
> +* 'htobeNN()' indicates converting a NN-bit value from host byte order to big-endian byte order
I think we need to add normal bswap insn.
These to_le and to_be are very awkward to use.
As soon as we have new insn the compiler will be using it.
There is no equivalent of to_be and to_le in C. It wasn't good ISA design.
> Jump instructions
> -----------------
>
> -``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
> +Instruction class ``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
> otherwise identical operations.
> -The 'code' field encodes the operation as below:
> +
> +The 4-bit 'code' field encodes the operation as below, where PC is the program counter:
>
> ======== ===== ========================= ============
> code value description notes
> @@ -253,9 +284,6 @@ BPF_JSLT 0xc0 PC += off if dst < src signed
> BPF_JSLE 0xd0 PC += off if dst <= src signed
> ======== ===== ========================= ============
>
> -The eBPF program needs to store the return value into register R0 before doing a
> -BPF_EXIT.
> -
> Helper functions
> ~~~~~~~~~~~~~~~~
> Helper functions are a concept whereby BPF programs can call into a
> @@ -285,7 +313,8 @@ For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_
> mode size instruction class
> ============ ====== =================
>
> -The mode modifier is one of:
> +mode
> + one of:
>
> ============= ===== ==================================== =============
> mode modifier value description reference
> @@ -297,7 +326,8 @@ The mode modifier is one of:
> BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
> ============= ===== ==================================== =============
>
> -The size modifier is one of:
> +size
> + one of:
>
> ============= ===== =====================
> size modifier value description
> @@ -308,25 +338,31 @@ The size modifier is one of:
> BPF_DW 0x18 double word (8 bytes)
> ============= ===== =====================
>
> +instruction class
> + the instruction class (see `Instruction classes`_)
> +
> Regular load and store operations
> ---------------------------------
>
> The ``BPF_MEM`` mode modifier is used to encode regular load and store
> instructions that transfer data between a register and memory.
>
> -``BPF_MEM | <size> | BPF_STX`` means::
> -
> - *(size *) (dst + offset) = src_reg
> -
> -``BPF_MEM | <size> | BPF_ST`` means::
> -
> - *(size *) (dst + offset) = imm32
> -
> -``BPF_MEM | <size> | BPF_LDX`` means::
> -
> - dst = *(size *) (src + offset)
> -
> -Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
> +============================= ========= ====================================
> +opcode construction opcode pseudocode
> +============================= ========= ====================================
> +BPF_MEM | BPF_B | BPF_LDX 0x71 dst = \*(uint8_t \*) (src + offset)
> +BPF_MEM | BPF_H | BPF_LDX 0x69 dst = \*(uint16_t \*) (src + offset)
> +BPF_MEM | BPF_W | BPF_LDX 0x61 dst = \*(uint32_t \*) (src + offset)
> +BPF_MEM | BPF_DW | BPF_LDX 0x79 dst = \*(uint64_t \*) (src + offset)
> +BPF_MEM | BPF_B | BPF_ST 0x72 \*(uint8_t \*) (dst + offset) = imm
> +BPF_MEM | BPF_H | BPF_ST 0x6a \*(uint16_t \*) (dst + offset) = imm
> +BPF_MEM | BPF_W | BPF_ST 0x62 \*(uint32_t \*) (dst + offset) = imm
> +BPF_MEM | BPF_DW | BPF_ST 0x7a \*(uint64_t \*) (dst + offset) = imm
> +BPF_MEM | BPF_B | BPF_STX 0x73 \*(uint8_t \*) (dst + offset) = src
> +BPF_MEM | BPF_H | BPF_STX 0x6b \*(uint16_t \*) (dst + offset) = src
> +BPF_MEM | BPF_W | BPF_STX 0x63 \*(uint32_t \*) (dst + offset) = src
> +BPF_MEM | BPF_DW | BPF_STX 0x7b \*(uint64_t \*) (dst + offset) = src
> +============================= ========= ====================================
I think the table is more verbose and less readable than the original text.
>
> Atomic operations
> -----------------
> @@ -338,9 +374,11 @@ by other eBPF programs or means outside of this specification.
> All atomic operations supported by eBPF are encoded as store operations
> that use the ``BPF_ATOMIC`` mode modifier as follows:
>
> -* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
> -* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
> -* 8-bit and 16-bit wide atomic operations are not supported.
> +* ``BPF_ATOMIC | BPF_W | BPF_STX`` (0xc3) for 32-bit operations
> +* ``BPF_ATOMIC | BPF_DW | BPF_STX`` (0xdb) for 64-bit operations
> +
> +Note that 8-bit (``BPF_B``) and 16-bit (``BPF_H``) wide atomic operations are not supported,
> +nor is ``BPF_ATOMIC | <size> | BPF_ST``.
>
> The 'imm' field is used to encode the actual atomic operation.
> Simple atomic operation use a subset of the values defined to encode
> @@ -355,16 +393,15 @@ BPF_AND 0x50 atomic and
> BPF_XOR 0xa0 atomic xor
> ======== ===== ===========
>
> -
> -``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
> +``BPF_ATOMIC | BPF_W | BPF_STX`` (0xc3) with 'imm' = BPF_ADD means::
>
> *(uint32_t *)(dst + offset) += src
>
> -``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
> +``BPF_ATOMIC | BPF_DW | BPF_STX`` (0xdb) with 'imm' = BPF ADD means::
>
> *(uint64_t *)(dst + offset) += src
>
> -In addition to the simple atomic operations, there also is a modifier and
> +In addition to the simple atomic operations above, there also is a modifier and
> two complex atomic operations:
>
> =========== ================ ===========================
> --
> 2.33.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 12/15] ebpf-docs: Add Linux note about register calling convention
2022-09-27 18:59 ` [PATCH 12/15] ebpf-docs: Add Linux note about register calling convention dthaler1968
@ 2022-09-30 22:17 ` Alexei Starovoitov
0 siblings, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 22:17 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:55PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/linux-notes.rst | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst
> index 522ebe27d..0581ba326 100644
> --- a/Documentation/bpf/linux-notes.rst
> +++ b/Documentation/bpf/linux-notes.rst
> @@ -7,6 +7,12 @@ Linux implementation notes
>
> This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
>
> +Registers and calling convention
> +================================
> +
> +All program types only use R1 which contains the "context", which is typically a structure containing all
> +the inputs needed, and the exit value for eBPF programs is passed as a 32 bit value.
There is a patch pending that makes return values 64-bit.
Also bpf progs that replace other bpf progs have all 5 input args.
I think this paragraph is unnecessary.
> +
> Arithmetic instructions
> =======================
>
> --
> 2.33.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 15/15] ebpf-docs: Add note about invalid instruction
2022-09-27 18:59 ` [PATCH 15/15] ebpf-docs: Add note about invalid instruction dthaler1968
@ 2022-09-30 22:21 ` Alexei Starovoitov
0 siblings, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 22:21 UTC (permalink / raw)
To: dthaler1968; +Cc: bpf, Dave Thaler
On Tue, Sep 27, 2022 at 06:59:58PM +0000, dthaler1968@googlemail.com wrote:
> From: Dave Thaler <dthaler@microsoft.com>
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> ---
> Documentation/bpf/clang-notes.rst | 5 +++++
> Documentation/bpf/instruction-set.rst | 3 +++
> 2 files changed, 8 insertions(+)
>
> diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst
> index 528feddf2..3c934421b 100644
> --- a/Documentation/bpf/clang-notes.rst
> +++ b/Documentation/bpf/clang-notes.rst
> @@ -20,6 +20,11 @@ Arithmetic instructions
> For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with
> ``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included.
>
> +Invalid instructions
> +====================
> +
> +Clang will generate the invalid ``BPF_CALL | BPF_X | BPF_JMP`` (0x8d) instruction if ``-O0`` is used.
I wouldn't call it invalid and it's not related to -O0.
It's a "reserved" instruction.
When we support indirect jumps that's what it would be.
> +
> Atomic operations
> =================
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index 2ac8f0dae..af9dc0cc6 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -303,6 +303,9 @@ with the remaining registers being ignored. The definition of a helper function
> is responsible for specifying the type (e.g., integer, pointer, etc.) of the value returned,
> the number of arguments, and the type of each argument.
>
> +Note that ``BPF_CALL | BPF_X | BPF_JMP`` (0x8d), where the helper function integer
> +would be read from a specified register, is not currently permitted.
I would say:
Note that ``BPF_CALL | BPF_X | BPF_JMP`` instruction is reserved and currently not permitted.
> +
> Runtime functions
> ~~~~~~~~~~~~~~~~~
> Runtime functions are like helper functions except that they are not specific
> --
> 2.33.4
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-30 21:59 ` Alexei Starovoitov
@ 2022-09-30 22:41 ` Dave Thaler
2022-09-30 23:41 ` Alexei Starovoitov
2023-09-29 21:03 ` Signed modulo operations Dave Thaler
0 siblings, 2 replies; 50+ messages in thread
From: Dave Thaler @ 2022-09-30 22:41 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: dthaler1968, bpf
> -----Original Message-----
> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Sent: Friday, September 30, 2022 2:59 PM
> To: Dave Thaler <dthaler@microsoft.com>
> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
> Subject: Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero,
> overflow, and underflow
>
> On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
> > [...]
> > > > +Also note that the modulo operation often varies by language when
> > > > +the dividend or divisor are negative, where Python, Ruby, etc.
> > > > +differ from C, Go, Java, etc. This specification requires that
> > > > +modulo use truncated division (where -13 % 3 == -1) as
> > > > +implemented in C, Go,
> > > > +etc.:
> > > > +
> > > > + a % n = a - n * trunc(a / n)
> > > > +
> > >
> > > Interesting bit of info, but I'm not sure how it relates to the ISA doc.
> >
> > It's because there's multiple definitions of modulo out there as the
> > paragraph notes, which differ in what they do with negative numbers.
> > The ISA defines the modulo operation as being the specific version above.
> > If you tried to implement the ISA in say Python and didn't know that,
> > you'd have a non-compliant implementation.
>
> Is it because the languages have weird rules to pick between signed vs
> unsigned mod?
> At least from llvm pov the smod and umod have fixed behavior.
It's because there's different mathematical definitions and different languages have chosen different definitions. E.g., languages/libraries that follow Knuth use a
different mathematical definition than C uses. For details see:
https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition
https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order
2022-09-30 22:02 ` Alexei Starovoitov
@ 2022-09-30 22:43 ` Dave Thaler
0 siblings, 0 replies; 50+ messages in thread
From: Dave Thaler @ 2022-09-30 22:43 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
[...]
> I guess it's useful, but how did you generate this table?
> Looks very error prone if all hex numbers were done by human.
Yes it was human generated and checked by other reviewers such as Quentin.
I wish there was an automated way. If someone else can come up with
an automated process, I welcome the patch contribution :)
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 09/15] ebpf-docs: Explain helper functions
2022-09-30 22:01 ` Alexei Starovoitov
@ 2022-09-30 23:01 ` Dave Thaler
0 siblings, 0 replies; 50+ messages in thread
From: Dave Thaler @ 2022-09-30 23:01 UTC (permalink / raw)
To: Alexei Starovoitov, dthaler1968; +Cc: bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
[...]
> > +Helper functions
> > +~~~~~~~~~~~~~~~~
> > +Helper functions are a concept whereby BPF programs can call into a
> > +set of function calls exposed by the eBPF runtime. Each helper
> > +function is identified by an integer used in a ``BPF_CALL`` instruction.
> > +The available helper functions may differ for each eBPF program type.
> > +
> > +Conceptually, each helper function is implemented with a commonly
> > +shared function signature defined as:
> > +
> > + uint64_t function(uint64_t r1, uint64_t r2, uint64_t r3, uint64_t
> > + r4, uint64_t r5)
> > +
> > +In actuality, each helper function is defined as taking between 0 and
> > +5 arguments, with the remaining registers being ignored. The
> > +definition of a helper function is responsible for specifying the
> > +type (e.g., integer, pointer, etc.) of the value returned, the number of
> arguments, and the type of each argument.
>
> If we explain helpers in the doc then we should explain kfuncs and bpf-to-bpf
> calls as well.
> Otherwise it looks incomplete and eventually will suffer the same issue as '64-
> bit instructionS'.
Agree, and blurbs were indeed added in
[PATCH 14/15] ebpf-docs: Add extended call instructions
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-30 22:41 ` Dave Thaler
@ 2022-09-30 23:41 ` Alexei Starovoitov
2022-10-04 16:36 ` Dave Thaler
2023-09-29 21:03 ` Signed modulo operations Dave Thaler
1 sibling, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-09-30 23:41 UTC (permalink / raw)
To: Dave Thaler; +Cc: dthaler1968, bpf
On Fri, Sep 30, 2022 at 3:41 PM Dave Thaler <dthaler@microsoft.com> wrote:
>
> > -----Original Message-----
> > From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Sent: Friday, September 30, 2022 2:59 PM
> > To: Dave Thaler <dthaler@microsoft.com>
> > Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
> > Subject: Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero,
> > overflow, and underflow
> >
> > On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
> > > [...]
> > > > > +Also note that the modulo operation often varies by language when
> > > > > +the dividend or divisor are negative, where Python, Ruby, etc.
> > > > > +differ from C, Go, Java, etc. This specification requires that
> > > > > +modulo use truncated division (where -13 % 3 == -1) as
> > > > > +implemented in C, Go,
> > > > > +etc.:
> > > > > +
> > > > > + a % n = a - n * trunc(a / n)
> > > > > +
> > > >
> > > > Interesting bit of info, but I'm not sure how it relates to the ISA doc.
> > >
> > > It's because there's multiple definitions of modulo out there as the
> > > paragraph notes, which differ in what they do with negative numbers.
> > > The ISA defines the modulo operation as being the specific version above.
> > > If you tried to implement the ISA in say Python and didn't know that,
> > > you'd have a non-compliant implementation.
> >
> > Is it because the languages have weird rules to pick between signed vs
> > unsigned mod?
> > At least from llvm pov the smod and umod have fixed behavior.
>
> It's because there's different mathematical definitions and different languages have chosen different definitions. E.g., languages/libraries that follow Knuth use a
> different mathematical definition than C uses. For details see:
>
> https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition
>
> https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
Those differences are in signed div/mod only, right?
Unsigned div/mod doesn't have it, right?
bpf has only unsigned div/mod.
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 11/15] ebpf-docs: Improve English readability
2022-09-30 22:16 ` Alexei Starovoitov
@ 2022-10-04 14:32 ` Dave Thaler
2022-10-04 15:38 ` Alexei Starovoitov
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 14:32 UTC (permalink / raw)
To: Alexei Starovoitov, dthaler1968; +Cc: bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > +The eBPF instruction set consists of eleven 64 bit registers, a
> > +program counter, and 512 bytes of stack space.
>
> I would not add 512 to a doc.
> That's what we have now, but we might relax that in the future.
I think it's important to at least give a minimum. Will change to
"at least 512".
[...]
> > +Registers R0 - R5 are scratch registers, meaning the BPF program
> > +needs to either spill them to the BPF stack or move them to callee
> > +saved registers if these arguments are to be reused across multiple
> > +function calls. Spilling means that the value in the register is
> > +moved to the BPF stack. The reverse operation of moving the variable
> from the BPF stack to the register is called filling.
> > +The reason for spilling/filling is due to the limited number of registers.
>
> More canonical way would be to say that r0-r5 are caller saved.
Will change "scratch" to "caller-saved"
[...]
> > -``BPF_ADD | BPF_X | BPF_ALU64`` means::
> > +``BPF_ADD | BPF_X | BPF_ALU64`` (0x0f) means::
>
> I don't think adding hex values help here.
I found it very helpful in verifying that the Appendix table
was correct, and providing a correlation to the text here
that shows the construction of the value. So I'd like to keep them.
[...]
> > -The 1-bit source operand field in the opcode is used to to select
> > what byte -order the operation convert from or to:
> > +Byte swap instructions use non-default semantics of the 1-bit
> > +'source' field in
>
> I would drop 'non-default'. Many fields have different meanings depending
> on opcode.
> BPF_SRC() macro just reads that bit.
Will reword to say:
Byte swap instructions use the 1-bit 'source' field in the 'opcode' field
as follows. Instead of indicating the source operator, it is instead
used to select what byte order the operation converts from or to:
[...]
> > +* mnenomic indicates a short form that might be displayed by some
> > +tools such as disassemblers
> > +* 'htoleNN()' indicates converting a NN-bit value from host byte
> > +order to little-endian byte order
> > +* 'htobeNN()' indicates converting a NN-bit value from host byte
> > +order to big-endian byte order
>
> I think we need to add normal bswap insn.
> These to_le and to_be are very awkward to use.
> As soon as we have new insn the compiler will be using it.
> There is no equivalent of to_be and to_le in C. It wasn't good ISA design.
I will interpret this as a request for someone to do code work, rather
than any request for immediately changes to the doc :)
[...]
> > Regular load and store operations
> > ---------------------------------
> >
> > The ``BPF_MEM`` mode modifier is used to encode regular load and
> > store instructions that transfer data between a register and memory.
> >
> > -``BPF_MEM | <size> | BPF_STX`` means::
> > -
> > - *(size *) (dst + offset) = src_reg
> > -
> > -``BPF_MEM | <size> | BPF_ST`` means::
> > -
> > - *(size *) (dst + offset) = imm32
> > -
> > -``BPF_MEM | <size> | BPF_LDX`` means::
> > -
> > - dst = *(size *) (src + offset)
> > -
> > -Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
> > +============================= =========
> ====================================
> > +opcode construction opcode pseudocode
> > +============================= =========
> ====================================
> > +BPF_MEM | BPF_B | BPF_LDX 0x71 dst = \*(uint8_t \*) (src + offset)
> > +BPF_MEM | BPF_H | BPF_LDX 0x69 dst = \*(uint16_t \*) (src +
> offset)
> > +BPF_MEM | BPF_W | BPF_LDX 0x61 dst = \*(uint32_t \*) (src +
> offset)
> > +BPF_MEM | BPF_DW | BPF_LDX 0x79 dst = \*(uint64_t \*) (src +
> offset)
> > +BPF_MEM | BPF_B | BPF_ST 0x72 \*(uint8_t \*) (dst + offset) = imm
> > +BPF_MEM | BPF_H | BPF_ST 0x6a \*(uint16_t \*) (dst + offset) =
> imm
> > +BPF_MEM | BPF_W | BPF_ST 0x62 \*(uint32_t \*) (dst + offset) =
> imm
> > +BPF_MEM | BPF_DW | BPF_ST 0x7a \*(uint64_t \*) (dst + offset) =
> imm
> > +BPF_MEM | BPF_B | BPF_STX 0x73 \*(uint8_t \*) (dst + offset) = src
> > +BPF_MEM | BPF_H | BPF_STX 0x6b \*(uint16_t \*) (dst + offset) =
> src
> > +BPF_MEM | BPF_W | BPF_STX 0x63 \*(uint32_t \*) (dst + offset) =
> src
> > +BPF_MEM | BPF_DW | BPF_STX 0x7b \*(uint64_t \*) (dst + offset) =
> src
> > +============================= =========
> > +====================================
>
> I think the table is more verbose and less readable than the original text.
Will change back to original text.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 08/15] ebpf-docs: Use consistent names for the same field
2022-09-30 20:57 ` Alexei Starovoitov
@ 2022-10-04 14:44 ` Dave Thaler
0 siblings, 0 replies; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 14:44 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
[...]
> > +src
> > + the source register number (0-10), except where otherwise specified
> > + (`64-bit immediate instructions`_ reuse this field for other
> > +purposes)
>
> There are more than one?
> I guess we have such section now,
> but in ISA it really is only one insn. LD_IMM64.
> It's one insn for the interpreter and one insn for JITs.
Here the plural is really referring to occurrences in programs,
rather than implying multiple opcodes, so I don't think the grammar
is incorrect.
[...]
> > +As discussed below in `64-bit immediate instructions`_, some
> > +instructions use a 64-bit immediate value that is constructed as follows.
> > +The 64 bits following the basic instruction contain a pseudo
> > +instruction using the same format but with opcode, dst, src, and
> > +offset all set to zero, and imm containing the high 32 bits of the immediate
> value.
>
> 'instructions' here and further reads a bit odd.
> May be calling it one instruction where imm_lo/hi have different semantics
> depending on src would be better?
I will reword the first sentence above to:
As discussed below in `64-bit immediate instructions`_, a 64-bit immediate
instruction uses a 64-bit immediate value that is constructed as follows.
[...]
>
> > +
> > +================= ==================
> > +64 bits (MSB) 64 bits (LSB)
> > +================= ================== basic instruction pseudo
> > +instruction ================= ==================
> > +
> > +Thus the 64-bit immediate value is constructed as follows:
> > +
> > + imm64 = imm + (next_imm << 32)
> > +
> > +where 'next_imm' refers to the imm value of the pseudo instruction
> > +following the basic instruction.
> > +
> > +In the remainder of this document 'src' and 'dst' refer to the values
> > +of the source and destination registers, respectively, rather than the
> register number.
> > +
> > Instruction classes
> > -------------------
> >
> > @@ -75,20 +114,24 @@ For arithmetic and jump instructions
> > (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` an ============== ======
> =================
> > 4 bits (MSB) 1 bit 3 bits (LSB)
> > ============== ====== ================= -operation code source
> > instruction class
> > +code source instruction class
> > ============== ====== =================
> >
> > -The 4th bit encodes the source operand:
>
> feels wrong to lose this part.
The concept is still in the document, just in the "Arithmetic and jump instructions"
section since it seems specific to those (i.e., not the "Load and store instructions").
> > +code
> > + the operation code, whose meaning varies by instruction class
> >
> > - ====== ===== ========================================
> > - source value description
> > - ====== ===== ========================================
> > - BPF_K 0x00 use 32-bit immediate as source operand
> > - BPF_X 0x08 use 'src_reg' register as source operand
> > - ====== ===== ========================================
> > +source
> > + the source operand location, which unless otherwise specified is one of:
> >
> > -The four MSB bits store the operation code.
>
> same here.
Still there in the supersection.
These are easier to see in the rendered document.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 14:32 ` Dave Thaler
@ 2022-10-04 15:38 ` Alexei Starovoitov
2022-10-04 15:55 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-10-04 15:38 UTC (permalink / raw)
To: Dave Thaler; +Cc: dthaler1968, bpf
On Tue, Oct 4, 2022 at 7:32 AM Dave Thaler <dthaler@microsoft.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > +The eBPF instruction set consists of eleven 64 bit registers, a
> > > +program counter, and 512 bytes of stack space.
> >
> > I would not add 512 to a doc.
> > That's what we have now, but we might relax that in the future.
>
> I think it's important to at least give a minimum. Will change to
> "at least 512".
I'm not sure 512 stands as a minimum either.
When we have the gatekeeper prog it will be able to enforce
any stack size.
> [...]
> > > +Registers R0 - R5 are scratch registers, meaning the BPF program
> > > +needs to either spill them to the BPF stack or move them to callee
> > > +saved registers if these arguments are to be reused across multiple
> > > +function calls. Spilling means that the value in the register is
> > > +moved to the BPF stack. The reverse operation of moving the variable
> > from the BPF stack to the register is called filling.
> > > +The reason for spilling/filling is due to the limited number of registers.
> >
> > More canonical way would be to say that r0-r5 are caller saved.
>
> Will change "scratch" to "caller-saved"
>
> [...]
> > > -``BPF_ADD | BPF_X | BPF_ALU64`` means::
> > > +``BPF_ADD | BPF_X | BPF_ALU64`` (0x0f) means::
> >
> > I don't think adding hex values help here.
>
> I found it very helpful in verifying that the Appendix table
> was correct, and providing a correlation to the text here
> that shows the construction of the value. So I'd like to keep them.
I think that means that the appendix table shouldn't be there either.
I'd like to avoid both.
> [...]
> > > -The 1-bit source operand field in the opcode is used to to select
> > > what byte -order the operation convert from or to:
> > > +Byte swap instructions use non-default semantics of the 1-bit
> > > +'source' field in
> >
> > I would drop 'non-default'. Many fields have different meanings depending
> > on opcode.
> > BPF_SRC() macro just reads that bit.
>
> Will reword to say:
> Byte swap instructions use the 1-bit 'source' field in the 'opcode' field
> as follows. Instead of indicating the source operator, it is instead
> used to select what byte order the operation converts from or to:
makes sense.
> [...]
> > > +* mnenomic indicates a short form that might be displayed by some
> > > +tools such as disassemblers
> > > +* 'htoleNN()' indicates converting a NN-bit value from host byte
> > > +order to little-endian byte order
> > > +* 'htobeNN()' indicates converting a NN-bit value from host byte
> > > +order to big-endian byte order
> >
> > I think we need to add normal bswap insn.
> > These to_le and to_be are very awkward to use.
> > As soon as we have new insn the compiler will be using it.
> > There is no equivalent of to_be and to_le in C. It wasn't good ISA design.
>
> I will interpret this as a request for someone to do code work, rather
> than any request for immediately changes to the doc :)
>
> [...]
> > > Regular load and store operations
> > > ---------------------------------
> > >
> > > The ``BPF_MEM`` mode modifier is used to encode regular load and
> > > store instructions that transfer data between a register and memory.
> > >
> > > -``BPF_MEM | <size> | BPF_STX`` means::
> > > -
> > > - *(size *) (dst + offset) = src_reg
> > > -
> > > -``BPF_MEM | <size> | BPF_ST`` means::
> > > -
> > > - *(size *) (dst + offset) = imm32
> > > -
> > > -``BPF_MEM | <size> | BPF_LDX`` means::
> > > -
> > > - dst = *(size *) (src + offset)
> > > -
> > > -Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
> > > +============================= =========
> > ====================================
> > > +opcode construction opcode pseudocode
> > > +============================= =========
> > ====================================
> > > +BPF_MEM | BPF_B | BPF_LDX 0x71 dst = \*(uint8_t \*) (src + offset)
> > > +BPF_MEM | BPF_H | BPF_LDX 0x69 dst = \*(uint16_t \*) (src +
> > offset)
> > > +BPF_MEM | BPF_W | BPF_LDX 0x61 dst = \*(uint32_t \*) (src +
> > offset)
> > > +BPF_MEM | BPF_DW | BPF_LDX 0x79 dst = \*(uint64_t \*) (src +
> > offset)
> > > +BPF_MEM | BPF_B | BPF_ST 0x72 \*(uint8_t \*) (dst + offset) = imm
> > > +BPF_MEM | BPF_H | BPF_ST 0x6a \*(uint16_t \*) (dst + offset) =
> > imm
> > > +BPF_MEM | BPF_W | BPF_ST 0x62 \*(uint32_t \*) (dst + offset) =
> > imm
> > > +BPF_MEM | BPF_DW | BPF_ST 0x7a \*(uint64_t \*) (dst + offset) =
> > imm
> > > +BPF_MEM | BPF_B | BPF_STX 0x73 \*(uint8_t \*) (dst + offset) = src
> > > +BPF_MEM | BPF_H | BPF_STX 0x6b \*(uint16_t \*) (dst + offset) =
> > src
> > > +BPF_MEM | BPF_W | BPF_STX 0x63 \*(uint32_t \*) (dst + offset) =
> > src
> > > +BPF_MEM | BPF_DW | BPF_STX 0x7b \*(uint64_t \*) (dst + offset) =
> > src
> > > +============================= =========
> > > +====================================
> >
> > I think the table is more verbose and less readable than the original text.
>
> Will change back to original text.
Please see git. I've removed that table. Please don't add it back.
I see no value in such tables other than more things to get wrong.
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 15:38 ` Alexei Starovoitov
@ 2022-10-04 15:55 ` Dave Thaler
2022-10-04 15:56 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 15:55 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
> > I found it very helpful in verifying that the Appendix table was
> > correct, and providing a correlation to the text here that shows the
> > construction of the value. So I'd like to keep them.
>
> I think that means that the appendix table shouldn't be there either.
> I'd like to avoid both.
I've heard from multiple people with different affiliations that the appendix
is the most useful part of the document, and what they wanted to see
added. So I added by popular request.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 15:55 ` Dave Thaler
@ 2022-10-04 15:56 ` Dave Thaler
2022-10-04 16:19 ` Alexei Starovoitov
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 15:56 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
Also worth noting that Quentin has a script that I believe parses the appendix
and uses it to generate a validator for ebpf programs. (Which also
helps validate the appendix).
Dave
> -----Original Message-----
> From: Dave Thaler
> Sent: Tuesday, October 4, 2022 8:55 AM
> To: 'Alexei Starovoitov' <alexei.starovoitov@gmail.com>
> Cc: bpf@vger.kernel.org
> Subject: RE: [PATCH 11/15] ebpf-docs: Improve English readability
>
> > > I found it very helpful in verifying that the Appendix table was
> > > correct, and providing a correlation to the text here that shows the
> > > construction of the value. So I'd like to keep them.
> >
> > I think that means that the appendix table shouldn't be there either.
> > I'd like to avoid both.
>
> I've heard from multiple people with different affiliations that the appendix
> is the most useful part of the document, and what they wanted to see
> added. So I added by popular request.
>
> Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 15:56 ` Dave Thaler
@ 2022-10-04 16:19 ` Alexei Starovoitov
2022-10-04 16:41 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-10-04 16:19 UTC (permalink / raw)
To: Dave Thaler; +Cc: bpf
On Tue, Oct 4, 2022 at 8:56 AM Dave Thaler <dthaler@microsoft.com> wrote:
>
> Also worth noting that Quentin has a script that I believe parses the appendix
> and uses it to generate a validator for ebpf programs. (Which also
> helps validate the appendix).
The last thing I want to see is a document becoming a description
of the code.
We've always been doing it the other way around.
The documentation can live next to the code and docs automatically
generated from .h or .c files.
Doing the other way around sooner or later will be a disaster.
Imagine a typo in instruction-set.rst.
What should we do next? Fix a typo and say, look, the code
behaves differently, so we're fixing the doc.
If so, there is close to zero reason to add hex to the doc,
since it's not an authoritative answer.
On the other hand if instruction-set.rst is the source of
the truth then the code would have to change, which we obviously
cannot do. So let's not get us into the corner with
such tables.
> Dave
>
> > -----Original Message-----
> > From: Dave Thaler
> > Sent: Tuesday, October 4, 2022 8:55 AM
> > To: 'Alexei Starovoitov' <alexei.starovoitov@gmail.com>
> > Cc: bpf@vger.kernel.org
> > Subject: RE: [PATCH 11/15] ebpf-docs: Improve English readability
> >
> > > > I found it very helpful in verifying that the Appendix table was
> > > > correct, and providing a correlation to the text here that shows the
> > > > construction of the value. So I'd like to keep them.
> > >
> > > I think that means that the appendix table shouldn't be there either.
> > > I'd like to avoid both.
> >
> > I've heard from multiple people with different affiliations that the appendix
> > is the most useful part of the document, and what they wanted to see
> > added. So I added by popular request.
These people should speak up then.
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-09-30 23:41 ` Alexei Starovoitov
@ 2022-10-04 16:36 ` Dave Thaler
2022-10-04 17:24 ` div_k. Was: " Alexei Starovoitov
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 16:36 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
[...]
> > > > > > +Also note that the modulo operation often varies by language
> > > > > > +when the dividend or divisor are negative, where Python, Ruby,
> etc.
> > > > > > +differ from C, Go, Java, etc. This specification requires
> > > > > > +that modulo use truncated division (where -13 % 3 == -1) as
> > > > > > +implemented in C, Go,
> > > > > > +etc.:
> > > > > > +
> > > > > > + a % n = a - n * trunc(a / n)
> > > > > > +
> > > > >
> > > > > Interesting bit of info, but I'm not sure how it relates to the ISA doc.
[...]
> Those differences are in signed div/mod only, right?
> Unsigned div/mod doesn't have it, right?
> bpf has only unsigned div/mod.
Ah right, will replace. However since imm is a signed integer, that leaves
an ambiguity that is important to clarify.
What is the expected value for the following 64-bit BPF_DIV operation:
r0 = 0xFFFFFFFFFFFFFFFF
r0 /= -10
Is it 0x1 or 0x10000000a? i.e., is the -10 sign extended to
0xFFFFFFFFFFFFFFF6 or treated as 0xFFFFFFF6 when doing the unsigned
division?
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 16:19 ` Alexei Starovoitov
@ 2022-10-04 16:41 ` Dave Thaler
2022-10-04 16:54 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 16:41 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
> -----Original Message-----
> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Sent: Tuesday, October 4, 2022 9:20 AM
> To: Dave Thaler <dthaler@microsoft.com>
> Cc: bpf@vger.kernel.org
> Subject: Re: [PATCH 11/15] ebpf-docs: Improve English readability
>
> On Tue, Oct 4, 2022 at 8:56 AM Dave Thaler <dthaler@microsoft.com>
> wrote:
> >
> > Also worth noting that Quentin has a script that I believe parses the
> > appendix and uses it to generate a validator for ebpf programs.
> > (Which also helps validate the appendix).
>
> The last thing I want to see is a document becoming a description of the
> code.
> We've always been doing it the other way around.
> The documentation can live next to the code and docs automatically
> generated from .h or .c files.
> Doing the other way around sooner or later will be a disaster.
> Imagine a typo in instruction-set.rst.
> What should we do next? Fix a typo and say, look, the code behaves
> differently, so we're fixing the doc.
> If so, there is close to zero reason to add hex to the doc, since it's not an
> authoritative answer.
> On the other hand if instruction-set.rst is the source of the truth then the
> code would have to change, which we obviously cannot do. So let's not get us
> into the corner with such tables.
The point of a standard is to be a source of truth. If another implementation
(ubpf, hbpf, etc.) doesn't match the spec, then the code would have to change.
Such tables are seen as invaluable for determining correctness of other
implementations. So the feedback is that it's important to have such if we
want everyone else to do the right thing.
> These people should speak up then.
I agree.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 16:41 ` Dave Thaler
@ 2022-10-04 16:54 ` Dave Thaler
2022-10-06 20:44 ` Jim Harris
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 16:54 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: bpf
Regarding the tables:
> Such tables are seen as invaluable for determining correctness of other
> implementations. So the feedback is that it's important to have such if we
> want everyone else to do the right thing.
>
> > These people should speak up then.
>
> I agree.
Here's two public examples...
Christoph Hellwig, said on May 17 at https://lore.kernel.org/bpf/20220517091011.GA18723@lst.de/:
> One useful thing for this would be an opcode table with all the
> optional field usage in machine readable format.
>
> Jim who is on CC has already built a nice table off all opcodes based
> on existing material that might be a good starting point.
Jim Harris responded on that thread with a strawman which was
used as the basis for the table in the appendix.
Jim then commented in the github version on August 30:
> In my opinion, this table is the biggest thing that has been missing,
> and will be most essential for a more "formal" specification.
I will encourage them and others to comment on this thread.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* div_k. Was: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-10-04 16:36 ` Dave Thaler
@ 2022-10-04 17:24 ` Alexei Starovoitov
2022-10-04 18:23 ` Dave Thaler
0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-10-04 17:24 UTC (permalink / raw)
To: Dave Thaler, Daniel Borkmann; +Cc: bpf, Johan Almbladh
On Tue, Oct 4, 2022 at 9:36 AM Dave Thaler <dthaler@microsoft.com> wrote:
> > Those differences are in signed div/mod only, right?
> > Unsigned div/mod doesn't have it, right?
> > bpf has only unsigned div/mod.
>
> Ah right, will replace. However since imm is a signed integer, that leaves
> an ambiguity that is important to clarify.
>
> What is the expected value for the following 64-bit BPF_DIV operation:
> r0 = 0xFFFFFFFFFFFFFFFF
> r0 /= -10
> Is it 0x1 or 0x10000000a? i.e., is the -10 sign extended to
> 0xFFFFFFFFFFFFFFF6 or treated as 0xFFFFFFF6 when doing the unsigned
> division?
x86 and arm64 JITs treat it as imm32 is zero extended.
But looking at the interpreter:
ALU64_DIV_K:
DST = div64_u64(DST, IMM);
it looks like we have a bug there.
But we have a bunch of div_k tests in lib/test_bpf.c
including negative imm32. Hmm.
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: div_k. Was: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-10-04 17:24 ` div_k. Was: " Alexei Starovoitov
@ 2022-10-04 18:23 ` Dave Thaler
2022-10-04 18:34 ` Alexei Starovoitov
0 siblings, 1 reply; 50+ messages in thread
From: Dave Thaler @ 2022-10-04 18:23 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann; +Cc: bpf, Johan Almbladh
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
[...]
> > What is the expected value for the following 64-bit BPF_DIV operation:
> > r0 = 0xFFFFFFFFFFFFFFFF
> > r0 /= -10
> > Is it 0x1 or 0x10000000a? i.e., is the -10 sign extended to
> > 0xFFFFFFFFFFFFFFF6 or treated as 0xFFFFFFF6 when doing the unsigned
> > division?
>
> x86 and arm64 JITs treat it as imm32 is zero extended.
Alan Jowett just pointed out to me that the question is not limited to DIV.
r0 = 1
r0 += -1
Is the answer 0 or 0x0000000100000000?
Assuming the answer is to zero extend imm32, it contains the latter, which
would be counter-intuitive enough to make it important to point out explicitly.
> But looking at the interpreter:
> ALU64_DIV_K:
> DST = div64_u64(DST, IMM); it looks like we have a bug there.
> But we have a bunch of div_k tests in lib/test_bpf.c including negative
> imm32. Hmm.
Yeah.
"ALU64_DIV_K: 0xffffffffffffffff / (-1) = 0x0000000000000001",
"ALU64_ADD_K: 2147483646 + -2147483647 = -1",
"ALU64_ADD_K: 0 + (-1) = 0xffffffffffffffff",
"ALU64_MUL_K: 1 * -2147483647 = -2147483647",
"ALU64_MUL_K: 1 * (-1) = 0xffffffffffffffff",
"ALU64_AND_K: 0x0000ffffffff0000 & -1 = 0x0000ffffffff0000",
"ALU64_AND_K: 0xffffffffffffffff & -1 = 0xffffffffffffffff",
"ALU64_OR_K: 0x000000000000000 | -1 = 0xffffffffffffffff",
The above assume sign extension not zero extension is the correct behavior
for these operations, if I understand correctly.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: div_k. Was: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow
2022-10-04 18:23 ` Dave Thaler
@ 2022-10-04 18:34 ` Alexei Starovoitov
0 siblings, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-10-04 18:34 UTC (permalink / raw)
To: Dave Thaler; +Cc: Daniel Borkmann, bpf, Johan Almbladh
On Tue, Oct 4, 2022 at 11:23 AM Dave Thaler <dthaler@microsoft.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> [...]
> > > What is the expected value for the following 64-bit BPF_DIV operation:
> > > r0 = 0xFFFFFFFFFFFFFFFF
> > > r0 /= -10
> > > Is it 0x1 or 0x10000000a? i.e., is the -10 sign extended to
> > > 0xFFFFFFFFFFFFFFF6 or treated as 0xFFFFFFF6 when doing the unsigned
> > > division?
> >
> > x86 and arm64 JITs treat it as imm32 is zero extended.
>
> Alan Jowett just pointed out to me that the question is not limited to DIV.
>
> r0 = 1
> r0 += -1
>
> Is the answer 0 or 0x0000000100000000?
> Assuming the answer is to zero extend imm32, it contains the latter, which
> would be counter-intuitive enough to make it important to point out explicitly.
This is an obvious one. imm32 is _sign_ extended everywhere.
>
> > But looking at the interpreter:
> > ALU64_DIV_K:
> > DST = div64_u64(DST, IMM); it looks like we have a bug there.
> > But we have a bunch of div_k tests in lib/test_bpf.c including negative
> > imm32. Hmm.
Actually I misread the JITs.
/* mov r11, imm32 */
EMIT3_off32(0x49, 0xC7, 0xC3, imm32);
It is sign extending. There is no bug in the interpreter or JIT.
> Yeah.
>
> "ALU64_DIV_K: 0xffffffffffffffff / (-1) = 0x0000000000000001",
> "ALU64_ADD_K: 2147483646 + -2147483647 = -1",
> "ALU64_ADD_K: 0 + (-1) = 0xffffffffffffffff",
> "ALU64_MUL_K: 1 * -2147483647 = -2147483647",
> "ALU64_MUL_K: 1 * (-1) = 0xffffffffffffffff",
> "ALU64_AND_K: 0x0000ffffffff0000 & -1 = 0x0000ffffffff0000",
> "ALU64_AND_K: 0xffffffffffffffff & -1 = 0xffffffffffffffff",
> "ALU64_OR_K: 0x000000000000000 | -1 = 0xffffffffffffffff",
>
> The above assume sign extension not zero extension is the correct behavior
> for these operations, if I understand correctly.
>
> Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 11/15] ebpf-docs: Improve English readability
2022-10-04 16:54 ` Dave Thaler
@ 2022-10-06 20:44 ` Jim Harris
0 siblings, 0 replies; 50+ messages in thread
From: Jim Harris @ 2022-10-06 20:44 UTC (permalink / raw)
To: Dave Thaler; +Cc: Alexei Starovoitov, bpf
On Tue, Oct 4, 2022 at 10:09 AM Dave Thaler <dthaler@microsoft.com> wrote:
>
> Regarding the tables:
> > Such tables are seen as invaluable for determining correctness of other
> > implementations. So the feedback is that it's important to have such if we
> > want everyone else to do the right thing.
> >
> > > These people should speak up then.
> >
> > I agree.
>
> Here's two public examples...
>
> Christoph Hellwig, said on May 17 at https://lore.kernel.org/bpf/20220517091011.GA18723@lst.de/:
> > One useful thing for this would be an opcode table with all the
> > optional field usage in machine readable format.
> >
> > Jim who is on CC has already built a nice table off all opcodes based
> > on existing material that might be a good starting point.
>
> Jim Harris responded on that thread with a strawman which was
> used as the basis for the table in the appendix.
>
> Jim then commented in the github version on August 30:
> > In my opinion, this table is the biggest thing that has been missing,
> > and will be most essential for a more "formal" specification.
I think an opcode table is a huge help for developing alternate eBPF
implementations - anything that makes it more explicit which opcodes
are valid (and which ones are not).
For example, the section for BPF_ALU and BPF_ALU64 classes lists the
operation codes. BPF_END is in that list. Description says "see
separate section below". The "Byte swap instructions" section then
says that byte swap instructions always use BPF_ALU, even for 64-bit
widths. So an implementer can synthesize all of this to determine
that opcodes 0xD7 and 0xDF (which have BPF_ALU64 | BPF_END) are not
valid. It would be more clear if somewhere there was a list that
explicitly showed that 0xD4 and 0xDC (BPF_ALU | BPF_END) were valid,
but 0xD7 and 0xDF were not.
If there's concern that an appendix gets out of sync with the code and
the primary sections of the instruction set doc, maybe these opcodes
can be added to the existing per-class-code tables in the instruction
set doc. A full opcode table could probably be generated from those
tables instead of the hand-written one that Dave and I worked on in
his patch.
-Jim
> I will encourage them and others to comment on this thread.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Signed modulo operations
2022-09-30 22:41 ` Dave Thaler
2022-09-30 23:41 ` Alexei Starovoitov
@ 2023-09-29 21:03 ` Dave Thaler
2023-09-29 21:03 ` [Bpf] " Dave Thaler
` (2 more replies)
1 sibling, 3 replies; 50+ messages in thread
From: Dave Thaler @ 2023-09-29 21:03 UTC (permalink / raw)
To: bpf; +Cc: bpf
In the email discussion below, we concluded it wasn't relevant at the time because
there were no signed modulo instructions. However, now there is and I believe the
ambiguity in the current spec needs to be addressed.
> -----Original Message-----
> From: Dave Thaler
> Sent: Friday, September 30, 2022 3:42 PM
> To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
> Subject: RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero,
> overflow, and underflow
>
> > -----Original Message-----
> > From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Sent: Friday, September 30, 2022 2:59 PM
> > To: Dave Thaler <dthaler@microsoft.com>
> > Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
> > Subject: Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by
> > zero, overflow, and underflow
> >
> > On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
> > > [...]
> > > > > +Also note that the modulo operation often varies by language
> > > > > +when the dividend or divisor are negative, where Python, Ruby, etc.
> > > > > +differ from C, Go, Java, etc. This specification requires that
> > > > > +modulo use truncated division (where -13 % 3 == -1) as
> > > > > +implemented in C, Go,
> > > > > +etc.:
> > > > > +
> > > > > + a % n = a - n * trunc(a / n)
> > > > > +
> > > >
> > > > Interesting bit of info, but I'm not sure how it relates to the ISA doc.
> > >
> > > It's because there's multiple definitions of modulo out there as the
> > > paragraph notes, which differ in what they do with negative numbers.
> > > The ISA defines the modulo operation as being the specific version above.
> > > If you tried to implement the ISA in say Python and didn't know
> > > that, you'd have a non-compliant implementation.
> >
> > Is it because the languages have weird rules to pick between signed vs
> > unsigned mod?
> > At least from llvm pov the smod and umod have fixed behavior.
>
> It's because there's different mathematical definitions and different languages
> have chosen different definitions. E.g., languages/libraries that follow Knuth
> use a different mathematical definition than C uses. For details see:
>
> https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition
>
> https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
>
> Dave
Perhaps text like the proposed snippet quoted in the exchange above should be
added around the new text that now appears in the doc, i.e. the ambiguous text
is currently:
> For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
> 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
> is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
> interpreted as a 64-bit signed value.
Dave
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Bpf] Signed modulo operations
2023-09-29 21:03 ` Signed modulo operations Dave Thaler
@ 2023-09-29 21:03 ` Dave Thaler
2023-09-30 6:07 ` Carsten Bormann
2023-09-30 14:48 ` Alexei Starovoitov
2 siblings, 0 replies; 50+ messages in thread
From: Dave Thaler @ 2023-09-29 21:03 UTC (permalink / raw)
To: bpf; +Cc: bpf
In the email discussion below, we concluded it wasn't relevant at the time because
there were no signed modulo instructions. However, now there is and I believe the
ambiguity in the current spec needs to be addressed.
> -----Original Message-----
> From: Dave Thaler
> Sent: Friday, September 30, 2022 3:42 PM
> To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
> Subject: RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero,
> overflow, and underflow
>
> > -----Original Message-----
> > From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Sent: Friday, September 30, 2022 2:59 PM
> > To: Dave Thaler <dthaler@microsoft.com>
> > Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
> > Subject: Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by
> > zero, overflow, and underflow
> >
> > On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
> > > [...]
> > > > > +Also note that the modulo operation often varies by language
> > > > > +when the dividend or divisor are negative, where Python, Ruby, etc.
> > > > > +differ from C, Go, Java, etc. This specification requires that
> > > > > +modulo use truncated division (where -13 % 3 == -1) as
> > > > > +implemented in C, Go,
> > > > > +etc.:
> > > > > +
> > > > > + a % n = a - n * trunc(a / n)
> > > > > +
> > > >
> > > > Interesting bit of info, but I'm not sure how it relates to the ISA doc.
> > >
> > > It's because there's multiple definitions of modulo out there as the
> > > paragraph notes, which differ in what they do with negative numbers.
> > > The ISA defines the modulo operation as being the specific version above.
> > > If you tried to implement the ISA in say Python and didn't know
> > > that, you'd have a non-compliant implementation.
> >
> > Is it because the languages have weird rules to pick between signed vs
> > unsigned mod?
> > At least from llvm pov the smod and umod have fixed behavior.
>
> It's because there's different mathematical definitions and different languages
> have chosen different definitions. E.g., languages/libraries that follow Knuth
> use a different mathematical definition than C uses. For details see:
>
> https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition
>
> https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
>
> Dave
Perhaps text like the proposed snippet quoted in the exchange above should be
added around the new text that now appears in the doc, i.e. the ambiguous text
is currently:
> For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
> 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
> is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
> interpreted as a 64-bit signed value.
Dave
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [Bpf] Signed modulo operations
2023-09-29 21:03 ` Signed modulo operations Dave Thaler
2023-09-29 21:03 ` [Bpf] " Dave Thaler
@ 2023-09-30 6:07 ` Carsten Bormann
2023-09-30 6:07 ` Carsten Bormann
2023-09-30 14:48 ` Alexei Starovoitov
2 siblings, 1 reply; 50+ messages in thread
From: Carsten Bormann @ 2023-09-30 6:07 UTC (permalink / raw)
To: Dave Thaler; +Cc: bpf, bpf
I didn’t follow the whole discussion, but it maybe it's worth pointing out that C’s % is not a modulo operator, but a remainder operator.
Grüße, Carsten
> On 29. Sep 2023, at 23:03, Dave Thaler <dthaler=40microsoft.com@dmarc.ietf.org> wrote:
>
> In the email discussion below, we concluded it wasn't relevant at the time because
> there were no signed modulo instructions. However, now there is and I believe the
> ambiguity in the current spec needs to be addressed.
>
>> -----Original Message-----
>> From: Dave Thaler
>> Sent: Friday, September 30, 2022 3:42 PM
>> To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
>> Subject: RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero,
>> overflow, and underflow
>>
>>> -----Original Message-----
>>> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>>> Sent: Friday, September 30, 2022 2:59 PM
>>> To: Dave Thaler <dthaler@microsoft.com>
>>> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
>>> Subject: Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by
>>> zero, overflow, and underflow
>>>
>>> On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
>>>> [...]
>>>>>> +Also note that the modulo operation often varies by language
>>>>>> +when the dividend or divisor are negative, where Python, Ruby, etc.
>>>>>> +differ from C, Go, Java, etc. This specification requires that
>>>>>> +modulo use truncated division (where -13 % 3 == -1) as
>>>>>> +implemented in C, Go,
>>>>>> +etc.:
>>>>>> +
>>>>>> + a % n = a - n * trunc(a / n)
>>>>>> +
>>>>>
>>>>> Interesting bit of info, but I'm not sure how it relates to the ISA doc.
>>>>
>>>> It's because there's multiple definitions of modulo out there as the
>>>> paragraph notes, which differ in what they do with negative numbers.
>>>> The ISA defines the modulo operation as being the specific version above.
>>>> If you tried to implement the ISA in say Python and didn't know
>>>> that, you'd have a non-compliant implementation.
>>>
>>> Is it because the languages have weird rules to pick between signed vs
>>> unsigned mod?
>>> At least from llvm pov the smod and umod have fixed behavior.
>>
>> It's because there's different mathematical definitions and different languages
>> have chosen different definitions. E.g., languages/libraries that follow Knuth
>> use a different mathematical definition than C uses. For details see:
>>
>> https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition
>>
>> https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
>>
>> Dave
>
> Perhaps text like the proposed snippet quoted in the exchange above should be
> added around the new text that now appears in the doc, i.e. the ambiguous text
> is currently:
>> For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
>> 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
>> is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
>> interpreted as a 64-bit signed value.
>
> Dave
>
> --
> Bpf mailing list
> Bpf@ietf.org
> https://www.ietf.org/mailman/listinfo/bpf
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [Bpf] Signed modulo operations
2023-09-30 6:07 ` Carsten Bormann
@ 2023-09-30 6:07 ` Carsten Bormann
0 siblings, 0 replies; 50+ messages in thread
From: Carsten Bormann @ 2023-09-30 6:07 UTC (permalink / raw)
To: Dave Thaler; +Cc: bpf, bpf
I didn’t follow the whole discussion, but it maybe it's worth pointing out that C’s % is not a modulo operator, but a remainder operator.
Grüße, Carsten
> On 29. Sep 2023, at 23:03, Dave Thaler <dthaler=40microsoft.com@dmarc.ietf.org> wrote:
>
> In the email discussion below, we concluded it wasn't relevant at the time because
> there were no signed modulo instructions. However, now there is and I believe the
> ambiguity in the current spec needs to be addressed.
>
>> -----Original Message-----
>> From: Dave Thaler
>> Sent: Friday, September 30, 2022 3:42 PM
>> To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
>> Subject: RE: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero,
>> overflow, and underflow
>>
>>> -----Original Message-----
>>> From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
>>> Sent: Friday, September 30, 2022 2:59 PM
>>> To: Dave Thaler <dthaler@microsoft.com>
>>> Cc: dthaler1968@googlemail.com; bpf@vger.kernel.org
>>> Subject: Re: [PATCH 07/15] ebpf-docs: Fix modulo zero, division by
>>> zero, overflow, and underflow
>>>
>>> On Fri, Sep 30, 2022 at 09:54:17PM +0000, Dave Thaler wrote:
>>>> [...]
>>>>>> +Also note that the modulo operation often varies by language
>>>>>> +when the dividend or divisor are negative, where Python, Ruby, etc.
>>>>>> +differ from C, Go, Java, etc. This specification requires that
>>>>>> +modulo use truncated division (where -13 % 3 == -1) as
>>>>>> +implemented in C, Go,
>>>>>> +etc.:
>>>>>> +
>>>>>> + a % n = a - n * trunc(a / n)
>>>>>> +
>>>>>
>>>>> Interesting bit of info, but I'm not sure how it relates to the ISA doc.
>>>>
>>>> It's because there's multiple definitions of modulo out there as the
>>>> paragraph notes, which differ in what they do with negative numbers.
>>>> The ISA defines the modulo operation as being the specific version above.
>>>> If you tried to implement the ISA in say Python and didn't know
>>>> that, you'd have a non-compliant implementation.
>>>
>>> Is it because the languages have weird rules to pick between signed vs
>>> unsigned mod?
>>> At least from llvm pov the smod and umod have fixed behavior.
>>
>> It's because there's different mathematical definitions and different languages
>> have chosen different definitions. E.g., languages/libraries that follow Knuth
>> use a different mathematical definition than C uses. For details see:
>>
>> https://en.wikipedia.org/wiki/Modulo_operation#Variants_of_the_definition
>>
>> https://torstencurdt.com/tech/posts/modulo-of-negative-numbers/
>>
>> Dave
>
> Perhaps text like the proposed snippet quoted in the exchange above should be
> added around the new text that now appears in the doc, i.e. the ambiguous text
> is currently:
>> For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
>> 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
>> is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
>> interpreted as a 64-bit signed value.
>
> Dave
>
> --
> Bpf mailing list
> Bpf@ietf.org
> https://www.ietf.org/mailman/listinfo/bpf
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [Bpf] Signed modulo operations
2023-09-29 21:03 ` Signed modulo operations Dave Thaler
2023-09-29 21:03 ` [Bpf] " Dave Thaler
2023-09-30 6:07 ` Carsten Bormann
@ 2023-09-30 14:48 ` Alexei Starovoitov
2023-09-30 14:48 ` Alexei Starovoitov
2023-10-02 13:19 ` Eduard Zingerman
2 siblings, 2 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2023-09-30 14:48 UTC (permalink / raw)
To: Dave Thaler; +Cc: bpf, bpf
On Fri, Sep 29, 2023 at 2:03 PM Dave Thaler
<dthaler=40microsoft.com@dmarc.ietf.org> wrote:
>
> Perhaps text like the proposed snippet quoted in the exchange above should be
> added around the new text that now appears in the doc, i.e. the ambiguous text
> is currently:
> > For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
> > 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
> > is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
> > interpreted as a 64-bit signed value.
That's what we have in the doc and it's a correct description.
Which part is ambiguous?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [Bpf] Signed modulo operations
2023-09-30 14:48 ` Alexei Starovoitov
@ 2023-09-30 14:48 ` Alexei Starovoitov
2023-10-02 13:19 ` Eduard Zingerman
1 sibling, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2023-09-30 14:48 UTC (permalink / raw)
To: Dave Thaler; +Cc: bpf, bpf
On Fri, Sep 29, 2023 at 2:03 PM Dave Thaler
<dthaler=40microsoft.com@dmarc.ietf.org> wrote:
>
> Perhaps text like the proposed snippet quoted in the exchange above should be
> added around the new text that now appears in the doc, i.e. the ambiguous text
> is currently:
> > For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
> > 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
> > is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
> > interpreted as a 64-bit signed value.
That's what we have in the doc and it's a correct description.
Which part is ambiguous?
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [Bpf] Signed modulo operations
2023-09-30 14:48 ` Alexei Starovoitov
2023-09-30 14:48 ` Alexei Starovoitov
@ 2023-10-02 13:19 ` Eduard Zingerman
1 sibling, 0 replies; 50+ messages in thread
From: Eduard Zingerman @ 2023-10-02 13:19 UTC (permalink / raw)
To: Alexei Starovoitov, Dave Thaler; +Cc: bpf, bpf
On Sat, 2023-09-30 at 07:48 -0700, Alexei Starovoitov wrote:
> On Fri, Sep 29, 2023 at 2:03 PM Dave Thaler
> <dthaler=40microsoft.com@dmarc.ietf.org> wrote:
> >
> > Perhaps text like the proposed snippet quoted in the exchange above should be
> > added around the new text that now appears in the doc, i.e. the ambiguous text
> > is currently:
> > > For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
> > > 'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
> > > is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
> > > interpreted as a 64-bit signed value.
>
> That's what we have in the doc and it's a correct description.
> Which part is ambiguous?
>
As far as I understand Dave suggests to add exact specification for
the SMOD instruction as "signed modulo" might have different definitions [1].
I double checked and current clang implementation generates SMOD for
LLVM's 'srem' operation [2], which follows C semantics and defines
remainder as:
remainder = a - n * trunc(divident / divisor)
> This instruction returns the remainder of a division (where the
> result is either zero or has the same sign as the dividend, op1)
And this is consistent with interpreter logic in core.c, e.g.:
ALU64_MOD_K:
switch (OFF) {
case 0: ... break;
case 1:
AX = div64_s64(DST, IMM); /* implemented as '/' */
DST = DST - AX * IMM;
break;
}
CONT;
[1] https://en.wikipedia.org/wiki/Modulo
[2] https://llvm.org/docs/LangRef.html#srem-instruction
^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2023-10-02 13:19 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 18:59 [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file dthaler1968
2022-09-27 18:59 ` [PATCH 02/15] ebpf-docs: Linux byteswap note dthaler1968
2022-09-27 18:59 ` [PATCH 03/15] ebpf-docs: Move Clang notes to a separate file dthaler1968
2022-09-27 18:59 ` [PATCH 04/15] ebpf-docs: Add Clang note about BPF_ALU dthaler1968
2022-09-27 18:59 ` [PATCH 05/15] ebpf-docs: Add TOC and fix formatting dthaler1968
2022-09-27 18:59 ` [PATCH 06/15] ebpf-docs: Use standard type convention in standard doc dthaler1968
2022-09-30 20:49 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 07/15] ebpf-docs: Fix modulo zero, division by zero, overflow, and underflow dthaler1968
2022-09-30 20:52 ` Alexei Starovoitov
2022-09-30 21:54 ` Dave Thaler
2022-09-30 21:59 ` Alexei Starovoitov
2022-09-30 22:41 ` Dave Thaler
2022-09-30 23:41 ` Alexei Starovoitov
2022-10-04 16:36 ` Dave Thaler
2022-10-04 17:24 ` div_k. Was: " Alexei Starovoitov
2022-10-04 18:23 ` Dave Thaler
2022-10-04 18:34 ` Alexei Starovoitov
2023-09-29 21:03 ` Signed modulo operations Dave Thaler
2023-09-29 21:03 ` [Bpf] " Dave Thaler
2023-09-30 6:07 ` Carsten Bormann
2023-09-30 6:07 ` Carsten Bormann
2023-09-30 14:48 ` Alexei Starovoitov
2023-09-30 14:48 ` Alexei Starovoitov
2023-10-02 13:19 ` Eduard Zingerman
2022-09-27 18:59 ` [PATCH 08/15] ebpf-docs: Use consistent names for the same field dthaler1968
2022-09-30 20:57 ` Alexei Starovoitov
2022-10-04 14:44 ` Dave Thaler
2022-09-27 18:59 ` [PATCH 09/15] ebpf-docs: Explain helper functions dthaler1968
2022-09-30 22:01 ` Alexei Starovoitov
2022-09-30 23:01 ` Dave Thaler
2022-09-27 18:59 ` [PATCH 10/15] ebpf-docs: Add appendix of all opcodes in order dthaler1968
2022-09-30 22:02 ` Alexei Starovoitov
2022-09-30 22:43 ` Dave Thaler
2022-09-27 18:59 ` [PATCH 11/15] ebpf-docs: Improve English readability dthaler1968
2022-09-30 22:16 ` Alexei Starovoitov
2022-10-04 14:32 ` Dave Thaler
2022-10-04 15:38 ` Alexei Starovoitov
2022-10-04 15:55 ` Dave Thaler
2022-10-04 15:56 ` Dave Thaler
2022-10-04 16:19 ` Alexei Starovoitov
2022-10-04 16:41 ` Dave Thaler
2022-10-04 16:54 ` Dave Thaler
2022-10-06 20:44 ` Jim Harris
2022-09-27 18:59 ` [PATCH 12/15] ebpf-docs: Add Linux note about register calling convention dthaler1968
2022-09-30 22:17 ` Alexei Starovoitov
2022-09-27 18:59 ` [PATCH 13/15] ebpf-docs: Add extended 64-bit immediate instructions dthaler1968
2022-09-27 18:59 ` [PATCH 14/15] ebpf-docs: Add extended call instructions dthaler1968
2022-09-27 18:59 ` [PATCH 15/15] ebpf-docs: Add note about invalid instruction dthaler1968
2022-09-30 22:21 ` Alexei Starovoitov
2022-09-30 20:50 ` [PATCH 01/15] ebpf-docs: Move legacy packet instructions to a separate file patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).