All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
@ 2022-12-07 20:55 Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
                   ` (6 more replies)
  0 siblings, 7 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
and the 2nd can be found here [1].

In this patchset, the following convenience helpers are added for interacting
with bpf dynamic pointers:

    * bpf_dynptr_data_rdonly
    * bpf_dynptr_trim
    * bpf_dynptr_advance
    * bpf_dynptr_is_null
    * bpf_dynptr_is_rdonly
    * bpf_dynptr_get_size
    * bpf_dynptr_get_offset
    * bpf_dynptr_clone
    * bpf_dynptr_iterator

Querying dynptr information is abstracted to helper functions instead of directly
exposing bpf_dynptr internals because this avoids imposing restrictions on the
dynptr struct in the case of any future modifications, as well as consolidates
any logic for parsing the fields to one place.

In the future, some of these convenience helper calls will be inlined.

Please note that this patchset will be rebased on top of dynptr refactoring/fixes
once that is landed upstream.

[0] https://lore.kernel.org/bpf/20220523210712.3641569-1-joannelkoong@gmail.com/
[1] https://lore.kernel.org/bpf/20221021011510.1890852-1-joannelkoong@gmail.com/


v1 -> v2:
v1: https://lore.kernel.org/bpf/20220908000254.3079129-1-joannelkoong@gmail.com/
* Drop patch adding "bpf_dynptr_data_rdonly"
* Add offset arg for bpf_dynptr_clone, to advance offset for cloned dynptr
* bpf_dynptr_iterator operates on a cloned dynptr instead of the original (Kumar,Andrii)  

Joanne Koong (6):
  bpf: Add bpf_dynptr_trim and bpf_dynptr_advance
  bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly
  bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset
  bpf: Add bpf_dynptr_clone
  bpf: Add bpf_dynptr_iterator
  selftests/bpf: Tests for dynptr convenience helpers

 include/linux/bpf.h                           |   2 +-
 include/uapi/linux/bpf.h                      | 114 ++++
 kernel/bpf/helpers.c                          | 218 ++++++-
 kernel/bpf/verifier.c                         | 205 +++++--
 kernel/trace/bpf_trace.c                      |   4 +-
 scripts/bpf_doc.py                            |   3 +
 tools/include/uapi/linux/bpf.h                | 114 ++++
 .../testing/selftests/bpf/prog_tests/dynptr.c |  31 +
 .../testing/selftests/bpf/progs/dynptr_fail.c | 439 ++++++++++++++
 .../selftests/bpf/progs/dynptr_success.c      | 534 +++++++++++++++++-
 10 files changed, 1601 insertions(+), 63 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
@ 2022-12-07 20:55 ` Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 2/6] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly Joanne Koong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

Add two new helper functions: bpf_dynptr_trim and bpf_dynptr_advance.

bpf_dynptr_trim decreases the size of a dynptr by the specified
number of bytes (offset remains the same). bpf_dynptr_advance advances
the offset of the dynptr by the specified number of bytes (size
decreases correspondingly).

One example where trimming / advancing the dynptr may useful is for
hashing. If the dynptr points to a larger struct, it is possible to hash
an individual field within the struct through dynptrs by using
bpf_dynptr_advance+trim.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/uapi/linux/bpf.h       | 18 +++++++++
 kernel/bpf/helpers.c           | 67 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 18 +++++++++
 3 files changed, 103 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a9bb98365031..c2d915601484 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5537,6 +5537,22 @@ union bpf_attr {
  *		*flags* is currently unused, it must be 0 for now.
  *	Return
  *		0 on success, -EINVAL if flags is not 0.
+ *
+ * long bpf_dynptr_advance(struct bpf_dynptr *ptr, u32 len)
+ *	Description
+ *		Advance a dynptr's internal offset by *len* bytes.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr is invalid, -ERANGE if *len*
+ *		exceeds the bounds of the dynptr.
+ *
+ * long bpf_dynptr_trim(struct bpf_dynptr *ptr, u32 len)
+ *	Description
+ *		Trim the size of memory pointed to by the dynptr by *len* bytes.
+ *
+ *		The offset is unmodified.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr is invalid, -ERANGE if
+ *		trying to trim more bytes than the size of the dynptr.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5753,6 +5769,8 @@ union bpf_attr {
 	FN(cgrp_storage_delete, 211, ##ctx)		\
 	FN(dynptr_from_skb, 212, ##ctx)			\
 	FN(dynptr_from_xdp, 213, ##ctx)			\
+	FN(dynptr_advance, 214, ##ctx)			\
+	FN(dynptr_trim, 215, ##ctx)			\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 3a9c8814aaf6..fa3989047ff6 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1429,6 +1429,13 @@ u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
 	return ptr->size & DYNPTR_SIZE_MASK;
 }
 
+static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
+{
+	u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
+
+	ptr->size = new_size | metadata;
+}
+
 int bpf_dynptr_check_size(u32 size)
 {
 	return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
@@ -1640,6 +1647,62 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = {
 	.arg3_type	= ARG_CONST_ALLOC_SIZE_OR_ZERO,
 };
 
+/* For dynptrs, the offset may only be advanced and the size may only be decremented */
+static int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 off_inc, u32 sz_dec)
+{
+	u32 size;
+
+	if (!ptr->data)
+		return -EINVAL;
+
+	size = bpf_dynptr_get_size(ptr);
+
+	if (sz_dec > size)
+		return -ERANGE;
+
+	if (off_inc) {
+		u32 new_off;
+
+		if (off_inc > size)
+			return -ERANGE;
+
+		if (check_add_overflow(ptr->offset, off_inc, &new_off))
+			return -ERANGE;
+
+		ptr->offset = new_off;
+	}
+
+	bpf_dynptr_set_size(ptr, size - sz_dec);
+
+	return 0;
+}
+
+BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len)
+{
+	return bpf_dynptr_adjust(ptr, len, len);
+}
+
+static const struct bpf_func_proto bpf_dynptr_advance_proto = {
+	.func		= bpf_dynptr_advance,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg2_type	= ARG_ANYTHING,
+};
+
+BPF_CALL_2(bpf_dynptr_trim, struct bpf_dynptr_kern *, ptr, u32, len)
+{
+	return bpf_dynptr_adjust(ptr, 0, len);
+}
+
+static const struct bpf_func_proto bpf_dynptr_trim_proto = {
+	.func		= bpf_dynptr_trim,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg2_type	= ARG_ANYTHING,
+};
+
 const struct bpf_func_proto bpf_get_current_task_proto __weak;
 const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
 const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1744,6 +1807,10 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_dynptr_write_proto;
 	case BPF_FUNC_dynptr_data:
 		return &bpf_dynptr_data_proto;
+	case BPF_FUNC_dynptr_advance:
+		return &bpf_dynptr_advance_proto;
+	case BPF_FUNC_dynptr_trim:
+		return &bpf_dynptr_trim_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_cgrp_storage_get:
 		return &bpf_cgrp_storage_get_proto;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index a9bb98365031..c2d915601484 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5537,6 +5537,22 @@ union bpf_attr {
  *		*flags* is currently unused, it must be 0 for now.
  *	Return
  *		0 on success, -EINVAL if flags is not 0.
+ *
+ * long bpf_dynptr_advance(struct bpf_dynptr *ptr, u32 len)
+ *	Description
+ *		Advance a dynptr's internal offset by *len* bytes.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr is invalid, -ERANGE if *len*
+ *		exceeds the bounds of the dynptr.
+ *
+ * long bpf_dynptr_trim(struct bpf_dynptr *ptr, u32 len)
+ *	Description
+ *		Trim the size of memory pointed to by the dynptr by *len* bytes.
+ *
+ *		The offset is unmodified.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr is invalid, -ERANGE if
+ *		trying to trim more bytes than the size of the dynptr.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5753,6 +5769,8 @@ union bpf_attr {
 	FN(cgrp_storage_delete, 211, ##ctx)		\
 	FN(dynptr_from_skb, 212, ##ctx)			\
 	FN(dynptr_from_xdp, 213, ##ctx)			\
+	FN(dynptr_advance, 214, ##ctx)			\
+	FN(dynptr_trim, 215, ##ctx)			\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 bpf-next 2/6] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
@ 2022-12-07 20:55 ` Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 3/6] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset Joanne Koong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

Add two new helper functions: bpf_dynptr_is_null and
bpf_dynptr_is_rdonly.

bpf_dynptr_is_null returns true if the dynptr is null / invalid
(determined by whether ptr->data is NULL), else false if
the dynptr is a valid dynptr.

bpf_dynptr_is_rdonly returns true if the dynptr is read-only,
else false if the dynptr is read-writable.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/uapi/linux/bpf.h       | 20 ++++++++++++++++++
 kernel/bpf/helpers.c           | 37 +++++++++++++++++++++++++++++++---
 scripts/bpf_doc.py             |  3 +++
 tools/include/uapi/linux/bpf.h | 20 ++++++++++++++++++
 4 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c2d915601484..80582bc00bf4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5553,6 +5553,24 @@ union bpf_attr {
  *	Return
  *		0 on success, -EINVAL if the dynptr is invalid, -ERANGE if
  *		trying to trim more bytes than the size of the dynptr.
+ *
+ * bool bpf_dynptr_is_null(struct bpf_dynptr *ptr)
+ *	Description
+ *		Determine whether a dynptr is null / invalid.
+ *
+ *		*ptr* must be an initialized dynptr.
+ *	Return
+ *		True if the dynptr is null, else false.
+ *
+ * bool bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
+ *	Description
+ *		Determine whether a dynptr is read-only.
+ *
+ *		*ptr* must be an initialized dynptr. If *ptr*
+ *		is a null dynptr, this will return false.
+ *	Return
+ *		True if the dynptr is read-only and a valid dynptr,
+ *		else false.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5771,6 +5789,8 @@ union bpf_attr {
 	FN(dynptr_from_xdp, 213, ##ctx)			\
 	FN(dynptr_advance, 214, ##ctx)			\
 	FN(dynptr_trim, 215, ##ctx)			\
+	FN(dynptr_is_null, 216, ##ctx)			\
+	FN(dynptr_is_rdonly, 217, ##ctx)		\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index fa3989047ff6..cd9e1a2972fe 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1404,7 +1404,7 @@ static const struct bpf_func_proto bpf_kptr_xchg_proto = {
 #define DYNPTR_SIZE_MASK	0xFFFFFF
 #define DYNPTR_RDONLY_BIT	BIT(31)
 
-static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
+static bool __bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
 {
 	return ptr->size & DYNPTR_RDONLY_BIT;
 }
@@ -1547,7 +1547,7 @@ BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *,
 	enum bpf_dynptr_type type;
 	int err;
 
-	if (!dst->data || bpf_dynptr_is_rdonly(dst))
+	if (!dst->data || __bpf_dynptr_is_rdonly(dst))
 		return -EINVAL;
 
 	err = bpf_dynptr_check_off_len(dst, offset, len);
@@ -1605,7 +1605,7 @@ BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len
 	switch (type) {
 	case BPF_DYNPTR_TYPE_LOCAL:
 	case BPF_DYNPTR_TYPE_RINGBUF:
-		if (bpf_dynptr_is_rdonly(ptr))
+		if (__bpf_dynptr_is_rdonly(ptr))
 			return 0;
 
 		data = ptr->data;
@@ -1703,6 +1703,33 @@ static const struct bpf_func_proto bpf_dynptr_trim_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_1(bpf_dynptr_is_null, struct bpf_dynptr_kern *, ptr)
+{
+	return !ptr->data;
+}
+
+static const struct bpf_func_proto bpf_dynptr_is_null_proto = {
+	.func		= bpf_dynptr_is_null,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+};
+
+BPF_CALL_1(bpf_dynptr_is_rdonly, struct bpf_dynptr_kern *, ptr)
+{
+	if (!ptr->data)
+		return false;
+
+	return __bpf_dynptr_is_rdonly(ptr);
+}
+
+static const struct bpf_func_proto bpf_dynptr_is_rdonly_proto = {
+	.func		= bpf_dynptr_is_rdonly,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+};
+
 const struct bpf_func_proto bpf_get_current_task_proto __weak;
 const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
 const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1811,6 +1838,10 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_dynptr_advance_proto;
 	case BPF_FUNC_dynptr_trim:
 		return &bpf_dynptr_trim_proto;
+	case BPF_FUNC_dynptr_is_null:
+		return &bpf_dynptr_is_null_proto;
+	case BPF_FUNC_dynptr_is_rdonly:
+		return &bpf_dynptr_is_rdonly_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_cgrp_storage_get:
 		return &bpf_cgrp_storage_get_proto;
diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py
index fdb0aff8cb5a..c20cf141e787 100755
--- a/scripts/bpf_doc.py
+++ b/scripts/bpf_doc.py
@@ -710,6 +710,7 @@ class PrinterHelpers(Printer):
             'int',
             'long',
             'unsigned long',
+            'bool',
 
             '__be16',
             '__be32',
@@ -781,6 +782,8 @@ class PrinterHelpers(Printer):
         header = '''\
 /* This is auto-generated file. See bpf_doc.py for details. */
 
+#include <stdbool.h>
+
 /* Forward declarations of BPF structs */'''
 
         print(header)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c2d915601484..80582bc00bf4 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5553,6 +5553,24 @@ union bpf_attr {
  *	Return
  *		0 on success, -EINVAL if the dynptr is invalid, -ERANGE if
  *		trying to trim more bytes than the size of the dynptr.
+ *
+ * bool bpf_dynptr_is_null(struct bpf_dynptr *ptr)
+ *	Description
+ *		Determine whether a dynptr is null / invalid.
+ *
+ *		*ptr* must be an initialized dynptr.
+ *	Return
+ *		True if the dynptr is null, else false.
+ *
+ * bool bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
+ *	Description
+ *		Determine whether a dynptr is read-only.
+ *
+ *		*ptr* must be an initialized dynptr. If *ptr*
+ *		is a null dynptr, this will return false.
+ *	Return
+ *		True if the dynptr is read-only and a valid dynptr,
+ *		else false.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5771,6 +5789,8 @@ union bpf_attr {
 	FN(dynptr_from_xdp, 213, ##ctx)			\
 	FN(dynptr_advance, 214, ##ctx)			\
 	FN(dynptr_trim, 215, ##ctx)			\
+	FN(dynptr_is_null, 216, ##ctx)			\
+	FN(dynptr_is_rdonly, 217, ##ctx)		\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 bpf-next 3/6] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 2/6] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly Joanne Koong
@ 2022-12-07 20:55 ` Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 4/6] bpf: Add bpf_dynptr_clone Joanne Koong
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

Add two new helper functions: bpf_dynptr_get_size and
bpf_dynptr_get_offset.

bpf_dynptr_get_size returns the number of usable bytes in a dynptr and
bpf_dynptr_get_offset returns the current offset into the dynptr.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/linux/bpf.h            |  2 +-
 include/uapi/linux/bpf.h       | 25 +++++++++++++++++++++
 kernel/bpf/helpers.c           | 40 +++++++++++++++++++++++++++++++---
 kernel/trace/bpf_trace.c       |  4 ++--
 tools/include/uapi/linux/bpf.h | 25 +++++++++++++++++++++
 5 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5628256de3e5..753444e1478c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1118,7 +1118,7 @@ enum bpf_dynptr_type {
 };
 
 int bpf_dynptr_check_size(u32 size);
-u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr);
+u32 __bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr);
 
 #ifdef CONFIG_BPF_JIT
 int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 80582bc00bf4..5ad52d481cde 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5571,6 +5571,29 @@ union bpf_attr {
  *	Return
  *		True if the dynptr is read-only and a valid dynptr,
  *		else false.
+ *
+ * long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
+ *	Description
+ *		Get the size of *ptr*.
+ *
+ *		Size refers to the number of usable bytes. For example,
+ *		if *ptr* was initialized with 100 bytes and its
+ *		offset was advanced by 40 bytes, then the size will be
+ *		60 bytes.
+ *
+ *		*ptr* must be an initialized dynptr.
+ *	Return
+ *		The size of the dynptr on success, -EINVAL if the dynptr is
+ *		invalid.
+ *
+ * long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
+ *	Description
+ *		Get the offset of the dynptr.
+ *
+ *		*ptr* must be an initialized dynptr.
+ *	Return
+ *		The offset of the dynptr on success, -EINVAL if the dynptr is
+ *		invalid.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5791,6 +5814,8 @@ union bpf_attr {
 	FN(dynptr_trim, 215, ##ctx)			\
 	FN(dynptr_is_null, 216, ##ctx)			\
 	FN(dynptr_is_rdonly, 217, ##ctx)		\
+	FN(dynptr_get_size, 218, ##ctx)		\
+	FN(dynptr_get_offset, 219, ##ctx)		\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index cd9e1a2972fe..0164d7e4b5a6 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1424,7 +1424,7 @@ static enum bpf_dynptr_type bpf_dynptr_get_type(const struct bpf_dynptr_kern *pt
 	return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT;
 }
 
-u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
+u32 __bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
 {
 	return ptr->size & DYNPTR_SIZE_MASK;
 }
@@ -1457,7 +1457,7 @@ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
 
 static int bpf_dynptr_check_off_len(struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
 {
-	u32 size = bpf_dynptr_get_size(ptr);
+	u32 size = __bpf_dynptr_get_size(ptr);
 
 	if (len > size || offset > size - len)
 		return -E2BIG;
@@ -1655,7 +1655,7 @@ static int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 off_inc, u32 sz_de
 	if (!ptr->data)
 		return -EINVAL;
 
-	size = bpf_dynptr_get_size(ptr);
+	size = __bpf_dynptr_get_size(ptr);
 
 	if (sz_dec > size)
 		return -ERANGE;
@@ -1730,6 +1730,36 @@ static const struct bpf_func_proto bpf_dynptr_is_rdonly_proto = {
 	.arg1_type	= ARG_PTR_TO_DYNPTR,
 };
 
+BPF_CALL_1(bpf_dynptr_get_size, struct bpf_dynptr_kern *, ptr)
+{
+	if (!ptr->data)
+		return -EINVAL;
+
+	return __bpf_dynptr_get_size(ptr);
+}
+
+static const struct bpf_func_proto bpf_dynptr_get_size_proto = {
+	.func		= bpf_dynptr_get_size,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+};
+
+BPF_CALL_1(bpf_dynptr_get_offset, struct bpf_dynptr_kern *, ptr)
+{
+	if (!ptr->data)
+		return -EINVAL;
+
+	return ptr->offset;
+}
+
+static const struct bpf_func_proto bpf_dynptr_get_offset_proto = {
+	.func		= bpf_dynptr_get_offset,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+};
+
 const struct bpf_func_proto bpf_get_current_task_proto __weak;
 const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
 const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1842,6 +1872,10 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_dynptr_is_null_proto;
 	case BPF_FUNC_dynptr_is_rdonly:
 		return &bpf_dynptr_is_rdonly_proto;
+	case BPF_FUNC_dynptr_get_size:
+		return &bpf_dynptr_get_size_proto;
+	case BPF_FUNC_dynptr_get_offset:
+		return &bpf_dynptr_get_offset_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_cgrp_storage_get:
 		return &bpf_cgrp_storage_get_proto;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3bbd3f0c810c..e057570b4e2c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1349,9 +1349,9 @@ int bpf_verify_pkcs7_signature(struct bpf_dynptr_kern *data_ptr,
 	}
 
 	return verify_pkcs7_signature(data_ptr->data,
-				      bpf_dynptr_get_size(data_ptr),
+				      __bpf_dynptr_get_size(data_ptr),
 				      sig_ptr->data,
-				      bpf_dynptr_get_size(sig_ptr),
+				      __bpf_dynptr_get_size(sig_ptr),
 				      trusted_keyring->key,
 				      VERIFYING_UNSPECIFIED_SIGNATURE, NULL,
 				      NULL);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 80582bc00bf4..5ad52d481cde 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5571,6 +5571,29 @@ union bpf_attr {
  *	Return
  *		True if the dynptr is read-only and a valid dynptr,
  *		else false.
+ *
+ * long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
+ *	Description
+ *		Get the size of *ptr*.
+ *
+ *		Size refers to the number of usable bytes. For example,
+ *		if *ptr* was initialized with 100 bytes and its
+ *		offset was advanced by 40 bytes, then the size will be
+ *		60 bytes.
+ *
+ *		*ptr* must be an initialized dynptr.
+ *	Return
+ *		The size of the dynptr on success, -EINVAL if the dynptr is
+ *		invalid.
+ *
+ * long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
+ *	Description
+ *		Get the offset of the dynptr.
+ *
+ *		*ptr* must be an initialized dynptr.
+ *	Return
+ *		The offset of the dynptr on success, -EINVAL if the dynptr is
+ *		invalid.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5791,6 +5814,8 @@ union bpf_attr {
 	FN(dynptr_trim, 215, ##ctx)			\
 	FN(dynptr_is_null, 216, ##ctx)			\
 	FN(dynptr_is_rdonly, 217, ##ctx)		\
+	FN(dynptr_get_size, 218, ##ctx)		\
+	FN(dynptr_get_offset, 219, ##ctx)		\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 bpf-next 4/6] bpf: Add bpf_dynptr_clone
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
                   ` (2 preceding siblings ...)
  2022-12-07 20:55 ` [PATCH v2 bpf-next 3/6] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset Joanne Koong
@ 2022-12-07 20:55 ` Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 5/6] bpf: Add bpf_dynptr_iterator Joanne Koong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

Add a new helper, bpf_dynptr_clone, which clones a dynptr.

The cloned dynptr will point to the same data as its parent dynptr,
with the same type, offset, size and read-only properties.

Any writes to a dynptr will be reflected across all instances
(by 'instance', this means any dynptrs that point to the same
underlying data).

Please note that data slice and dynptr invalidations will affect all
instances as well. For example, if bpf_dynptr_write() is called on an
skb-type dynptr, all data slices of dynptr instances to that skb
will be invalidated as well (eg data slices of any clones, parents,
grandparents, ...). Another example is if a ringbuf dynptr is submitted,
any instance of that dynptr will be invalidated.

Changing the view of the dynptr (eg advancing the offset or
trimming the size) will only affect that dynptr and not affect any
other instances.

One example use case where cloning may be helpful is for hashing or
iterating through dynptr data. Cloning will allow the user to maintain
the original view of the dynptr for future use, while also allowing
views to smaller subsets of the data after the offset is advanced or the
size is trimmed.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/uapi/linux/bpf.h       |  26 +++++++
 kernel/bpf/helpers.c           |  34 ++++++++
 kernel/bpf/verifier.c          | 138 +++++++++++++++++++++------------
 tools/include/uapi/linux/bpf.h |  26 +++++++
 4 files changed, 173 insertions(+), 51 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 5ad52d481cde..f9387c5aba2b 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5594,6 +5594,31 @@ union bpf_attr {
  *	Return
  *		The offset of the dynptr on success, -EINVAL if the dynptr is
  *		invalid.
+ *
+ * long bpf_dynptr_clone(struct bpf_dynptr *ptr, struct bpf_dynptr *clone, u32 offset)
+ *	Description
+ *		Clone an initialized dynptr *ptr*. After this call, both *ptr*
+ *		and *clone* will point to the same underlying data. If non-zero,
+ *		*offset* specifies how many bytes to advance the cloned dynptr by.
+ *
+ *		*clone* must be an uninitialized dynptr.
+ *
+ *		Any data slice or dynptr invalidations will apply equally for
+ *		both dynptrs after this call. For example, if ptr1 is a
+ *		ringbuf-type dynptr with multiple data slices that is cloned to
+ *		ptr2, if ptr2 discards the ringbuf sample, then ptr2, ptr2's
+ *		data slices, ptr1, and ptr1's data slices will all be
+ *		invalidated.
+ *
+ *		This is convenient for getting different "views" to the same
+ *		data. For instance, if one wishes to hash only a particular
+ *		section of data, one can clone the dynptr, advance it to a
+ *		specified offset and trim it to a specified size, pass it
+ *		to the hash function, and discard it after hashing, without
+ *		losing access to the original view of the dynptr.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr to clone is invalid, -ERANGE
+ *		if attempting to clone the dynptr at an out of range offset.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5816,6 +5841,7 @@ union bpf_attr {
 	FN(dynptr_is_rdonly, 217, ##ctx)		\
 	FN(dynptr_get_size, 218, ##ctx)		\
 	FN(dynptr_get_offset, 219, ##ctx)		\
+	FN(dynptr_clone, 220, ##ctx)			\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 0164d7e4b5a6..0c2cfb4ed33c 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1760,6 +1760,38 @@ static const struct bpf_func_proto bpf_dynptr_get_offset_proto = {
 	.arg1_type	= ARG_PTR_TO_DYNPTR,
 };
 
+BPF_CALL_3(bpf_dynptr_clone, struct bpf_dynptr_kern *, ptr,
+	   struct bpf_dynptr_kern *, clone, u32, offset)
+{
+	int err = -EINVAL;
+
+	if (!ptr->data)
+		goto error;
+
+	memcpy(clone, ptr, sizeof(*clone));
+
+	if (offset) {
+		err = bpf_dynptr_adjust(clone, offset, offset);
+		if (err)
+			goto error;
+	}
+
+	return 0;
+
+error:
+	bpf_dynptr_set_null(clone);
+	return err;
+}
+
+static const struct bpf_func_proto bpf_dynptr_clone_proto = {
+	.func		= bpf_dynptr_clone,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg2_type	= ARG_PTR_TO_DYNPTR | MEM_UNINIT,
+	.arg3_type	= ARG_ANYTHING,
+};
+
 const struct bpf_func_proto bpf_get_current_task_proto __weak;
 const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
 const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1876,6 +1908,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_dynptr_get_size_proto;
 	case BPF_FUNC_dynptr_get_offset:
 		return &bpf_dynptr_get_offset_proto;
+	case BPF_FUNC_dynptr_clone:
+		return &bpf_dynptr_clone_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_cgrp_storage_get:
 		return &bpf_cgrp_storage_get_proto;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4d81d159254b..3f617f7040b7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -719,17 +719,53 @@ static enum bpf_dynptr_type arg_to_dynptr_type(enum bpf_arg_type arg_type)
 	}
 }
 
+static bool arg_type_is_dynptr(enum bpf_arg_type type)
+{
+	return base_type(type) == ARG_PTR_TO_DYNPTR;
+}
+
 static bool dynptr_type_refcounted(enum bpf_dynptr_type type)
 {
 	return type == BPF_DYNPTR_TYPE_RINGBUF;
 }
 
-static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
-				   enum bpf_arg_type arg_type, int insn_idx)
+static struct bpf_reg_state *get_dynptr_arg_reg(const struct bpf_func_proto *fn,
+						struct bpf_reg_state *regs)
+{
+	enum bpf_arg_type t;
+	int i;
+
+	for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) {
+		t = fn->arg_type[i];
+		if (arg_type_is_dynptr(t) && !(t & MEM_UNINIT))
+			return &regs[BPF_REG_1 + i];
+	}
+
+	return NULL;
+}
+
+static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env,
+						       struct bpf_reg_state *reg,
+						       int *ref_obj_id)
+{
+	struct bpf_func_state *state = func(env, reg);
+	int spi = get_spi(reg->off);
+
+	if (ref_obj_id)
+		*ref_obj_id = state->stack[spi].spilled_ptr.id;
+
+	return state->stack[spi].spilled_ptr.dynptr.type;
+}
+
+static int mark_stack_slots_dynptr(struct bpf_verifier_env *env,
+				   const struct bpf_func_proto *fn,
+				   struct bpf_reg_state *reg,
+				   enum bpf_arg_type arg_type,
+				   int insn_idx, int func_id)
 {
 	struct bpf_func_state *state = func(env, reg);
 	enum bpf_dynptr_type type;
-	int spi, i, id;
+	int spi, i, id = 0;
 
 	spi = get_spi(reg->off);
 
@@ -741,7 +777,21 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 		state->stack[spi - 1].slot_type[i] = STACK_DYNPTR;
 	}
 
-	type = arg_to_dynptr_type(arg_type);
+	if (func_id == BPF_FUNC_dynptr_clone) {
+		/* find the type and id of the dynptr we're cloning and
+		 * assign it to the clone
+		 */
+		struct bpf_reg_state *parent_state = get_dynptr_arg_reg(fn, state->regs);
+
+		if (!parent_state) {
+			verbose(env, "verifier internal error: no parent dynptr in bpf_dynptr_clone()\n");
+			return -EFAULT;
+		}
+		type = stack_slot_get_dynptr_info(env, parent_state, &id);
+	} else {
+		type = arg_to_dynptr_type(arg_type);
+	}
+
 	if (type == BPF_DYNPTR_TYPE_INVALID)
 		return -EINVAL;
 
@@ -751,9 +801,11 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 
 	if (dynptr_type_refcounted(type)) {
 		/* The id is used to track proper releasing */
-		id = acquire_reference_state(env, insn_idx);
-		if (id < 0)
-			return id;
+		if (!id) {
+			id = acquire_reference_state(env, insn_idx);
+			if (id < 0)
+				return id;
+		}
 
 		state->stack[spi].spilled_ptr.id = id;
 		state->stack[spi - 1].spilled_ptr.id = id;
@@ -762,6 +814,17 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 	return 0;
 }
 
+static void invalidate_dynptr(struct bpf_func_state *state, int spi)
+{
+	int i;
+
+	state->stack[spi].spilled_ptr.id = 0;
+	for (i = 0; i < BPF_REG_SIZE; i++)
+		state->stack[spi].slot_type[i] = STACK_INVALID;
+	state->stack[spi].spilled_ptr.dynptr.first_slot = false;
+	state->stack[spi].spilled_ptr.dynptr.type = 0;
+}
+
 static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
@@ -772,22 +835,25 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return -EINVAL;
 
-	for (i = 0; i < BPF_REG_SIZE; i++) {
-		state->stack[spi].slot_type[i] = STACK_INVALID;
-		state->stack[spi - 1].slot_type[i] = STACK_INVALID;
-	}
-
-	/* Invalidate any slices associated with this dynptr */
 	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
+		int id = state->stack[spi].spilled_ptr.id;
+
+		/* If the dynptr is refcounted, we need to invalidate two things:
+		 * 1) any dynptrs with a matching id
+		 * 2) any slices associated with the dynptr id
+		 */
+
 		release_reference(env, state->stack[spi].spilled_ptr.id);
-		state->stack[spi].spilled_ptr.id = 0;
-		state->stack[spi - 1].spilled_ptr.id = 0;
+		for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
+			if (state->stack[i].slot_type[0] == STACK_DYNPTR &&
+			    state->stack[i].spilled_ptr.id == id)
+				invalidate_dynptr(state, i);
+		}
+	} else {
+		invalidate_dynptr(state, spi);
+		invalidate_dynptr(state, spi - 1);
 	}
 
-	state->stack[spi].spilled_ptr.dynptr.first_slot = false;
-	state->stack[spi].spilled_ptr.dynptr.type = 0;
-	state->stack[spi - 1].spilled_ptr.dynptr.type = 0;
-
 	return 0;
 }
 
@@ -5862,11 +5928,6 @@ static bool arg_type_is_release(enum bpf_arg_type type)
 	return type & OBJ_RELEASE;
 }
 
-static bool arg_type_is_dynptr(enum bpf_arg_type type)
-{
-	return base_type(type) == ARG_PTR_TO_DYNPTR;
-}
-
 static int int_ptr_type_to_size(enum bpf_arg_type type)
 {
 	if (type == ARG_PTR_TO_INT)
@@ -6176,31 +6237,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
 }
 
-static struct bpf_reg_state *get_dynptr_arg_reg(const struct bpf_func_proto *fn,
-						struct bpf_reg_state *regs)
-{
-	int i;
-
-	for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++)
-		if (arg_type_is_dynptr(fn->arg_type[i]))
-			return &regs[BPF_REG_1 + i];
-
-	return NULL;
-}
-
-static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env,
-						       struct bpf_reg_state *reg,
-						       int *ref_obj_id)
-{
-	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
-
-	if (ref_obj_id)
-		*ref_obj_id = state->stack[spi].spilled_ptr.id;
-
-	return state->stack[spi].spilled_ptr.dynptr.type;
-}
-
 static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 			  struct bpf_call_arg_meta *meta,
 			  const struct bpf_func_proto *fn)
@@ -7697,9 +7733,9 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 				return err;
 		}
 
-		err = mark_stack_slots_dynptr(env, &regs[meta.uninit_dynptr_regno],
+		err = mark_stack_slots_dynptr(env, fn, &regs[meta.uninit_dynptr_regno],
 					      fn->arg_type[meta.uninit_dynptr_regno - BPF_REG_1],
-					      insn_idx);
+					      insn_idx, func_id);
 		if (err)
 			return err;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5ad52d481cde..f9387c5aba2b 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5594,6 +5594,31 @@ union bpf_attr {
  *	Return
  *		The offset of the dynptr on success, -EINVAL if the dynptr is
  *		invalid.
+ *
+ * long bpf_dynptr_clone(struct bpf_dynptr *ptr, struct bpf_dynptr *clone, u32 offset)
+ *	Description
+ *		Clone an initialized dynptr *ptr*. After this call, both *ptr*
+ *		and *clone* will point to the same underlying data. If non-zero,
+ *		*offset* specifies how many bytes to advance the cloned dynptr by.
+ *
+ *		*clone* must be an uninitialized dynptr.
+ *
+ *		Any data slice or dynptr invalidations will apply equally for
+ *		both dynptrs after this call. For example, if ptr1 is a
+ *		ringbuf-type dynptr with multiple data slices that is cloned to
+ *		ptr2, if ptr2 discards the ringbuf sample, then ptr2, ptr2's
+ *		data slices, ptr1, and ptr1's data slices will all be
+ *		invalidated.
+ *
+ *		This is convenient for getting different "views" to the same
+ *		data. For instance, if one wishes to hash only a particular
+ *		section of data, one can clone the dynptr, advance it to a
+ *		specified offset and trim it to a specified size, pass it
+ *		to the hash function, and discard it after hashing, without
+ *		losing access to the original view of the dynptr.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr to clone is invalid, -ERANGE
+ *		if attempting to clone the dynptr at an out of range offset.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5816,6 +5841,7 @@ union bpf_attr {
 	FN(dynptr_is_rdonly, 217, ##ctx)		\
 	FN(dynptr_get_size, 218, ##ctx)		\
 	FN(dynptr_get_offset, 219, ##ctx)		\
+	FN(dynptr_clone, 220, ##ctx)			\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 bpf-next 5/6] bpf: Add bpf_dynptr_iterator
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
                   ` (3 preceding siblings ...)
  2022-12-07 20:55 ` [PATCH v2 bpf-next 4/6] bpf: Add bpf_dynptr_clone Joanne Koong
@ 2022-12-07 20:55 ` Joanne Koong
  2022-12-07 20:55 ` [PATCH v2 bpf-next 6/6] selftests/bpf: Tests for dynptr convenience helpers Joanne Koong
  2022-12-08  1:54 ` [PATCH v2 bpf-next 0/6] Dynptr " Alexei Starovoitov
  6 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

Add a new helper function, bpf_dynptr_iterator:

  long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
			   void *callback_ctx, u64 flags)

where callback_fn is defined as:

  long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx)

and callback_fn returns the number of bytes to advance the
dynptr by (or an error code in the case of error). The iteration
will stop if the callback_fn returns 0 or an error or tries to
advance by more bytes than available.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 include/uapi/linux/bpf.h       | 25 +++++++++
 kernel/bpf/helpers.c           | 42 +++++++++++++++
 kernel/bpf/verifier.c          | 93 ++++++++++++++++++++++++++++------
 tools/include/uapi/linux/bpf.h | 25 +++++++++
 4 files changed, 170 insertions(+), 15 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f9387c5aba2b..11c7e1e52f4d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5619,6 +5619,30 @@ union bpf_attr {
  *	Return
  *		0 on success, -EINVAL if the dynptr to clone is invalid, -ERANGE
  *		if attempting to clone the dynptr at an out of range offset.
+ *
+ * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags)
+ *	Description
+ *		Iterate through the dynptr data, calling **callback_fn** on each
+ *		iteration with **callback_ctx** as the context parameter.
+ *		The **callback_fn** should be a static function and
+ *		the **callback_ctx** should be a pointer to the stack.
+ *		Currently **flags** is unused and must be 0.
+ *
+ *		int (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx);
+ *
+ *		where **callback_fn** returns the number of bytes to advance
+ *		the callback dynptr by or an error. The iteration will stop if
+ *		**callback_fn** returns 0 or an error or tries to advance by more
+ *		bytes than the remaining size.
+ *
+ *		Please note that **ptr** will remain untouched (eg offset and
+ *		size will not be modified) though the data pointed to by **ptr**
+ *		may have been modified. Please also note that you cannot release
+ *		a dynptr within the callback function.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0,
+ *		-ERANGE if attempting to iterate more bytes than available, or other
+ *		error code if **callback_fn** returns an error.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5842,6 +5866,7 @@ union bpf_attr {
 	FN(dynptr_get_size, 218, ##ctx)		\
 	FN(dynptr_get_offset, 219, ##ctx)		\
 	FN(dynptr_clone, 220, ##ctx)			\
+	FN(dynptr_iterator, 221, ##ctx)			\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 0c2cfb4ed33c..0e612007601e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1792,6 +1792,46 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = {
 	.arg3_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn,
+	   void *, callback_ctx, u64, flags)
+{
+	bpf_callback_t callback = (bpf_callback_t)callback_fn;
+	struct bpf_dynptr_kern ptr_copy;
+	int nr_bytes, err;
+
+	if (flags)
+		return -EINVAL;
+
+	err = ____bpf_dynptr_clone(ptr, &ptr_copy, 0);
+	if (err)
+		return err;
+
+	while (ptr_copy.size > 0) {
+		nr_bytes = callback((uintptr_t)&ptr_copy, (uintptr_t)callback_ctx, 0, 0, 0);
+		if (nr_bytes <= 0)
+			return nr_bytes;
+
+		if (nr_bytes > U32_MAX)
+			return -ERANGE;
+
+		err = bpf_dynptr_adjust(&ptr_copy, nr_bytes, nr_bytes);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_dynptr_iterator_proto = {
+	.func		= bpf_dynptr_iterator,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg2_type	= ARG_PTR_TO_FUNC,
+	.arg3_type	= ARG_PTR_TO_STACK_OR_NULL,
+	.arg4_type	= ARG_ANYTHING,
+};
+
 const struct bpf_func_proto bpf_get_current_task_proto __weak;
 const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
 const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1910,6 +1950,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_dynptr_get_offset_proto;
 	case BPF_FUNC_dynptr_clone:
 		return &bpf_dynptr_clone_proto;
+	case BPF_FUNC_dynptr_iterator:
+		return &bpf_dynptr_iterator_proto;
 #ifdef CONFIG_CGROUPS
 	case BPF_FUNC_cgrp_storage_get:
 		return &bpf_cgrp_storage_get_proto;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3f617f7040b7..8abdc392a48e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -524,7 +524,8 @@ static bool is_callback_calling_function(enum bpf_func_id func_id)
 	       func_id == BPF_FUNC_timer_set_callback ||
 	       func_id == BPF_FUNC_find_vma ||
 	       func_id == BPF_FUNC_loop ||
-	       func_id == BPF_FUNC_user_ringbuf_drain;
+	       func_id == BPF_FUNC_user_ringbuf_drain ||
+	       func_id == BPF_FUNC_dynptr_iterator;
 }
 
 static bool is_storage_get_function(enum bpf_func_id func_id)
@@ -703,6 +704,19 @@ static void mark_verifier_state_scratched(struct bpf_verifier_env *env)
 	env->scratched_stack_slots = ~0ULL;
 }
 
+static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env,
+						       struct bpf_reg_state *reg,
+						       int *ref_obj_id)
+{
+	struct bpf_func_state *state = func(env, reg);
+	int spi = get_spi(reg->off);
+
+	if (ref_obj_id)
+		*ref_obj_id = state->stack[spi].spilled_ptr.id;
+
+	return state->stack[spi].spilled_ptr.dynptr.type;
+}
+
 static enum bpf_dynptr_type arg_to_dynptr_type(enum bpf_arg_type arg_type)
 {
 	switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
@@ -719,6 +733,25 @@ static enum bpf_dynptr_type arg_to_dynptr_type(enum bpf_arg_type arg_type)
 	}
 }
 
+static enum bpf_type_flag dynptr_flag_type(struct bpf_verifier_env *env,
+					   struct bpf_reg_state *state)
+{
+	enum bpf_dynptr_type type = stack_slot_get_dynptr_info(env, state, NULL);
+
+	switch (type) {
+	case BPF_DYNPTR_TYPE_LOCAL:
+		return DYNPTR_TYPE_LOCAL;
+	case BPF_DYNPTR_TYPE_RINGBUF:
+		return DYNPTR_TYPE_RINGBUF;
+	case BPF_DYNPTR_TYPE_SKB:
+		return DYNPTR_TYPE_SKB;
+	case BPF_DYNPTR_TYPE_XDP:
+		return DYNPTR_TYPE_XDP;
+	default:
+		return 0;
+	}
+}
+
 static bool arg_type_is_dynptr(enum bpf_arg_type type)
 {
 	return base_type(type) == ARG_PTR_TO_DYNPTR;
@@ -744,19 +777,6 @@ static struct bpf_reg_state *get_dynptr_arg_reg(const struct bpf_func_proto *fn,
 	return NULL;
 }
 
-static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env,
-						       struct bpf_reg_state *reg,
-						       int *ref_obj_id)
-{
-	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
-
-	if (ref_obj_id)
-		*ref_obj_id = state->stack[spi].spilled_ptr.id;
-
-	return state->stack[spi].spilled_ptr.dynptr.type;
-}
-
 static int mark_stack_slots_dynptr(struct bpf_verifier_env *env,
 				   const struct bpf_func_proto *fn,
 				   struct bpf_reg_state *reg,
@@ -6053,6 +6073,9 @@ static const struct bpf_reg_types dynptr_types = {
 	.types = {
 		PTR_TO_STACK,
 		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
+		PTR_TO_DYNPTR | DYNPTR_TYPE_RINGBUF,
+		PTR_TO_DYNPTR | DYNPTR_TYPE_SKB,
+		PTR_TO_DYNPTR | DYNPTR_TYPE_XDP,
 	}
 };
 
@@ -6440,8 +6463,13 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		 * assumption is that if it is, that a helper function
 		 * initialized the dynptr on behalf of the BPF program.
 		 */
-		if (base_type(reg->type) == PTR_TO_DYNPTR)
+		if (base_type(reg->type) == PTR_TO_DYNPTR) {
+			if (arg_type & MEM_UNINIT) {
+				verbose(env, "PTR_TO_DYNPTR is already an initialized dynptr\n");
+				return -EINVAL;
+			}
 			break;
+		}
 		if (arg_type & MEM_UNINIT) {
 			if (!is_dynptr_reg_valid_uninit(env, reg)) {
 				verbose(env, "Dynptr has to be an uninitialized dynptr\n");
@@ -7342,6 +7370,37 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static int set_dynptr_iterator_callback_state(struct bpf_verifier_env *env,
+					      struct bpf_func_state *caller,
+					      struct bpf_func_state *callee,
+					      int insn_idx)
+{
+	/* bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
+	 * void *callback_ctx, u64 flags);
+	 *
+	 * callback_fn(struct bpf_dynptr *ptr, void *callback_ctx);
+	 */
+
+	enum bpf_type_flag dynptr_flag =
+		dynptr_flag_type(env, &caller->regs[BPF_REG_1]);
+
+	if (dynptr_flag == 0)
+		return -EFAULT;
+
+	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | dynptr_flag;
+	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
+	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
+	callee->callback_ret_range = tnum_range(0, U32_MAX);
+
+	/* unused */
+	__mark_reg_not_init(env, &callee->regs[BPF_REG_3]);
+	__mark_reg_not_init(env, &callee->regs[BPF_REG_4]);
+	__mark_reg_not_init(env, &callee->regs[BPF_REG_5]);
+
+	callee->in_callback_fn = true;
+	return 0;
+}
+
 static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
 {
 	struct bpf_verifier_state *state = env->cur_state;
@@ -7857,6 +7916,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
 					set_user_ringbuf_callback_state);
 		break;
+	case BPF_FUNC_dynptr_iterator:
+		err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
+					set_dynptr_iterator_callback_state);
+		break;
 	}
 
 	if (err)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index f9387c5aba2b..11c7e1e52f4d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5619,6 +5619,30 @@ union bpf_attr {
  *	Return
  *		0 on success, -EINVAL if the dynptr to clone is invalid, -ERANGE
  *		if attempting to clone the dynptr at an out of range offset.
+ *
+ * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags)
+ *	Description
+ *		Iterate through the dynptr data, calling **callback_fn** on each
+ *		iteration with **callback_ctx** as the context parameter.
+ *		The **callback_fn** should be a static function and
+ *		the **callback_ctx** should be a pointer to the stack.
+ *		Currently **flags** is unused and must be 0.
+ *
+ *		int (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx);
+ *
+ *		where **callback_fn** returns the number of bytes to advance
+ *		the callback dynptr by or an error. The iteration will stop if
+ *		**callback_fn** returns 0 or an error or tries to advance by more
+ *		bytes than the remaining size.
+ *
+ *		Please note that **ptr** will remain untouched (eg offset and
+ *		size will not be modified) though the data pointed to by **ptr**
+ *		may have been modified. Please also note that you cannot release
+ *		a dynptr within the callback function.
+ *	Return
+ *		0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0,
+ *		-ERANGE if attempting to iterate more bytes than available, or other
+ *		error code if **callback_fn** returns an error.
  */
 #define ___BPF_FUNC_MAPPER(FN, ctx...)			\
 	FN(unspec, 0, ##ctx)				\
@@ -5842,6 +5866,7 @@ union bpf_attr {
 	FN(dynptr_get_size, 218, ##ctx)		\
 	FN(dynptr_get_offset, 219, ##ctx)		\
 	FN(dynptr_clone, 220, ##ctx)			\
+	FN(dynptr_iterator, 221, ##ctx)			\
 	/* */
 
 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v2 bpf-next 6/6] selftests/bpf: Tests for dynptr convenience helpers
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
                   ` (4 preceding siblings ...)
  2022-12-07 20:55 ` [PATCH v2 bpf-next 5/6] bpf: Add bpf_dynptr_iterator Joanne Koong
@ 2022-12-07 20:55 ` Joanne Koong
  2022-12-08  1:54 ` [PATCH v2 bpf-next 0/6] Dynptr " Alexei Starovoitov
  6 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-07 20:55 UTC (permalink / raw)
  To: bpf; +Cc: andrii, kernel-team, ast, daniel, martin.lau, song, Joanne Koong

Test dynptr convenience helpers in the following way:

1) bpf_dynptr_trim and bpf_dynptr_advance
    * "test_advance_trim" tests that dynptr offset and size get adjusted
      correctly.
    * "test_advance_trim_err" tests that advances beyond dynptr size and
      trims larger than dynptr size are rejected.
    * "test_zero_size_dynptr" tests that a zero-size dynptr (after
      advancing or trimming) can only read and write 0 bytes.

2) bpf_dynptr_is_null
    * "dynptr_is_null_invalid" tests that only initialized dynptrs can
      be passed in.
    * "test_dynptr_is_null" tests that null dynptrs return true and
      non-null dynptrs return false.

3) bpf_dynptr_is_rdonly
    * "dynptr_is_rdonly_invalid" tests that only initialized dynptrs can
      be passed in.
    * "test_dynptr_is_rdonly" tests that rdonly dynptrs return true and
      non-rdonly or invalid dynptrs return false.

4) bpf_dynptr_get_size
    * "dynptr_get_size_invalid" tests that only initialized dynptrs can
      be passed in.
    * Additional functionality is tested as a by-product in
      "test_advance_trim"

5) bpf_dynptr_get_offset
    * "dynptr_get_offset_invalid" tests that only initialized dynptrs can
      be passed in.
    * Additional functionality is tested as a by-product in
      "test_advance_trim"

6) bpf_dynptr_clone
    * "clone_invalidate_{1..6}" tests that invalidating a dynptr
      invalidates all instances and invalidating a dynptr's data slices
      invalidates all data slices for all instances.
    * "clone_skb_packet_data" tests that data slices of skb dynptr instances
      are invalidated when packet data changes.
    * "clone_xdp_packet_data" tests that data slices of xdp dynptr instances
      are invalidated when packet data changes.
    * "clone_invalid1" tests that only initialized dynptrs can be
      cloned.
    * "clone_invalid2" tests that only uninitialized dynptrs can be
      a clone.
    * "test_dynptr_clone" tests that the views from the same dynptr instances
      are independent (advancing or trimming a dynptr doesn't affect other
      instances), and that a clone will return a dynptr with the same
      type, offset, size, and rd-only property.
    * "test_dynptr_clone_offset" tests cloning at invalid offsets and
       at valid offsets.

7) bpf_dynptr_iterator
    * "iterator_invalid1" tests that any dynptr requiring a release
      that gets acquired in an iterator callback must also be released
      within the callback
    * "iterator_invalid2" tests that bpf_dynptr_iterator can't be called
      on an uninitialized dynptr
    * "iterator_invalid3" tests that the initialized dynptr can't
      be initialized again in the iterator callback function
    * "iterator_invalid4" tests that the dynptr in the iterator callback
      function can't be released
    * "iterator_invalid5" tests that the dynptr passed as a callback ctx
      can't be released within the callback
    * "iterator_invalid6" tests that the dynptr can't be modified
      within the iterator callback
    * "iterator_invalid7" tests that the callback function can't return
      a value larger than an int
    * "test_dynptr_iterator" tests basic functionality of the iterator
    * "iterator_parse_strings" tests parsing strings as values

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 .../testing/selftests/bpf/prog_tests/dynptr.c |  31 +
 .../testing/selftests/bpf/progs/dynptr_fail.c | 439 ++++++++++++++
 .../selftests/bpf/progs/dynptr_success.c      | 534 +++++++++++++++++-
 3 files changed, 1002 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c
index 3c55721f8f6d..8052aded2261 100644
--- a/tools/testing/selftests/bpf/prog_tests/dynptr.c
+++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c
@@ -57,12 +57,43 @@ static struct {
 	{"skb_invalid_ctx", "unknown func bpf_dynptr_from_skb"},
 	{"xdp_invalid_ctx", "unknown func bpf_dynptr_from_xdp"},
 	{"skb_invalid_write", "cannot write into rdonly_mem"},
+	{"dynptr_is_null_invalid", "Expected an initialized dynptr as arg #1"},
+	{"dynptr_is_rdonly_invalid", "Expected an initialized dynptr as arg #1"},
+	{"dynptr_get_size_invalid", "Expected an initialized dynptr as arg #1"},
+	{"dynptr_get_offset_invalid", "Expected an initialized dynptr as arg #1"},
+	{"clone_invalid1", "Expected an initialized dynptr as arg #1"},
+	{"clone_invalid2", "Dynptr has to be an uninitialized dynptr"},
+	{"clone_invalidate1", "Expected an initialized dynptr"},
+	{"clone_invalidate2", "Expected an initialized dynptr"},
+	{"clone_invalidate3", "Expected an initialized dynptr"},
+	{"clone_invalidate4", "invalid mem access 'scalar'"},
+	{"clone_invalidate5", "invalid mem access 'scalar'"},
+	{"clone_invalidate6", "invalid mem access 'scalar'"},
+	{"clone_skb_packet_data", "invalid mem access 'scalar'"},
+	{"clone_xdp_packet_data", "invalid mem access 'scalar'"},
+	{"iterator_invalid1", "Unreleased reference id=1"},
+	{"iterator_invalid2", "Expected an initialized dynptr as arg #1"},
+	{"iterator_invalid3", "PTR_TO_DYNPTR is already an initialized dynptr"},
+	{"iterator_invalid4", "arg 1 is an unacquired reference"},
+	{"iterator_invalid5", "Unreleased reference"},
+	{"iterator_invalid6", "invalid mem access 'dynptr_ptr'"},
+	{"iterator_invalid7",
+		"At callback return the register R0 has value (0x100000000; 0x0)"},
 
 	/* these tests should be run and should succeed */
 	{"test_read_write", NULL, SETUP_SYSCALL_SLEEP},
 	{"test_data_slice", NULL, SETUP_SYSCALL_SLEEP},
 	{"test_ringbuf", NULL, SETUP_SYSCALL_SLEEP},
 	{"test_skb_readonly", NULL, SETUP_SKB_PROG},
+	{"test_advance_trim", NULL, SETUP_SYSCALL_SLEEP},
+	{"test_advance_trim_err", NULL, SETUP_SYSCALL_SLEEP},
+	{"test_zero_size_dynptr", NULL, SETUP_SYSCALL_SLEEP},
+	{"test_dynptr_is_null", NULL, SETUP_SYSCALL_SLEEP},
+	{"test_dynptr_is_rdonly", NULL, SETUP_SKB_PROG},
+	{"test_dynptr_clone", NULL, SETUP_SKB_PROG},
+	{"test_dynptr_clone_offset", NULL, SETUP_SKB_PROG},
+	{"test_dynptr_iterator", NULL, SETUP_SKB_PROG},
+	{"iterator_parse_strings", NULL, SETUP_SYSCALL_SLEEP},
 };
 
 static void verify_fail(const char *prog_name, const char *expected_err_msg)
diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c
index fe9b668b4999..2e91642ded16 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_fail.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c
@@ -733,3 +733,442 @@ int skb_invalid_write(struct __sk_buff *skb)
 
 	return 0;
 }
+
+/* dynptr_is_null can only be called on initialized dynptrs */
+SEC("?raw_tp")
+int dynptr_is_null_invalid(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_is_null(&ptr);
+
+	return 0;
+}
+
+/* dynptr_is_rdonly can only be called on initialized dynptrs */
+SEC("?raw_tp")
+int dynptr_is_rdonly_invalid(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_is_rdonly(&ptr);
+
+	return 0;
+}
+
+/* dynptr_get_size can only be called on initialized dynptrs */
+SEC("?raw_tp")
+int dynptr_get_size_invalid(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_get_size(&ptr);
+
+	return 0;
+}
+
+/* dynptr_get_offset can only be called on initialized dynptrs */
+SEC("?raw_tp")
+int dynptr_get_offset_invalid(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_get_offset(&ptr);
+
+	return 0;
+}
+
+/* Only initialized dynptrs can be cloned */
+SEC("?raw_tp")
+int clone_invalid1(void *ctx)
+{
+	struct bpf_dynptr ptr1;
+	struct bpf_dynptr ptr2;
+
+	/* this should fail */
+	bpf_dynptr_clone(&ptr1, &ptr2, 0);
+
+	return 0;
+}
+
+/* Only uninitialized dynptrs can be clones */
+SEC("?xdp")
+int clone_invalid2(struct xdp_md *xdp)
+{
+	struct bpf_dynptr ptr1;
+	struct bpf_dynptr clone;
+
+	bpf_dynptr_from_xdp(xdp, 0, &ptr1);
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, 64, 0, &clone);
+
+	/* this should fail */
+	bpf_dynptr_clone(&ptr1, &clone, 0);
+
+	bpf_ringbuf_submit_dynptr(&clone, 0);
+
+	return 0;
+}
+
+/* Invalidating a dynptr should invalidate its clones */
+SEC("?raw_tp")
+int clone_invalidate1(void *ctx)
+{
+	struct bpf_dynptr clone;
+	struct bpf_dynptr ptr;
+	char read_data[64];
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone, 0);
+
+	bpf_ringbuf_submit_dynptr(&ptr, 0);
+
+	/* this should fail */
+	bpf_dynptr_read(read_data, sizeof(read_data), &clone, 0, 0);
+
+	return 0;
+}
+
+/* Invalidating a dynptr should invalidate its parent */
+SEC("?raw_tp")
+int clone_invalidate2(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone;
+	char read_data[64];
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone, 0);
+
+	bpf_ringbuf_submit_dynptr(&clone, 0);
+
+	/* this should fail */
+	bpf_dynptr_read(read_data, sizeof(read_data), &ptr, 0, 0);
+
+	return 0;
+}
+
+/* Invalidating a dynptr should invalidate its siblings */
+SEC("?raw_tp")
+int clone_invalidate3(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone1;
+	struct bpf_dynptr clone2;
+	char read_data[64];
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone1, 0);
+
+	bpf_dynptr_clone(&ptr, &clone2, 0);
+
+	bpf_ringbuf_submit_dynptr(&clone2, 0);
+
+	/* this should fail */
+	bpf_dynptr_read(read_data, sizeof(read_data), &clone1, 0, 0);
+
+	return 0;
+}
+
+/* Invalidating a dynptr should invalidate any data slices
+ * of its clones
+ */
+SEC("?raw_tp")
+int clone_invalidate4(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone;
+	int *data;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone, 0);
+	data = bpf_dynptr_data(&clone, 0, sizeof(val));
+	if (!data)
+		return 0;
+
+	bpf_ringbuf_submit_dynptr(&ptr, 0);
+
+	/* this should fail */
+	*data = 123;
+
+	return 0;
+}
+
+/* Invalidating a dynptr should invalidate any data slices
+ * of its parent
+ */
+SEC("?raw_tp")
+int clone_invalidate5(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone;
+	int *data;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+	data = bpf_dynptr_data(&ptr, 0, sizeof(val));
+	if (!data)
+		return 0;
+
+	bpf_dynptr_clone(&ptr, &clone, 0);
+
+	bpf_ringbuf_submit_dynptr(&clone, 0);
+
+	/* this should fail */
+	*data = 123;
+
+	return 0;
+}
+
+/* Invalidating a dynptr should invalidate any data slices
+ * of its sibling
+ */
+SEC("?raw_tp")
+int clone_invalidate6(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone1;
+	struct bpf_dynptr clone2;
+	int *data;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone1, 0);
+
+	bpf_dynptr_clone(&ptr, &clone2, 0);
+
+	data = bpf_dynptr_data(&clone1, 0, sizeof(val));
+	if (!data)
+		return 0;
+
+	bpf_ringbuf_submit_dynptr(&clone2, 0);
+
+	/* this should fail */
+	*data = 123;
+
+	return 0;
+}
+
+/* A skb clone's data slices should be invalid anytime packet data changes */
+SEC("?tc")
+int clone_skb_packet_data(struct __sk_buff *skb)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone;
+	__u32 *data;
+
+	bpf_dynptr_from_skb(skb, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone, 0);
+	data = bpf_dynptr_data(&clone, 0, sizeof(*data));
+	if (!data)
+		return XDP_DROP;
+
+	if (bpf_skb_pull_data(skb, skb->len))
+		return SK_DROP;
+
+	/* this should fail */
+	*data = 123;
+
+	return 0;
+}
+
+/* A xdp clone's data slices should be invalid anytime packet data changes */
+SEC("?xdp")
+int clone_xdp_packet_data(struct xdp_md *xdp)
+{
+	struct bpf_dynptr ptr;
+	struct bpf_dynptr clone;
+	struct ethhdr *hdr;
+	__u32 *data;
+
+	bpf_dynptr_from_xdp(xdp, 0, &ptr);
+
+	bpf_dynptr_clone(&ptr, &clone, 0);
+	data = bpf_dynptr_data(&clone, 0, sizeof(*data));
+	if (!data)
+		return XDP_DROP;
+
+	if (bpf_xdp_adjust_head(xdp, 0 - (int)sizeof(*hdr)))
+		return XDP_DROP;
+
+	/* this should fail */
+	*data = 123;
+
+	return 0;
+}
+
+static int iterator_callback1(struct bpf_dynptr *ptr, void *ctx)
+{
+	struct bpf_dynptr local_ptr;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &local_ptr);
+
+	/* missing a call to bpf_ringbuf_discard/submit_dynptr */
+
+	return 0;
+}
+
+/* If a dynptr requiring a release is initialized within the iterator callback
+ * function, then it must also be released within that function
+ */
+SEC("?xdp")
+int iterator_invalid1(struct xdp_md *xdp)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_dynptr_from_xdp(xdp, 0, &ptr);
+
+	bpf_dynptr_iterator(&ptr, iterator_callback1, NULL, 0);
+
+	return 0;
+}
+
+/* bpf_dynptr_iterator can't be called on an uninitialized dynptr */
+SEC("?xdp")
+int iterator_invalid2(struct xdp_md *xdp)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_iterator(&ptr, iterator_callback1, NULL, 0);
+
+	return 0;
+}
+
+static int iterator_callback3(struct bpf_dynptr *ptr, void *ctx)
+{
+	/* this should fail */
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, ptr);
+
+	bpf_ringbuf_submit_dynptr(ptr, 0);
+
+	return 1;
+}
+
+/* The dynptr callback ctx can't be re-initialized as a separate dynptr
+ * within the callback function
+ */
+SEC("?raw_tp")
+int iterator_invalid3(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_iterator(&ptr, iterator_callback3,  NULL, 0);
+
+	bpf_ringbuf_submit_dynptr(&ptr, 0);
+
+	return 0;
+}
+
+static int iterator_callback4(struct bpf_dynptr *ptr, void *ctx)
+{
+	char write_data[64] = "hello there, world!!";
+
+	bpf_dynptr_write(ptr, 0, write_data, sizeof(write_data), 0);
+
+	/* this should fail */
+	bpf_ringbuf_submit_dynptr(ptr, 0);
+
+	return 0;
+}
+
+/* The dynptr can't be released within the iterator callback */
+SEC("?raw_tp")
+int iterator_invalid4(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_iterator(&ptr, iterator_callback4, NULL, 0);
+
+	return 0;
+}
+
+static int iterator_callback5(struct bpf_dynptr *ptr, void *ctx)
+{
+	bpf_ringbuf_submit_dynptr(ctx, 0);
+
+	return 0;
+}
+
+/* If a dynptr is passed in as the callback ctx, the dynptr
+ * can't be released.
+ *
+ * Currently, the verifier doesn't strictly check for this since
+ * it only runs the callback once when verifying. For now, we
+ * use the fact that the verifier doesn't mark the reference in
+ * the parent func state as released if it's released in the
+ * callback. This is what we currently lean on in bpf_loop() as
+ * well. This is a bit of a hack for now, and will need to be
+ * addressed more thoroughly in the future.
+ */
+SEC("?raw_tp")
+int iterator_invalid5(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_iterator(&ptr, iterator_callback5, &ptr, 0);
+
+	return 0;
+}
+
+static int iterator_callback6(struct bpf_dynptr *ptr, void *ctx)
+{
+	char write_data[64] = "hello there, world!!";
+
+	bpf_dynptr_write(ptr, 0, write_data, sizeof(write_data), 0);
+
+	/* this should fail */
+	*(int *)ptr = 12;
+
+	return 1;
+}
+
+/* The dynptr struct can't be modified in the iterator callback */
+SEC("?raw_tp")
+int iterator_invalid6(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_iterator(&ptr, iterator_callback6,  NULL, 0);
+
+	bpf_ringbuf_submit_dynptr(&ptr, 0);
+
+	return 0;
+}
+
+static __u64 iterator_callback7(struct bpf_dynptr *ptr, void *ctx)
+{
+	/* callback should return an int */
+	return 1UL << 32;
+}
+
+/* The callback should return an int */
+SEC("?raw_tp")
+int iterator_invalid7(void *ctx)
+{
+	struct bpf_dynptr ptr;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, val, 0, &ptr);
+
+	bpf_dynptr_iterator(&ptr, iterator_callback7,  NULL, 0);
+
+	bpf_ringbuf_submit_dynptr(&ptr, 0);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c
index 349def97f50a..e8866e662b06 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_success.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
@@ -1,11 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2022 Facebook */
 
+#include <errno.h>
 #include <string.h>
 #include <linux/bpf.h>
-#include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
-#include "errno.h"
+#include <bpf/bpf_helpers.h>
 
 char _license[] SEC("license") = "GPL";
 
@@ -29,6 +29,13 @@ struct {
 	__type(value, __u32);
 } array_map SEC(".maps");
 
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, __u32);
+	__uint(value_size, 64);
+} array_map2 SEC(".maps");
+
 SEC("tp/syscalls/sys_enter_nanosleep")
 int test_read_write(void *ctx)
 {
@@ -185,3 +192,526 @@ int test_skb_readonly(struct __sk_buff *skb)
 
 	return 0;
 }
+
+SEC("tp/syscalls/sys_enter_nanosleep")
+int test_advance_trim(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	__u32 bytes = 64;
+	__u32 off = 10;
+	__u32 trim = 5;
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	err = bpf_ringbuf_reserve_dynptr(&ringbuf, bytes, 0, &ptr);
+	if (err) {
+		err = 1;
+		goto done;
+	}
+
+	if (bpf_dynptr_get_size(&ptr) != bytes) {
+		err = 2;
+		goto done;
+	}
+
+	/* Advance the dynptr by off */
+	err = bpf_dynptr_advance(&ptr, off);
+	if (err) {
+		err = 3;
+		goto done;
+	}
+
+	/* Check that the dynptr off and size were adjusted correctly */
+	if (bpf_dynptr_get_offset(&ptr) != off) {
+		err = 4;
+		goto done;
+	}
+	if (bpf_dynptr_get_size(&ptr) != bytes - off) {
+		err = 5;
+		goto done;
+	}
+
+	/* Trim the dynptr */
+	err = bpf_dynptr_trim(&ptr, trim);
+	if (err) {
+		err = 6;
+		goto done;
+	}
+
+	/* Check that the off was unaffected */
+	if (bpf_dynptr_get_offset(&ptr) != off) {
+		err = 7;
+		goto done;
+	}
+	/* Check that the size was adjusted correctly */
+	if (bpf_dynptr_get_size(&ptr) != bytes - off - trim) {
+		err = 8;
+		goto done;
+	}
+
+done:
+	bpf_ringbuf_discard_dynptr(&ptr, 0);
+	return 0;
+}
+
+SEC("tp/syscalls/sys_enter_nanosleep")
+int test_advance_trim_err(void *ctx)
+{
+	char write_data[45] = "hello there, world!!";
+	struct bpf_dynptr ptr;
+	__u32 trim_size = 10;
+	__u32 size = 64;
+	__u32 off = 10;
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	if (bpf_ringbuf_reserve_dynptr(&ringbuf, size, 0, &ptr)) {
+		err = 1;
+		goto done;
+	}
+
+	/* Check that you can't advance beyond size of dynptr data */
+	if (bpf_dynptr_advance(&ptr, size + 1) != -ERANGE) {
+		err = 2;
+		goto done;
+	}
+
+	if (bpf_dynptr_advance(&ptr, off)) {
+		err = 3;
+		goto done;
+	}
+
+	/* Check that you can't trim more than size of dynptr data */
+	if (bpf_dynptr_trim(&ptr, size - off + 1) != -ERANGE) {
+		err = 4;
+		goto done;
+	}
+
+	/* Check that you can't write more bytes than available into the dynptr
+	 * after you've trimmed it
+	 */
+	if (bpf_dynptr_trim(&ptr, trim_size)) {
+		err = 5;
+		goto done;
+	}
+
+	if (bpf_dynptr_write(&ptr, 0, &write_data, sizeof(write_data), 0) != -E2BIG) {
+		err = 6;
+		goto done;
+	}
+
+	/* Check that even after advancing / trimming, submitting/discarding
+	 * a ringbuf dynptr works
+	 */
+	bpf_ringbuf_submit_dynptr(&ptr, 0);
+	return 0;
+
+done:
+	bpf_ringbuf_discard_dynptr(&ptr, 0);
+	return 0;
+}
+
+SEC("tp/syscalls/sys_enter_nanosleep")
+int test_zero_size_dynptr(void *ctx)
+{
+	char write_data = 'x', read_data;
+	struct bpf_dynptr ptr;
+	__u32 size = 64;
+	__u32 off = 10;
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	/* check that you can reserve a dynamic size reservation */
+	if (bpf_ringbuf_reserve_dynptr(&ringbuf, size, 0, &ptr)) {
+		err = 1;
+		goto done;
+	}
+
+	/* After this, the dynptr has a size of 0 */
+	if (bpf_dynptr_advance(&ptr, size)) {
+		err = 2;
+		goto done;
+	}
+
+	/* Test that reading + writing non-zero bytes is not ok */
+	if (bpf_dynptr_read(&read_data, sizeof(read_data), &ptr, 0, 0) != -E2BIG) {
+		err = 3;
+		goto done;
+	}
+
+	if (bpf_dynptr_write(&ptr, 0, &write_data, sizeof(write_data), 0) != -E2BIG) {
+		err = 4;
+		goto done;
+	}
+
+	/* Test that reading + writing 0 bytes from a 0-size dynptr is ok */
+	if (bpf_dynptr_read(&read_data, 0, &ptr, 0, 0)) {
+		err = 5;
+		goto done;
+	}
+
+	if (bpf_dynptr_write(&ptr, 0, &write_data, 0, 0)) {
+		err = 6;
+		goto done;
+	}
+
+	err = 0;
+
+done:
+	bpf_ringbuf_discard_dynptr(&ptr, 0);
+	return 0;
+}
+
+SEC("tp/syscalls/sys_enter_nanosleep")
+int test_dynptr_is_null(void *ctx)
+{
+	struct bpf_dynptr ptr1;
+	struct bpf_dynptr ptr2;
+	__u64 size = 4;
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	/* Pass in invalid flags, get back an invalid dynptr */
+	if (bpf_ringbuf_reserve_dynptr(&ringbuf, size, 123, &ptr1) != -EINVAL) {
+		err = 1;
+		goto exit_early;
+	}
+
+	/* Test that the invalid dynptr is null */
+	if (!bpf_dynptr_is_null(&ptr1)) {
+		err = 2;
+		goto exit_early;
+	}
+
+	/* Get a valid dynptr */
+	if (bpf_ringbuf_reserve_dynptr(&ringbuf, size, 0, &ptr2)) {
+		err = 3;
+		goto exit;
+	}
+
+	/* Test that the valid dynptr is not null */
+	if (bpf_dynptr_is_null(&ptr2)) {
+		err = 4;
+		goto exit;
+	}
+
+exit:
+	bpf_ringbuf_discard_dynptr(&ptr2, 0);
+exit_early:
+	bpf_ringbuf_discard_dynptr(&ptr1, 0);
+	return 0;
+}
+
+SEC("cgroup_skb/egress")
+int test_dynptr_is_rdonly(struct __sk_buff *skb)
+{
+	struct bpf_dynptr ptr1;
+	struct bpf_dynptr ptr2;
+	struct bpf_dynptr ptr3;
+
+	/* Pass in invalid flags, get back an invalid dynptr */
+	if (bpf_dynptr_from_skb(skb, 123, &ptr1) != -EINVAL) {
+		err = 1;
+		return 0;
+	}
+
+	/* Test that an invalid dynptr is_rdonly returns false */
+	if (bpf_dynptr_is_rdonly(&ptr1)) {
+		err = 2;
+		return 0;
+	}
+
+	/* Get a read-only dynptr */
+	if (bpf_dynptr_from_skb(skb, 0, &ptr2)) {
+		err = 3;
+		return 0;
+	}
+
+	/* Test that the dynptr is read-only */
+	if (!bpf_dynptr_is_rdonly(&ptr2)) {
+		err = 4;
+		return 0;
+	}
+
+	/* Get a read-writeable dynptr */
+	if (bpf_ringbuf_reserve_dynptr(&ringbuf, 64, 0, &ptr3)) {
+		err = 5;
+		goto done;
+	}
+
+	/* Test that the dynptr is read-only */
+	if (bpf_dynptr_is_rdonly(&ptr3)) {
+		err = 6;
+		goto done;
+	}
+
+done:
+	bpf_ringbuf_discard_dynptr(&ptr3, 0);
+	return 0;
+}
+
+SEC("cgroup_skb/egress")
+int test_dynptr_clone(struct __sk_buff *skb)
+{
+	struct bpf_dynptr ptr1;
+	struct bpf_dynptr ptr2;
+	__u32 off = 2, size;
+
+	/* Get a dynptr */
+	if (bpf_dynptr_from_skb(skb, 0, &ptr1)) {
+		err = 1;
+		return 0;
+	}
+
+	if (bpf_dynptr_advance(&ptr1, off)) {
+		err = 2;
+		return 0;
+	}
+
+	/* Clone the dynptr */
+	if (bpf_dynptr_clone(&ptr1, &ptr2, 0)) {
+		err = 3;
+		return 0;
+	}
+
+	size = bpf_dynptr_get_size(&ptr1);
+
+	/* Check that the clone has the same offset, size, and rd-only */
+	if (bpf_dynptr_get_size(&ptr2) != size) {
+		err = 4;
+		return 0;
+	}
+
+	if (bpf_dynptr_get_offset(&ptr2) != off) {
+		err = 5;
+		return 0;
+	}
+
+	if (bpf_dynptr_is_rdonly(&ptr2) != bpf_dynptr_is_rdonly(&ptr1)) {
+		err = 6;
+		return 0;
+	}
+
+	/* Advance and trim the original dynptr */
+	bpf_dynptr_advance(&ptr1, 50);
+	bpf_dynptr_trim(&ptr1, 50);
+
+	/* Check that only original dynptr was affected, and the clone wasn't */
+	if (bpf_dynptr_get_offset(&ptr2) != off) {
+		err = 7;
+		return 0;
+	}
+
+	if (bpf_dynptr_get_size(&ptr2) != size) {
+		err = 8;
+		return 0;
+	}
+
+	return 0;
+}
+
+SEC("cgroup_skb/egress")
+int test_dynptr_clone_offset(struct __sk_buff *skb)
+{
+	struct bpf_dynptr ptr1;
+	struct bpf_dynptr ptr2;
+	struct bpf_dynptr ptr3;
+	__u32 off = 2, size;
+
+	/* Get a dynptr */
+	if (bpf_dynptr_from_skb(skb, 0, &ptr1)) {
+		err = 1;
+		return 0;
+	}
+
+	if (bpf_dynptr_advance(&ptr1, off)) {
+		err = 2;
+		return 0;
+	}
+
+	size = bpf_dynptr_get_size(&ptr1);
+
+	/* Clone the dynptr at an invalid offset */
+	if (bpf_dynptr_clone(&ptr1, &ptr2, size + 1) != -ERANGE) {
+		err = 3;
+		return 0;
+	}
+
+	/* Clone the dynptr at valid offset */
+	if (bpf_dynptr_clone(&ptr1, &ptr3, off)) {
+		err = 4;
+		return 0;
+	}
+
+	if (bpf_dynptr_get_size(&ptr3) != size - off) {
+		err = 5;
+		return 0;
+	}
+
+	return 0;
+}
+
+static int iter_callback1(struct bpf_dynptr *ptr, void *ctx)
+{
+	return bpf_dynptr_get_size(ptr) + 1;
+}
+
+static int iter_callback2(struct bpf_dynptr *ptr, void *ctx)
+{
+	return -EFAULT;
+}
+
+SEC("cgroup_skb/egress")
+int test_dynptr_iterator(struct __sk_buff *skb)
+{
+	struct bpf_dynptr ptr;
+	__u32 off = 1, size;
+	/* Get a dynptr */
+	if (bpf_dynptr_from_skb(skb, 0, &ptr)) {
+		err = 1;
+		return 0;
+	}
+
+	if (bpf_dynptr_advance(&ptr, off)) {
+		err = 2;
+		return 0;
+	}
+
+	size = bpf_dynptr_get_size(&ptr);
+
+	/* Test the case where the callback tries to advance by more
+	 * bytes than available
+	 */
+	if (bpf_dynptr_iterator(&ptr, iter_callback1, NULL, 0) != -ERANGE) {
+		err = 3;
+		return 0;
+	}
+	if (bpf_dynptr_get_size(&ptr) != size) {
+		err = 4;
+		return 0;
+	}
+	if (bpf_dynptr_get_offset(&ptr) != off) {
+		err = 5;
+		return 0;
+	}
+
+	/* Test the case where the callback returns an error code */
+	if (bpf_dynptr_iterator(&ptr, iter_callback2, NULL, 0) != -EFAULT) {
+		err = 6;
+		return 0;
+	}
+
+	return 0;
+}
+
+static char values[3][64] = {};
+
+#define MAX_STRINGS_LEN 10
+static int parse_strings_callback(struct bpf_dynptr *ptr, int *nr_values)
+{
+	__u32 size = bpf_dynptr_get_size(ptr);
+	char buf[MAX_STRINGS_LEN] = {};
+	char *data;
+	int i, j, k;
+	int err;
+
+	if (size < MAX_STRINGS_LEN) {
+		err = bpf_dynptr_read(buf, size, ptr, 0, 0);
+		if (err)
+			return err;
+		data = buf;
+	} else {
+		data = bpf_dynptr_data(ptr, 0, MAX_STRINGS_LEN);
+		if (!data)
+			return -ENOENT;
+		size = MAX_STRINGS_LEN;
+	}
+
+	for (i = 0; i < size; i++) {
+		if (data[i] != '=')
+			continue;
+
+		for (j = i; j < size - i; j++) {
+			int index = 0;
+
+			if (data[j] != '/')
+				continue;
+
+			for (k = i + 1; k < j; k++) {
+				values[*nr_values][index] = data[k];
+				index += 1;
+			}
+
+			*nr_values += 1;
+			return j;
+		}
+
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+SEC("tp/syscalls/sys_enter_nanosleep")
+int iterator_parse_strings(void *ctx)
+{
+	char val[64] = "x=foo/y=bar/z=baz/";
+	struct bpf_dynptr ptr;
+	__u32 map_val_size;
+	int nr_values = 0;
+	__u32 key = 0;
+	char *map_val;
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+
+	map_val_size = sizeof(val);
+
+	if (bpf_map_update_elem(&array_map2, &key, &val, 0)) {
+		err = 1;
+		return 0;
+	}
+
+	map_val = bpf_map_lookup_elem(&array_map2, &key);
+	if (!map_val) {
+		err = 2;
+		return 0;
+	}
+
+	if (bpf_dynptr_from_mem(map_val, map_val_size, 0, &ptr)) {
+		err = 3;
+		return 0;
+	}
+
+	if (bpf_dynptr_iterator(&ptr, parse_strings_callback,
+				&nr_values, 0)) {
+		err = 4;
+		return 0;
+	}
+
+	if (nr_values != 3) {
+		err = 8;
+		return 0;
+	}
+
+	if (memcmp(values[0], "foo", sizeof("foo"))) {
+		err = 5;
+		return 0;
+	}
+
+	if (memcmp(values[1], "bar", sizeof("bar"))) {
+		err = 6;
+		return 0;
+	}
+
+	if (memcmp(values[2], "baz", sizeof("baz"))) {
+		err = 7;
+		return 0;
+	}
+
+	return 0;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
                   ` (5 preceding siblings ...)
  2022-12-07 20:55 ` [PATCH v2 bpf-next 6/6] selftests/bpf: Tests for dynptr convenience helpers Joanne Koong
@ 2022-12-08  1:54 ` Alexei Starovoitov
  2022-12-09  0:42   ` Andrii Nakryiko
  6 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-08  1:54 UTC (permalink / raw)
  To: Joanne Koong; +Cc: bpf, andrii, kernel-team, ast, daniel, martin.lau, song

On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> and the 2nd can be found here [1].
> 
> In this patchset, the following convenience helpers are added for interacting
> with bpf dynamic pointers:
> 
>     * bpf_dynptr_data_rdonly
>     * bpf_dynptr_trim
>     * bpf_dynptr_advance
>     * bpf_dynptr_is_null
>     * bpf_dynptr_is_rdonly
>     * bpf_dynptr_get_size
>     * bpf_dynptr_get_offset
>     * bpf_dynptr_clone
>     * bpf_dynptr_iterator

This is great, but it really stretches uapi limits.
Please convert the above and those in [1] to kfuncs.
I know that there can be an argument made for consistency with existing dynptr uapi
helpers, but we got burned on them once and scrambled to add 'flags' argument.
kfuncs are unstable and can be adjusted/removed at any time later.
The verifier now supports dynptr in kfunc verification, so conversion should
be straightforward.
Thanks

> 
> Please note that this patchset will be rebased on top of dynptr refactoring/fixes
> once that is landed upstream.
> 
> [0] https://lore.kernel.org/bpf/20220523210712.3641569-1-joannelkoong@gmail.com/
> [1] https://lore.kernel.org/bpf/20221021011510.1890852-1-joannelkoong@gmail.com/
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-08  1:54 ` [PATCH v2 bpf-next 0/6] Dynptr " Alexei Starovoitov
@ 2022-12-09  0:42   ` Andrii Nakryiko
  2022-12-09  1:30     ` Alexei Starovoitov
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2022-12-09  0:42 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Joanne Koong, bpf, andrii, kernel-team, ast, daniel, martin.lau, song

On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > and the 2nd can be found here [1].
> >
> > In this patchset, the following convenience helpers are added for interacting
> > with bpf dynamic pointers:
> >
> >     * bpf_dynptr_data_rdonly
> >     * bpf_dynptr_trim
> >     * bpf_dynptr_advance
> >     * bpf_dynptr_is_null
> >     * bpf_dynptr_is_rdonly
> >     * bpf_dynptr_get_size
> >     * bpf_dynptr_get_offset
> >     * bpf_dynptr_clone
> >     * bpf_dynptr_iterator
>
> This is great, but it really stretches uapi limits.

Stretches in what sense? They are simple and straightforward getters
and trim/advance/clone are fundamental modifiers to be able to work
with a subset of dynptr's overall memory area.

> Please convert the above and those in [1] to kfuncs.
> I know that there can be an argument made for consistency with existing dynptr uapi

yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
BPF helpers, it makes sense to have such basic things like is_null and
trim/advance/clone as BPF helpers as well. Both for consistency and
because there is nothing unstable about them. We are not going to
remove dynptr as a concept, it's pretty well defined.

Out of the above list perhaps only move bpf_dynptr_iterator() might be
a candidate for kfunc. Though, personally, it makes sense to me to
keep it as BPF helper without GPL restriction as well, given it is
meant for networking applications in the first place, and you don't
need to be GPL-compatible to write useful networking BPF program, from
what I understand. But all the other ones is something you'd need to
make actual use of dynptr concept in real-world BPF programs.

Can we please have those as BPF helpers, and we can decide to move
slightly fancier bpf_dynptr_iterator() (and future dynptr-related
extras) into kfunc?

> helpers, but we got burned on them once and scrambled to add 'flags' argument.
> kfuncs are unstable and can be adjusted/removed at any time later.

I don't see why we would remove any of the above list ever? They are
generic and fundamental to dynptr as a concept, they can't restrict
what dynptr can do in the future.

Also GPL restriction of kfuncs doesn't apply to these dynptr helpers
either, IMO.

> The verifier now supports dynptr in kfunc verification, so conversion should
> be straightforward.
> Thanks
>
> >
> > Please note that this patchset will be rebased on top of dynptr refactoring/fixes
> > once that is landed upstream.
> >
> > [0] https://lore.kernel.org/bpf/20220523210712.3641569-1-joannelkoong@gmail.com/
> > [1] https://lore.kernel.org/bpf/20221021011510.1890852-1-joannelkoong@gmail.com/
> >

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-09  0:42   ` Andrii Nakryiko
@ 2022-12-09  1:30     ` Alexei Starovoitov
  2022-12-09 22:24       ` Joanne Koong
  2022-12-12 20:12       ` Andrii Nakryiko
  0 siblings, 2 replies; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-09  1:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Dec 8, 2022 at 4:42 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > > and the 2nd can be found here [1].
> > >
> > > In this patchset, the following convenience helpers are added for interacting
> > > with bpf dynamic pointers:
> > >
> > >     * bpf_dynptr_data_rdonly
> > >     * bpf_dynptr_trim
> > >     * bpf_dynptr_advance
> > >     * bpf_dynptr_is_null
> > >     * bpf_dynptr_is_rdonly
> > >     * bpf_dynptr_get_size
> > >     * bpf_dynptr_get_offset
> > >     * bpf_dynptr_clone
> > >     * bpf_dynptr_iterator
> >
> > This is great, but it really stretches uapi limits.
>
> Stretches in what sense? They are simple and straightforward getters
> and trim/advance/clone are fundamental modifiers to be able to work
> with a subset of dynptr's overall memory area.
>
> > Please convert the above and those in [1] to kfuncs.
> > I know that there can be an argument made for consistency with existing dynptr uapi
>
> yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
> BPF helpers, it makes sense to have such basic things like is_null and
> trim/advance/clone as BPF helpers as well. Both for consistency and
> because there is nothing unstable about them. We are not going to
> remove dynptr as a concept, it's pretty well defined.
>
> Out of the above list perhaps only move bpf_dynptr_iterator() might be
> a candidate for kfunc. Though, personally, it makes sense to me to
> keep it as BPF helper without GPL restriction as well, given it is
> meant for networking applications in the first place, and you don't
> need to be GPL-compatible to write useful networking BPF program, from
> what I understand. But all the other ones is something you'd need to
> make actual use of dynptr concept in real-world BPF programs.
>
> Can we please have those as BPF helpers, and we can decide to move
> slightly fancier bpf_dynptr_iterator() (and future dynptr-related
> extras) into kfunc?

Sorry, uapi concerns are more important here.
non-gpl and consistency don't even come close.
We've been doing everything new as kfuncs and dynptr is not special.

> > helpers, but we got burned on them once and scrambled to add 'flags' argument.
> > kfuncs are unstable and can be adjusted/removed at any time later.
>
> I don't see why we would remove any of the above list ever? They are
> generic and fundamental to dynptr as a concept, they can't restrict
> what dynptr can do in the future.

It's not about removing them, but about changing them.

Just for example the whole discussion of whether frags should
be handled transparently and how write is handled didn't inspire
confidence that there is a strong consensus on semantics
of these new dynptr accessors.

Scrambling to add flags to dynptr helpers was another red flag.

All signs are pointing out that we're not ready do fix dynptr api.
It will evolve and has to evolve without uapi pain.

kfuncs only. For everything. Please.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-09  1:30     ` Alexei Starovoitov
@ 2022-12-09 22:24       ` Joanne Koong
  2022-12-12 20:12       ` Andrii Nakryiko
  1 sibling, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-09 22:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Dec 8, 2022 at 5:30 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Dec 8, 2022 at 4:42 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > > > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > > > and the 2nd can be found here [1].
> > > >
> > > > In this patchset, the following convenience helpers are added for interacting
> > > > with bpf dynamic pointers:
> > > >
> > > >     * bpf_dynptr_data_rdonly
> > > >     * bpf_dynptr_trim
> > > >     * bpf_dynptr_advance
> > > >     * bpf_dynptr_is_null
> > > >     * bpf_dynptr_is_rdonly
> > > >     * bpf_dynptr_get_size
> > > >     * bpf_dynptr_get_offset
> > > >     * bpf_dynptr_clone
> > > >     * bpf_dynptr_iterator
> > >
> > > This is great, but it really stretches uapi limits.
> >
> > Stretches in what sense? They are simple and straightforward getters
> > and trim/advance/clone are fundamental modifiers to be able to work
> > with a subset of dynptr's overall memory area.
> >
> > > Please convert the above and those in [1] to kfuncs.
> > > I know that there can be an argument made for consistency with existing dynptr uapi
> >
> > yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
> > BPF helpers, it makes sense to have such basic things like is_null and
> > trim/advance/clone as BPF helpers as well. Both for consistency and
> > because there is nothing unstable about them. We are not going to
> > remove dynptr as a concept, it's pretty well defined.
> >
> > Out of the above list perhaps only move bpf_dynptr_iterator() might be
> > a candidate for kfunc. Though, personally, it makes sense to me to
> > keep it as BPF helper without GPL restriction as well, given it is
> > meant for networking applications in the first place, and you don't
> > need to be GPL-compatible to write useful networking BPF program, from
> > what I understand. But all the other ones is something you'd need to
> > make actual use of dynptr concept in real-world BPF programs.
> >
> > Can we please have those as BPF helpers, and we can decide to move
> > slightly fancier bpf_dynptr_iterator() (and future dynptr-related
> > extras) into kfunc?
>
> Sorry, uapi concerns are more important here.
> non-gpl and consistency don't even come close.
> We've been doing everything new as kfuncs and dynptr is not special.
>
> > > helpers, but we got burned on them once and scrambled to add 'flags' argument.
> > > kfuncs are unstable and can be adjusted/removed at any time later.
> >
> > I don't see why we would remove any of the above list ever? They are
> > generic and fundamental to dynptr as a concept, they can't restrict
> > what dynptr can do in the future.
>
> It's not about removing them, but about changing them.
>
> Just for example the whole discussion of whether frags should
> be handled transparently and how write is handled didn't inspire
> confidence that there is a strong consensus on semantics
> of these new dynptr accessors.
>
> Scrambling to add flags to dynptr helpers was another red flag.
>
> All signs are pointing out that we're not ready do fix dynptr api.
> It will evolve and has to evolve without uapi pain.
>
> kfuncs only. For everything. Please.

Thanks for your feedback, Alexei and Andrii. I share the same opinion
as Andrii about helpers for the APIs that are straightforward (eg
bpf_dynptr_get_offset), but I see your point as well about doing
everything new as kfuncs.

I'll change this to use kfuncs for v3.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-09  1:30     ` Alexei Starovoitov
  2022-12-09 22:24       ` Joanne Koong
@ 2022-12-12 20:12       ` Andrii Nakryiko
  2022-12-13 23:50         ` Joanne Koong
  2022-12-16 17:35         ` Alexei Starovoitov
  1 sibling, 2 replies; 57+ messages in thread
From: Andrii Nakryiko @ 2022-12-12 20:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Dec 8, 2022 at 5:30 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Dec 8, 2022 at 4:42 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > > > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > > > and the 2nd can be found here [1].
> > > >
> > > > In this patchset, the following convenience helpers are added for interacting
> > > > with bpf dynamic pointers:
> > > >
> > > >     * bpf_dynptr_data_rdonly
> > > >     * bpf_dynptr_trim
> > > >     * bpf_dynptr_advance
> > > >     * bpf_dynptr_is_null
> > > >     * bpf_dynptr_is_rdonly
> > > >     * bpf_dynptr_get_size
> > > >     * bpf_dynptr_get_offset
> > > >     * bpf_dynptr_clone
> > > >     * bpf_dynptr_iterator
> > >
> > > This is great, but it really stretches uapi limits.
> >
> > Stretches in what sense? They are simple and straightforward getters
> > and trim/advance/clone are fundamental modifiers to be able to work
> > with a subset of dynptr's overall memory area.
> >
> > > Please convert the above and those in [1] to kfuncs.
> > > I know that there can be an argument made for consistency with existing dynptr uapi
> >
> > yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
> > BPF helpers, it makes sense to have such basic things like is_null and
> > trim/advance/clone as BPF helpers as well. Both for consistency and
> > because there is nothing unstable about them. We are not going to
> > remove dynptr as a concept, it's pretty well defined.
> >
> > Out of the above list perhaps only move bpf_dynptr_iterator() might be
> > a candidate for kfunc. Though, personally, it makes sense to me to
> > keep it as BPF helper without GPL restriction as well, given it is
> > meant for networking applications in the first place, and you don't
> > need to be GPL-compatible to write useful networking BPF program, from
> > what I understand. But all the other ones is something you'd need to
> > make actual use of dynptr concept in real-world BPF programs.
> >
> > Can we please have those as BPF helpers, and we can decide to move
> > slightly fancier bpf_dynptr_iterator() (and future dynptr-related
> > extras) into kfunc?
>
> Sorry, uapi concerns are more important here.

What about the overall user experience and adoption?

There is no clean way to ever move from unstable kfunc to a stable helper.

BPF helpers also have the advantage of working on all architectures,
whether that architecture supports kfuncs or not, whether it supports
JIT or not.

BPF helpers are also nicely self-discoverable and documented in
include/uapi/linux/bpf.h, in one place where other BPF helpers are.
This is a big deal, especially for non-expert BPF users (a vast
majority of BPF users).

> non-gpl and consistency don't even come close.
> We've been doing everything new as kfuncs and dynptr is not special.

I think dynptr is quite special. It's a very generic and fundamental
concept, part of core BPF experience. It's a more dynamic counterpart
to an inflexible statically sized `void * + size` pair of arguments
sent to helpers for input or output memory regions. Dynptr has no
inherent dependencies on BTF, kfuncs, trampolines, JIT, nothing.

By requiring kfunc-based helpers we are significantly raising the
obstacles towards adopting dynptr across a wide range of BPF
applications.

And the only advantage in return is that we get a hypothetical chance
to change something in the future. But let's see if that will ever be
necessary for the helpers Joanne is adding:

1. Generic accessors to check validity of *any* dynptr, and it's
inherent properties like offset, available size, read-only property
(just as useful somethings as bpf_ringbuf_query() is for ringbufs,
both for debugging and for various heuristics in production).

bpf_dynptr_is_null(struct bpf_dynptr *ptr)
long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)

There is nothing to add or remove here. No flags, no change in semantics.

2. Manipulators to copy existing dynptr's view and narrow it down to a
subset (e.g., for when you have a large memory blog, but need to
calculate hashes over smaller subset, without destroying original
dynptr, because it will be used later for some other access). We can
debate whether clone should get offset or not, but it doesn't change
much (except usability in common cases). Again, nothing to add or
remove otherwise, and pretty fundamental for real use of full power of
dynptr.

long bpf_dynptr_clone(struct bpf_dynptr *ptr, struct bpf_dynptr
*clone, u32 offset)
long bpf_dynptr_trim(struct bpf_dynptr *ptr, u32 len)
long bpf_dynptr_advance(struct bpf_dynptr *ptr, u32 len)

3. This one is the only one I feel less strongly about, but mostly
because I can implement the same (even though less ergonomically, of
course) with bpf_loop() and bpf_dynptr_{clone,advance}.

long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
void *callback_ctx, u64 flags)


All of the above don't add or change any semantics to dynptr as a
concept. There is nothing that we'd need to change.


>
> > > helpers, but we got burned on them once and scrambled to add 'flags' argument.
> > > kfuncs are unstable and can be adjusted/removed at any time later.

It's unfair to block these helpers just because we recided to add
flags to one of the previous ones (before the final release). And even
if we didn't managed to do it in time, the worst things would probably
be another variant of BPF helper. Definitely something to avoid, but
not end of the world. But as I pointed out above, this set of helpers
won't be change, as they just complete already established dynptr
ecosystem of helpers.

> >
> > I don't see why we would remove any of the above list ever? They are
> > generic and fundamental to dynptr as a concept, they can't restrict
> > what dynptr can do in the future.
>
> It's not about removing them, but about changing them.
>
> Just for example the whole discussion of whether frags should
> be handled transparently and how write is handled didn't inspire
> confidence that there is a strong consensus on semantics
> of these new dynptr accessors.

So let's start with acknowledging that skb and xdp buffer abstractions
as logically contiguous memory area are inherently complex and
non-perfect due to the way that kernel handles them for performance
and flexibility reasons.

Let's also note that verifier knows specific flavor of dynptr and thus
can enforce additional restrictions based on specifically SKB/XDP
flavor vs LOCAL/RINGBUF. So just because there is no perfect way to
handle all the SKB/XDP physical non-contiguity, doesn't mean that the
dynptr concept itself is flawed or not well thought out. It's just
that for SKB/XDP there is no perfect solution. Dynptr doesn't change
anything here, rather it actually simplifies a bunch of stuff,
especially for common scenarios.

I'd argue that for wider SKB/XDP dynptr adoption in the networking
world, those dynptr constructor helpers should be helpers and not
kfuncs as well. But I'd wish someone with more networking tie-ins
would argue this instead of me.

>
> Scrambling to add flags to dynptr helpers was another red flag.
>
> All signs are pointing out that we're not ready do fix dynptr api.

I disagree, it's an overly harsh generalization.

> It will evolve and has to evolve without uapi pain.
>
> kfuncs only. For everything. Please.

This is yet another generalized blanket statement I disagree with.
Over the years I've got an impression that the BPF subsystem is
generally a  proud proponent of pragmatic, flexible, and common sense
engineering approaches, so this hard-and-fast rule with no room for
nuance sounds weird.

There are things that belong in fundamental and core BPF concepts, and
it makes sense to keep them as stable abstractions and helpers. And
there are various things (like interfacing into kernel mechanics, its
types and systems) which totally make sense to keep unstable.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-12 20:12       ` Andrii Nakryiko
@ 2022-12-13 23:50         ` Joanne Koong
  2022-12-14  0:57           ` Andrii Nakryiko
  2022-12-16 17:35         ` Alexei Starovoitov
  1 sibling, 1 reply; 57+ messages in thread
From: Joanne Koong @ 2022-12-13 23:50 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Mon, Dec 12, 2022 at 12:12 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Dec 8, 2022 at 5:30 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Dec 8, 2022 at 4:42 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > > > > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > > > > and the 2nd can be found here [1].
> > > > >
> > > > > In this patchset, the following convenience helpers are added for interacting
> > > > > with bpf dynamic pointers:
> > > > >
> > > > >     * bpf_dynptr_data_rdonly
> > > > >     * bpf_dynptr_trim
> > > > >     * bpf_dynptr_advance
> > > > >     * bpf_dynptr_is_null
> > > > >     * bpf_dynptr_is_rdonly
> > > > >     * bpf_dynptr_get_size
> > > > >     * bpf_dynptr_get_offset
> > > > >     * bpf_dynptr_clone
> > > > >     * bpf_dynptr_iterator
> > > >
> > > > This is great, but it really stretches uapi limits.
> > >
> > > Stretches in what sense? They are simple and straightforward getters
> > > and trim/advance/clone are fundamental modifiers to be able to work
> > > with a subset of dynptr's overall memory area.
> > >
> > > > Please convert the above and those in [1] to kfuncs.
> > > > I know that there can be an argument made for consistency with existing dynptr uapi
> > >
> > > yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
> > > BPF helpers, it makes sense to have such basic things like is_null and
> > > trim/advance/clone as BPF helpers as well. Both for consistency and
> > > because there is nothing unstable about them. We are not going to
> > > remove dynptr as a concept, it's pretty well defined.
> > >
> > > Out of the above list perhaps only move bpf_dynptr_iterator() might be
> > > a candidate for kfunc. Though, personally, it makes sense to me to
> > > keep it as BPF helper without GPL restriction as well, given it is
> > > meant for networking applications in the first place, and you don't
> > > need to be GPL-compatible to write useful networking BPF program, from
> > > what I understand. But all the other ones is something you'd need to
> > > make actual use of dynptr concept in real-world BPF programs.
> > >
> > > Can we please have those as BPF helpers, and we can decide to move
> > > slightly fancier bpf_dynptr_iterator() (and future dynptr-related
> > > extras) into kfunc?
> >
> > Sorry, uapi concerns are more important here.
>
> What about the overall user experience and adoption?
>
> There is no clean way to ever move from unstable kfunc to a stable helper.
>
> BPF helpers also have the advantage of working on all architectures,
> whether that architecture supports kfuncs or not, whether it supports
> JIT or not.

Oh interesting, I didn't realize some architectures do not support kfuncs.

Out of curiosity, can you elaborate on "no clean way to move from
unstable kfunc to a stable helper"? If for example we needed to move
something from kfunc -> helper, could we not just remove the code
where we added it as a kfunc (eg defining a BTF_ID for it) and add it
as a helper instead?

>
> BPF helpers are also nicely self-discoverable and documented in
> include/uapi/linux/bpf.h, in one place where other BPF helpers are.
> This is a big deal, especially for non-expert BPF users (a vast
> majority of BPF users).
>
> > non-gpl and consistency don't even come close.
> > We've been doing everything new as kfuncs and dynptr is not special.
>
> I think dynptr is quite special. It's a very generic and fundamental
> concept, part of core BPF experience. It's a more dynamic counterpart
> to an inflexible statically sized `void * + size` pair of arguments
> sent to helpers for input or output memory regions. Dynptr has no
> inherent dependencies on BTF, kfuncs, trampolines, JIT, nothing.
>
> By requiring kfunc-based helpers we are significantly raising the
> obstacles towards adopting dynptr across a wide range of BPF
> applications.
>
> And the only advantage in return is that we get a hypothetical chance
> to change something in the future. But let's see if that will ever be
> necessary for the helpers Joanne is adding:
>
> 1. Generic accessors to check validity of *any* dynptr, and it's
> inherent properties like offset, available size, read-only property
> (just as useful somethings as bpf_ringbuf_query() is for ringbufs,
> both for debugging and for various heuristics in production).
>
> bpf_dynptr_is_null(struct bpf_dynptr *ptr)
> long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
> long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
> bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
>
> There is nothing to add or remove here. No flags, no change in semantics.
>
> 2. Manipulators to copy existing dynptr's view and narrow it down to a
> subset (e.g., for when you have a large memory blog, but need to
> calculate hashes over smaller subset, without destroying original
> dynptr, because it will be used later for some other access). We can
> debate whether clone should get offset or not, but it doesn't change
> much (except usability in common cases). Again, nothing to add or
> remove otherwise, and pretty fundamental for real use of full power of
> dynptr.
>
> long bpf_dynptr_clone(struct bpf_dynptr *ptr, struct bpf_dynptr
> *clone, u32 offset)
> long bpf_dynptr_trim(struct bpf_dynptr *ptr, u32 len)
> long bpf_dynptr_advance(struct bpf_dynptr *ptr, u32 len)
>
> 3. This one is the only one I feel less strongly about, but mostly
> because I can implement the same (even though less ergonomically, of
> course) with bpf_loop() and bpf_dynptr_{clone,advance}.
>
> long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
> void *callback_ctx, u64 flags)
>
>
> All of the above don't add or change any semantics to dynptr as a
> concept. There is nothing that we'd need to change.
>
>
> >
> > > > helpers, but we got burned on them once and scrambled to add 'flags' argument.
> > > > kfuncs are unstable and can be adjusted/removed at any time later.
>
> It's unfair to block these helpers just because we recided to add
> flags to one of the previous ones (before the final release). And even
> if we didn't managed to do it in time, the worst things would probably
> be another variant of BPF helper. Definitely something to avoid, but
> not end of the world. But as I pointed out above, this set of helpers
> won't be change, as they just complete already established dynptr
> ecosystem of helpers.
>
> > >
> > > I don't see why we would remove any of the above list ever? They are
> > > generic and fundamental to dynptr as a concept, they can't restrict
> > > what dynptr can do in the future.
> >
> > It's not about removing them, but about changing them.
> >
> > Just for example the whole discussion of whether frags should
> > be handled transparently and how write is handled didn't inspire
> > confidence that there is a strong consensus on semantics
> > of these new dynptr accessors.
>
> So let's start with acknowledging that skb and xdp buffer abstractions
> as logically contiguous memory area are inherently complex and
> non-perfect due to the way that kernel handles them for performance
> and flexibility reasons.
>
> Let's also note that verifier knows specific flavor of dynptr and thus
> can enforce additional restrictions based on specifically SKB/XDP
> flavor vs LOCAL/RINGBUF. So just because there is no perfect way to
> handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> dynptr concept itself is flawed or not well thought out. It's just
> that for SKB/XDP there is no perfect solution. Dynptr doesn't change
> anything here, rather it actually simplifies a bunch of stuff,
> especially for common scenarios.
>
> I'd argue that for wider SKB/XDP dynptr adoption in the networking
> world, those dynptr constructor helpers should be helpers and not
> kfuncs as well. But I'd wish someone with more networking tie-ins
> would argue this instead of me.

I'm not that familiar with the semantics of bpf kfuncs, so to clarify:
from a user API perspective, is there any difference in calling the
function from the bpf program as a helper vs. kfunc?

>
> >
> > Scrambling to add flags to dynptr helpers was another red flag.
> >
> > All signs are pointing out that we're not ready do fix dynptr api.
>
> I disagree, it's an overly harsh generalization.
>
> > It will evolve and has to evolve without uapi pain.
> >
> > kfuncs only. For everything. Please.
>
> This is yet another generalized blanket statement I disagree with.
> Over the years I've got an impression that the BPF subsystem is
> generally a  proud proponent of pragmatic, flexible, and common sense
> engineering approaches, so this hard-and-fast rule with no room for
> nuance sounds weird.
>
> There are things that belong in fundamental and core BPF concepts, and
> it makes sense to keep them as stable abstractions and helpers. And
> there are various things (like interfacing into kernel mechanics, its
> types and systems) which totally make sense to keep unstable.

I agree with all of your points. I know Alexei is on PTO these next
two weeks, so I will in the meantime table this and work on the dynptr
memory allocation patchset and a dynptr documentation write-up.

Thanks for the discussion!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-13 23:50         ` Joanne Koong
@ 2022-12-14  0:57           ` Andrii Nakryiko
  2022-12-14 21:25             ` Joanne Koong
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2022-12-14  0:57 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Alexei Starovoitov, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Tue, Dec 13, 2022 at 3:50 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Mon, Dec 12, 2022 at 12:12 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Dec 8, 2022 at 5:30 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Dec 8, 2022 at 4:42 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > > > > > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > > > > > and the 2nd can be found here [1].
> > > > > >
> > > > > > In this patchset, the following convenience helpers are added for interacting
> > > > > > with bpf dynamic pointers:
> > > > > >
> > > > > >     * bpf_dynptr_data_rdonly
> > > > > >     * bpf_dynptr_trim
> > > > > >     * bpf_dynptr_advance
> > > > > >     * bpf_dynptr_is_null
> > > > > >     * bpf_dynptr_is_rdonly
> > > > > >     * bpf_dynptr_get_size
> > > > > >     * bpf_dynptr_get_offset
> > > > > >     * bpf_dynptr_clone
> > > > > >     * bpf_dynptr_iterator
> > > > >
> > > > > This is great, but it really stretches uapi limits.
> > > >
> > > > Stretches in what sense? They are simple and straightforward getters
> > > > and trim/advance/clone are fundamental modifiers to be able to work
> > > > with a subset of dynptr's overall memory area.
> > > >
> > > > > Please convert the above and those in [1] to kfuncs.
> > > > > I know that there can be an argument made for consistency with existing dynptr uapi
> > > >
> > > > yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
> > > > BPF helpers, it makes sense to have such basic things like is_null and
> > > > trim/advance/clone as BPF helpers as well. Both for consistency and
> > > > because there is nothing unstable about them. We are not going to
> > > > remove dynptr as a concept, it's pretty well defined.
> > > >
> > > > Out of the above list perhaps only move bpf_dynptr_iterator() might be
> > > > a candidate for kfunc. Though, personally, it makes sense to me to
> > > > keep it as BPF helper without GPL restriction as well, given it is
> > > > meant for networking applications in the first place, and you don't
> > > > need to be GPL-compatible to write useful networking BPF program, from
> > > > what I understand. But all the other ones is something you'd need to
> > > > make actual use of dynptr concept in real-world BPF programs.
> > > >
> > > > Can we please have those as BPF helpers, and we can decide to move
> > > > slightly fancier bpf_dynptr_iterator() (and future dynptr-related
> > > > extras) into kfunc?
> > >
> > > Sorry, uapi concerns are more important here.
> >
> > What about the overall user experience and adoption?
> >
> > There is no clean way to ever move from unstable kfunc to a stable helper.
> >
> > BPF helpers also have the advantage of working on all architectures,
> > whether that architecture supports kfuncs or not, whether it supports
> > JIT or not.
>
> Oh interesting, I didn't realize some architectures do not support kfuncs.
>
> Out of curiosity, can you elaborate on "no clean way to move from
> unstable kfunc to a stable helper"? If for example we needed to move
> something from kfunc -> helper, could we not just remove the code
> where we added it as a kfunc (eg defining a BTF_ID for it) and add it
> as a helper instead?

We could in the kernel. And make user life horrible.

If, say, bpf_dynptr_is_null() is defined as kfunc, it will be exposed
(actually would have to be found in the kernel and definition would be
copy/pasted by user manually) to user's BPF application as:

extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;

When we "stabilize it" and make it helper, it turns into the following
definition supplied by libbpf in its bpf_helper_defs.h header
(auto-generated from include/uapi/linux/bpf.h):

static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;

From C source code perspective both will be called exactly the same,
but BPF assembly generated for them will be different. For kfunc it
will be a specially patched by libbpf `call -1;` instruction with
embedded BTF object ID and BTF type ID corresponding to this kfunc.
For BPF helper it will be simply `call 777;`. Both are processed by
verifier very differently.

From BPF program's standpoint it's impossible to support both ways of
calling the same bpf_dynptr_is_null(), because we get naming conflict,
and there is no single BPF assembly instruction that would support
both ways.

You'd have to get really creative to transparently call this helper
without caring whether it is kfunc or BPF helper. Or you'd have to
compile and distribute two variants of the same BPF object file. Both
suck. BPF CO-RE is nice and all, but we do it due to necessity, not
because it's fun and easy. So if we migrate kfunc to become BPF
helper, we'd most probably would need to make a new name for a helper
that's different from kfunc.

And it's currently not that easy to detect whether kfunc is available
or not (see [0]).

  [0] https://lore.kernel.org/bpf/de495e3a-cf06-ff85-1a4a-185621c9211a@linux.dev/



>
> >
> > BPF helpers are also nicely self-discoverable and documented in
> > include/uapi/linux/bpf.h, in one place where other BPF helpers are.
> > This is a big deal, especially for non-expert BPF users (a vast
> > majority of BPF users).
> >
> > > non-gpl and consistency don't even come close.
> > > We've been doing everything new as kfuncs and dynptr is not special.
> >
> > I think dynptr is quite special. It's a very generic and fundamental
> > concept, part of core BPF experience. It's a more dynamic counterpart
> > to an inflexible statically sized `void * + size` pair of arguments
> > sent to helpers for input or output memory regions. Dynptr has no
> > inherent dependencies on BTF, kfuncs, trampolines, JIT, nothing.
> >
> > By requiring kfunc-based helpers we are significantly raising the
> > obstacles towards adopting dynptr across a wide range of BPF
> > applications.
> >
> > And the only advantage in return is that we get a hypothetical chance
> > to change something in the future. But let's see if that will ever be
> > necessary for the helpers Joanne is adding:
> >
> > 1. Generic accessors to check validity of *any* dynptr, and it's
> > inherent properties like offset, available size, read-only property
> > (just as useful somethings as bpf_ringbuf_query() is for ringbufs,
> > both for debugging and for various heuristics in production).
> >
> > bpf_dynptr_is_null(struct bpf_dynptr *ptr)
> > long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
> > long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
> > bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
> >
> > There is nothing to add or remove here. No flags, no change in semantics.
> >
> > 2. Manipulators to copy existing dynptr's view and narrow it down to a
> > subset (e.g., for when you have a large memory blog, but need to
> > calculate hashes over smaller subset, without destroying original
> > dynptr, because it will be used later for some other access). We can
> > debate whether clone should get offset or not, but it doesn't change
> > much (except usability in common cases). Again, nothing to add or
> > remove otherwise, and pretty fundamental for real use of full power of
> > dynptr.
> >
> > long bpf_dynptr_clone(struct bpf_dynptr *ptr, struct bpf_dynptr
> > *clone, u32 offset)
> > long bpf_dynptr_trim(struct bpf_dynptr *ptr, u32 len)
> > long bpf_dynptr_advance(struct bpf_dynptr *ptr, u32 len)
> >
> > 3. This one is the only one I feel less strongly about, but mostly
> > because I can implement the same (even though less ergonomically, of
> > course) with bpf_loop() and bpf_dynptr_{clone,advance}.
> >
> > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
> > void *callback_ctx, u64 flags)
> >
> >
> > All of the above don't add or change any semantics to dynptr as a
> > concept. There is nothing that we'd need to change.
> >
> >
> > >
> > > > > helpers, but we got burned on them once and scrambled to add 'flags' argument.
> > > > > kfuncs are unstable and can be adjusted/removed at any time later.
> >
> > It's unfair to block these helpers just because we recided to add
> > flags to one of the previous ones (before the final release). And even
> > if we didn't managed to do it in time, the worst things would probably
> > be another variant of BPF helper. Definitely something to avoid, but
> > not end of the world. But as I pointed out above, this set of helpers
> > won't be change, as they just complete already established dynptr
> > ecosystem of helpers.
> >
> > > >
> > > > I don't see why we would remove any of the above list ever? They are
> > > > generic and fundamental to dynptr as a concept, they can't restrict
> > > > what dynptr can do in the future.
> > >
> > > It's not about removing them, but about changing them.
> > >
> > > Just for example the whole discussion of whether frags should
> > > be handled transparently and how write is handled didn't inspire
> > > confidence that there is a strong consensus on semantics
> > > of these new dynptr accessors.
> >
> > So let's start with acknowledging that skb and xdp buffer abstractions
> > as logically contiguous memory area are inherently complex and
> > non-perfect due to the way that kernel handles them for performance
> > and flexibility reasons.
> >
> > Let's also note that verifier knows specific flavor of dynptr and thus
> > can enforce additional restrictions based on specifically SKB/XDP
> > flavor vs LOCAL/RINGBUF. So just because there is no perfect way to
> > handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> > dynptr concept itself is flawed or not well thought out. It's just
> > that for SKB/XDP there is no perfect solution. Dynptr doesn't change
> > anything here, rather it actually simplifies a bunch of stuff,
> > especially for common scenarios.
> >
> > I'd argue that for wider SKB/XDP dynptr adoption in the networking
> > world, those dynptr constructor helpers should be helpers and not
> > kfuncs as well. But I'd wish someone with more networking tie-ins
> > would argue this instead of me.
>
> I'm not that familiar with the semantics of bpf kfuncs, so to clarify:
> from a user API perspective, is there any difference in calling the
> function from the bpf program as a helper vs. kfunc?

I think I addressed that above, but let me know if not.

>
> >
> > >
> > > Scrambling to add flags to dynptr helpers was another red flag.
> > >
> > > All signs are pointing out that we're not ready do fix dynptr api.
> >
> > I disagree, it's an overly harsh generalization.
> >
> > > It will evolve and has to evolve without uapi pain.
> > >
> > > kfuncs only. For everything. Please.
> >
> > This is yet another generalized blanket statement I disagree with.
> > Over the years I've got an impression that the BPF subsystem is
> > generally a  proud proponent of pragmatic, flexible, and common sense
> > engineering approaches, so this hard-and-fast rule with no room for
> > nuance sounds weird.
> >
> > There are things that belong in fundamental and core BPF concepts, and
> > it makes sense to keep them as stable abstractions and helpers. And
> > there are various things (like interfacing into kernel mechanics, its
> > types and systems) which totally make sense to keep unstable.
>
> I agree with all of your points. I know Alexei is on PTO these next
> two weeks, so I will in the meantime table this and work on the dynptr
> memory allocation patchset and a dynptr documentation write-up.
>
> Thanks for the discussion!

SGTM.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-14  0:57           ` Andrii Nakryiko
@ 2022-12-14 21:25             ` Joanne Koong
  0 siblings, 0 replies; 57+ messages in thread
From: Joanne Koong @ 2022-12-14 21:25 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Tue, Dec 13, 2022 at 4:57 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Dec 13, 2022 at 3:50 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Mon, Dec 12, 2022 at 12:12 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Thu, Dec 8, 2022 at 5:30 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 8, 2022 at 4:42 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Wed, Dec 7, 2022 at 5:54 PM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Dec 07, 2022 at 12:55:31PM -0800, Joanne Koong wrote:
> > > > > > > This patchset is the 3rd in the dynptr series. The 1st can be found here [0]
> > > > > > > and the 2nd can be found here [1].
> > > > > > >
> > > > > > > In this patchset, the following convenience helpers are added for interacting
> > > > > > > with bpf dynamic pointers:
> > > > > > >
> > > > > > >     * bpf_dynptr_data_rdonly
> > > > > > >     * bpf_dynptr_trim
> > > > > > >     * bpf_dynptr_advance
> > > > > > >     * bpf_dynptr_is_null
> > > > > > >     * bpf_dynptr_is_rdonly
> > > > > > >     * bpf_dynptr_get_size
> > > > > > >     * bpf_dynptr_get_offset
> > > > > > >     * bpf_dynptr_clone
> > > > > > >     * bpf_dynptr_iterator
> > > > > >
> > > > > > This is great, but it really stretches uapi limits.
> > > > >
> > > > > Stretches in what sense? They are simple and straightforward getters
> > > > > and trim/advance/clone are fundamental modifiers to be able to work
> > > > > with a subset of dynptr's overall memory area.
> > > > >
> > > > > > Please convert the above and those in [1] to kfuncs.
> > > > > > I know that there can be an argument made for consistency with existing dynptr uapi
> > > > >
> > > > > yeah, given we have bpf_dynptr_{read,write} and bpf_dynptr_data() as
> > > > > BPF helpers, it makes sense to have such basic things like is_null and
> > > > > trim/advance/clone as BPF helpers as well. Both for consistency and
> > > > > because there is nothing unstable about them. We are not going to
> > > > > remove dynptr as a concept, it's pretty well defined.
> > > > >
> > > > > Out of the above list perhaps only move bpf_dynptr_iterator() might be
> > > > > a candidate for kfunc. Though, personally, it makes sense to me to
> > > > > keep it as BPF helper without GPL restriction as well, given it is
> > > > > meant for networking applications in the first place, and you don't
> > > > > need to be GPL-compatible to write useful networking BPF program, from
> > > > > what I understand. But all the other ones is something you'd need to
> > > > > make actual use of dynptr concept in real-world BPF programs.
> > > > >
> > > > > Can we please have those as BPF helpers, and we can decide to move
> > > > > slightly fancier bpf_dynptr_iterator() (and future dynptr-related
> > > > > extras) into kfunc?
> > > >
> > > > Sorry, uapi concerns are more important here.
> > >
> > > What about the overall user experience and adoption?
> > >
> > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > >
> > > BPF helpers also have the advantage of working on all architectures,
> > > whether that architecture supports kfuncs or not, whether it supports
> > > JIT or not.
> >
> > Oh interesting, I didn't realize some architectures do not support kfuncs.
> >
> > Out of curiosity, can you elaborate on "no clean way to move from
> > unstable kfunc to a stable helper"? If for example we needed to move
> > something from kfunc -> helper, could we not just remove the code
> > where we added it as a kfunc (eg defining a BTF_ID for it) and add it
> > as a helper instead?
>
> We could in the kernel. And make user life horrible.
>
> If, say, bpf_dynptr_is_null() is defined as kfunc, it will be exposed
> (actually would have to be found in the kernel and definition would be
> copy/pasted by user manually) to user's BPF application as:
>
> extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
>
> When we "stabilize it" and make it helper, it turns into the following
> definition supplied by libbpf in its bpf_helper_defs.h header
> (auto-generated from include/uapi/linux/bpf.h):
>
> static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
>
> From C source code perspective both will be called exactly the same,
> but BPF assembly generated for them will be different. For kfunc it
> will be a specially patched by libbpf `call -1;` instruction with
> embedded BTF object ID and BTF type ID corresponding to this kfunc.
> For BPF helper it will be simply `call 777;`. Both are processed by
> verifier very differently.
>
> From BPF program's standpoint it's impossible to support both ways of
> calling the same bpf_dynptr_is_null(), because we get naming conflict,
> and there is no single BPF assembly instruction that would support
> both ways.
>
> You'd have to get really creative to transparently call this helper
> without caring whether it is kfunc or BPF helper. Or you'd have to
> compile and distribute two variants of the same BPF object file. Both
> suck. BPF CO-RE is nice and all, but we do it due to necessity, not
> because it's fun and easy. So if we migrate kfunc to become BPF
> helper, we'd most probably would need to make a new name for a helper
> that's different from kfunc.
>
> And it's currently not that easy to detect whether kfunc is available
> or not (see [0]).
>
>   [0] https://lore.kernel.org/bpf/de495e3a-cf06-ff85-1a4a-185621c9211a@linux.dev/
>
>
Thank you for the explanation! This is very helpful to know!
>
> >
> > >
> > > BPF helpers are also nicely self-discoverable and documented in
> > > include/uapi/linux/bpf.h, in one place where other BPF helpers are.
> > > This is a big deal, especially for non-expert BPF users (a vast
> > > majority of BPF users).
> > >
> > > > non-gpl and consistency don't even come close.
> > > > We've been doing everything new as kfuncs and dynptr is not special.
> > >
> > > I think dynptr is quite special. It's a very generic and fundamental
> > > concept, part of core BPF experience. It's a more dynamic counterpart
> > > to an inflexible statically sized `void * + size` pair of arguments
> > > sent to helpers for input or output memory regions. Dynptr has no
> > > inherent dependencies on BTF, kfuncs, trampolines, JIT, nothing.
> > >
> > > By requiring kfunc-based helpers we are significantly raising the
> > > obstacles towards adopting dynptr across a wide range of BPF
> > > applications.
> > >
> > > And the only advantage in return is that we get a hypothetical chance
> > > to change something in the future. But let's see if that will ever be
> > > necessary for the helpers Joanne is adding:
> > >
> > > 1. Generic accessors to check validity of *any* dynptr, and it's
> > > inherent properties like offset, available size, read-only property
> > > (just as useful somethings as bpf_ringbuf_query() is for ringbufs,
> > > both for debugging and for various heuristics in production).
> > >
> > > bpf_dynptr_is_null(struct bpf_dynptr *ptr)
> > > long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
> > > long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
> > > bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
> > >
> > > There is nothing to add or remove here. No flags, no change in semantics.
> > >
> > > 2. Manipulators to copy existing dynptr's view and narrow it down to a
> > > subset (e.g., for when you have a large memory blog, but need to
> > > calculate hashes over smaller subset, without destroying original
> > > dynptr, because it will be used later for some other access). We can
> > > debate whether clone should get offset or not, but it doesn't change
> > > much (except usability in common cases). Again, nothing to add or
> > > remove otherwise, and pretty fundamental for real use of full power of
> > > dynptr.
> > >
> > > long bpf_dynptr_clone(struct bpf_dynptr *ptr, struct bpf_dynptr
> > > *clone, u32 offset)
> > > long bpf_dynptr_trim(struct bpf_dynptr *ptr, u32 len)
> > > long bpf_dynptr_advance(struct bpf_dynptr *ptr, u32 len)
> > >
> > > 3. This one is the only one I feel less strongly about, but mostly
> > > because I can implement the same (even though less ergonomically, of
> > > course) with bpf_loop() and bpf_dynptr_{clone,advance}.
> > >
> > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
> > > void *callback_ctx, u64 flags)
> > >
> > >
> > > All of the above don't add or change any semantics to dynptr as a
> > > concept. There is nothing that we'd need to change.
> > >
> > >
> > > >
> > > > > > helpers, but we got burned on them once and scrambled to add 'flags' argument.
> > > > > > kfuncs are unstable and can be adjusted/removed at any time later.
> > >
> > > It's unfair to block these helpers just because we recided to add
> > > flags to one of the previous ones (before the final release). And even
> > > if we didn't managed to do it in time, the worst things would probably
> > > be another variant of BPF helper. Definitely something to avoid, but
> > > not end of the world. But as I pointed out above, this set of helpers
> > > won't be change, as they just complete already established dynptr
> > > ecosystem of helpers.
> > >
> > > > >
> > > > > I don't see why we would remove any of the above list ever? They are
> > > > > generic and fundamental to dynptr as a concept, they can't restrict
> > > > > what dynptr can do in the future.
> > > >
> > > > It's not about removing them, but about changing them.
> > > >
> > > > Just for example the whole discussion of whether frags should
> > > > be handled transparently and how write is handled didn't inspire
> > > > confidence that there is a strong consensus on semantics
> > > > of these new dynptr accessors.
> > >
> > > So let's start with acknowledging that skb and xdp buffer abstractions
> > > as logically contiguous memory area are inherently complex and
> > > non-perfect due to the way that kernel handles them for performance
> > > and flexibility reasons.
> > >
> > > Let's also note that verifier knows specific flavor of dynptr and thus
> > > can enforce additional restrictions based on specifically SKB/XDP
> > > flavor vs LOCAL/RINGBUF. So just because there is no perfect way to
> > > handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> > > dynptr concept itself is flawed or not well thought out. It's just
> > > that for SKB/XDP there is no perfect solution. Dynptr doesn't change
> > > anything here, rather it actually simplifies a bunch of stuff,
> > > especially for common scenarios.
> > >
> > > I'd argue that for wider SKB/XDP dynptr adoption in the networking
> > > world, those dynptr constructor helpers should be helpers and not
> > > kfuncs as well. But I'd wish someone with more networking tie-ins
> > > would argue this instead of me.
> >
> > I'm not that familiar with the semantics of bpf kfuncs, so to clarify:
> > from a user API perspective, is there any difference in calling the
> > function from the bpf program as a helper vs. kfunc?
>
> I think I addressed that above, but let me know if not.
>
> >
> > >
> > > >
> > > > Scrambling to add flags to dynptr helpers was another red flag.
> > > >
> > > > All signs are pointing out that we're not ready do fix dynptr api.
> > >
> > > I disagree, it's an overly harsh generalization.
> > >
> > > > It will evolve and has to evolve without uapi pain.
> > > >
> > > > kfuncs only. For everything. Please.
> > >
> > > This is yet another generalized blanket statement I disagree with.
> > > Over the years I've got an impression that the BPF subsystem is
> > > generally a  proud proponent of pragmatic, flexible, and common sense
> > > engineering approaches, so this hard-and-fast rule with no room for
> > > nuance sounds weird.
> > >
> > > There are things that belong in fundamental and core BPF concepts, and
> > > it makes sense to keep them as stable abstractions and helpers. And
> > > there are various things (like interfacing into kernel mechanics, its
> > > types and systems) which totally make sense to keep unstable.
> >
> > I agree with all of your points. I know Alexei is on PTO these next
> > two weeks, so I will in the meantime table this and work on the dynptr
> > memory allocation patchset and a dynptr documentation write-up.
> >
> > Thanks for the discussion!
>
> SGTM.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-12 20:12       ` Andrii Nakryiko
  2022-12-13 23:50         ` Joanne Koong
@ 2022-12-16 17:35         ` Alexei Starovoitov
  2022-12-20 19:31           ` Andrii Nakryiko
  1 sibling, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-16 17:35 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> 
> There is no clean way to ever move from unstable kfunc to a stable helper.

No clean way? Yet in the other email you proposed a way.
Not pretty, but workable.
I'm sure if ever there will be a need to stabilize the kfunc we will
find a clean way to do it.
Strongly arguing right now that this is an issue without doing the home work
is not productive.

> BPF helpers also have the advantage of working on all architectures,
> whether that architecture supports kfuncs or not, whether it supports
> JIT or not.

Correct, but applying the same argument we should argue that
all features must work in the interpreter as well, because
not all architectures support JIT.
This way struct-ops and bpf based TCP-CC would never be possible.
Some JITs don't support tail calls with subprogs.
freplace (bpf prog replacement) works when JITed only.
bpf trampoline works on x86-64 only.
while kfuncs work on more than one arch.

Now comapre the amount of .text that kernel has to contain
to support hundreds of helpers vs same amount of kfuncs.
In the former it's a whole bunch of code that is there in the kernel
in case bpf prog will call that helper. With 200+ helpers and half
of them already deprecated we have quite a bit of dead code in the kernel
that we cannot delete.
While with kfunc approach there is no extra code that deals with
conversion of the registers from bpf psABI to arch psABI.
With kfuncs we generate this code on demand.

> BPF helpers are also nicely self-discoverable and documented in
> include/uapi/linux/bpf.h, in one place where other BPF helpers are.
> This is a big deal, especially for non-expert BPF users (a vast
> majority of BPF users).

Good point. In general the kfuncs are not up to the level of
documentation of helpers and we should work on improving that,
but some of kfuncs are better documented than helpers.
So it's not black and white.

Discoverability we discussed in the past.
The task to automatically emit kfuncs into vmlinux.h is still not complete.
Time to prioritize it higher.

> 
> > non-gpl and consistency don't even come close.
> > We've been doing everything new as kfuncs and dynptr is not special.
> 
> I think dynptr is quite special. It's a very generic and fundamental
> concept, part of core BPF experience. It's a more dynamic counterpart
> to an inflexible statically sized `void * + size` pair of arguments
> sent to helpers for input or output memory regions. Dynptr has no
> inherent dependencies on BTF, kfuncs, trampolines, JIT, nothing.

imo dynptr and kptr are more or less equivalent in terms of being core
building blocks.
kptrs are done via kfuncs, so dynptr can do just as well.

> By requiring kfunc-based helpers we are significantly raising the
> obstacles towards adopting dynptr across a wide range of BPF
> applications.

Sorry, but I have to disagree. kptr and dynptr are left and right hand.
Both will work just fine as kfuncs.

> And the only advantage in return is that we get a hypothetical chance
> to change something in the future. But let's see if that will ever be
> necessary for the helpers Joanne is adding:
> 
> 1. Generic accessors to check validity of *any* dynptr, and it's
> inherent properties like offset, available size, read-only property
> (just as useful somethings as bpf_ringbuf_query() is for ringbufs,
> both for debugging and for various heuristics in production).
> 
> bpf_dynptr_is_null(struct bpf_dynptr *ptr)
> long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
> long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
> bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
> 
> There is nothing to add or remove here. No flags, no change in semantics.

Disagree, since there is an obvious counter example.
See all of bpf_get_current_task*().
Some of them are still used, but
bpf_get_current_task vs bpf_get_current_task_btf is our acknowledgement
of the fact that we suck in inventing uapi.
It's the lesson that we've learned the hard way.
Not going to repeat that mistake again.

To be completely honest I expect that dynptr may get obsolete
as the whole concept several years from now.
We still don't have a single actual user of it.
Just like kptr. Could be deprecated eventually just as well.

> 3. This one is the only one I feel less strongly about, but mostly
> because I can implement the same (even though less ergonomically, of
> course) with bpf_loop() and bpf_dynptr_{clone,advance}.
> 
> long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
> void *callback_ctx, u64 flags)

Speaking of your upcoming inline iterators.
Please make sure that you're adding them as kfuncs.
We've made a mistake with bpf_loop. It's a stable helper,
but inline iterators will immediately deprecate most uses of bpf_loop.
If bpf_loop was a kfunc we would have deleted it.

> Let's also note that verifier knows specific flavor of dynptr and thus
> can enforce additional restrictions based on specifically SKB/XDP
> flavor vs LOCAL/RINGBUF. So just because there is no perfect way to
> handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> dynptr concept itself is flawed or not well thought out. It's just

I think that's exactly what it means. dynptr concept is flawed.
It's ok to add this flawed feature to the kernel right now,
because we don't see a better way today, but that might change
in the future and we gotta be able to fix our mistakes.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-16 17:35         ` Alexei Starovoitov
@ 2022-12-20 19:31           ` Andrii Nakryiko
  2022-12-25 21:52             ` bpf helpers freeze. Was: " Alexei Starovoitov
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2022-12-20 19:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> >
> > There is no clean way to ever move from unstable kfunc to a stable helper.
>
> No clean way? Yet in the other email you proposed a way.
> Not pretty, but workable.
> I'm sure if ever there will be a need to stabilize the kfunc we will
> find a clean way to do it.

You can't have stable and unstable helper definition in the same .c
file, so work around would include having two separate .c files and
statically linking them together just to be able to call one or the
other within the same program.

It's possible, but in no way it's clean or straightforward. And that
was my point.

> Strongly arguing right now that this is an issue without doing the home work
> is not productive.

Not sure what kind of extra homework I should do to be able to point
out that what I said above and in previous emails is a real pain for
users.

>
> > BPF helpers also have the advantage of working on all architectures,
> > whether that architecture supports kfuncs or not, whether it supports
> > JIT or not.
>
> Correct, but applying the same argument we should argue that
> all features must work in the interpreter as well, because
> not all architectures support JIT.
> This way struct-ops and bpf based TCP-CC would never be possible.
> Some JITs don't support tail calls with subprogs.
> freplace (bpf prog replacement) works when JITed only.
> bpf trampoline works on x86-64 only.
> while kfuncs work on more than one arch.

Where did I claim that *everything* should work everywhere?

And yes, if we can make some feature work across JIT and interpreter
*with no extra work*, then yes, we should strive to do it.

>
> Now comapre the amount of .text that kernel has to contain
> to support hundreds of helpers vs same amount of kfuncs.

The amount of code is about the same for helpers vs kfuncs assuming
they are used, though, right? so it comes down to being able to remove
stuff, as you mention below.

> In the former it's a whole bunch of code that is there in the kernel
> in case bpf prog will call that helper. With 200+ helpers and half
> of them already deprecated we have quite a bit of dead code in the kernel
> that we cannot delete.

So "half of them already deprecated" is news to me and a pretty strong
statement. I went just scrolling through helpers and lots of them
seems as useful as they were when they were added. Completely ignoring
networking helpers (which I don't use much at all, but that doesn't
mean they are useless and deprecated, right?), I counted about 40 at
least that I've used personally, and there is more helpers that are
used in practice across various apps I've helped over time.

> While with kfunc approach there is no extra code that deals with
> conversion of the registers from bpf psABI to arch psABI.
> With kfuncs we generate this code on demand.

First time I'm hearing this .text size concern due to conversion of
the registers from bpf psABI to arch psABI. Can you elaborate, please?
I went spot checking, looked at a few helpers like
bpf_map_lookup_elem, bpf_csum_diff, bpf_skb_store_bytes, etc. I
couldn't guess what bloat you are talking about? And how many bytes
are we talking about here?

>
> > BPF helpers are also nicely self-discoverable and documented in
> > include/uapi/linux/bpf.h, in one place where other BPF helpers are.
> > This is a big deal, especially for non-expert BPF users (a vast
> > majority of BPF users).
>
> Good point. In general the kfuncs are not up to the level of
> documentation of helpers and we should work on improving that,
> but some of kfuncs are better documented than helpers.
> So it's not black and white.

I was not comparing the quality of documentation. I was saying all the
helpers are nicely listed (with their doc comments, yes) in one place
in UAPI, making it simple for users to discover.

Documentation itself can and should be improved for both helpers and
kfuncs as much as possible, of course.

>
> Discoverability we discussed in the past.
> The task to automatically emit kfuncs into vmlinux.h is still not complete.
> Time to prioritize it higher.
>

Yep.

> >
> > > non-gpl and consistency don't even come close.
> > > We've been doing everything new as kfuncs and dynptr is not special.
> >
> > I think dynptr is quite special. It's a very generic and fundamental
> > concept, part of core BPF experience. It's a more dynamic counterpart
> > to an inflexible statically sized `void * + size` pair of arguments
> > sent to helpers for input or output memory regions. Dynptr has no
> > inherent dependencies on BTF, kfuncs, trampolines, JIT, nothing.
>
> imo dynptr and kptr are more or less equivalent in terms of being core
> building blocks.
> kptrs are done via kfuncs, so dynptr can do just as well.

bpf_kptr_xchg() is a BPF helper, so kptr is not 100% done via kfuncs.
(But I'm guessing you'll say it was a mistake and bpf_kptr_xchg()
should have been a kfunc, but it's too late to change that, and it's
just a counter example that proves the rule).

But regardless, dynptr is modeled as black box with hidden state, and
its API surface area is bigger (offset, size, is null or not,
manipulations over those aspects; then there is skb/xdp abstraction to
be taken care of for generic read/write). It has a wider *generic* API
surface to be useful and effectively used.

Kptr is a single pointer that can be NULL or not and you can check for
that directly. The rest is BPF verifier magic that keeps track of
types and "trustedness", and then you can use specific interfacing
kfuncs to work with kernel objects (which as I said before, makes
sense to keep unstable).

Yes, both are fundamental. But they are not apples to apples.

>
> > By requiring kfunc-based helpers we are significantly raising the
> > obstacles towards adopting dynptr across a wide range of BPF
> > applications.
>
> Sorry, but I have to disagree. kptr and dynptr are left and right hand.
> Both will work just fine as kfuncs.
>

Ok, let's agree to disagree.

> > And the only advantage in return is that we get a hypothetical chance
> > to change something in the future. But let's see if that will ever be
> > necessary for the helpers Joanne is adding:
> >
> > 1. Generic accessors to check validity of *any* dynptr, and it's
> > inherent properties like offset, available size, read-only property
> > (just as useful somethings as bpf_ringbuf_query() is for ringbufs,
> > both for debugging and for various heuristics in production).
> >
> > bpf_dynptr_is_null(struct bpf_dynptr *ptr)
> > long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
> > long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
> > bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
> >
> > There is nothing to add or remove here. No flags, no change in semantics.
>
> Disagree, since there is an obvious counter example.

I'm talking about *specific* dynptr helpers under discussion, and you
are bringing up some other helpers as "counter examples". What kind of
discussion is this? We'll keep branching out with more and more (at
best) tangentially related arguments until I'm exhausted and just give
up?

> See all of bpf_get_current_task*().
> Some of them are still used, but
> bpf_get_current_task vs bpf_get_current_task_btf is our acknowledgement
> of the fact that we suck in inventing uapi.

All *two* of them, bpf_get_current_task() and
bpf_get_current_task_btf(), right? They are 2 years apart.
bpf_get_current_task() was added before BTF era. It is still actively
used today and there is nothing wrong with it. It works on older
kernels just fine, even with BPF CO-RE (as backporting a few simple
patches to generate BTF is simple and easy; not so much with BPF
verifier changes to add native BTF support). I don't see much problem
having both, they are not maintenance burden.

> It's the lesson that we've learned the hard way.
> Not going to repeat that mistake again.

I'm not dismissing the burden of backwards compat and UAPI stability,
you don't have to explain that to me. But I don't see it as a reason
to suddenly make everything unstable, even concepts that are core
parts of the BPF framework.

>
> To be completely honest I expect that dynptr may get obsolete
> as the whole concept several years from now.
> We still don't have a single actual user of it.
> Just like kptr. Could be deprecated eventually just as well.
>

One can say similar things about any technology or API. It doesn't
mean that it was a mistake to implement them in the first place (just
like your example with bpf_get_current_task() -- it served and still
serves its purpose).

For dynptr, time will tell, but we are still missing important parts
for wider adoption. Skb/xdp stuff will be great for networking.
Ringbuf/local (and malloc one, when we get to it) dynptrs will be used
by generic tracing apps, but it will have to be deployed more widely
across all supported kernels to make sense (thinking about our
fleet-wide profiler adoption, for example). And in general, adoption
of new concepts takes time.


> > 3. This one is the only one I feel less strongly about, but mostly
> > because I can implement the same (even though less ergonomically, of
> > course) with bpf_loop() and bpf_dynptr_{clone,advance}.
> >
> > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn,
> > void *callback_ctx, u64 flags)
>
> Speaking of your upcoming inline iterators.
> Please make sure that you're adding them as kfuncs.
> We've made a mistake with bpf_loop. It's a stable helper,
> but inline iterators will immediately deprecate most uses of bpf_loop.
> If bpf_loop was a kfunc we would have deleted it.

I'm afraid we'll have to have a similar discussion with iterators. For
a generic fundamental number range iterator, which is a generalization
of bounded loops and bpf_loop, I believe it should be in stable UAPI
as well. For stuff like iterators over kernel objects (tasks, cgroups,
etc) -- kfuncs make sense to me.

But let's cross that bridge when we get there.

>
> > Let's also note that verifier knows specific flavor of dynptr and thus
> > can enforce additional restrictions based on specifically SKB/XDP
> > flavor vs LOCAL/RINGBUF. So just because there is no perfect way to
> > handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> > dynptr concept itself is flawed or not well thought out. It's just
>
> I think that's exactly what it means. dynptr concept is flawed.
> It's ok to add this flawed feature to the kernel right now,
> because we don't see a better way today, but that might change
> in the future and we gotta be able to fix our mistakes.

"flawed", "mistakes", "deprecated", etc. You keep using this strongly
negatively connotated language for things that were and are perfectly
valid and working (and, most importantly, used and useful in
practice), but somehow fell out of your favor. Is it really necessary
to denigrate everything like that? It just distracts from the essence
of the discussion.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-20 19:31           ` Andrii Nakryiko
@ 2022-12-25 21:52             ` Alexei Starovoitov
  2022-12-29 23:10               ` Andrii Nakryiko
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-25 21:52 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > >
> > > There is no clean way to ever move from unstable kfunc to a stable helper.
> >
> > No clean way? Yet in the other email you proposed a way.
> > Not pretty, but workable.
> > I'm sure if ever there will be a need to stabilize the kfunc we will
> > find a clean way to do it.
> 
> You can't have stable and unstable helper definition in the same .c
> file,

of course we can.
uapi helpers vs kfuncs argument is not a black and white comparison.
It's not just stable vs unstable.
uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
Meaning they are largely unstable.
The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
but distros can apply their own "stability rules".
See Redhat's kABI, for example. A distro can guarantee a stability
of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
upstream development.

With uapi bpf helpers we have to guarantee their stability,
while with kfuncs we can do whatever we want. Right now all kfuncs are
unstable and to prove the point we changed them couple times already (nf_conn*).
We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
Hard to imagine more stable and more fundamental function.
Of course we want bpf programs to use bpf_obj_new() and assume
that it's going to be available in all future kernel releases.
But at the same time we're not bound by uapi rules.
bpf_obj_new() will likely be stable, but not uapi stable.
If we screw up (or find better way to allocate memory in the future)
we can change it.
We can invent our own deprecation rules for stable-ish kfuncs and
invent our more-unstable-than-current-unstable rules for kfuncs that
are too much kernel release dependent.

> But regardless, dynptr is modeled as black box with hidden state, and
> its API surface area is bigger (offset, size, is null or not,
> manipulations over those aspects; then there is skb/xdp abstraction to
> be taken care of for generic read/write). It has a wider *generic* API
> surface to be useful and effectively used.

tbh dynptr as an abstraction of skb/xdp is not convincing.
cilium created their own abstraction on top of skb and xdp and it's zero cost.
While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
So I suspect it won't be a success story in the long run, but we
can certainly try it out since they will be kfuncs and can be deprecated
if maintenance outweighs the number of users.

> All *two* of them, bpf_get_current_task() and
> bpf_get_current_task_btf(), right? They are 2 years apart.
> bpf_get_current_task() was added before BTF era. It is still actively
> used today and there is nothing wrong with it. It works on older
> kernels just fine, even with BPF CO-RE (as backporting a few simple
> patches to generate BTF is simple and easy; not so much with BPF
> verifier changes to add native BTF support). I don't see much problem
> having both, they are not maintenance burden.

bpf_get_current_pid_tgid
bpf_get_current_uid_gid
bpf_get_current_comm
bpf_get_current_task
bpf_get_current_task_btf
bpf_get_current_cgroup_id
bpf_get_current_ancestor_cgroup_id
bpf_skb_ancestor_cgroup_id
bpf_sk_cgroup_id
bpf_sk_ancestor_cgroup_id

_are_ a maintenance burden.
The verifier got smarter and we could have removed all of them,
but uapi rules makes it impossible.
The bpf prog could have been enabled to access all these task_struct
and cgroup fields directly. Likely without any kfuncs.

bpf_send_signal vs bpf_send_signal_thread
bpf_jiffies64 vs bpf_this_cpu_ptr
etc
there are plenty examples where uapi bpf helpers became a burden.
They are working and will keep working, but we could have done
much better job if not for uapi.
These are the examples where uapi rules are too strong for bpf development.
Our pace of adding new features is high.
The kernel uapi rules are too strict for us.

At one point DaveM declared freeze on sizeof(struct sk_buff).
It was a difficult, but correct decision.
We have to declare freeze on bpf helpers.
211 helpers that have to be maintained forever is a huge burden.
All new features should use kfuncs and we need to figure out a deprecation
and stability story for them. How to document kfuncs cleanly,
how to discover them, etc.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-25 21:52             ` bpf helpers freeze. Was: " Alexei Starovoitov
@ 2022-12-29 23:10               ` Andrii Nakryiko
  2022-12-30  2:46                 ` Alexei Starovoitov
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2022-12-29 23:10 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> > On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > > >
> > > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > >
> > > No clean way? Yet in the other email you proposed a way.
> > > Not pretty, but workable.
> > > I'm sure if ever there will be a need to stabilize the kfunc we will
> > > find a clean way to do it.
> >
> > You can't have stable and unstable helper definition in the same .c
> > file,
>
> of course we can.
> uapi helpers vs kfuncs argument is not a black and white comparison.
> It's not just stable vs unstable.
> uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
> While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
> Meaning they are largely unstable.
> The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
> but distros can apply their own "stability rules".
> See Redhat's kABI, for example. A distro can guarantee a stability
> of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
> upstream development.
>
> With uapi bpf helpers we have to guarantee their stability,
> while with kfuncs we can do whatever we want. Right now all kfuncs are
> unstable and to prove the point we changed them couple times already (nf_conn*).
> We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
> Hard to imagine more stable and more fundamental function.
> Of course we want bpf programs to use bpf_obj_new() and assume
> that it's going to be available in all future kernel releases.
> But at the same time we're not bound by uapi rules.
> bpf_obj_new() will likely be stable, but not uapi stable.
> If we screw up (or find better way to allocate memory in the future)
> we can change it.
> We can invent our own deprecation rules for stable-ish kfuncs and
> invent our more-unstable-than-current-unstable rules for kfuncs that
> are too much kernel release dependent.

I'm talking about *mechanics* of having two incompatible definitions
of functions with the same name, not the *concept* of stable vs
unstable API. See [0] where I explained this as a reply to Joanne.

  [0] https://lore.kernel.org/bpf/CAEf4BzbRQLEjAFUkzzStv0c0=O+r9iZ8hq33sJB2RtSuGrGAEA@mail.gmail.com/

>
> > But regardless, dynptr is modeled as black box with hidden state, and
> > its API surface area is bigger (offset, size, is null or not,
> > manipulations over those aspects; then there is skb/xdp abstraction to
> > be taken care of for generic read/write). It has a wider *generic* API
> > surface to be useful and effectively used.
>
> tbh dynptr as an abstraction of skb/xdp is not convincing.
> cilium created their own abstraction on top of skb and xdp and it's zero cost.
> While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
> So I suspect it won't be a success story in the long run, but we
> can certainly try it out since they will be kfuncs and can be deprecated
> if maintenance outweighs the number of users.
>
> > All *two* of them, bpf_get_current_task() and
> > bpf_get_current_task_btf(), right? They are 2 years apart.
> > bpf_get_current_task() was added before BTF era. It is still actively
> > used today and there is nothing wrong with it. It works on older
> > kernels just fine, even with BPF CO-RE (as backporting a few simple
> > patches to generate BTF is simple and easy; not so much with BPF
> > verifier changes to add native BTF support). I don't see much problem
> > having both, they are not maintenance burden.
>
> bpf_get_current_pid_tgid
> bpf_get_current_uid_gid
> bpf_get_current_comm
> bpf_get_current_task
> bpf_get_current_task_btf
> bpf_get_current_cgroup_id
> bpf_get_current_ancestor_cgroup_id
> bpf_skb_ancestor_cgroup_id
> bpf_sk_cgroup_id
> bpf_sk_ancestor_cgroup_id
>
> _are_ a maintenance burden.

bpf_get_current_pid_tgid() was added in 2015, slightly and
uncritically touched by Daniel in 2016 and we never had any problems
with it ever since. No updates, no maintenance. I don't remember much
problem with other helpers in this list, but I didn't check each one.

But we certainly have a different understanding of what "maintenance
burden" is. If some code doesn't require constant change and doesn't
prevent changes in some other parts of the system, it's not a
maintenance burden.


> The verifier got smarter and we could have removed all of them,
> but uapi rules makes it impossible.
> The bpf prog could have been enabled to access all these task_struct
> and cgroup fields directly. Likely without any kfuncs.
>
> bpf_send_signal vs bpf_send_signal_thread
> bpf_jiffies64 vs bpf_this_cpu_ptr
> etc
> there are plenty examples where uapi bpf helpers became a burden.
> They are working and will keep working, but we could have done
> much better job if not for uapi.
> These are the examples where uapi rules are too strong for bpf development.
> Our pace of adding new features is high.
> The kernel uapi rules are too strict for us.

I'm familiar with the burden of maintaining API stability and
backwards compat. But it's not just about the library/system
developer's convenience and burden, it's also about the end user's
experience and convenience. BPF tool developers really appreciate when
there are few less quirks to remember and work around across kernel
versions, configurations, architectures, etc. It's the pain that
kernel engineers working on BPF bleeding-edge don't experience in the
BPF selftests environment.

>
> At one point DaveM declared freeze on sizeof(struct sk_buff).
> It was a difficult, but correct decision.
> We have to declare freeze on bpf helpers.
> 211 helpers that have to be maintained forever is a huge burden.

I still didn't get why we have to freeze anything and how exactly
helpers are a burden.

But especially in this specific case of few simple dynptr helpers,
especially that other dynptrs generic APIs are already BPF helpers. I
just don't get it and honestly all I see from this discussion is that
you've made up your mind and there is nothing that can be done to
convince you.

The only "BPF helpers are stable and thus a burden" argument is just
not convincing and I'd even say is mostly false. There are no upsides
to having dynptr helpers as kfuncs, as far as I'm concerned. But there
are a bunch of downsides, even if some of those might be lifted in the
future.

The unfortunate thing is that end users that are meant to benefit from
all these helpers and them being "a standard API offering" are not
well represented on the BPF mailing list, unfortunately. And my
opinion and arguments as a proxy for theirs is clearly not enough.

> All new features should use kfuncs and we need to figure out a deprecation
> and stability story for them. How to document kfuncs cleanly,
> how to discover them, etc.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-29 23:10               ` Andrii Nakryiko
@ 2022-12-30  2:46                 ` Alexei Starovoitov
  2022-12-30 18:38                   ` David Vernet
  2023-01-04 18:43                   ` Andrii Nakryiko
  0 siblings, 2 replies; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-30  2:46 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Dec 29, 2022 at 03:10:22PM -0800, Andrii Nakryiko wrote:
> On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> > > On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > > > >
> > > > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > > >
> > > > No clean way? Yet in the other email you proposed a way.
> > > > Not pretty, but workable.
> > > > I'm sure if ever there will be a need to stabilize the kfunc we will
> > > > find a clean way to do it.
> > >
> > > You can't have stable and unstable helper definition in the same .c
> > > file,
> >
> > of course we can.
> > uapi helpers vs kfuncs argument is not a black and white comparison.
> > It's not just stable vs unstable.
> > uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
> > While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
> > Meaning they are largely unstable.
> > The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
> > but distros can apply their own "stability rules".
> > See Redhat's kABI, for example. A distro can guarantee a stability
> > of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
> > upstream development.
> >
> > With uapi bpf helpers we have to guarantee their stability,
> > while with kfuncs we can do whatever we want. Right now all kfuncs are
> > unstable and to prove the point we changed them couple times already (nf_conn*).
> > We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
> > Hard to imagine more stable and more fundamental function.
> > Of course we want bpf programs to use bpf_obj_new() and assume
> > that it's going to be available in all future kernel releases.
> > But at the same time we're not bound by uapi rules.
> > bpf_obj_new() will likely be stable, but not uapi stable.
> > If we screw up (or find better way to allocate memory in the future)
> > we can change it.
> > We can invent our own deprecation rules for stable-ish kfuncs and
> > invent our more-unstable-than-current-unstable rules for kfuncs that
> > are too much kernel release dependent.
> 
> I'm talking about *mechanics* of having two incompatible definitions
> of functions with the same name, not the *concept* of stable vs
> unstable API. See [0] where I explained this as a reply to Joanne.
> 
>   [0] https://lore.kernel.org/bpf/CAEf4BzbRQLEjAFUkzzStv0c0=O+r9iZ8hq33sJB2RtSuGrGAEA@mail.gmail.com/

Mechanics for kfuncs are much better than for helpers.

extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;

will likely work with both gcc and clang.
And if it doesn't we can fix it.

While when gcc folks saw helpers:

static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;

they realized that it is a hack that abuses compiler optimizations.
They even invented attr(kernel_helper) to workaround this issue.
After a bunch of arguing gcc added support for this hack without attr,
but it's going to be around forever... in gcc, in clang and in kernel.
It's something that we could have fixed if it wasn't for uapi.
Just one more example of unfixable mistake that causing issues
to multiple projects.
That's the core issue of kernel uapi rules: inability to fix mistakes.

> >
> > > But regardless, dynptr is modeled as black box with hidden state, and
> > > its API surface area is bigger (offset, size, is null or not,
> > > manipulations over those aspects; then there is skb/xdp abstraction to
> > > be taken care of for generic read/write). It has a wider *generic* API
> > > surface to be useful and effectively used.
> >
> > tbh dynptr as an abstraction of skb/xdp is not convincing.
> > cilium created their own abstraction on top of skb and xdp and it's zero cost.
> > While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
> > So I suspect it won't be a success story in the long run, but we
> > can certainly try it out since they will be kfuncs and can be deprecated
> > if maintenance outweighs the number of users.
> >
> > > All *two* of them, bpf_get_current_task() and
> > > bpf_get_current_task_btf(), right? They are 2 years apart.
> > > bpf_get_current_task() was added before BTF era. It is still actively
> > > used today and there is nothing wrong with it. It works on older
> > > kernels just fine, even with BPF CO-RE (as backporting a few simple
> > > patches to generate BTF is simple and easy; not so much with BPF
> > > verifier changes to add native BTF support). I don't see much problem
> > > having both, they are not maintenance burden.
> >
> > bpf_get_current_pid_tgid
> > bpf_get_current_uid_gid
> > bpf_get_current_comm
> > bpf_get_current_task
> > bpf_get_current_task_btf
> > bpf_get_current_cgroup_id
> > bpf_get_current_ancestor_cgroup_id
> > bpf_skb_ancestor_cgroup_id
> > bpf_sk_cgroup_id
> > bpf_sk_ancestor_cgroup_id
> >
> > _are_ a maintenance burden.
> 
> bpf_get_current_pid_tgid() was added in 2015, slightly and
> uncritically touched by Daniel in 2016 and we never had any problems
> with it ever since. No updates, no maintenance. I don't remember much
> problem with other helpers in this list, but I didn't check each one.
> 
> But we certainly have a different understanding of what "maintenance
> burden" is. If some code doesn't require constant change and doesn't
> prevent changes in some other parts of the system, it's not a
> maintenance burden.

As I said it's not about working today. If one doesn't touch code
it will keep working.
It's about being able to change it.
The uapi bits we simply cannot change.

> 
> > The verifier got smarter and we could have removed all of them,
> > but uapi rules makes it impossible.
> > The bpf prog could have been enabled to access all these task_struct
> > and cgroup fields directly. Likely without any kfuncs.
> >
> > bpf_send_signal vs bpf_send_signal_thread
> > bpf_jiffies64 vs bpf_this_cpu_ptr
> > etc
> > there are plenty examples where uapi bpf helpers became a burden.
> > They are working and will keep working, but we could have done
> > much better job if not for uapi.
> > These are the examples where uapi rules are too strong for bpf development.
> > Our pace of adding new features is high.
> > The kernel uapi rules are too strict for us.
> 
> I'm familiar with the burden of maintaining API stability and
> backwards compat. But it's not just about the library/system

libbpf 1.0 wasn't the smoothest example of deprecation.
But we still did it despite all kinds of negative flame.
With uapi helpers we cannot do any of that. No deprecation schemes.
While kfuncs allow innovation.

> developer's convenience and burden, it's also about the end user's
> experience and convenience. BPF tool developers really appreciate when
> there are few less quirks to remember and work around across kernel
> versions, configurations, architectures, etc. It's the pain that
> kernel engineers working on BPF bleeding-edge don't experience in the
> BPF selftests environment.

There is a trade off between users and developers. We want to make user
experience as smooth as possible while preserve the speed of development
for the kernel. uapi is in the way of that.

> >
> > At one point DaveM declared freeze on sizeof(struct sk_buff).
> > It was a difficult, but correct decision.
> > We have to declare freeze on bpf helpers.
> > 211 helpers that have to be maintained forever is a huge burden.
> 
> I still didn't get why we have to freeze anything and how exactly
> helpers are a burden.
> 
> But especially in this specific case of few simple dynptr helpers,
> especially that other dynptrs generic APIs are already BPF helpers. I
> just don't get it and honestly all I see from this discussion is that
> you've made up your mind and there is nothing that can be done to
> convince you.
> 
> The only "BPF helpers are stable and thus a burden" argument is just
> not convincing and I'd even say is mostly false. There are no upsides
> to having dynptr helpers as kfuncs, as far as I'm concerned. 

The main and only upside for everything as kfunc is that we can change it.
That's it.

> But there
> are a bunch of downsides, even if some of those might be lifted in the
> future.

imo ability to change outweighs all downsides, since downsides are fixable
while inability to change is a burden.

> The unfortunate thing is that end users that are meant to benefit from
> all these helpers and them being "a standard API offering" are not
> well represented on the BPF mailing list, unfortunately. And my
> opinion and arguments as a proxy for theirs is clearly not enough.

I also would like to hear what others on the list are thinking.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-30  2:46                 ` Alexei Starovoitov
@ 2022-12-30 18:38                   ` David Vernet
  2022-12-30 19:31                     ` Alexei Starovoitov
  2023-01-04 18:43                   ` Andrii Nakryiko
  1 sibling, 1 reply; 57+ messages in thread
From: David Vernet @ 2022-12-30 18:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Dec 29, 2022 at 06:46:41PM -0800, Alexei Starovoitov wrote:
> On Thu, Dec 29, 2022 at 03:10:22PM -0800, Andrii Nakryiko wrote:
> > On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> > > > On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > > > > >
> > > > > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > > > >
> > > > > No clean way? Yet in the other email you proposed a way.
> > > > > Not pretty, but workable.
> > > > > I'm sure if ever there will be a need to stabilize the kfunc we will
> > > > > find a clean way to do it.
> > > >
> > > > You can't have stable and unstable helper definition in the same .c
> > > > file,
> > >
> > > of course we can.
> > > uapi helpers vs kfuncs argument is not a black and white comparison.
> > > It's not just stable vs unstable.
> > > uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
> > > While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
> > > Meaning they are largely unstable.
> > > The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
> > > but distros can apply their own "stability rules".
> > > See Redhat's kABI, for example. A distro can guarantee a stability
> > > of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
> > > upstream development.

This also sounds more in line with what was discussed at the maintainers
summit [0]. "A BPF program that depends on kernel symbols is not really
a user program anymore." Given that perspective, EXPORT_SYMBOL_GPL
sounds like the correct equivalency to "public BPF symbols".

[0]: https://lwn.net/Articles/908464/

> > >
> > > With uapi bpf helpers we have to guarantee their stability,
> > > while with kfuncs we can do whatever we want. Right now all kfuncs are
> > > unstable and to prove the point we changed them couple times already (nf_conn*).
> > > We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
> > > Hard to imagine more stable and more fundamental function.
> > > Of course we want bpf programs to use bpf_obj_new() and assume
> > > that it's going to be available in all future kernel releases.
> > > But at the same time we're not bound by uapi rules.
> > > bpf_obj_new() will likely be stable, but not uapi stable.
> > > If we screw up (or find better way to allocate memory in the future)
> > > we can change it.
> > > We can invent our own deprecation rules for stable-ish kfuncs and
> > > invent our more-unstable-than-current-unstable rules for kfuncs that
> > > are too much kernel release dependent.
> >
> > I'm talking about *mechanics* of having two incompatible definitions
> > of functions with the same name, not the *concept* of stable vs
> > unstable API. See [0] where I explained this as a reply to Joanne.
> >
> >   [0] https://lore.kernel.org/bpf/CAEf4BzbRQLEjAFUkzzStv0c0=O+r9iZ8hq33sJB2RtSuGrGAEA@mail.gmail.com/
>
> Mechanics for kfuncs are much better than for helpers.
>
> extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
>
> will likely work with both gcc and clang.
> And if it doesn't we can fix it.
>
> While when gcc folks saw helpers:
>
> static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
>
> they realized that it is a hack that abuses compiler optimizations.
> They even invented attr(kernel_helper) to workaround this issue.
> After a bunch of arguing gcc added support for this hack without attr,
> but it's going to be around forever... in gcc, in clang and in kernel.
> It's something that we could have fixed if it wasn't for uapi.
> Just one more example of unfixable mistake that causing issues
> to multiple projects.
> That's the core issue of kernel uapi rules: inability to fix mistakes.
>
> > >
> > > > But regardless, dynptr is modeled as black box with hidden state, and
> > > > its API surface area is bigger (offset, size, is null or not,
> > > > manipulations over those aspects; then there is skb/xdp abstraction to
> > > > be taken care of for generic read/write). It has a wider *generic* API
> > > > surface to be useful and effectively used.
> > >
> > > tbh dynptr as an abstraction of skb/xdp is not convincing.
> > > cilium created their own abstraction on top of skb and xdp and it's zero cost.
> > > While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
> > > So I suspect it won't be a success story in the long run, but we
> > > can certainly try it out since they will be kfuncs and can be deprecated
> > > if maintenance outweighs the number of users.
> > >
> > > > All *two* of them, bpf_get_current_task() and
> > > > bpf_get_current_task_btf(), right? They are 2 years apart.
> > > > bpf_get_current_task() was added before BTF era. It is still actively
> > > > used today and there is nothing wrong with it. It works on older
> > > > kernels just fine, even with BPF CO-RE (as backporting a few simple
> > > > patches to generate BTF is simple and easy; not so much with BPF
> > > > verifier changes to add native BTF support). I don't see much problem
> > > > having both, they are not maintenance burden.
> > >
> > > bpf_get_current_pid_tgid
> > > bpf_get_current_uid_gid
> > > bpf_get_current_comm
> > > bpf_get_current_task
> > > bpf_get_current_task_btf
> > > bpf_get_current_cgroup_id
> > > bpf_get_current_ancestor_cgroup_id
> > > bpf_skb_ancestor_cgroup_id
> > > bpf_sk_cgroup_id
> > > bpf_sk_ancestor_cgroup_id
> > >
> > > _are_ a maintenance burden.
> >
> > bpf_get_current_pid_tgid() was added in 2015, slightly and
> > uncritically touched by Daniel in 2016 and we never had any problems
> > with it ever since. No updates, no maintenance. I don't remember much
> > problem with other helpers in this list, but I didn't check each one.

You could argue that this actually a point in favor of kfuncs. If we
implement these as kfuncs and never touch them again, users will not
need to change anything and will have the same exact experience as if it
was in UAPI (minus being on platforms that don't support kfuncs, which
is something we should work to fix in general). It will just work
indefinitely, as long as we decide to support it.

The only time there will be pain felt by users is if we in fact do
actually have to change it. If we have to add a flags field, or change
the semantics to have different behavior, etc. I think Alexei's point is
that we simply _can't_ do that if we're bound by UAPI. At least with
kfuncs we have the choice to change it if we deem it necessary.

Taking bpf_get_current_task() as an example, I think it's better to have
the debate be "should we keep supporting this / are users still using
it?" rather than, "it's UAPI, there's nothing to even discuss". The
point being that even if bpf_get_current_task() is still used, there may
(and inevitably will) be other UAPI helpers that are useless and that we
just can't remove.

> >
> > But we certainly have a different understanding of what "maintenance
> > burden" is. If some code doesn't require constant change and doesn't
> > prevent changes in some other parts of the system, it's not a
> > maintenance burden.
>
> As I said it's not about working today. If one doesn't touch code
> it will keep working.
> It's about being able to change it.
> The uapi bits we simply cannot change.

I think Michael Kerrisk's classic "Once upon an API" talk [1] provides a
compelling, real-world example of this point:

[1]: https://kernel-recipes.org/en/2022/once-upon-an-api/

APIs can seem innocuous when you first add them, and then as you use
them more and in different ways, your platform grows more featureful and
things change, etc, you realize that the axioms upon which you designed
your APIs in the first place are no longer true. prctl() started out as
a dead-simple syscall where a child process would get a signal if its
parent process dies. Over the years, it's morphed into a monstrosity [2]
of a syscall with tons of odd behavior that's impossible [3] to fix even
a decade+ after the API was first introduced due to the possibility of
breaking applications that have come to rely on that non-sensical
behavior. Never breaking user space is a great philosophy, but I don't
think we need to inflict that same pain on ourselves for _kernel_
programs, which is what we're discussing here.

[2]: https://man7.org/linux/man-pages/man2/prctl.2.html
[3]: https://bugzilla.kernel.org/show_bug.cgi?id=43300#c22

I'm not trying to paint a false equivalency between prctl() and the
helpers you enumerated in [4], because I agree with you that it's very
unlikely that they'll change, but I also think it's impossible to know
that for sure, and I do agree with Alexei that the "hypothetical chance
to change something in the future" is hugely valuable. That being said,
I comment more on the dynptr helpers down below.

[4]: https://lore.kernel.org/all/CAEf4BzZM0+j6DXMgu2o2UvjtzoOxcjsJtT8j-jqVZYvAqxc52g@mail.gmail.com/

> >
> > > The verifier got smarter and we could have removed all of them,
> > > but uapi rules makes it impossible.
> > > The bpf prog could have been enabled to access all these task_struct
> > > and cgroup fields directly. Likely without any kfuncs.
> > >
> > > bpf_send_signal vs bpf_send_signal_thread
> > > bpf_jiffies64 vs bpf_this_cpu_ptr
> > > etc
> > > there are plenty examples where uapi bpf helpers became a burden.
> > > They are working and will keep working, but we could have done
> > > much better job if not for uapi.
> > > These are the examples where uapi rules are too strong for bpf development.
> > > Our pace of adding new features is high.
> > > The kernel uapi rules are too strict for us.
> >
> > I'm familiar with the burden of maintaining API stability and
> > backwards compat. But it's not just about the library/system
>
> libbpf 1.0 wasn't the smoothest example of deprecation.
> But we still did it despite all kinds of negative flame.
> With uapi helpers we cannot do any of that. No deprecation schemes.
> While kfuncs allow innovation.
>
> > developer's convenience and burden, it's also about the end user's
> > experience and convenience. BPF tool developers really appreciate when
> > there are few less quirks to remember and work around across kernel
> > versions, configurations, architectures, etc. It's the pain that
> > kernel engineers working on BPF bleeding-edge don't experience in the
> > BPF selftests environment.
>
> There is a trade off between users and developers. We want to make user
> experience as smooth as possible while preserve the speed of development
> for the kernel. uapi is in the way of that.

As illustrated in the prctl() example above, UAPI can get in the way of
users as well. If we can't fix an API or its semantics, some users are
stuck with that crappy behavior (while, admittedly, others get to enjoy
the consistency of the weird / existing behavior not changing out from
under them). I certainly see why there are strong reasons to have a
stable UAPI for user space, but for kernel programs I don't think so.

> > >
> > > At one point DaveM declared freeze on sizeof(struct sk_buff).
> > > It was a difficult, but correct decision.
> > > We have to declare freeze on bpf helpers.
> > > 211 helpers that have to be maintained forever is a huge burden.

While I agree that we should freeze helpers at some point, I also think
we need to take care of a few things before that can or should formally
go into effect. You mentioned some things we should take care of in [5].
Automatically emitting kfuncs into vmlinux.h, properly documenting
kfuncs. I think that list is insufficient, and that we need:

[5]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/

1. A formal, build-enforced policy for documenting kfuncs, as we
currently have for helpers (as you mentioned, minus the
build-enforcement).

2. Emitting kfuncs into vmlinux.h, as you mentioned.

3. Allowing users to specify flags per-argument in kfuncs. In my opinion
this is a big deficiency of kfuncs relative to helpers. This would mean
e.g. getting rid of the __sz and __k hacks. I think it's fine for us to
live with it for now while we're continuing to flesh-out and improve
kfuncs (a process which is happening quickly), but IMO it's really not
appropriate for it to be the official only way to add helpers. It's a
beta feature :-)

4. Getting rid of KF_TRUSTED_ARGS and making that the default.

5. Ideally we could improve the story for _defining_ kfuncs as well,
though IMO it's already far less painful than defining helpers. It would
be nice if you could just tag a kfunc with something like a __bpf_kfunc
macro and it would do the following:

- Automatically disable the -Wmissing-prototypes warning. I doubt this
  is possible without adding some compiler features that let you do
  something like __attribute__(__nowarn__("Wmissing-prototypes")), so
  maybe this isn't a hard blocker, but more of a medium / long-term
  goal.
- Add whatever other attributes we need for the kfuncs to be safe. For
  example, 'noinline' and '__used'. Even if the symbols are global,
  we'll probably need '__used' for LTO.

Overall, my point is really that we still have some homework to do
before we can just unilaterally freeze helpers. We're getting close, but
IMO not quite there yet.

> >
> > I still didn't get why we have to freeze anything and how exactly
> > helpers are a burden.
> >
> > But especially in this specific case of few simple dynptr helpers,
> > especially that other dynptrs generic APIs are already BPF helpers. I
> > just don't get it and honestly all I see from this discussion is that
> > you've made up your mind and there is nothing that can be done to
> > convince you.
> >
> > The only "BPF helpers are stable and thus a burden" argument is just
> > not convincing and I'd even say is mostly false. There are no upsides
> > to having dynptr helpers as kfuncs, as far as I'm concerned.
>
> The main and only upside for everything as kfunc is that we can change it.
> That's it.
>
> > But there
> > are a bunch of downsides, even if some of those might be lifted in the
> > future.
>
> imo ability to change outweighs all downsides, since downsides are fixable
> while inability to change is a burden.
>
> > The unfortunate thing is that end users that are meant to benefit from
> > all these helpers and them being "a standard API offering" are not
> > well represented on the BPF mailing list, unfortunately. And my
> > opinion and arguments as a proxy for theirs is clearly not enough.
>
> I also would like to hear what others on the list are thinking.

The last thing I'll say is that everything I've said above is really in
regards to the more general debate of helpers vs. kfuncs. Specifically
for the dynptrs being added in this set, I agree with Andrii that it's
arguably an odd user experience for certain platforms to support
different only specific parts of the dynptr API surface.

I'm not sure whether that's enough to warrant making them helpers
instead of kfuncs, but I do think it's not exactly an apples to apples
comparison with future features that today have no helper API presence.
Putting myself in the shoes of a dynptr user, I would be very surprised
and confused if all of a sudden, I couldn't use some of the core dynptr
APIs due to being on a platform that doesn't have kfunc support. My two
cents are that letting these dynptr functions stay as helpers, while
agreeing that kfuncs is the way forward (though I don't think Andrii
agrees with that even aside from just these dynptrs) is a reasonable
compromise that errs on the side of user-friendliness for dynptr users.

FWIW, I also don't think it's fair or logical to argue at this point in
the game that dynptrs as a concept is inherently flawed. They were super
useful for enabling the user ringbuf map type, which is a key part of
rhone / user-space scheduling in sched_ext, and I wouldn't be surprised
if ghOSt started using it as well as a way to make scheduling decisions
without trapping into the kernel as well. Also, the attendees at LSFMM
generally seemed enthusiastic about dynptrs and user ringbuf, though I
admittedly don't know who's using either feature outside of rhone.

That being said, to reiterate, I personally agree that once we take care
of a few more things for kfuncs , they're 100% the way forward over
helpers. BPF programs are kernel programs, no UAPI pain should be
necessary.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-30 18:38                   ` David Vernet
@ 2022-12-30 19:31                     ` Alexei Starovoitov
  2022-12-30 21:00                       ` David Vernet
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-30 19:31 UTC (permalink / raw)
  To: David Vernet
  Cc: Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Fri, Dec 30, 2022 at 12:38:55PM -0600, David Vernet wrote:
> On Thu, Dec 29, 2022 at 06:46:41PM -0800, Alexei Starovoitov wrote:
> > On Thu, Dec 29, 2022 at 03:10:22PM -0800, Andrii Nakryiko wrote:
> > > On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> > > > > On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > > > > > >
> > > > > > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > > > > >
> > > > > > No clean way? Yet in the other email you proposed a way.
> > > > > > Not pretty, but workable.
> > > > > > I'm sure if ever there will be a need to stabilize the kfunc we will
> > > > > > find a clean way to do it.
> > > > >
> > > > > You can't have stable and unstable helper definition in the same .c
> > > > > file,
> > > >
> > > > of course we can.
> > > > uapi helpers vs kfuncs argument is not a black and white comparison.
> > > > It's not just stable vs unstable.
> > > > uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
> > > > While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
> > > > Meaning they are largely unstable.
> > > > The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
> > > > but distros can apply their own "stability rules".
> > > > See Redhat's kABI, for example. A distro can guarantee a stability
> > > > of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
> > > > upstream development.
> 
> This also sounds more in line with what was discussed at the maintainers
> summit [0]. "A BPF program that depends on kernel symbols is not really
> a user program anymore." Given that perspective, EXPORT_SYMBOL_GPL
> sounds like the correct equivalency to "public BPF symbols".
> 
> [0]: https://lwn.net/Articles/908464/
> 
> > > >
> > > > With uapi bpf helpers we have to guarantee their stability,
> > > > while with kfuncs we can do whatever we want. Right now all kfuncs are
> > > > unstable and to prove the point we changed them couple times already (nf_conn*).
> > > > We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
> > > > Hard to imagine more stable and more fundamental function.
> > > > Of course we want bpf programs to use bpf_obj_new() and assume
> > > > that it's going to be available in all future kernel releases.
> > > > But at the same time we're not bound by uapi rules.
> > > > bpf_obj_new() will likely be stable, but not uapi stable.
> > > > If we screw up (or find better way to allocate memory in the future)
> > > > we can change it.
> > > > We can invent our own deprecation rules for stable-ish kfuncs and
> > > > invent our more-unstable-than-current-unstable rules for kfuncs that
> > > > are too much kernel release dependent.
> > >
> > > I'm talking about *mechanics* of having two incompatible definitions
> > > of functions with the same name, not the *concept* of stable vs
> > > unstable API. See [0] where I explained this as a reply to Joanne.
> > >
> > >   [0] https://lore.kernel.org/bpf/CAEf4BzbRQLEjAFUkzzStv0c0=O+r9iZ8hq33sJB2RtSuGrGAEA@mail.gmail.com/
> >
> > Mechanics for kfuncs are much better than for helpers.
> >
> > extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
> >
> > will likely work with both gcc and clang.
> > And if it doesn't we can fix it.
> >
> > While when gcc folks saw helpers:
> >
> > static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
> >
> > they realized that it is a hack that abuses compiler optimizations.
> > They even invented attr(kernel_helper) to workaround this issue.
> > After a bunch of arguing gcc added support for this hack without attr,
> > but it's going to be around forever... in gcc, in clang and in kernel.
> > It's something that we could have fixed if it wasn't for uapi.
> > Just one more example of unfixable mistake that causing issues
> > to multiple projects.
> > That's the core issue of kernel uapi rules: inability to fix mistakes.
> >
> > > >
> > > > > But regardless, dynptr is modeled as black box with hidden state, and
> > > > > its API surface area is bigger (offset, size, is null or not,
> > > > > manipulations over those aspects; then there is skb/xdp abstraction to
> > > > > be taken care of for generic read/write). It has a wider *generic* API
> > > > > surface to be useful and effectively used.
> > > >
> > > > tbh dynptr as an abstraction of skb/xdp is not convincing.
> > > > cilium created their own abstraction on top of skb and xdp and it's zero cost.
> > > > While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
> > > > So I suspect it won't be a success story in the long run, but we
> > > > can certainly try it out since they will be kfuncs and can be deprecated
> > > > if maintenance outweighs the number of users.
> > > >
> > > > > All *two* of them, bpf_get_current_task() and
> > > > > bpf_get_current_task_btf(), right? They are 2 years apart.
> > > > > bpf_get_current_task() was added before BTF era. It is still actively
> > > > > used today and there is nothing wrong with it. It works on older
> > > > > kernels just fine, even with BPF CO-RE (as backporting a few simple
> > > > > patches to generate BTF is simple and easy; not so much with BPF
> > > > > verifier changes to add native BTF support). I don't see much problem
> > > > > having both, they are not maintenance burden.
> > > >
> > > > bpf_get_current_pid_tgid
> > > > bpf_get_current_uid_gid
> > > > bpf_get_current_comm
> > > > bpf_get_current_task
> > > > bpf_get_current_task_btf
> > > > bpf_get_current_cgroup_id
> > > > bpf_get_current_ancestor_cgroup_id
> > > > bpf_skb_ancestor_cgroup_id
> > > > bpf_sk_cgroup_id
> > > > bpf_sk_ancestor_cgroup_id
> > > >
> > > > _are_ a maintenance burden.
> > >
> > > bpf_get_current_pid_tgid() was added in 2015, slightly and
> > > uncritically touched by Daniel in 2016 and we never had any problems
> > > with it ever since. No updates, no maintenance. I don't remember much
> > > problem with other helpers in this list, but I didn't check each one.
> 
> You could argue that this actually a point in favor of kfuncs. If we
> implement these as kfuncs and never touch them again, users will not
> need to change anything and will have the same exact experience as if it
> was in UAPI (minus being on platforms that don't support kfuncs, which
> is something we should work to fix in general). It will just work
> indefinitely, as long as we decide to support it.
> 
> The only time there will be pain felt by users is if we in fact do
> actually have to change it. If we have to add a flags field, or change
> the semantics to have different behavior, etc. I think Alexei's point is
> that we simply _can't_ do that if we're bound by UAPI. At least with
> kfuncs we have the choice to change it if we deem it necessary.
> 
> Taking bpf_get_current_task() as an example, I think it's better to have
> the debate be "should we keep supporting this / are users still using
> it?" rather than, "it's UAPI, there's nothing to even discuss". The
> point being that even if bpf_get_current_task() is still used, there may
> (and inevitably will) be other UAPI helpers that are useless and that we
> just can't remove.
> 
> > >
> > > But we certainly have a different understanding of what "maintenance
> > > burden" is. If some code doesn't require constant change and doesn't
> > > prevent changes in some other parts of the system, it's not a
> > > maintenance burden.
> >
> > As I said it's not about working today. If one doesn't touch code
> > it will keep working.
> > It's about being able to change it.
> > The uapi bits we simply cannot change.
> 
> I think Michael Kerrisk's classic "Once upon an API" talk [1] provides a
> compelling, real-world example of this point:
> 
> [1]: https://kernel-recipes.org/en/2022/once-upon-an-api/
> 
> APIs can seem innocuous when you first add them, and then as you use
> them more and in different ways, your platform grows more featureful and
> things change, etc, you realize that the axioms upon which you designed
> your APIs in the first place are no longer true. prctl() started out as
> a dead-simple syscall where a child process would get a signal if its
> parent process dies. Over the years, it's morphed into a monstrosity [2]
> of a syscall with tons of odd behavior that's impossible [3] to fix even
> a decade+ after the API was first introduced due to the possibility of
> breaking applications that have come to rely on that non-sensical
> behavior. Never breaking user space is a great philosophy, but I don't
> think we need to inflict that same pain on ourselves for _kernel_
> programs, which is what we're discussing here.
> 
> [2]: https://man7.org/linux/man-pages/man2/prctl.2.html
> [3]: https://bugzilla.kernel.org/show_bug.cgi?id=43300#c22
> 
> I'm not trying to paint a false equivalency between prctl() and the
> helpers you enumerated in [4], because I agree with you that it's very
> unlikely that they'll change, but I also think it's impossible to know
> that for sure, and I do agree with Alexei that the "hypothetical chance
> to change something in the future" is hugely valuable. That being said,
> I comment more on the dynptr helpers down below.
> 
> [4]: https://lore.kernel.org/all/CAEf4BzZM0+j6DXMgu2o2UvjtzoOxcjsJtT8j-jqVZYvAqxc52g@mail.gmail.com/
> 
> > >
> > > > The verifier got smarter and we could have removed all of them,
> > > > but uapi rules makes it impossible.
> > > > The bpf prog could have been enabled to access all these task_struct
> > > > and cgroup fields directly. Likely without any kfuncs.
> > > >
> > > > bpf_send_signal vs bpf_send_signal_thread
> > > > bpf_jiffies64 vs bpf_this_cpu_ptr
> > > > etc
> > > > there are plenty examples where uapi bpf helpers became a burden.
> > > > They are working and will keep working, but we could have done
> > > > much better job if not for uapi.
> > > > These are the examples where uapi rules are too strong for bpf development.
> > > > Our pace of adding new features is high.
> > > > The kernel uapi rules are too strict for us.
> > >
> > > I'm familiar with the burden of maintaining API stability and
> > > backwards compat. But it's not just about the library/system
> >
> > libbpf 1.0 wasn't the smoothest example of deprecation.
> > But we still did it despite all kinds of negative flame.
> > With uapi helpers we cannot do any of that. No deprecation schemes.
> > While kfuncs allow innovation.
> >
> > > developer's convenience and burden, it's also about the end user's
> > > experience and convenience. BPF tool developers really appreciate when
> > > there are few less quirks to remember and work around across kernel
> > > versions, configurations, architectures, etc. It's the pain that
> > > kernel engineers working on BPF bleeding-edge don't experience in the
> > > BPF selftests environment.
> >
> > There is a trade off between users and developers. We want to make user
> > experience as smooth as possible while preserve the speed of development
> > for the kernel. uapi is in the way of that.
> 
> As illustrated in the prctl() example above, UAPI can get in the way of
> users as well. If we can't fix an API or its semantics, some users are
> stuck with that crappy behavior (while, admittedly, others get to enjoy
> the consistency of the weird / existing behavior not changing out from
> under them). I certainly see why there are strong reasons to have a
> stable UAPI for user space, but for kernel programs I don't think so.
> 
> > > >
> > > > At one point DaveM declared freeze on sizeof(struct sk_buff).
> > > > It was a difficult, but correct decision.
> > > > We have to declare freeze on bpf helpers.
> > > > 211 helpers that have to be maintained forever is a huge burden.
> 
> While I agree that we should freeze helpers at some point, I also think
> we need to take care of a few things before that can or should formally
> go into effect. You mentioned some things we should take care of in [5].
> Automatically emitting kfuncs into vmlinux.h, properly documenting
> kfuncs. I think that list is insufficient, and that we need:
> 
> [5]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/

All of the below are 'nice to have' to improve kfunc user experience,
but certainly not 'must have'.

> 1. A formal, build-enforced policy for documenting kfuncs, as we
> currently have for helpers (as you mentioned, minus the
> build-enforcement).

That would be necessary only for stable-ish kfuncs.
Like recently added bpf_obj_new.
Unstable kfuncs would have to be documented differently and maybe not even documented.
It will take time to figure it all out.

> 2. Emitting kfuncs into vmlinux.h, as you mentioned.

Key kfuncs are already in bpf_experimental.h
Unstable kfuncs might go into vmlinux.h.
Maybe all.
Many ways to go about it.

> 3. Allowing users to specify flags per-argument in kfuncs. In my opinion
> this is a big deficiency of kfuncs relative to helpers. This would mean
> e.g. getting rid of the __sz and __k hacks. I think it's fine for us to
> live with it for now while we're continuing to flesh-out and improve
> kfuncs (a process which is happening quickly), but IMO it's really not
> appropriate for it to be the official only way to add helpers. It's a
> beta feature :-)

This is a huge discussion on pros and cons and correct approach.
That might take years. We already had ~3 refactoring of how kfuncs
are represented in the kernel in the last ~2 years.
Is 4th refactoring going to be final? Likely no.
It's a bit of wishful thinking that addressing today's problem will somehow
make everything nice and clean and then we will be ready to stop adding helpers.
We'll keep improving the infra for years to come.
There is no "end of the road" sign.

> 4. Getting rid of KF_TRUSTED_ARGS and making that the default.

We've been talking about this possibility for months.
Are you suggesting to keep adding helpers for another year or so?
We already have 91 kfuncs and 211 helpers.
If we were not asking all developers to use kfuncs we would have had 300+ helpers.

> 5. Ideally we could improve the story for _defining_ kfuncs as well,
> though IMO it's already far less painful than defining helpers. It would
> be nice if you could just tag a kfunc with something like a __bpf_kfunc
> macro and it would do the following:
> 
> - Automatically disable the -Wmissing-prototypes warning. I doubt this
>   is possible without adding some compiler features that let you do
>   something like __attribute__(__nowarn__("Wmissing-prototypes")), so
>   maybe this isn't a hard blocker, but more of a medium / long-term
>   goal.
> - Add whatever other attributes we need for the kfuncs to be safe. For
>   example, 'noinline' and '__used'. Even if the symbols are global,
>   we'll probably need '__used' for LTO.

would be nice, but that didn't stop existing 91 kfuncs to appear
and already used in production.
Yes. kfuncs are already used in production.

> Overall, my point is really that we still have some homework to do
> before we can just unilaterally freeze helpers. We're getting close, but
> IMO not quite there yet.

91 vs 211 tells a different story.

> > >
> > > I still didn't get why we have to freeze anything and how exactly
> > > helpers are a burden.
> > >
> > > But especially in this specific case of few simple dynptr helpers,
> > > especially that other dynptrs generic APIs are already BPF helpers. I
> > > just don't get it and honestly all I see from this discussion is that
> > > you've made up your mind and there is nothing that can be done to
> > > convince you.
> > >
> > > The only "BPF helpers are stable and thus a burden" argument is just
> > > not convincing and I'd even say is mostly false. There are no upsides
> > > to having dynptr helpers as kfuncs, as far as I'm concerned.
> >
> > The main and only upside for everything as kfunc is that we can change it.
> > That's it.
> >
> > > But there
> > > are a bunch of downsides, even if some of those might be lifted in the
> > > future.
> >
> > imo ability to change outweighs all downsides, since downsides are fixable
> > while inability to change is a burden.
> >
> > > The unfortunate thing is that end users that are meant to benefit from
> > > all these helpers and them being "a standard API offering" are not
> > > well represented on the BPF mailing list, unfortunately. And my
> > > opinion and arguments as a proxy for theirs is clearly not enough.
> >
> > I also would like to hear what others on the list are thinking.
> 
> The last thing I'll say is that everything I've said above is really in
> regards to the more general debate of helpers vs. kfuncs. Specifically
> for the dynptrs being added in this set, I agree with Andrii that it's
> arguably an odd user experience for certain platforms to support
> different only specific parts of the dynptr API surface.
> 
> I'm not sure whether that's enough to warrant making them helpers
> instead of kfuncs, but I do think it's not exactly an apples to apples
> comparison with future features that today have no helper API presence.
> Putting myself in the shoes of a dynptr user, I would be very surprised
> and confused if all of a sudden, I couldn't use some of the core dynptr
> APIs due to being on a platform that doesn't have kfunc support. My two
> cents are that letting these dynptr functions stay as helpers, while
> agreeing that kfuncs is the way forward (though I don't think Andrii
> agrees with that even aside from just these dynptrs) is a reasonable
> compromise that errs on the side of user-friendliness for dynptr users.

We already have this 'discrepancy' of both kfuncs and helpers for kptrs
(bpf_obj_new vs bpf_kptr_xhcg) and so far no complains.
Why dynptr is special?

> FWIW, I also don't think it's fair or logical to argue at this point in
> the game that dynptrs as a concept is inherently flawed. They were super
> useful for enabling the user ringbuf map type, which is a key part of
> rhone / user-space scheduling in sched_ext, and I wouldn't be surprised
> if ghOSt started using it as well as a way to make scheduling decisions
> without trapping into the kernel as well. Also, the attendees at LSFMM
> generally seemed enthusiastic about dynptrs and user ringbuf, though I
> admittedly don't know who's using either feature outside of rhone.

rhone doesn't have stability guarantees just like sched-ext doesn't have them.
To drive that point rhone and sched-ext should really be using kfuncs.
Otherwise somebody might point the finger at helpers and argue that
this is somehow makes sched-ext stable.

> That being said, to reiterate, I personally agree that once we take care
> of a few more things for kfuncs , they're 100% the way forward over
> helpers. BPF programs are kernel programs, no UAPI pain should be
> necessary.

Similar arguments were made during sk_buff freeze... let's add few more fields
that are going to be sooo useful and then we'll freeze sk_buff...
dynptr is trying to be that special snow flake.

bpf_rcu_read_lock was added as a kfunc. It's more fundamental than dynptr.
bpf_obj_new is a kfunc too. Also more fundamental than dynptr.
What is so special about dynptr that we need to make an exception for it?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-30 19:31                     ` Alexei Starovoitov
@ 2022-12-30 21:00                       ` David Vernet
  2022-12-31  0:42                         ` Alexei Starovoitov
  2023-01-04 18:43                         ` Andrii Nakryiko
  0 siblings, 2 replies; 57+ messages in thread
From: David Vernet @ 2022-12-30 21:00 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Fri, Dec 30, 2022 at 11:31:12AM -0800, Alexei Starovoitov wrote:
> On Fri, Dec 30, 2022 at 12:38:55PM -0600, David Vernet wrote:
> > On Thu, Dec 29, 2022 at 06:46:41PM -0800, Alexei Starovoitov wrote:
> > > On Thu, Dec 29, 2022 at 03:10:22PM -0800, Andrii Nakryiko wrote:
> > > > On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> > > > > > On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > >
> > > > > > > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > > > > > > >
> > > > > > > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > > > > > >
> > > > > > > No clean way? Yet in the other email you proposed a way.
> > > > > > > Not pretty, but workable.
> > > > > > > I'm sure if ever there will be a need to stabilize the kfunc we will
> > > > > > > find a clean way to do it.
> > > > > >
> > > > > > You can't have stable and unstable helper definition in the same .c
> > > > > > file,
> > > > >
> > > > > of course we can.
> > > > > uapi helpers vs kfuncs argument is not a black and white comparison.
> > > > > It's not just stable vs unstable.
> > > > > uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
> > > > > While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
> > > > > Meaning they are largely unstable.
> > > > > The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
> > > > > but distros can apply their own "stability rules".
> > > > > See Redhat's kABI, for example. A distro can guarantee a stability
> > > > > of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
> > > > > upstream development.
> > 
> > This also sounds more in line with what was discussed at the maintainers
> > summit [0]. "A BPF program that depends on kernel symbols is not really
> > a user program anymore." Given that perspective, EXPORT_SYMBOL_GPL
> > sounds like the correct equivalency to "public BPF symbols".
> > 
> > [0]: https://lwn.net/Articles/908464/
> > 
> > > > >
> > > > > With uapi bpf helpers we have to guarantee their stability,
> > > > > while with kfuncs we can do whatever we want. Right now all kfuncs are
> > > > > unstable and to prove the point we changed them couple times already (nf_conn*).
> > > > > We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
> > > > > Hard to imagine more stable and more fundamental function.
> > > > > Of course we want bpf programs to use bpf_obj_new() and assume
> > > > > that it's going to be available in all future kernel releases.
> > > > > But at the same time we're not bound by uapi rules.
> > > > > bpf_obj_new() will likely be stable, but not uapi stable.
> > > > > If we screw up (or find better way to allocate memory in the future)
> > > > > we can change it.
> > > > > We can invent our own deprecation rules for stable-ish kfuncs and
> > > > > invent our more-unstable-than-current-unstable rules for kfuncs that
> > > > > are too much kernel release dependent.
> > > >
> > > > I'm talking about *mechanics* of having two incompatible definitions
> > > > of functions with the same name, not the *concept* of stable vs
> > > > unstable API. See [0] where I explained this as a reply to Joanne.
> > > >
> > > >   [0] https://lore.kernel.org/bpf/CAEf4BzbRQLEjAFUkzzStv0c0=O+r9iZ8hq33sJB2RtSuGrGAEA@mail.gmail.com/
> > >
> > > Mechanics for kfuncs are much better than for helpers.
> > >
> > > extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
> > >
> > > will likely work with both gcc and clang.
> > > And if it doesn't we can fix it.
> > >
> > > While when gcc folks saw helpers:
> > >
> > > static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
> > >
> > > they realized that it is a hack that abuses compiler optimizations.
> > > They even invented attr(kernel_helper) to workaround this issue.
> > > After a bunch of arguing gcc added support for this hack without attr,
> > > but it's going to be around forever... in gcc, in clang and in kernel.
> > > It's something that we could have fixed if it wasn't for uapi.
> > > Just one more example of unfixable mistake that causing issues
> > > to multiple projects.
> > > That's the core issue of kernel uapi rules: inability to fix mistakes.
> > >
> > > > >
> > > > > > But regardless, dynptr is modeled as black box with hidden state, and
> > > > > > its API surface area is bigger (offset, size, is null or not,
> > > > > > manipulations over those aspects; then there is skb/xdp abstraction to
> > > > > > be taken care of for generic read/write). It has a wider *generic* API
> > > > > > surface to be useful and effectively used.
> > > > >
> > > > > tbh dynptr as an abstraction of skb/xdp is not convincing.
> > > > > cilium created their own abstraction on top of skb and xdp and it's zero cost.
> > > > > While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
> > > > > So I suspect it won't be a success story in the long run, but we
> > > > > can certainly try it out since they will be kfuncs and can be deprecated
> > > > > if maintenance outweighs the number of users.
> > > > >
> > > > > > All *two* of them, bpf_get_current_task() and
> > > > > > bpf_get_current_task_btf(), right? They are 2 years apart.
> > > > > > bpf_get_current_task() was added before BTF era. It is still actively
> > > > > > used today and there is nothing wrong with it. It works on older
> > > > > > kernels just fine, even with BPF CO-RE (as backporting a few simple
> > > > > > patches to generate BTF is simple and easy; not so much with BPF
> > > > > > verifier changes to add native BTF support). I don't see much problem
> > > > > > having both, they are not maintenance burden.
> > > > >
> > > > > bpf_get_current_pid_tgid
> > > > > bpf_get_current_uid_gid
> > > > > bpf_get_current_comm
> > > > > bpf_get_current_task
> > > > > bpf_get_current_task_btf
> > > > > bpf_get_current_cgroup_id
> > > > > bpf_get_current_ancestor_cgroup_id
> > > > > bpf_skb_ancestor_cgroup_id
> > > > > bpf_sk_cgroup_id
> > > > > bpf_sk_ancestor_cgroup_id
> > > > >
> > > > > _are_ a maintenance burden.
> > > >
> > > > bpf_get_current_pid_tgid() was added in 2015, slightly and
> > > > uncritically touched by Daniel in 2016 and we never had any problems
> > > > with it ever since. No updates, no maintenance. I don't remember much
> > > > problem with other helpers in this list, but I didn't check each one.
> > 
> > You could argue that this actually a point in favor of kfuncs. If we
> > implement these as kfuncs and never touch them again, users will not
> > need to change anything and will have the same exact experience as if it
> > was in UAPI (minus being on platforms that don't support kfuncs, which
> > is something we should work to fix in general). It will just work
> > indefinitely, as long as we decide to support it.
> > 
> > The only time there will be pain felt by users is if we in fact do
> > actually have to change it. If we have to add a flags field, or change
> > the semantics to have different behavior, etc. I think Alexei's point is
> > that we simply _can't_ do that if we're bound by UAPI. At least with
> > kfuncs we have the choice to change it if we deem it necessary.
> > 
> > Taking bpf_get_current_task() as an example, I think it's better to have
> > the debate be "should we keep supporting this / are users still using
> > it?" rather than, "it's UAPI, there's nothing to even discuss". The
> > point being that even if bpf_get_current_task() is still used, there may
> > (and inevitably will) be other UAPI helpers that are useless and that we
> > just can't remove.
> > 
> > > >
> > > > But we certainly have a different understanding of what "maintenance
> > > > burden" is. If some code doesn't require constant change and doesn't
> > > > prevent changes in some other parts of the system, it's not a
> > > > maintenance burden.
> > >
> > > As I said it's not about working today. If one doesn't touch code
> > > it will keep working.
> > > It's about being able to change it.
> > > The uapi bits we simply cannot change.
> > 
> > I think Michael Kerrisk's classic "Once upon an API" talk [1] provides a
> > compelling, real-world example of this point:
> > 
> > [1]: https://kernel-recipes.org/en/2022/once-upon-an-api/
> > 
> > APIs can seem innocuous when you first add them, and then as you use
> > them more and in different ways, your platform grows more featureful and
> > things change, etc, you realize that the axioms upon which you designed
> > your APIs in the first place are no longer true. prctl() started out as
> > a dead-simple syscall where a child process would get a signal if its
> > parent process dies. Over the years, it's morphed into a monstrosity [2]
> > of a syscall with tons of odd behavior that's impossible [3] to fix even
> > a decade+ after the API was first introduced due to the possibility of
> > breaking applications that have come to rely on that non-sensical
> > behavior. Never breaking user space is a great philosophy, but I don't
> > think we need to inflict that same pain on ourselves for _kernel_
> > programs, which is what we're discussing here.
> > 
> > [2]: https://man7.org/linux/man-pages/man2/prctl.2.html
> > [3]: https://bugzilla.kernel.org/show_bug.cgi?id=43300#c22
> > 
> > I'm not trying to paint a false equivalency between prctl() and the
> > helpers you enumerated in [4], because I agree with you that it's very
> > unlikely that they'll change, but I also think it's impossible to know
> > that for sure, and I do agree with Alexei that the "hypothetical chance
> > to change something in the future" is hugely valuable. That being said,
> > I comment more on the dynptr helpers down below.
> > 
> > [4]: https://lore.kernel.org/all/CAEf4BzZM0+j6DXMgu2o2UvjtzoOxcjsJtT8j-jqVZYvAqxc52g@mail.gmail.com/
> > 
> > > >
> > > > > The verifier got smarter and we could have removed all of them,
> > > > > but uapi rules makes it impossible.
> > > > > The bpf prog could have been enabled to access all these task_struct
> > > > > and cgroup fields directly. Likely without any kfuncs.
> > > > >
> > > > > bpf_send_signal vs bpf_send_signal_thread
> > > > > bpf_jiffies64 vs bpf_this_cpu_ptr
> > > > > etc
> > > > > there are plenty examples where uapi bpf helpers became a burden.
> > > > > They are working and will keep working, but we could have done
> > > > > much better job if not for uapi.
> > > > > These are the examples where uapi rules are too strong for bpf development.
> > > > > Our pace of adding new features is high.
> > > > > The kernel uapi rules are too strict for us.
> > > >
> > > > I'm familiar with the burden of maintaining API stability and
> > > > backwards compat. But it's not just about the library/system
> > >
> > > libbpf 1.0 wasn't the smoothest example of deprecation.
> > > But we still did it despite all kinds of negative flame.
> > > With uapi helpers we cannot do any of that. No deprecation schemes.
> > > While kfuncs allow innovation.
> > >
> > > > developer's convenience and burden, it's also about the end user's
> > > > experience and convenience. BPF tool developers really appreciate when
> > > > there are few less quirks to remember and work around across kernel
> > > > versions, configurations, architectures, etc. It's the pain that
> > > > kernel engineers working on BPF bleeding-edge don't experience in the
> > > > BPF selftests environment.
> > >
> > > There is a trade off between users and developers. We want to make user
> > > experience as smooth as possible while preserve the speed of development
> > > for the kernel. uapi is in the way of that.
> > 
> > As illustrated in the prctl() example above, UAPI can get in the way of
> > users as well. If we can't fix an API or its semantics, some users are
> > stuck with that crappy behavior (while, admittedly, others get to enjoy
> > the consistency of the weird / existing behavior not changing out from
> > under them). I certainly see why there are strong reasons to have a
> > stable UAPI for user space, but for kernel programs I don't think so.
> > 
> > > > >
> > > > > At one point DaveM declared freeze on sizeof(struct sk_buff).
> > > > > It was a difficult, but correct decision.
> > > > > We have to declare freeze on bpf helpers.
> > > > > 211 helpers that have to be maintained forever is a huge burden.
> > 
> > While I agree that we should freeze helpers at some point, I also think
> > we need to take care of a few things before that can or should formally
> > go into effect. You mentioned some things we should take care of in [5].
> > Automatically emitting kfuncs into vmlinux.h, properly documenting
> > kfuncs. I think that list is insufficient, and that we need:
> > 
> > [5]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/
> 
> All of the below are 'nice to have' to improve kfunc user experience,
> but certainly not 'must have'.

I certainly agree that what is 'must have' is subjective.

> 
> > 1. A formal, build-enforced policy for documenting kfuncs, as we
> > currently have for helpers (as you mentioned, minus the
> > build-enforcement).
> 
> That would be necessary only for stable-ish kfuncs.
> Like recently added bpf_obj_new.
> Unstable kfuncs would have to be documented differently and maybe not even documented.
> It will take time to figure it all out.

Why would we only want to make it necessary for stable-ish kfuncs? It's
simpler and less open to interpretation to just have a blanket "you must
document your kfuncs" policy. It seems pretty reasonable to expect
people who are exporting public symbols that can be linked against by
BPF programs to document those functions given that it takes no more
than ~5 minutes?

I also don't want to hijack the larger conversation here to discuss
documentation. I think we all agree that documentation is important. We
already have a pretty good kfuncs docs page [0] anyways. In my
subjective opinion, _the_ platform for documenting public / exported BPF
symbols should have a well-defined documentation story, but yes, arguing
for it to be a blocker is maybe a stretch.

[0]: https://docs.kernel.org/bpf/kfuncs.html

> > 2. Emitting kfuncs into vmlinux.h, as you mentioned.
> 
> Key kfuncs are already in bpf_experimental.h
> Unstable kfuncs might go into vmlinux.h.
> Maybe all.
> Many ways to go about it.

Agreed that there are many possibilities. In my once again subjective
opinion it would be good to get this ironed out, but yes, arguably not a
blocker.

> 
> > 3. Allowing users to specify flags per-argument in kfuncs. In my opinion
> > this is a big deficiency of kfuncs relative to helpers. This would mean
> > e.g. getting rid of the __sz and __k hacks. I think it's fine for us to
> > live with it for now while we're continuing to flesh-out and improve
> > kfuncs (a process which is happening quickly), but IMO it's really not
> > appropriate for it to be the official only way to add helpers. It's a
> > beta feature :-)
> 
> This is a huge discussion on pros and cons and correct approach.
> That might take years. We already had ~3 refactoring of how kfuncs
> are represented in the kernel in the last ~2 years.
> Is 4th refactoring going to be final? Likely no.

I don't think the fact that we'll never be done is a valid counterpoint
to "are we ready now"? The first iteration of kfuncs was definitely not
in a good enough state to freeze all helpers. The usability of kfuncs
has improved drastically since then. The question isn't "when will be at
a complete stopping point?", it's, "are we sufficiently ready now?".

> It's a bit of wishful thinking that addressing today's problem will somehow
> make everything nice and clean and then we will be ready to stop adding helpers.
> We'll keep improving the infra for years to come.
> There is no "end of the road" sign.

Yes, there's no end of the road, but my point is that there are still
pieces that we know we need to change, and which we know are temporary
(__sz and __k being the main examples).

*That being said*: I completely admit that this is all subjective. From
a technical standpoint, there is nothing stopping us from freezing
helpers. And honestly, I don't disagree with you that getting out of
UAPI immediately and forever is a huge positive; possibly even to the
point that it warrants us just doing it now. More below.

> 
> > 4. Getting rid of KF_TRUSTED_ARGS and making that the default.
> 
> We've been talking about this possibility for months.
> Are you suggesting to keep adding helpers for another year or so?

I think that kfuncs should be the norm for the vast majority of things
being added, and hopefully for everything (I'm going to walk back my
suggestion of adding these new dynptr functions as helpers). Honestly,
my point was really just that I think the API for defining kfuncs needs
to be improved before we can totally and completely freeze helpers due
to the fact that we have __sz and __k, and don't have a consistent
documentation story. That being said, __sz and __k are there, they work,
and as you and I have both said at this point, whether or not they're
"blockers" is subjective.

So my answer to your question of "should we add helpers for another year
or so" in my last reply would have been "absolutely not, unless we truly
have no choice because of the lack of per-arg flags". After reading your
reply, if you're worried that that policy won't be strictly enforced
(meaning that we'll end up having to add helpers that easily could have
just been kfuncs) then I agree that we should just do the hard freeze
now. We've de-facto been doing that anyways for the last year.

That being said, I really would hope that we could at least get some of
the documentation story figured out. Even if it's just something as
simple as spelling out a formal policy on our kfuncs docs page
stipulating that you have to add a doxygen header and link it from a
docs page, it would be nice to have some policy that puts kfuncs on a
road to being as well documented as helpers.

> We already have 91 kfuncs and 211 helpers.
> If we were not asking all developers to use kfuncs we would have had 300+ helpers.

Agreed that this would have been a _very_ unfortunate outcome.

> 
> > 5. Ideally we could improve the story for _defining_ kfuncs as well,
> > though IMO it's already far less painful than defining helpers. It would
> > be nice if you could just tag a kfunc with something like a __bpf_kfunc
> > macro and it would do the following:
> > 
> > - Automatically disable the -Wmissing-prototypes warning. I doubt this
> >   is possible without adding some compiler features that let you do
> >   something like __attribute__(__nowarn__("Wmissing-prototypes")), so
> >   maybe this isn't a hard blocker, but more of a medium / long-term
> >   goal.
> > - Add whatever other attributes we need for the kfuncs to be safe. For
> >   example, 'noinline' and '__used'. Even if the symbols are global,
> >   we'll probably need '__used' for LTO.
> 
> would be nice, but that didn't stop existing 91 kfuncs to appear
> and already used in production.
> Yes. kfuncs are already used in production.

This is something that would literally only take like 1-2 patches
anyways. I'm happy to do it so we don't have to waste cycles thinking
about it as a blocker for anything.

> 
> > Overall, my point is really that we still have some homework to do
> > before we can just unilaterally freeze helpers. We're getting close, but
> > IMO not quite there yet.
> 
> 91 vs 211 tells a different story.

Yeah, the fact that we have 91 kfuncs is strong evidence that kfuncs are
already in a good-enough place to just freeze helpers.

Another counterpoint to my initial claim that not having per-arg flags
could be problematic is that there are certain things that are global in
kfuncs that are also global in helpers despite having per-arg modifiers.
For example, the fact that you can only have one OBJ_RELEASE argument.
And yet another is the fact that none of the helpers we've added in the
last year relied on having per-arg modifiers, so in practice it hasn't
been a problem.

I think it's fair to say that if you just look at the data instead of
from an "API cleanlines" perspective, having per-arg modifiers is not a
blocker. Data wins over subjectivity, so as mentioned above, I'm willing
to change my mind about per-arg modifiers being a blocker, especially
with __sz and __k.

> > > >
> > > > I still didn't get why we have to freeze anything and how exactly
> > > > helpers are a burden.
> > > >
> > > > But especially in this specific case of few simple dynptr helpers,
> > > > especially that other dynptrs generic APIs are already BPF helpers. I
> > > > just don't get it and honestly all I see from this discussion is that
> > > > you've made up your mind and there is nothing that can be done to
> > > > convince you.
> > > >
> > > > The only "BPF helpers are stable and thus a burden" argument is just
> > > > not convincing and I'd even say is mostly false. There are no upsides
> > > > to having dynptr helpers as kfuncs, as far as I'm concerned.
> > >
> > > The main and only upside for everything as kfunc is that we can change it.
> > > That's it.
> > >
> > > > But there
> > > > are a bunch of downsides, even if some of those might be lifted in the
> > > > future.
> > >
> > > imo ability to change outweighs all downsides, since downsides are fixable
> > > while inability to change is a burden.
> > >
> > > > The unfortunate thing is that end users that are meant to benefit from
> > > > all these helpers and them being "a standard API offering" are not
> > > > well represented on the BPF mailing list, unfortunately. And my
> > > > opinion and arguments as a proxy for theirs is clearly not enough.
> > >
> > > I also would like to hear what others on the list are thinking.
> > 
> > The last thing I'll say is that everything I've said above is really in
> > regards to the more general debate of helpers vs. kfuncs. Specifically
> > for the dynptrs being added in this set, I agree with Andrii that it's
> > arguably an odd user experience for certain platforms to support
> > different only specific parts of the dynptr API surface.
> > 
> > I'm not sure whether that's enough to warrant making them helpers
> > instead of kfuncs, but I do think it's not exactly an apples to apples
> > comparison with future features that today have no helper API presence.
> > Putting myself in the shoes of a dynptr user, I would be very surprised
> > and confused if all of a sudden, I couldn't use some of the core dynptr
> > APIs due to being on a platform that doesn't have kfunc support. My two
> > cents are that letting these dynptr functions stay as helpers, while
> > agreeing that kfuncs is the way forward (though I don't think Andrii
> > agrees with that even aside from just these dynptrs) is a reasonable
> > compromise that errs on the side of user-friendliness for dynptr users.
> 
> We already have this 'discrepancy' of both kfuncs and helpers for kptrs
> (bpf_obj_new vs bpf_kptr_xhcg) and so far no complains.
> Why dynptr is special?

Well, lack of usability in one case doesn't necessarily mean we should
allow it in another. That said, the "usability" gains from having a
helper really are minimal to the point of practically being negligible
anyways.

Part of me was trying to find a compromise here to move forward, but
honestly, I do agree with you that we should aggressively make
everything a kfunc unless we have a good reason not to, dynptr functions
included. So I'm willing to walk this suggestion back as well -- let's
just make these kfuncs.

> > FWIW, I also don't think it's fair or logical to argue at this point in
> > the game that dynptrs as a concept is inherently flawed. They were super
> > useful for enabling the user ringbuf map type, which is a key part of
> > rhone / user-space scheduling in sched_ext, and I wouldn't be surprised
> > if ghOSt started using it as well as a way to make scheduling decisions
> > without trapping into the kernel as well. Also, the attendees at LSFMM
> > generally seemed enthusiastic about dynptrs and user ringbuf, though I
> > admittedly don't know who's using either feature outside of rhone.
> 
> rhone doesn't have stability guarantees just like sched-ext doesn't have them.
> To drive that point rhone and sched-ext should really be using kfuncs.
> Otherwise somebody might point the finger at helpers and argue that
> this is somehow makes sched-ext stable.

Also a reasonable point. My point above was really just a response to
your claim in [0] that dynptrs are flawed. It wasn't related to kfuncs
vs. helpers.

[0]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/

> 
> > That being said, to reiterate, I personally agree that once we take care
> > of a few more things for kfuncs , they're 100% the way forward over
> > helpers. BPF programs are kernel programs, no UAPI pain should be
> > necessary.
> 
> Similar arguments were made during sk_buff freeze... let's add few more fields
> that are going to be sooo useful and then we'll freeze sk_buff...
> dynptr is trying to be that special snow flake.

The main points of my initial response were not about dynptrs, they were
about how we define kfuncs. I agree there is nothing at all special
about dynptrs beyond the fact that they as a feature already have
helpers. Sure, let's add them as kfuncs. No reason to be beholden to the
UAPI restrictions.

> 
> bpf_rcu_read_lock was added as a kfunc. It's more fundamental than dynptr.
> bpf_obj_new is a kfunc too. Also more fundamental than dynptr.
> What is so special about dynptr that we need to make an exception for it?

See above.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-30 21:00                       ` David Vernet
@ 2022-12-31  0:42                         ` Alexei Starovoitov
  2023-01-03 11:43                           ` Daniel Borkmann
                                             ` (2 more replies)
  2023-01-04 18:43                         ` Andrii Nakryiko
  1 sibling, 3 replies; 57+ messages in thread
From: Alexei Starovoitov @ 2022-12-31  0:42 UTC (permalink / raw)
  To: David Vernet
  Cc: Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
> > > 
> > > Taking bpf_get_current_task() as an example, I think it's better to have
> > > the debate be "should we keep supporting this / are users still using
> > > it?" rather than, "it's UAPI, there's nothing to even discuss". The
> > > point being that even if bpf_get_current_task() is still used, there may
> > > (and inevitably will) be other UAPI helpers that are useless and that we
> > > just can't remove.

Sorry, missed this question in the previous reply.
The answer is "it's UAPI, there's nothing to even discuss".
It doesn't matter whether bpf_get_current_task() is used heavily or not used at all.
The chance of breaking user space is what paralyzes the changes.
Any change to uapi header file is looked at with a magnifying glass.
There is no deprecation story for uapi.
The definition and semantics of bpf helpers are frozen _forever_.
And our uapi/bpf.h is not in a good company:
ls -Sla include/uapi/linux/|head
-rw-r--r-- 1 ast users 331159 Nov  3 08:32 nl80211.h
-rw-r--r-- 1 ast users 265312 Dec 25 13:51 bpf.h
-rw-r--r-- 1 ast users 118621 Dec 25 13:51 v4l2-controls.h
-rw-r--r-- 1 ast users  99533 Dec 25 13:51 videodev2.h
-rw-r--r-- 1 ast users  86460 Nov 29 11:15 ethtool.h

"Freeze bpf helpers now" is a minimum we should do right now.
We need to take aggressive steps to freeze the growth of the whole uapi/bpf.h

Support for kfuncs was added in March 2021 in
commit e6ac2450d6de ("bpf: Support bpf program calling kernel function")
In almost 2 years we've learned a lot on how to verify them, how to use and extend them.
The way they're defined in the kernel was refactored ~3 times.
Right now do:
git grep 'BTF_ID_FLAGS(func'
to find all kfuncs.
Including Documentation/bpf/kfuncs.rst that you've made great contribution to :)

When I mentioned 91 kfunc in my previous reply I forgot to count another dozen kfuncs
in sched-ext and another dozen in hid-bpf that are not in mainline yet.
fuse-bpf will likely add their own kfuncs and so on.

Your 'todo list' for kfuncs is absolutely correct. Are kfuncs a perfect substitute
for helpers? No. They have downsides and we need to work on addressing downsides
instead of growing bpf.h further.
Are we ready to freeze bpf helpers? Absolutely yes.
"please use kfuncs instead of helpers" was our recommendation for 9 month or so
and now we need to make it an official rule.
For bpf noobs it's certainly easier to add new prog type, new map type, new helper,
but that gotta stop.
Last prog type we added in May 2021 and we should try hard not to add one anymore.
hid-bpf managed to do everything without new prog type.
sched-ext is not adding new prog type either.
This is great. We're breaking free from uapi constraints.

With map types we are not doing so well:
9330986c03006 (Joanne Koong            2021-10-27 16:45:00 -0700  943)  BPF_MAP_TYPE_BLOOM_FILTER,
583c1f420173f (David Vernet            2022-09-19 19:00:57 -0500  944)  BPF_MAP_TYPE_USER_RINGBUF,
c4bcfb38a95ed (Yonghong Song           2022-10-25 21:28:50 -0700  945)  BPF_MAP_TYPE_CGRP_STORAGE,
99c55f7d47c0d (Alexei Starovoitov      2014-09-26 00:16:57 -0700  946) };

I wish these last three were not added as stable uapi.
Right now we're getting close on defining new map types in unstable way.
The bpf link lists and bpf rbtree are added through kfuncs
(aka new generation data structures, aka graph apis).
They don't have uapi values in 'enum bpf_map_type' and that's
the most important part about them.
Are we ready to freeze map prog types already? Probably not.
Upcoming qp-trie comes to mind that looks very hard to do without new map type.
I hope it will be the last stable map type.

> > > I think Michael Kerrisk's classic "Once upon an API" talk [1] provides a
> > > compelling, real-world example of this point:
> > > 
> > > [1]: https://kernel-recipes.org/en/2022/once-upon-an-api/

This is great analogy. We need to learn from the "uapi pain" of others before us
instead of learning it the hard way through our own mistakes.

> I also don't want to hijack the larger conversation here to discuss
> documentation. I think we all agree that documentation is important. We
> already have a pretty good kfuncs docs page [0] anyways. In my
> subjective opinion, _the_ platform for documenting public / exported BPF
> symbols should have a well-defined documentation story, but yes, arguing
> for it to be a blocker is maybe a stretch.
...
> That being said, I really would hope that we could at least get some of
> the documentation story figured out. Even if it's just something as
> simple as spelling out a formal policy on our kfuncs docs page
> stipulating that you have to add a doxygen header and link it from a
> docs page, it would be nice to have some policy that puts kfuncs on a
> road to being as well documented as helpers.

The challenge of requiring the doc with a kfunc is that it can make kfunc
look stable.
We need the whole spectrum of kfuncs from pretty stable (like bpf_obj_new)
to something very unstable (like bpf_kfunc_call_test_mem_len_fail2).
We cannot require a doc with automatic .h for every kfunc.
Therefore right now all kfuncs are completely unstable and
stability story (including good doc and discoverability) is yet to be figured out.

> > 
> > > 5. Ideally we could improve the story for _defining_ kfuncs as well,
> > > though IMO it's already far less painful than defining helpers. It would
> > > be nice if you could just tag a kfunc with something like a __bpf_kfunc
> > > macro and it would do the following:
> > > 
> > > - Automatically disable the -Wmissing-prototypes warning. I doubt this
> > >   is possible without adding some compiler features that let you do
> > >   something like __attribute__(__nowarn__("Wmissing-prototypes")), so
> > >   maybe this isn't a hard blocker, but more of a medium / long-term
> > >   goal.
> > > - Add whatever other attributes we need for the kfuncs to be safe. For
> > >   example, 'noinline' and '__used'. Even if the symbols are global,
> > >   we'll probably need '__used' for LTO.
> > 
> > would be nice, but that didn't stop existing 91 kfuncs to appear
> > and already used in production.
> > Yes. kfuncs are already used in production.
> 
> This is something that would literally only take like 1-2 patches
> anyways. I'm happy to do it so we don't have to waste cycles thinking
> about it as a blocker for anything.

Yeah. __bpf_kfunc tag would be nice to avoid this boilerplate.

In addition to your 'kfunc todo list' I can add:
6. introduce polymorphic kfuncs
We have helpers that have different implementation depending on prog type.
All kfuncs have one-to-one match so far.
We need kfuncs that would work differently depending on bpf prog context.

7. fine grained kfunc scope
Right now a set of available kfuncs is determined by prog type.
Same thing we do for helpers, but kfuncs already outpaced helpers.
We need to be able to define a set of kfuncs for a pair (prog type, attach location)
or something like that. hid-bpf and sched-ext folks asked for it.
That would be similar to EXPORT_SYMBOL namespaces, but with strict
enforcement for safety.

> Another counterpoint to my initial claim that not having per-arg flags
> could be problematic is that there are certain things that are global in
> kfuncs that are also global in helpers despite having per-arg modifiers.
> For example, the fact that you can only have one OBJ_RELEASE argument.
> And yet another is the fact that none of the helpers we've added in the
> last year relied on having per-arg modifiers, so in practice it hasn't
> been a problem.

Right. Right now we have OBJ_RELEASE flag for args of helpers,
but that refactoring happened recently. Not that long ago
all helpers with release semantic were hard coded in verifier.c.
We're making progress in both helper and kfunc verification.
We should be able to combine the code eventually.

> Part of me was trying to find a compromise here to move forward, but
> honestly, I do agree with you that we should aggressively make
> everything a kfunc unless we have a good reason not to, dynptr functions
> included. So I'm willing to walk this suggestion back as well -- let's
> just make these kfuncs.

Agree that any hard policy like 'only kfuncs from now on' gotta have its limits.
Maybe there will be a strong reason to add a new helper one day,
so we can keep the door open a tiny bit for an exception,
but for dynptr...
There are kfuncs with dynptr already (bpf_verify_pkcs7_signature)
So precedent is already made.

> Also a reasonable point. My point above was really just a response to
> your claim in [0] that dynptrs are flawed. It wasn't related to kfuncs
> vs. helpers.
> 
> [0]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/

The flawed part of dynptr I was explaining here:
https://lore.kernel.org/all/20221225215210.ekmfhyczgubx4rih@macbook-pro-6.dhcp.thefacebook.com/

It's not that the whole concept of dynptr is flawed,
but using it as an abstraction on top of skb/xdp.
I don't believe that the extreme performance demands of xdp users are
compatible with 'lets verify in runtime' philosophy of dynptr.
I could be wrong. That's why I'm fine adding dynptr_on_top_of_xdp as kfuncs
and seeing it playing out, but certainly not as a stable helper.
iirc Martin and Kuba had concerns about bits of dynptr(skb | xdp) too.
With kfuncs we can iron out the issues while trying to use it whereas
with helpers we will be stuck for long time in endless mailing list arguments.
It's a win-win for everyone to switch everything to kfuncs.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-31  0:42                         ` Alexei Starovoitov
@ 2023-01-03 11:43                           ` Daniel Borkmann
  2023-01-03 23:51                             ` Alexei Starovoitov
  2023-01-04  0:55                           ` Jakub Kicinski
  2023-01-04 18:44                           ` Andrii Nakryiko
  2 siblings, 1 reply; 57+ messages in thread
From: Daniel Borkmann @ 2023-01-03 11:43 UTC (permalink / raw)
  To: Alexei Starovoitov, David Vernet
  Cc: Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu

On 12/31/22 1:42 AM, Alexei Starovoitov wrote:
> On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
>>>>
>>>> Taking bpf_get_current_task() as an example, I think it's better to have
>>>> the debate be "should we keep supporting this / are users still using
>>>> it?" rather than, "it's UAPI, there's nothing to even discuss". The
>>>> point being that even if bpf_get_current_task() is still used, there may
>>>> (and inevitably will) be other UAPI helpers that are useless and that we
>>>> just can't remove.
> 
> Sorry, missed this question in the previous reply.
> The answer is "it's UAPI, there's nothing to even discuss".
> It doesn't matter whether bpf_get_current_task() is used heavily or not used at all.
> The chance of breaking user space is what paralyzes the changes.
> Any change to uapi header file is looked at with a magnifying glass.
> There is no deprecation story for uapi.
> The definition and semantics of bpf helpers are frozen _forever_.
> And our uapi/bpf.h is not in a good company:
> ls -Sla include/uapi/linux/|head
> -rw-r--r-- 1 ast users 331159 Nov  3 08:32 nl80211.h
> -rw-r--r-- 1 ast users 265312 Dec 25 13:51 bpf.h
> -rw-r--r-- 1 ast users 118621 Dec 25 13:51 v4l2-controls.h
> -rw-r--r-- 1 ast users  99533 Dec 25 13:51 videodev2.h
> -rw-r--r-- 1 ast users  86460 Nov 29 11:15 ethtool.h
> 
> "Freeze bpf helpers now" is a minimum we should do right now.
> We need to take aggressive steps to freeze the growth of the whole uapi/bpf.h

Imho, freezing BPF helpers now is way too aggressive step. One aspect which was
not discussed here is that unstable kfuncs will be a pain for user experience
compared to BPF helpers. Probably not for FB or G who maintain they own limited
set of kernels, but for all others. If there is valid reason that kfuncs will have
to change one way or another, then BPF applications using them will have to carry
the maintenance burden on their side to be able to support a variety of kernel
versions with working around the kfunc quirks. So you're essentially outsourcing
the problem from kernel to users, which will suck from a user experience (and add
to development cost on their side). Ofc there is interest in keeping changes to a
minimum, but it's not the same as BPF helpers where there is a significantly higher
guarantee that things continue to keep working going forward. Today in Cilium we
don't use any of the kfuncs, we might at some point when we see it necessary, but
likely to a limited degree if sth cannot be solved as-is and only kfunc is present
as a solution. But again, from a UX it's not great having to know that things can
break anytime soon with newer kernels (things might already with verifier/LLVM
upgrade and kfunc potentially adds yet another level). Generally speaking, I'm not
against kfuncs but I suggest only making "freeze bpf helpers now" a soft freeze
with a path forward for promoting some of the kfuncs which have been around and
matured for a while and didn't need changes as stable BPF helpers to indicate their
maturity level when we see it fit. So it's not a hard "no", but possible promotion
when suitable.

[...]
> When I mentioned 91 kfunc in my previous reply I forgot to count another dozen kfuncs
> in sched-ext and another dozen in hid-bpf that are not in mainline yet.
> fuse-bpf will likely add their own kfuncs and so on.

For the latter agree as well given from a bigger picture, they are mainly niche use
cases at this point and in future.

> Your 'todo list' for kfuncs is absolutely correct. Are kfuncs a perfect substitute
> for helpers? No. They have downsides and we need to work on addressing downsides
> instead of growing bpf.h further.
> Are we ready to freeze bpf helpers? Absolutely yes.
> "please use kfuncs instead of helpers" was our recommendation for 9 month or so
> and now we need to make it an official rule.
> For bpf noobs it's certainly easier to add new prog type, new map type, new helper,
> but that gotta stop.
> Last prog type we added in May 2021 and we should try hard not to add one anymore.
> hid-bpf managed to do everything without new prog type.
> sched-ext is not adding new prog type either.
> This is great. We're breaking free from uapi constraints.
[...]

> The challenge of requiring the doc with a kfunc is that it can make kfunc
> look stable.
> We need the whole spectrum of kfuncs from pretty stable (like bpf_obj_new)
> to something very unstable (like bpf_kfunc_call_test_mem_len_fail2).
> We cannot require a doc with automatic .h for every kfunc.
> Therefore right now all kfuncs are completely unstable and
> stability story (including good doc and discoverability) is yet to be figured out.
[...]

Discoverability plus being able to know semantics from a user PoV to figure out when
workarounds for older/newer kernels are required to be able to support both kernels.
"something very unstable" sounds like it probably shouldn't even be merged in the
first place, but generally speaking a spectrum from pretty stable to very unstable
is imho repeating the same story as BPF helpers vs kfuncs. Saying a kfunc is 'pretty
stable' is kind of hinting to users that it's close to UAPI, but yet it's unstable.
It'll confuse even more. I'd rather have a path forward where those kfuncs get promoted
to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
and from an API PoV that it is ready to be a proper BPF helper, and until this point
it's unstable, expect things to change, period. If a kfunc actually changed for the
kernels that users develop against, they need to go and figure out anyway as part of
their development process (/ maintenance cost).

> Agree that any hard policy like 'only kfuncs from now on' gotta have its limits.
> Maybe there will be a strong reason to add a new helper one day,
> so we can keep the door open a tiny bit for an exception,

+1

> but for dynptr...
> There are kfuncs with dynptr already (bpf_verify_pkcs7_signature)
> So precedent is already made.

bpf_verify_pkcs7_signature as kfunc also makes sense given wider-spread adoption (and
ideally as part of an OSS project) is yet to be seen.

>> Also a reasonable point. My point above was really just a response to
>> your claim in [0] that dynptrs are flawed. It wasn't related to kfuncs
>> vs. helpers.
>>
>> [0]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/
> 
> The flawed part of dynptr I was explaining here:
> https://lore.kernel.org/all/20221225215210.ekmfhyczgubx4rih@macbook-pro-6.dhcp.thefacebook.com/
> 
> It's not that the whole concept of dynptr is flawed,
> but using it as an abstraction on top of skb/xdp.
> I don't believe that the extreme performance demands of xdp users are
> compatible with 'lets verify in runtime' philosophy of dynptr.
> I could be wrong. That's why I'm fine adding dynptr_on_top_of_xdp as kfuncs
> and seeing it playing out, but certainly not as a stable helper.
> iirc Martin and Kuba had concerns about bits of dynptr(skb | xdp) too.

(My assumption was that you're adding it because you were planning to use
it internally?)

> With kfuncs we can iron out the issues while trying to use it whereas
> with helpers we will be stuck for long time in endless mailing list arguments.
> It's a win-win for everyone to switch everything to kfuncs.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-03 11:43                           ` Daniel Borkmann
@ 2023-01-03 23:51                             ` Alexei Starovoitov
  2023-01-04 14:25                               ` Daniel Borkmann
  2023-01-11 22:56                               ` Maxim Mikityanskiy
  0 siblings, 2 replies; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-03 23:51 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Vernet, Andrii Nakryiko, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
> On 12/31/22 1:42 AM, Alexei Starovoitov wrote:
> > On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
> > > > > 
> > > > > Taking bpf_get_current_task() as an example, I think it's better to have
> > > > > the debate be "should we keep supporting this / are users still using
> > > > > it?" rather than, "it's UAPI, there's nothing to even discuss". The
> > > > > point being that even if bpf_get_current_task() is still used, there may
> > > > > (and inevitably will) be other UAPI helpers that are useless and that we
> > > > > just can't remove.
> > 
> > Sorry, missed this question in the previous reply.
> > The answer is "it's UAPI, there's nothing to even discuss".
> > It doesn't matter whether bpf_get_current_task() is used heavily or not used at all.
> > The chance of breaking user space is what paralyzes the changes.
> > Any change to uapi header file is looked at with a magnifying glass.
> > There is no deprecation story for uapi.
> > The definition and semantics of bpf helpers are frozen _forever_.
> > And our uapi/bpf.h is not in a good company:
> > ls -Sla include/uapi/linux/|head
> > -rw-r--r-- 1 ast users 331159 Nov  3 08:32 nl80211.h
> > -rw-r--r-- 1 ast users 265312 Dec 25 13:51 bpf.h
> > -rw-r--r-- 1 ast users 118621 Dec 25 13:51 v4l2-controls.h
> > -rw-r--r-- 1 ast users  99533 Dec 25 13:51 videodev2.h
> > -rw-r--r-- 1 ast users  86460 Nov 29 11:15 ethtool.h
> > 
> > "Freeze bpf helpers now" is a minimum we should do right now.
> > We need to take aggressive steps to freeze the growth of the whole uapi/bpf.h
> 
> Imho, freezing BPF helpers now is way too aggressive step. One aspect which was
> not discussed here is that unstable kfuncs will be a pain for user experience
> compared to BPF helpers. Probably not for FB or G who maintain they own limited
> set of kernels, but for all others. If there is valid reason that kfuncs will have
> to change one way or another, then BPF applications using them will have to carry
> the maintenance burden on their side to be able to support a variety of kernel
> versions with working around the kfunc quirks. So you're essentially outsourcing
> the problem from kernel to users, which will suck from a user experience (and add
> to development cost on their side). 

It's actually the opposite.
A small company that wants to use BPF needs to have a workaround/plan B for
different kernels and different distros.
That's why cilium and others have to detect availability of helpers and bpf features.
One bpf prog for newer kernel and potentially completely different solution
for older kernels.
That's the biggest obstacle in bpf adoption: the required features are in
the latest kernels, but companies have to support older kernels too.
Now look at the problem from different angle:
Detecting kfuncs is no different than detecting helpers.
The bpf users has to have a workaround when helper/kfunc is not available.
In that sense stability of the helpers vs instability of kfuncs is irrelevant.
Both might not exist in a particular kernel.
So if cilium starts to use kfunc it won't be extra development cost and
bpf program writer experience using kfuncs vs using helpers is the same as well.
But with kfuncs we can solve this bpf adoption issue.
The helpers are not easily backportable and cannot be added in modules,
so company's workarounds for older kernel are painful.
While kfuncs are trivially added in a module.

Let's take bpf_sock_destroy that Aditi wants to add as an example.
If it's done as a helper the cilium would need to wait for the next kernel release
and next distro release some years from now to actually use it at the customer site.
If bpf_sock_destroy is added as kfunc you can ship an extra kernel module
with just that kfunc to your customers. You can also attempt to convince a distro
that this module with kfuncs should be certified, since the same kfunc is in upstream kernel.
The customer can use cilium that relies on bpf_sock_destroy much sooner
and likely there won't be a need to develop a completely different workaround
for kernels without that kfunc.
There is no need to actually backport bpf_sock_destroy to older kernels.
As long as verifier infrastructure for kfuncs is feature rich all new kfuncs
can be shipped by distro or by cilium in a module without affecting
support contract of the main kernel.

The verification of kfuncs is still actively evolving, but in not too distant future
people will be able to ship/add kfuncs without touching the kernel.
The faster the whole bpf community switches to 'use kfuncs for everything' model
the faster the verification of them becomes solid and bpf adoption issue will be addressed.

> Ofc there is interest in keeping changes to a
> minimum, but it's not the same as BPF helpers where there is a significantly higher
> guarantee that things continue to keep working going forward. Today in Cilium we
> don't use any of the kfuncs, we might at some point when we see it necessary, but
> likely to a limited degree if sth cannot be solved as-is and only kfunc is present
> as a solution. But again, from a UX it's not great having to know that things can
> break anytime soon with newer kernels (things might already with verifier/LLVM
> upgrade and kfunc potentially adds yet another level). Generally speaking, I'm not
> against kfuncs but I suggest only making "freeze bpf helpers now" a soft freeze
> with a path forward for promoting some of the kfuncs which have been around and
> matured for a while and didn't need changes as stable BPF helpers to indicate their
> maturity level when we see it fit. So it's not a hard "no", but possible promotion
> when suitable.

The problem with 'soft' freeze that it's open to interpretation and abuse.
It feels to me you're saying that cilium is not using kfuncs and
therefore all cilium features additions are ok to be done as helpers.
That doesn't sound fair to other bpf devs.

> 
> [...]
> > When I mentioned 91 kfunc in my previous reply I forgot to count another dozen kfuncs
> > in sched-ext and another dozen in hid-bpf that are not in mainline yet.
> > fuse-bpf will likely add their own kfuncs and so on.
> 
> For the latter agree as well given from a bigger picture, they are mainly niche use
> cases at this point and in future.

I'd argue that cilium's bpf_sock_destroy is just as niche as sched-ext scheduling kfuncs.

> 
> > Your 'todo list' for kfuncs is absolutely correct. Are kfuncs a perfect substitute
> > for helpers? No. They have downsides and we need to work on addressing downsides
> > instead of growing bpf.h further.
> > Are we ready to freeze bpf helpers? Absolutely yes.
> > "please use kfuncs instead of helpers" was our recommendation for 9 month or so
> > and now we need to make it an official rule.
> > For bpf noobs it's certainly easier to add new prog type, new map type, new helper,
> > but that gotta stop.
> > Last prog type we added in May 2021 and we should try hard not to add one anymore.
> > hid-bpf managed to do everything without new prog type.
> > sched-ext is not adding new prog type either.
> > This is great. We're breaking free from uapi constraints.
> [...]
> 
> > The challenge of requiring the doc with a kfunc is that it can make kfunc
> > look stable.
> > We need the whole spectrum of kfuncs from pretty stable (like bpf_obj_new)
> > to something very unstable (like bpf_kfunc_call_test_mem_len_fail2).
> > We cannot require a doc with automatic .h for every kfunc.
> > Therefore right now all kfuncs are completely unstable and
> > stability story (including good doc and discoverability) is yet to be figured out.
> [...]
> 
> Discoverability plus being able to know semantics from a user PoV to figure out when
> workarounds for older/newer kernels are required to be able to support both kernels.

Sounds like your concern is that there could be a kfunc that changed it semantics,
but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
such patches from landing. It's up to us to define sane and user friendly deprecation of kfuncs.

> "something very unstable" sounds like it probably shouldn't even be merged in the
> first place, but generally speaking a spectrum from pretty stable to very unstable

See bpf_kfunc_call_test_mem_len_fail2.
It's very much 'very unstable'. It's a test function.
Currently it's in net/bpf/test_run.c. It's there only because at that time we
didn't have an ability to add kfuncs in modules. Soon we will move all test kfuncs
from the main kernel to bpf_testmod.ko

> is imho repeating the same story as BPF helpers vs kfuncs. Saying a kfunc is 'pretty
> stable' is kind of hinting to users that it's close to UAPI, but yet it's unstable.

correct.

> It'll confuse even more. I'd rather have a path forward where those kfuncs get promoted

why confuse more? There are EXPORT_SYMBOL like kmalloc that are quite stable,
yet they can change.
EXPORT_SYMBOL_GPL is exact analogy to kfunc.

> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> and from an API PoV that it is ready to be a proper BPF helper, and until this point

"Proper BPF helper" model is broken.
static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;

is a hack that works only when compiler optimizes the code.
See gcc's attr(kernel_helper) workaround.
This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
And because it's uapi we cannot even fix this.
With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
These tools don't exist yet, but we have a way forward whereas with helpers
we are stuck with -O2.

> it's unstable, expect things to change, period. If a kfunc actually changed for the
> kernels that users develop against, they need to go and figure out anyway as part of
> their development process (/ maintenance cost).

The stable kfuncs will still use the same kfuncs mechanics: libbpf searches BTF
and supplies kernel with btf_id of that kfunc before loading the bpf prog.
We won't be hacking stable kfuncs into '= (void *) 1;'

> > Agree that any hard policy like 'only kfuncs from now on' gotta have its limits.
> > Maybe there will be a strong reason to add a new helper one day,
> > so we can keep the door open a tiny bit for an exception,
> 
> +1
> 
> > but for dynptr...
> > There are kfuncs with dynptr already (bpf_verify_pkcs7_signature)
> > So precedent is already made.
> 
> bpf_verify_pkcs7_signature as kfunc also makes sense given wider-spread adoption (and
> ideally as part of an OSS project) is yet to be seen.
> 
> > > Also a reasonable point. My point above was really just a response to
> > > your claim in [0] that dynptrs are flawed. It wasn't related to kfuncs
> > > vs. helpers.
> > > 
> > > [0]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/
> > 
> > The flawed part of dynptr I was explaining here:
> > https://lore.kernel.org/all/20221225215210.ekmfhyczgubx4rih@macbook-pro-6.dhcp.thefacebook.com/
> > 
> > It's not that the whole concept of dynptr is flawed,
> > but using it as an abstraction on top of skb/xdp.
> > I don't believe that the extreme performance demands of xdp users are
> > compatible with 'lets verify in runtime' philosophy of dynptr.
> > I could be wrong. That's why I'm fine adding dynptr_on_top_of_xdp as kfuncs
> > and seeing it playing out, but certainly not as a stable helper.
> > iirc Martin and Kuba had concerns about bits of dynptr(skb | xdp) too.
> 
> (My assumption was that you're adding it because you were planning to use
> it internally?)

The bar is not that some project wants to use this new feature, but rather that
the feature looks useful and may potentially be used. We are as maintainers
making this judgement call ever single day.
When we make mistake we should be able to fix it. With uapi we cannot fix our mistakes.

> > With kfuncs we can iron out the issues while trying to use it whereas
> > with helpers we will be stuck for long time in endless mailing list arguments.
> > It's a win-win for everyone to switch everything to kfuncs.
> 
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-31  0:42                         ` Alexei Starovoitov
  2023-01-03 11:43                           ` Daniel Borkmann
@ 2023-01-04  0:55                           ` Jakub Kicinski
  2023-01-04 18:44                           ` Andrii Nakryiko
  2 siblings, 0 replies; 57+ messages in thread
From: Jakub Kicinski @ 2023-01-04  0:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Andrii Nakryiko, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu

On Fri, 30 Dec 2022 16:42:13 -0800 Alexei Starovoitov wrote:
> iirc Martin and Kuba had concerns about bits of dynptr(skb | xdp) too.

FWIW yes, I withdrew my objections because Joanne showed me some changes
which reduced LOC in user space even with the limited functionality.
But dynptrs are not the efficient skb/xdp buf abstraction I was hoping
for :(

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-03 23:51                             ` Alexei Starovoitov
@ 2023-01-04 14:25                               ` Daniel Borkmann
  2023-01-04 18:59                                 ` Andrii Nakryiko
                                                   ` (2 more replies)
  2023-01-11 22:56                               ` Maxim Mikityanskiy
  1 sibling, 3 replies; 57+ messages in thread
From: Daniel Borkmann @ 2023-01-04 14:25 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Andrii Nakryiko, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On 1/4/23 12:51 AM, Alexei Starovoitov wrote:
> On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
>> On 12/31/22 1:42 AM, Alexei Starovoitov wrote:
>>> On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
>>>>>>
>>>>>> Taking bpf_get_current_task() as an example, I think it's better to have
>>>>>> the debate be "should we keep supporting this / are users still using
>>>>>> it?" rather than, "it's UAPI, there's nothing to even discuss". The
>>>>>> point being that even if bpf_get_current_task() is still used, there may
>>>>>> (and inevitably will) be other UAPI helpers that are useless and that we
>>>>>> just can't remove.
>>>
>>> Sorry, missed this question in the previous reply.
>>> The answer is "it's UAPI, there's nothing to even discuss".
>>> It doesn't matter whether bpf_get_current_task() is used heavily or not used at all.
>>> The chance of breaking user space is what paralyzes the changes.
>>> Any change to uapi header file is looked at with a magnifying glass.
>>> There is no deprecation story for uapi.
>>> The definition and semantics of bpf helpers are frozen _forever_.
>>> And our uapi/bpf.h is not in a good company:
>>> ls -Sla include/uapi/linux/|head
>>> -rw-r--r-- 1 ast users 331159 Nov  3 08:32 nl80211.h
>>> -rw-r--r-- 1 ast users 265312 Dec 25 13:51 bpf.h
>>> -rw-r--r-- 1 ast users 118621 Dec 25 13:51 v4l2-controls.h
>>> -rw-r--r-- 1 ast users  99533 Dec 25 13:51 videodev2.h
>>> -rw-r--r-- 1 ast users  86460 Nov 29 11:15 ethtool.h
>>>
>>> "Freeze bpf helpers now" is a minimum we should do right now.
>>> We need to take aggressive steps to freeze the growth of the whole uapi/bpf.h
>>
>> Imho, freezing BPF helpers now is way too aggressive step. One aspect which was
>> not discussed here is that unstable kfuncs will be a pain for user experience
>> compared to BPF helpers. Probably not for FB or G who maintain they own limited
>> set of kernels, but for all others. If there is valid reason that kfuncs will have
>> to change one way or another, then BPF applications using them will have to carry
>> the maintenance burden on their side to be able to support a variety of kernel
>> versions with working around the kfunc quirks. So you're essentially outsourcing
>> the problem from kernel to users, which will suck from a user experience (and add
>> to development cost on their side).
> 
> It's actually the opposite.
> A small company that wants to use BPF needs to have a workaround/plan B for
> different kernels and different distros.
> That's why cilium and others have to detect availability of helpers and bpf features.
> One bpf prog for newer kernel and potentially completely different solution
> for older kernels.
> That's the biggest obstacle in bpf adoption: the required features are in
> the latest kernels, but companies have to support older kernels too.
> Now look at the problem from different angle:
> Detecting kfuncs is no different than detecting helpers.
> The bpf users has to have a workaround when helper/kfunc is not available.
> In that sense stability of the helpers vs instability of kfuncs is irrelevant.
> Both might not exist in a particular kernel.
> So if cilium starts to use kfunc it won't be extra development cost and
> bpf program writer experience using kfuncs vs using helpers is the same as well.

But that was not the point I was making. What you describe above is the baseline
cost which is there regardless of BPF helper vs kfunc.. detecting availability
and having a workaround for older kernel if needed. The added cost is if kfunc
changes over time for whichever valid reason, then you are essentially pushing
the maintenance cost _from kernel to users_ when they need to keep track of that
and implement workarounds specifically to make the kfunc work in their program
for a set of kernels they plan to support, which they otherwise would /not/ have
if it was a BPF helper. It raises the barrier from user side. Similarly, if users
started out with using kfunc from a base kernel, and in future it might get
removed given its not stable, then a workaround (if possible) needs to be
implemented for newer kernels - probably rare occasion but not impossible or
something that can be ruled out entirely. So the stability of the helpers vs
instability of kfuncs is relevant in that case, not for the case you describe
above, and that is extra development cost on user side. Generally, what I'm saying
is, there needs to be a path forward where we are still open for both instead of
completely freezing the former.

> But with kfuncs we can solve this bpf adoption issue.
> The helpers are not easily backportable and cannot be added in modules,
> so company's workarounds for older kernel are painful.
> While kfuncs are trivially added in a module.

Maybe to a small degree. Often shipping out-of-tree kernel module is generally
a no-go from corp policy and there's nothing you can do about it in such case.

"trivially added" is a bit oversimplified as well.. depends on the kfunc of course,
but potentially painful in terms of having to work around various changing kernel
internals for your kfunc implementation and only possible if kernel actually exposes
the needed functionality to modules. While the adoption issue /can/ in some cases be
solved, I don't think it will be widely practical to solve adoption issue. Eventually
only time will solve it when everyone is on decent enough kernel as baseline, this
is what is there today at least for networking and tracing side where BPF is widely
adopted and its available framework big enough to solve many use cases.

Aside and independent of all that, kfuncs added in out of tree modules should be
discouraged. After all we want developers to contribute back to upstream kernel,
and for a very long time we've had the stance that no extra functionality should be
possible via out of tree module extensions.

> Let's take bpf_sock_destroy that Aditi wants to add as an example.
> If it's done as a helper the cilium would need to wait for the next kernel release
> and next distro release some years from now to actually use it at the customer site.

Yeap, with some distros in K8s space being better than others, for example, some like
Flatcar tend to be fairly up to date. With major LTS ones it takes 1+ years though.

> If bpf_sock_destroy is added as kfunc you can ship an extra kernel module
> with just that kfunc to your customers. You can also attempt to convince a distro
> that this module with kfuncs should be certified, since the same kfunc is in upstream kernel.
> The customer can use cilium that relies on bpf_sock_destroy much sooner
> and likely there won't be a need to develop a completely different workaround
> for kernels without that kfunc.

See above wrt modules. Some larger users which run their own DC infra also build
kernels for themselves, so in some cases it's possible and easier from corp policy
PoV to just cherry-pick upstream commits and roll them into their own kernel build
until they upgrade at some point to a base kernel where this comes by default. Some
of the distro vendors build "hw enablement" kernels for cloud providers and there
it is possible too to ask for backports on core functionality even if not in stable,
it's a slow process however.

[...]
>> Ofc there is interest in keeping changes to a
>> minimum, but it's not the same as BPF helpers where there is a significantly higher
>> guarantee that things continue to keep working going forward. Today in Cilium we
>> don't use any of the kfuncs, we might at some point when we see it necessary, but
>> likely to a limited degree if sth cannot be solved as-is and only kfunc is present
>> as a solution. But again, from a UX it's not great having to know that things can
>> break anytime soon with newer kernels (things might already with verifier/LLVM
>> upgrade and kfunc potentially adds yet another level). Generally speaking, I'm not
>> against kfuncs but I suggest only making "freeze bpf helpers now" a soft freeze
>> with a path forward for promoting some of the kfuncs which have been around and
>> matured for a while and didn't need changes as stable BPF helpers to indicate their
>> maturity level when we see it fit. So it's not a hard "no", but possible promotion
>> when suitable.
> 
> The problem with 'soft' freeze that it's open to interpretation and abuse.
> It feels to me you're saying that cilium is not using kfuncs and
> therefore all cilium features additions are ok to be done as helpers.
> That doesn't sound fair to other bpf devs.

I think you misread, lets not twist what I mentioned. All I was saying is that we
should keep the door open for both to continue to co-exist; both have a place, both
come with their advantages but also baggage. It's not that one is absolutely better
than the other, and that maintenance baggage is either on our side or pushed towards
users.

[...]
>> Discoverability plus being able to know semantics from a user PoV to figure out when
>> workarounds for older/newer kernels are required to be able to support both kernels.
> 
> Sounds like your concern is that there could be a kfunc that changed it semantics,
> but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
> such patches from landing. It's up to us to define sane and user friendly deprecation of kfuncs.

Yes, that is a concern. New kfunc and deprecation with eventual removal of the old
one might be better in such case, agree.

[...]
>> is imho repeating the same story as BPF helpers vs kfuncs. Saying a kfunc is 'pretty
>> stable' is kind of hinting to users that it's close to UAPI, but yet it's unstable.
> 
> correct.
> 
>> It'll confuse even more. I'd rather have a path forward where those kfuncs get promoted
> 
> why confuse more? There are EXPORT_SYMBOL like kmalloc that are quite stable,
> yet they can change.
> EXPORT_SYMBOL_GPL is exact analogy to kfunc.

They are quite stable because they are used in lots of places in-tree and changing
would cause a ton of needless churn and merge conflicts for everyone, etc. You might
not always have this kind of visibility on usage of kfuncs. The data you have is
from your internal code base and what's in some of the larger OSS projects, but
certainly a more limited/biased view. So as with 'soft' freeze this is just as well open
to interpretation. "confuse more" because you declare it quite stable, yet not stable.
Why is there fear to make them proper uapi then with the given known guarantees? From
user side this guarantee is a good thing, not a bad thing. Mistakes were/are made all
the time and learned from. Imagine syscall API is not stable anymore. Would you invest
the cost to develop an application against it? Imho, it's one of BPF's strengths and
we should keep the door open, not close it.

>> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
>> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> 
> "Proper BPF helper" model is broken.
> static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> 
> is a hack that works only when compiler optimizes the code.
> See gcc's attr(kernel_helper) workaround.
> This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> And because it's uapi we cannot even fix this
> With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> These tools don't exist yet, but we have a way forward whereas with helpers
> we are stuck with -O2.

Better debugging tools are needed either way, independent of -O0 or -O2. I don't
think -O0 is a requirement or barrier for that. It may open up possibilities for
new tools, but production is still running with -O2. Proper BPF helper model is
broken, but everyone relies on it, and will be for a very very long time to come,
whether we like it or not. There is a larger ecosystem around BPF devs outside of
kernel, and developers will use the existing means today. There are recommendations /
guidelines that we can provide but we also don't have control over what developers
are doing. Yet we should make their life easier, not harder. Better debugging
possibilities should cater to everyone.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-30  2:46                 ` Alexei Starovoitov
  2022-12-30 18:38                   ` David Vernet
@ 2023-01-04 18:43                   ` Andrii Nakryiko
  2023-01-04 19:44                     ` Alexei Starovoitov
  1 sibling, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 18:43 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Dec 29, 2022 at 6:46 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Dec 29, 2022 at 03:10:22PM -0800, Andrii Nakryiko wrote:
> > On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Dec 20, 2022 at 11:31:25AM -0800, Andrii Nakryiko wrote:
> > > > On Fri, Dec 16, 2022 at 9:35 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 12, 2022 at 12:12:09PM -0800, Andrii Nakryiko wrote:
> > > > > >
> > > > > > There is no clean way to ever move from unstable kfunc to a stable helper.
> > > > >
> > > > > No clean way? Yet in the other email you proposed a way.
> > > > > Not pretty, but workable.
> > > > > I'm sure if ever there will be a need to stabilize the kfunc we will
> > > > > find a clean way to do it.
> > > >
> > > > You can't have stable and unstable helper definition in the same .c
> > > > file,
> > >
> > > of course we can.
> > > uapi helpers vs kfuncs argument is not a black and white comparison.
> > > It's not just stable vs unstable.
> > > uapi has strict rules and helpers in uapi/bpf.h have to follow those rules.
> > > While kfuncs in terms of stability are equivalent to EXPORT_SYMBOL_GPL.
> > > Meaning they are largely unstable.
> > > The upsteam kernel keeps changing those EXPORT_SYMBOL* functions,
> > > but distros can apply their own "stability rules".
> > > See Redhat's kABI, for example. A distro can guarantee a stability
> > > of certain EXPORT_SYMBOL* for their customers, but that doesn't bind
> > > upstream development.
> > >
> > > With uapi bpf helpers we have to guarantee their stability,
> > > while with kfuncs we can do whatever we want. Right now all kfuncs are
> > > unstable and to prove the point we changed them couple times already (nf_conn*).
> > > We also have bpf_obj_new_impl() kfunc which is equivalent to EXPORT_SYMBOL(__kmalloc).
> > > Hard to imagine more stable and more fundamental function.
> > > Of course we want bpf programs to use bpf_obj_new() and assume
> > > that it's going to be available in all future kernel releases.
> > > But at the same time we're not bound by uapi rules.
> > > bpf_obj_new() will likely be stable, but not uapi stable.
> > > If we screw up (or find better way to allocate memory in the future)
> > > we can change it.
> > > We can invent our own deprecation rules for stable-ish kfuncs and
> > > invent our more-unstable-than-current-unstable rules for kfuncs that
> > > are too much kernel release dependent.
> >
> > I'm talking about *mechanics* of having two incompatible definitions
> > of functions with the same name, not the *concept* of stable vs
> > unstable API. See [0] where I explained this as a reply to Joanne.
> >
> >   [0] https://lore.kernel.org/bpf/CAEf4BzbRQLEjAFUkzzStv0c0=O+r9iZ8hq33sJB2RtSuGrGAEA@mail.gmail.com/
>
> Mechanics for kfuncs are much better than for helpers.

>> *mechanics* of having two incompatible definitions
>> of functions with the same name,

but you made it clear that no unstable kfunc will ever be promoted to
BPF helper, so I see no point in arguing further

>
> extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
>
> will likely work with both gcc and clang.
> And if it doesn't we can fix it.
>
> While when gcc folks saw helpers:
>
> static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
>
> they realized that it is a hack that abuses compiler optimizations.
> They even invented attr(kernel_helper) to workaround this issue.
> After a bunch of arguing gcc added support for this hack without attr,
> but it's going to be around forever... in gcc, in clang and in kernel.
> It's something that we could have fixed if it wasn't for uapi.
> Just one more example of unfixable mistake that causing issues
> to multiple projects.
> That's the core issue of kernel uapi rules: inability to fix mistakes.

This is BPF ISA defining `call #N;` to call helper with ID N, which
you agree that it (ISA) has to be stable, documented and standardized,
right?

Everything else is just how we expose those constants into C code and
how libbpf deals with them. Libbpf could support new attribute or even
extern-based convention, if necessary.

But it wasn't necessary for years and only was brought up during GCC's
attempt to invent a new convention here. And they successfully dealt
with this challenge.

>
> > >
> > > > But regardless, dynptr is modeled as black box with hidden state, and
> > > > its API surface area is bigger (offset, size, is null or not,
> > > > manipulations over those aspects; then there is skb/xdp abstraction to
> > > > be taken care of for generic read/write). It has a wider *generic* API
> > > > surface to be useful and effectively used.
> > >
> > > tbh dynptr as an abstraction of skb/xdp is not convincing.
> > > cilium created their own abstraction on top of skb and xdp and it's zero cost.
> > > While dynptr is not free, so xdp users unlikely to use dynptr(xdp) for perf reasons.
> > > So I suspect it won't be a success story in the long run, but we
> > > can certainly try it out since they will be kfuncs and can be deprecated
> > > if maintenance outweighs the number of users.
> > >
> > > > All *two* of them, bpf_get_current_task() and
> > > > bpf_get_current_task_btf(), right? They are 2 years apart.
> > > > bpf_get_current_task() was added before BTF era. It is still actively
> > > > used today and there is nothing wrong with it. It works on older
> > > > kernels just fine, even with BPF CO-RE (as backporting a few simple
> > > > patches to generate BTF is simple and easy; not so much with BPF
> > > > verifier changes to add native BTF support). I don't see much problem
> > > > having both, they are not maintenance burden.
> > >
> > > bpf_get_current_pid_tgid
> > > bpf_get_current_uid_gid
> > > bpf_get_current_comm
> > > bpf_get_current_task
> > > bpf_get_current_task_btf
> > > bpf_get_current_cgroup_id
> > > bpf_get_current_ancestor_cgroup_id
> > > bpf_skb_ancestor_cgroup_id
> > > bpf_sk_cgroup_id
> > > bpf_sk_ancestor_cgroup_id
> > >
> > > _are_ a maintenance burden.
> >
> > bpf_get_current_pid_tgid() was added in 2015, slightly and
> > uncritically touched by Daniel in 2016 and we never had any problems
> > with it ever since. No updates, no maintenance. I don't remember much
> > problem with other helpers in this list, but I didn't check each one.
> >
> > But we certainly have a different understanding of what "maintenance
> > burden" is. If some code doesn't require constant change and doesn't
> > prevent changes in some other parts of the system, it's not a
> > maintenance burden.
>
> As I said it's not about working today. If one doesn't touch code

Where do you see "working today"? Quoting myself, just few lines above:

> > If some code doesn't require constant change and doesn't
> > prevent changes in some other parts of the system, it's not a
> > maintenance burden.

Which of those helpers prevent us from doing something new? Which ones
are slowing us down and by how much?

> it will keep working.
> It's about being able to change it.
> The uapi bits we simply cannot change.

Yes, we won't change existing helpers, but we can add new ones if we
need to extend them. That's how APIs work. Yes, they need careful
considerations when designing and implementing new APIs. Yes, mistakes
do happen, that's just fact of life and par for the course of software
development. Yes, we have to live with those mistakes. Nothing changed
about that.

But somehow libraries and kernel still produce stable APIs and
maintain them because they clearly provide benefits to end users.

>
> >
> > > The verifier got smarter and we could have removed all of them,
> > > but uapi rules makes it impossible.
> > > The bpf prog could have been enabled to access all these task_struct
> > > and cgroup fields directly. Likely without any kfuncs.
> > >
> > > bpf_send_signal vs bpf_send_signal_thread
> > > bpf_jiffies64 vs bpf_this_cpu_ptr
> > > etc
> > > there are plenty examples where uapi bpf helpers became a burden.
> > > They are working and will keep working, but we could have done
> > > much better job if not for uapi.
> > > These are the examples where uapi rules are too strong for bpf development.
> > > Our pace of adding new features is high.
> > > The kernel uapi rules are too strict for us.
> >
> > I'm familiar with the burden of maintaining API stability and
> > backwards compat. But it's not just about the library/system
>
> libbpf 1.0 wasn't the smoothest example of deprecation.
> But we still did it despite all kinds of negative flame.
> With uapi helpers we cannot do any of that. No deprecation schemes.
> While kfuncs allow innovation.

We'll get the same amount of flame when we try to change kfunc that's
widely adopted.

You are missing the point, though, in trying to pit BPF helpers
against kfuncs. I'm not saying it has to always be BPF helpers and
never kfuncs. Both have the right to exist. My point is that in some
cases BPF helpers are better, in others - kfuncs are more adequate.
Why is this so controversial?

>
> > developer's convenience and burden, it's also about the end user's
> > experience and convenience. BPF tool developers really appreciate when
> > there are few less quirks to remember and work around across kernel
> > versions, configurations, architectures, etc. It's the pain that
> > kernel engineers working on BPF bleeding-edge don't experience in the
> > BPF selftests environment.
>
> There is a trade off between users and developers. We want to make user
> experience as smooth as possible while preserve the speed of development
> for the kernel. uapi is in the way of that.
>
> > >
> > > At one point DaveM declared freeze on sizeof(struct sk_buff).
> > > It was a difficult, but correct decision.
> > > We have to declare freeze on bpf helpers.
> > > 211 helpers that have to be maintained forever is a huge burden.
> >
> > I still didn't get why we have to freeze anything and how exactly
> > helpers are a burden.
> >
> > But especially in this specific case of few simple dynptr helpers,
> > especially that other dynptrs generic APIs are already BPF helpers. I
> > just don't get it and honestly all I see from this discussion is that
> > you've made up your mind and there is nothing that can be done to
> > convince you.
> >
> > The only "BPF helpers are stable and thus a burden" argument is just
> > not convincing and I'd even say is mostly false. There are no upsides
> > to having dynptr helpers as kfuncs, as far as I'm concerned.
>
> The main and only upside for everything as kfunc is that we can change it.
> That's it.

And that's not reason enough to outlaw new BPF helpers wholesale.

>
> > But there
> > are a bunch of downsides, even if some of those might be lifted in the
> > future.
>
> imo ability to change outweighs all downsides, since downsides are fixable
> while inability to change is a burden.

I'm curious what's the mechanism when people disagree with your "imo"
and have good reasons for that? Is there a scenario where opinion
other than yours prevails even if you disagree with it?


>
> > The unfortunate thing is that end users that are meant to benefit from
> > all these helpers and them being "a standard API offering" are not
> > well represented on the BPF mailing list, unfortunately. And my
> > opinion and arguments as a proxy for theirs is clearly not enough.
>
> I also would like to hear what others on the list are thinking.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-30 21:00                       ` David Vernet
  2022-12-31  0:42                         ` Alexei Starovoitov
@ 2023-01-04 18:43                         ` Andrii Nakryiko
  2023-01-04 19:51                           ` Alexei Starovoitov
  1 sibling, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 18:43 UTC (permalink / raw)
  To: David Vernet
  Cc: Alexei Starovoitov, Joanne Koong, bpf, Andrii Nakryiko,
	kernel-team, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu

On Fri, Dec 30, 2022 at 1:00 PM David Vernet <void@manifault.com> wrote:
>
> On Fri, Dec 30, 2022 at 11:31:12AM -0800, Alexei Starovoitov wrote:
> > On Fri, Dec 30, 2022 at 12:38:55PM -0600, David Vernet wrote:
> > > On Thu, Dec 29, 2022 at 06:46:41PM -0800, Alexei Starovoitov wrote:
> > > > On Thu, Dec 29, 2022 at 03:10:22PM -0800, Andrii Nakryiko wrote:
> > > > > On Sun, Dec 25, 2022 at 1:52 PM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >

[...]

> I don't think the fact that we'll never be done is a valid counterpoint
> to "are we ready now"? The first iteration of kfuncs was definitely not
> in a good enough state to freeze all helpers. The usability of kfuncs
> has improved drastically since then. The question isn't "when will be at
> a complete stopping point?", it's, "are we sufficiently ready now?".
>
> > It's a bit of wishful thinking that addressing today's problem will somehow
> > make everything nice and clean and then we will be ready to stop adding helpers.
> > We'll keep improving the infra for years to come.
> > There is no "end of the road" sign.
>
> Yes, there's no end of the road, but my point is that there are still
> pieces that we know we need to change, and which we know are temporary
> (__sz and __k being the main examples).
>
> *That being said*: I completely admit that this is all subjective. From
> a technical standpoint, there is nothing stopping us from freezing
> helpers. And honestly, I don't disagree with you that getting out of
> UAPI immediately and forever is a huge positive; possibly even to the

"huge positive" for whom? for happy kernel engineers that only care
about the latest version of everything in BPF selftests or
samples/bpf? Sure. But let's think about poor end user. Let's as a
hypothetical and trivial example think about dynptr and
bpf_dynptr_is_null(). Basic dynptr is usable in earlier kernel release
than bpf_dynptr_is_null() helper, so you could write BPF app that will
do some work-arounds without using bpf_dynptr_is_null() on old kernel,
but happily switch to new helper/kfunc, if available. With BPF helpers
I can detect this on BPF side completely transparently to user-space
part of my app:

struct bpf_dynptr dptr = ...;
bool is_null = false;

if (bpf_core_value_exists(enum bpf_func_id, BPF_FUNC_dynptr_is_null)) {
    is_null = bpf_dynptr_is_null(&dptr);
} else {
    struct bpf_dynptr_kern *kdptr = (void*)&dptr;
    is_null = !!BPF_CORE_READ(kdptr, data);
}

How do you detect the existence of kfunc today? Preferably without
doing extra work in user-space.

Now, let's say kfunc changes its signature. Show me a short example on
how you deal with that in BPF C code?


Think about sched_ext. Right now it's so bleeding edge that you have
to assume the very latest and freshest kernel code. So you know all
the kfuncs that you need should exist otherwise sched_ext doesn't work
at all. Ok, happy place.

Now a year or two passes by. Some kfuncs are added, some are changed.
We still believe that BPF CO-RE (compile once - run everywhere) is
good and we don't want to compile and distribute multiple versions of
BPF application, right? You'll want to do some extra (or more
performant) stuff if kernel is recent and has some new kfunc, but
fallback to some default suboptimal behavior otherwise. How do you do
that in a simple and straightforward way? But even worse is what if
some critical kfunc is changed between kernel versions and you do
*need* to support both versions. Think about those aspects, because
sched_ext will run into them almost inevitably soon after its
inclusion into kernel.


One way or another there are some technical solution of various
degrees of creativity. And I'm actually not sure if I have a solution
for kfunc signature change at all. Without BTF we could use two
separate .c files and statically link them together, which would work
because extern is untyped in pure C. But with BPF static linking we do
have BTF information for each extern, and those BTF types will be
incompatible for the same extern func.

We can probably come up with some hacks and conventions, as usual, but
better start thinking about them now.

But hopefully you can empathize a bit more with poor end users that
have to do hack like this and why having bpf_dynptr API defined as
stable BPF helpers, with no extra dependencies on BTF in kernel, on
kfunc support for architecture, and whatever other hidden dependencies
we all forgot or haven't thought about yet (believe me, there will
always be users trying to do something on some embedded system with
"unusual" kernel configs or architectures).


But again. Let me repeat my point *again*. BPF helpers and kfuncs are
not mutually exclusive, both can and should exist and evolve. That's
one of the main points which is somehow eluding this conversation.

> point that it warrants us just doing it now. More below.
>
> >
> > > 4. Getting rid of KF_TRUSTED_ARGS and making that the default.
> >
> > We've been talking about this possibility for months.
> > Are you suggesting to keep adding helpers for another year or so?
>
> I think that kfuncs should be the norm for the vast majority of things
> being added, and hopefully for everything (I'm going to walk back my
> suggestion of adding these new dynptr functions as helpers). Honestly,
> my point was really just that I think the API for defining kfuncs needs
> to be improved before we can totally and completely freeze helpers due
> to the fact that we have __sz and __k, and don't have a consistent
> documentation story. That being said, __sz and __k are there, they work,
> and as you and I have both said at this point, whether or not they're
> "blockers" is subjective.
>
> So my answer to your question of "should we add helpers for another year
> or so" in my last reply would have been "absolutely not, unless we truly
> have no choice because of the lack of per-arg flags". After reading your
> reply, if you're worried that that policy won't be strictly enforced
> (meaning that we'll end up having to add helpers that easily could have
> just been kfuncs) then I agree that we should just do the hard freeze
> now. We've de-facto been doing that anyways for the last year.
>
> That being said, I really would hope that we could at least get some of
> the documentation story figured out. Even if it's just something as
> simple as spelling out a formal policy on our kfuncs docs page
> stipulating that you have to add a doxygen header and link it from a
> docs page, it would be nice to have some policy that puts kfuncs on a
> road to being as well documented as helpers.
>
> > We already have 91 kfuncs and 211 helpers.
> > If we were not asking all developers to use kfuncs we would have had 300+ helpers.
>
> Agreed that this would have been a _very_ unfortunate outcome.

Again, this is a wrong dichotomy. Just because there are 91 (out of
which 25-ish are test-only kfuncs that should really be in
bpf_testmod, but somehow that doesn't bother anyone) kfuncs, doesn't
mean they would have to all be done as BPF helpers. dynptr is stable
generic concept, it should be done as BPF helpers. ct, xfrm, hid-bpf
are interfaces to kernel objects, they are perfectly fit with kfunc.

There is no contradiction there. Just some questionable conclusions.

>
> >
> > > 5. Ideally we could improve the story for _defining_ kfuncs as well,
> > > though IMO it's already far less painful than defining helpers. It would
> > > be nice if you could just tag a kfunc with something like a __bpf_kfunc
> > > macro and it would do the following:
> > >
> > > - Automatically disable the -Wmissing-prototypes warning. I doubt this
> > >   is possible without adding some compiler features that let you do
> > >   something like __attribute__(__nowarn__("Wmissing-prototypes")), so
> > >   maybe this isn't a hard blocker, but more of a medium / long-term
> > >   goal.
> > > - Add whatever other attributes we need for the kfuncs to be safe. For
> > >   example, 'noinline' and '__used'. Even if the symbols are global,
> > >   we'll probably need '__used' for LTO.
> >
> > would be nice, but that didn't stop existing 91 kfuncs to appear
> > and already used in production.
> > Yes. kfuncs are already used in production.
>
> This is something that would literally only take like 1-2 patches
> anyways. I'm happy to do it so we don't have to waste cycles thinking
> about it as a blocker for anything.
>
> >
> > > Overall, my point is really that we still have some homework to do
> > > before we can just unilaterally freeze helpers. We're getting close, but
> > > IMO not quite there yet.
> >
> > 91 vs 211 tells a different story.
>
> Yeah, the fact that we have 91 kfuncs is strong evidence that kfuncs are
> already in a good-enough place to just freeze helpers.
>
> Another counterpoint to my initial claim that not having per-arg flags
> could be problematic is that there are certain things that are global in
> kfuncs that are also global in helpers despite having per-arg modifiers.
> For example, the fact that you can only have one OBJ_RELEASE argument.
> And yet another is the fact that none of the helpers we've added in the
> last year relied on having per-arg modifiers, so in practice it hasn't
> been a problem.

You are conflating "single flag per func" with "which arg it belongs
to doesn't matter". There could be only one OBJ_RELEASE, but we need
to know which argument it applies to. Sure, today we take a shortcut
and say it should apply to the only ref_obj_id-enabled argument.

But think about some hypothetical kfunc:

int do_something_weird(struct bpf_dynptr *dptr1, struct bpf_dynptr *dptr2)

If it has OBJ_RELEASE, which arg (dptr1 or dptr2) it applies to?

OBJ_RELEASE is still an argument flag.

>
> I think it's fair to say that if you just look at the data instead of
> from an "API cleanlines" perspective, having per-arg modifiers is not a
> blocker. Data wins over subjectivity, so as mentioned above, I'm willing
> to change my mind about per-arg modifiers being a blocker, especially
> with __sz and __k.
>

[...]

> > > I'm not sure whether that's enough to warrant making them helpers
> > > instead of kfuncs, but I do think it's not exactly an apples to apples
> > > comparison with future features that today have no helper API presence.
> > > Putting myself in the shoes of a dynptr user, I would be very surprised
> > > and confused if all of a sudden, I couldn't use some of the core dynptr
> > > APIs due to being on a platform that doesn't have kfunc support. My two
> > > cents are that letting these dynptr functions stay as helpers, while
> > > agreeing that kfuncs is the way forward (though I don't think Andrii
> > > agrees with that even aside from just these dynptrs) is a reasonable
> > > compromise that errs on the side of user-friendliness for dynptr users.
> >
> > We already have this 'discrepancy' of both kfuncs and helpers for kptrs
> > (bpf_obj_new vs bpf_kptr_xhcg) and so far no complains.
> > Why dynptr is special?
>
> Well, lack of usability in one case doesn't necessarily mean we should
> allow it in another. That said, the "usability" gains from having a
> helper really are minimal to the point of practically being negligible
> anyways.

Depends on perspective. If I was some humble dev trying to build
BPF-based tool that should work on x86, arm64, s390x, and riscv (or
whatever other architecture), and dynptr API is only based on kfuncs,
I'm screwed. I can't sponsor or do kfunc support for my favorite
architecture, I'm stuck waiting for this to be done by someone some
time, if ever.

And all because we arbitrarily decided not to do BPF helper.

From a good engineering perspective, if some functionality doesn't
require dependency X to work in principle, it shouldn't depend on that
feature X. Even if feature X is beloved BTF.

>
> Part of me was trying to find a compromise here to move forward, but
> honestly, I do agree with you that we should aggressively make
> everything a kfunc unless we have a good reason not to, dynptr functions
> included. So I'm willing to walk this suggestion back as well -- let's
> just make these kfuncs.

How about the policy of "let's use common sense and decide on what's
best in each particular case"? Isn't that the best policy? Blanket
statements and hard-defined rules are easy to follow, but they do not
produce best outcomes (IMO).


>
> > > FWIW, I also don't think it's fair or logical to argue at this point in
> > > the game that dynptrs as a concept is inherently flawed. They were super
> > > useful for enabling the user ringbuf map type, which is a key part of
> > > rhone / user-space scheduling in sched_ext, and I wouldn't be surprised
> > > if ghOSt started using it as well as a way to make scheduling decisions
> > > without trapping into the kernel as well. Also, the attendees at LSFMM
> > > generally seemed enthusiastic about dynptrs and user ringbuf, though I
> > > admittedly don't know who's using either feature outside of rhone.
> >
> > rhone doesn't have stability guarantees just like sched-ext doesn't have them.
> > To drive that point rhone and sched-ext should really be using kfuncs.
> > Otherwise somebody might point the finger at helpers and argue that
> > this is somehow makes sched-ext stable.
>
> Also a reasonable point. My point above was really just a response to
> your claim in [0] that dynptrs are flawed. It wasn't related to kfuncs
> vs. helpers.
>
> [0]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/
>
> >
> > > That being said, to reiterate, I personally agree that once we take care
> > > of a few more things for kfuncs , they're 100% the way forward over
> > > helpers. BPF programs are kernel programs, no UAPI pain should be
> > > necessary.
> >
> > Similar arguments were made during sk_buff freeze... let's add few more fields
> > that are going to be sooo useful and then we'll freeze sk_buff...
> > dynptr is trying to be that special snow flake.
>
> The main points of my initial response were not about dynptrs, they were
> about how we define kfuncs. I agree there is nothing at all special
> about dynptrs beyond the fact that they as a feature already have
> helpers. Sure, let's add them as kfuncs. No reason to be beholden to the
> UAPI restrictions.
>
> >
> > bpf_rcu_read_lock was added as a kfunc. It's more fundamental than dynptr.
> > bpf_obj_new is a kfunc too. Also more fundamental than dynptr.
> > What is so special about dynptr that we need to make an exception for it?
>
> See above.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2022-12-31  0:42                         ` Alexei Starovoitov
  2023-01-03 11:43                           ` Daniel Borkmann
  2023-01-04  0:55                           ` Jakub Kicinski
@ 2023-01-04 18:44                           ` Andrii Nakryiko
  2023-01-04 19:56                             ` Alexei Starovoitov
  2 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 18:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Fri, Dec 30, 2022 at 4:42 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
> > > >
> > > > Taking bpf_get_current_task() as an example, I think it's better to have
> > > > the debate be "should we keep supporting this / are users still using
> > > > it?" rather than, "it's UAPI, there's nothing to even discuss". The
> > > > point being that even if bpf_get_current_task() is still used, there may
> > > > (and inevitably will) be other UAPI helpers that are useless and that we
> > > > just can't remove.
>
> Sorry, missed this question in the previous reply.

[...]

> > Part of me was trying to find a compromise here to move forward, but
> > honestly, I do agree with you that we should aggressively make
> > everything a kfunc unless we have a good reason not to, dynptr functions
> > included. So I'm willing to walk this suggestion back as well -- let's
> > just make these kfuncs.
>
> Agree that any hard policy like 'only kfuncs from now on' gotta have its limits.
> Maybe there will be a strong reason to add a new helper one day,
> so we can keep the door open a tiny bit for an exception,
> but for dynptr...
> There are kfuncs with dynptr already (bpf_verify_pkcs7_signature)
> So precedent is already made.

bpf_verify_pkcs7_signature() is using dynptr as a pointer to memory.
It's a totally valid and intended use case, to pass memory area of
statically unknown size, yes.

But that's very different from having basic dynptr helpers like
is_null() and trim/advance as kfunc. Such helpers are stable, they
manipulate generic attributes of dynptr: size, offset, underlying
memory pointer. There is nothing unstable and potentially changing
about them.

>
> > Also a reasonable point. My point above was really just a response to
> > your claim in [0] that dynptrs are flawed. It wasn't related to kfuncs
> > vs. helpers.
> >
> > [0]: https://lore.kernel.org/all/20221216173526.y3e5go6mgmjrv46l@MacBook-Pro-6.local/
>
> The flawed part of dynptr I was explaining here:
> https://lore.kernel.org/all/20221225215210.ekmfhyczgubx4rih@macbook-pro-6.dhcp.thefacebook.com/
>
> It's not that the whole concept of dynptr is flawed,
> but using it as an abstraction on top of skb/xdp.

From original exchange:

> > > So just because there is no perfect way to
> > > handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> > > dynptr concept itself is flawed or not well thought out. It's just
> >
> > I think that's exactly what it means. dynptr concept is flawed.

Must be a lot of typos in here ;) because as written it clearly states
that the whole concept of dynptr is flawed.

But I'm glad we are finally on the same page at least on this point now.


> I don't believe that the extreme performance demands of xdp users are
> compatible with 'lets verify in runtime' philosophy of dynptr.
> I could be wrong. That's why I'm fine adding dynptr_on_top_of_xdp as kfuncs
> and seeing it playing out, but certainly not as a stable helper.
> iirc Martin and Kuba had concerns about bits of dynptr(skb | xdp) too.
> With kfuncs we can iron out the issues while trying to use it whereas
> with helpers we will be stuck for long time in endless mailing list arguments.
> It's a win-win for everyone to switch everything to kfuncs.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 14:25                               ` Daniel Borkmann
@ 2023-01-04 18:59                                 ` Andrii Nakryiko
  2023-01-04 20:03                                   ` Alexei Starovoitov
  2023-01-04 19:37                                 ` Alexei Starovoitov
  2023-01-04 20:50                                 ` David Vernet
  2 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 18:59 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, David Vernet, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On Wed, Jan 4, 2023 at 6:25 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 1/4/23 12:51 AM, Alexei Starovoitov wrote:
> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
> >> On 12/31/22 1:42 AM, Alexei Starovoitov wrote:
> >>> On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
> >>>>>>
> >>>>>> Taking bpf_get_current_task() as an example, I think it's better to have
> >>>>>> the debate be "should we keep supporting this / are users still using
> >>>>>> it?" rather than, "it's UAPI, there's nothing to even discuss". The
> >>>>>> point being that even if bpf_get_current_task() is still used, there may
> >>>>>> (and inevitably will) be other UAPI helpers that are useless and that we
> >>>>>> just can't remove.
> >>>

+1 to all the things Daniel said about end user pains and barriers for
adoption, glad I'm not the only one arguing this anymore.

[...]

> >> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> >> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> >
> > "Proper BPF helper" model is broken.
> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> >
> > is a hack that works only when compiler optimizes the code.
> > See gcc's attr(kernel_helper) workaround.
> > This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> > And because it's uapi we cannot even fix this
> > With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> > These tools don't exist yet, but we have a way forward whereas with helpers
> > we are stuck with -O2.
>

But specifically about how the BPF helper model is broken, that's at
least an exaggeration. BPF helper call is defined at BPF ISA level, it
has to be a `call <some constant>;`, and as long as compiler generates
such code, it's all good. From C standpoint UAPI is just a function
call:

bpf_map_lookup_elem(&map, ...);

As long as this compiles and generates proper `call 1;` assembly
instruction, we are good. If/when both Clang and GCC support an
alternative way to define helper and not as a static func pointer, -O0
builds (at least in the aspect of calling BPF helpers, I suspect other
stuff will break still) will just work. And what's better,
bpf_helper_defs.h would be able to pick the best option based on
compiler's support with end users not having to care or notice the
difference.

This is not an UAPI problem at all.


> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
> think -O0 is a requirement or barrier for that. It may open up possibilities for
> new tools, but production is still running with -O2. Proper BPF helper model is
> broken, but everyone relies on it, and will be for a very very long time to come,
> whether we like it or not. There is a larger ecosystem around BPF devs outside of
> kernel, and developers will use the existing means today. There are recommendations /
> guidelines that we can provide but we also don't have control over what developers
> are doing. Yet we should make their life easier, not harder. Better debugging
> possibilities should cater to everyone.
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 14:25                               ` Daniel Borkmann
  2023-01-04 18:59                                 ` Andrii Nakryiko
@ 2023-01-04 19:37                                 ` Alexei Starovoitov
  2023-01-05  0:13                                   ` Martin KaFai Lau
  2023-01-04 20:50                                 ` David Vernet
  2 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-04 19:37 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Vernet, Andrii Nakryiko, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 03:25:00PM +0100, Daniel Borkmann wrote:
> On 1/4/23 12:51 AM, Alexei Starovoitov wrote:
> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
> > > On 12/31/22 1:42 AM, Alexei Starovoitov wrote:
> > > > On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
> > > > > > > 
> > > > > > > Taking bpf_get_current_task() as an example, I think it's better to have
> > > > > > > the debate be "should we keep supporting this / are users still using
> > > > > > > it?" rather than, "it's UAPI, there's nothing to even discuss". The
> > > > > > > point being that even if bpf_get_current_task() is still used, there may
> > > > > > > (and inevitably will) be other UAPI helpers that are useless and that we
> > > > > > > just can't remove.
> > > > 
> > > > Sorry, missed this question in the previous reply.
> > > > The answer is "it's UAPI, there's nothing to even discuss".
> > > > It doesn't matter whether bpf_get_current_task() is used heavily or not used at all.
> > > > The chance of breaking user space is what paralyzes the changes.
> > > > Any change to uapi header file is looked at with a magnifying glass.
> > > > There is no deprecation story for uapi.
> > > > The definition and semantics of bpf helpers are frozen _forever_.
> > > > And our uapi/bpf.h is not in a good company:
> > > > ls -Sla include/uapi/linux/|head
> > > > -rw-r--r-- 1 ast users 331159 Nov  3 08:32 nl80211.h
> > > > -rw-r--r-- 1 ast users 265312 Dec 25 13:51 bpf.h
> > > > -rw-r--r-- 1 ast users 118621 Dec 25 13:51 v4l2-controls.h
> > > > -rw-r--r-- 1 ast users  99533 Dec 25 13:51 videodev2.h
> > > > -rw-r--r-- 1 ast users  86460 Nov 29 11:15 ethtool.h
> > > > 
> > > > "Freeze bpf helpers now" is a minimum we should do right now.
> > > > We need to take aggressive steps to freeze the growth of the whole uapi/bpf.h
> > > 
> > > Imho, freezing BPF helpers now is way too aggressive step. One aspect which was
> > > not discussed here is that unstable kfuncs will be a pain for user experience
> > > compared to BPF helpers. Probably not for FB or G who maintain they own limited
> > > set of kernels, but for all others. If there is valid reason that kfuncs will have
> > > to change one way or another, then BPF applications using them will have to carry
> > > the maintenance burden on their side to be able to support a variety of kernel
> > > versions with working around the kfunc quirks. So you're essentially outsourcing
> > > the problem from kernel to users, which will suck from a user experience (and add
> > > to development cost on their side).
> > 
> > It's actually the opposite.
> > A small company that wants to use BPF needs to have a workaround/plan B for
> > different kernels and different distros.
> > That's why cilium and others have to detect availability of helpers and bpf features.
> > One bpf prog for newer kernel and potentially completely different solution
> > for older kernels.
> > That's the biggest obstacle in bpf adoption: the required features are in
> > the latest kernels, but companies have to support older kernels too.
> > Now look at the problem from different angle:
> > Detecting kfuncs is no different than detecting helpers.
> > The bpf users has to have a workaround when helper/kfunc is not available.
> > In that sense stability of the helpers vs instability of kfuncs is irrelevant.
> > Both might not exist in a particular kernel.
> > So if cilium starts to use kfunc it won't be extra development cost and
> > bpf program writer experience using kfuncs vs using helpers is the same as well.
> 
> But that was not the point I was making. What you describe above is the baseline
> cost which is there regardless of BPF helper vs kfunc.. detecting availability
> and having a workaround for older kernel if needed. The added cost is if kfunc
> changes over time for whichever valid reason, then you are essentially pushing
> the maintenance cost _from kernel to users_ when they need to keep track of that
> and implement workarounds specifically to make the kfunc work in their program
> for a set of kernels they plan to support, which they otherwise would /not/ have
> if it was a BPF helper. It raises the barrier from user side. Similarly, if users
> started out with using kfunc from a base kernel, and in future it might get
> removed given its not stable, then a workaround (if possible) needs to be
> implemented for newer kernels - probably rare occasion but not impossible or
> something that can be ruled out entirely. 

In theory it all makes sense assuming that kernel devs keep changing kfuncs
to make users suffer. You're painting kernel as malicious towards users
whereas in reallity it's exactly the opposite. When we add a kfunc we think
just as hard about its usefulness. We don't have a deprecation strategy yet
and that's the point I'm making: while we think about helpers as the only
stable medium we won't be making progress in kfunc deprecation and kfunc stability areas.

> So the stability of the helpers vs
> instability of kfuncs is relevant in that case, not for the case you describe
> above, and that is extra development cost on user side. Generally, what I'm saying
> is, there needs to be a path forward where we are still open for both instead of
> completely freezing the former.

'extra development cost on user side'... in theory.
None of it happened in practice yet.
kfuncs is the best answer to uapi rigidness we have.
Maybe years from now we realize that kfunc mechanism sucks too and we will replace
it with something else. It's a possiblity and opportunity to make our own
decisions and fix our mistakes where uapi rules we cannot change.

> > But with kfuncs we can solve this bpf adoption issue.
> > The helpers are not easily backportable and cannot be added in modules,
> > so company's workarounds for older kernel are painful.
> > While kfuncs are trivially added in a module.
> 
> Maybe to a small degree. Often shipping out-of-tree kernel module is generally
> a no-go from corp policy and there's nothing you can do about it in such case.

Often yes, but in many cases the customers are ok with additional ko-s when
it's clear that there is a path forward to upstream the ko's functionality.
In this case the kfuncs will be already upstream, so selling out-of-tree ko
that implements what's already upstream is much easier.

> "trivially added" is a bit oversimplified as well.. depends on the kfunc of course,
> but potentially painful in terms of having to work around various changing kernel
> internals for your kfunc implementation and only possible if kernel actually exposes
> the needed functionality to modules. While the adoption issue /can/ in some cases be
> solved, I don't think it will be widely practical to solve adoption issue. Eventually
> only time will solve it when everyone is on decent enough kernel as baseline, this
> is what is there today at least for networking and tracing side where BPF is widely
> adopted and its available framework big enough to solve many use cases.

Of course. The verification of kfuncs still rapidly evolves. Today we cannot claim
that 6.1 kernel will be a stable base and the model of 'kfuncs in extra ko' will
work from now on. The point that we need to stop thinking about helpers as the only
stable option we have and align all our efforts behind kfuncs, define deprecation
and stability rules.

> Aside and independent of all that, kfuncs added in out of tree modules should be
> discouraged. After all we want developers to contribute back to upstream kernel,
> and for a very long time we've had the stance that no extra functionality should be
> possible via out of tree module extensions.

Right. That model worked until windows came along and started defining their own
stable helpers with different func_id numbers.
Now if cilium wants to run on linux and windows it still needs to use different
bpf_helper_defs.h. The C code stays largerly the same, but the numbers change and
their semantics between OSes likely differ a tiny bit to be annoying long term.
The point is the stability of helpers is relative.

> > Let's take bpf_sock_destroy that Aditi wants to add as an example.
> > If it's done as a helper the cilium would need to wait for the next kernel release
> > and next distro release some years from now to actually use it at the customer site.
> 
> Yeap, with some distros in K8s space being better than others, for example, some like
> Flatcar tend to be fairly up to date. With major LTS ones it takes 1+ years though.
> 
> > If bpf_sock_destroy is added as kfunc you can ship an extra kernel module
> > with just that kfunc to your customers. You can also attempt to convince a distro
> > that this module with kfuncs should be certified, since the same kfunc is in upstream kernel.
> > The customer can use cilium that relies on bpf_sock_destroy much sooner
> > and likely there won't be a need to develop a completely different workaround
> > for kernels without that kfunc.
> 
> See above wrt modules. Some larger users which run their own DC infra also build
> kernels for themselves, so in some cases it's possible and easier from corp policy
> PoV to just cherry-pick upstream commits and roll them into their own kernel build
> until they upgrade at some point to a base kernel where this comes by default. Some
> of the distro vendors build "hw enablement" kernels for cloud providers and there
> it is possible too to ask for backports on core functionality even if not in stable,
> it's a slow process however.

Right. Redhat is backporting quite a bit of upstream bpf features into their official
kernels and that's great. With kfuncs in ko-s it will become much easier.
No need to validate the whole kernel. The QA effort is smaller, code reviews are easier, etc.
The kfuncs in ko-s will be easier on support team too, since any kernel crash
is easier to attribute. "pls unload kfunc-ko and repeate your work".

> 
> [...]
> > > Ofc there is interest in keeping changes to a
> > > minimum, but it's not the same as BPF helpers where there is a significantly higher
> > > guarantee that things continue to keep working going forward. Today in Cilium we
> > > don't use any of the kfuncs, we might at some point when we see it necessary, but
> > > likely to a limited degree if sth cannot be solved as-is and only kfunc is present
> > > as a solution. But again, from a UX it's not great having to know that things can
> > > break anytime soon with newer kernels (things might already with verifier/LLVM
> > > upgrade and kfunc potentially adds yet another level). Generally speaking, I'm not
> > > against kfuncs but I suggest only making "freeze bpf helpers now" a soft freeze
> > > with a path forward for promoting some of the kfuncs which have been around and
> > > matured for a while and didn't need changes as stable BPF helpers to indicate their
> > > maturity level when we see it fit. So it's not a hard "no", but possible promotion
> > > when suitable.
> > 
> > The problem with 'soft' freeze that it's open to interpretation and abuse.
> > It feels to me you're saying that cilium is not using kfuncs and
> > therefore all cilium features additions are ok to be done as helpers.
> > That doesn't sound fair to other bpf devs.
> 
> I think you misread, lets not twist what I mentioned. All I was saying is that we
> should keep the door open for both to continue to co-exist; both have a place, both
> come with their advantages but also baggage. It's not that one is absolutely better
> than the other, and that maintenance baggage is either on our side or pushed towards
> users.
> 
> [...]
> > > Discoverability plus being able to know semantics from a user PoV to figure out when
> > > workarounds for older/newer kernels are required to be able to support both kernels.
> > 
> > Sounds like your concern is that there could be a kfunc that changed it semantics,
> > but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
> > such patches from landing. It's up to us to define sane and user friendly deprecation of kfuncs.
> 
> Yes, that is a concern. New kfunc and deprecation with eventual removal of the old
> one might be better in such case, agree.
> 
> [...]
> > > is imho repeating the same story as BPF helpers vs kfuncs. Saying a kfunc is 'pretty
> > > stable' is kind of hinting to users that it's close to UAPI, but yet it's unstable.
> > 
> > correct.
> > 
> > > It'll confuse even more. I'd rather have a path forward where those kfuncs get promoted
> > 
> > why confuse more? There are EXPORT_SYMBOL like kmalloc that are quite stable,
> > yet they can change.
> > EXPORT_SYMBOL_GPL is exact analogy to kfunc.
> 
> They are quite stable because they are used in lots of places in-tree and changing
> would cause a ton of needless churn and merge conflicts for everyone, etc. You might
> not always have this kind of visibility on usage of kfuncs. The data you have is
> from your internal code base and what's in some of the larger OSS projects, but
> certainly a more limited/biased view. So as with 'soft' freeze this is just as well open
> to interpretation. "confuse more" because you declare it quite stable, yet not stable.
> Why is there fear to make them proper uapi then with the given known guarantees? From
> user side this guarantee is a good thing, not a bad thing. Mistakes were/are made all
> the time and learned from. Imagine syscall API is not stable anymore. Would you invest
> the cost to develop an application against it? 

Would you invest in developing application against unstable syscall API? Absolutely.
People develop all tons of stuff on top of fuse-fs. People develop apps that interact
with tracing bpf progs that are clearly unstable. They do suffer when kernel side
changes and people accept that cost. BPF and tracing in general contributed to that mind change.
In a datacenter quite a few user apps are tied to kernel internals.

> Imho, it's one of BPF's strengths and
> we should keep the door open, not close it.

The strength of BPF was and still is that it has both stable and unstable interfaces.
Roughly: networking is stable, tracing is unstable.
The point is that to be stable one doesn't need to use helpers.
We can make kfuncs stable too if we focus all our efforts this way and
for that we need to abandon adding helpers though it's a pain short term.

> 
> > > to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> > > and from an API PoV that it is ready to be a proper BPF helper, and until this point
> > 
> > "Proper BPF helper" model is broken.
> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> > 
> > is a hack that works only when compiler optimizes the code.
> > See gcc's attr(kernel_helper) workaround.
> > This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> > And because it's uapi we cannot even fix this
> > With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> > These tools don't exist yet, but we have a way forward whereas with helpers
> > we are stuck with -O2.
> 
> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
> think -O0 is a requirement or barrier for that. It may open up possibilities for
> new tools, but production is still running with -O2. Proper BPF helper model is
> broken, but everyone relies on it, and will be for a very very long time to come,
> whether we like it or not. There is a larger ecosystem around BPF devs outside of
> kernel, and developers will use the existing means today. There are recommendations /
> guidelines that we can provide but we also don't have control over what developers
> are doing. Yet we should make their life easier, not harder.

Fully fleshed out kfunc infra will make developers job easier. No one is advocating
to make users suffer.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 18:43                   ` Andrii Nakryiko
@ 2023-01-04 19:44                     ` Alexei Starovoitov
  2023-01-04 21:55                       ` Andrii Nakryiko
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-04 19:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 10:43:37AM -0800, Andrii Nakryiko wrote:
> > extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
> >
> > will likely work with both gcc and clang.
> > And if it doesn't we can fix it.
> >
> > While when gcc folks saw helpers:
> >
> > static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
> >
> > they realized that it is a hack that abuses compiler optimizations.
> > They even invented attr(kernel_helper) to workaround this issue.
> > After a bunch of arguing gcc added support for this hack without attr,
> > but it's going to be around forever... in gcc, in clang and in kernel.
> > It's something that we could have fixed if it wasn't for uapi.
> > Just one more example of unfixable mistake that causing issues
> > to multiple projects.
> > That's the core issue of kernel uapi rules: inability to fix mistakes.
> 
> This is BPF ISA defining `call #N;` to call helper with ID N, which
> you agree that it (ISA) has to be stable, documented and standardized,
> right?
> 
> Everything else is just how we expose those constants into C code and
> how libbpf deals with them. Libbpf could support new attribute or even
> extern-based convention, if necessary.
> 
> But it wasn't necessary for years and only was brought up during GCC's
> attempt to invent a new convention here. And they successfully dealt
> with this challenge.

'dealt with this challenge'? You mean didn't, right?
gcc doesn't guarantee that '= (void *) 777;' will work even with optimization on.
In clang we cannot guarantee that either.
Nothing requires a compiler to do constant propagation.

> 
> Yes, we won't change existing helpers, but we can add new ones if we
> need to extend them. That's how APIs work. Yes, they need careful
> considerations when designing and implementing new APIs. Yes, mistakes
> do happen, that's just fact of life and par for the course of software
> development. Yes, we have to live with those mistakes. Nothing changed
> about that.
> 
> But somehow libraries and kernel still produce stable APIs and
> maintain them because they clearly provide benefits to end users.

Did you 'live with mistakes done in libbpf 0.x' ? No.
You've introduced libbpf 1.0 with incompatible api and some users suffereed.

> We'll get the same amount of flame when we try to change kfunc that's
> widely adopted.

Of course. That's why we need to define a stability and deperecation
plan for them.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 18:43                         ` Andrii Nakryiko
@ 2023-01-04 19:51                           ` Alexei Starovoitov
  2023-01-04 21:56                             ` Andrii Nakryiko
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-04 19:51 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Vernet, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 10:43:52AM -0800, Andrii Nakryiko wrote:
> 
> struct bpf_dynptr dptr = ...;
> bool is_null = false;
> 
> if (bpf_core_value_exists(enum bpf_func_id, BPF_FUNC_dynptr_is_null)) {
>     is_null = bpf_dynptr_is_null(&dptr);
> } else {
>     struct bpf_dynptr_kern *kdptr = (void*)&dptr;
>     is_null = !!BPF_CORE_READ(kdptr, data);
> }
> 
> How do you detect the existence of kfunc today? Preferably without
> doing extra work in user-space.
> 
> Now, let's say kfunc changes its signature. Show me a short example on
> how you deal with that in BPF C code?

Didn't we add bpf_core_type_matches for func protos specifically
to deal with function signature changes in the kernel after tracepoint
args got swapped?
I'm assuming the same mechanism will work for kfuncs.
If not we can come up with a new one.

> 
> Think about sched_ext. Right now it's so bleeding edge that you have
> to assume the very latest and freshest kernel code. So you know all
> the kfuncs that you need should exist otherwise sched_ext doesn't work
> at all. Ok, happy place.
> 
> Now a year or two passes by. Some kfuncs are added, some are changed.
> We still believe that BPF CO-RE (compile once - run everywhere) is
> good and we don't want to compile and distribute multiple versions of
> BPF application, right? You'll want to do some extra (or more
> performant) stuff if kernel is recent and has some new kfunc, but
> fallback to some default suboptimal behavior otherwise. How do you do
> that in a simple and straightforward way? 

with a help of CORE, of course.
If it doesn't exist today we can add it.

> But even worse is what if
> some critical kfunc is changed between kernel versions and you do
> *need* to support both versions. Think about those aspects, because
> sched_ext will run into them almost inevitably soon after its
> inclusion into kernel.
> 
> 
> One way or another there are some technical solution of various
> degrees of creativity. And I'm actually not sure if I have a solution
> for kfunc signature change at all. Without BTF we could use two
> separate .c files and statically link them together, which would work
> because extern is untyped in pure C. But with BPF static linking we do
> have BTF information for each extern, and those BTF types will be
> incompatible for the same extern func.
> 
> We can probably come up with some hacks and conventions, as usual, but
> better start thinking about them now.
> 
> But hopefully you can empathize a bit more with poor end users that
> have to do hack like this and why having bpf_dynptr API defined as
> stable BPF helpers, with no extra dependencies on BTF in kernel, 

BTF is a reasonable dependency.
You've just used it to detect whether helper exists or not.
So it's fine to use the same to check whether kfunc exists or not.

> 
> Depends on perspective. If I was some humble dev trying to build
> BPF-based tool that should work on x86, arm64, s390x, and riscv (or
> whatever other architecture), and dynptr API is only based on kfuncs,
> I'm screwed. I can't sponsor or do kfunc support for my favorite
> architecture, I'm stuck waiting for this to be done by someone some
> time, if ever.

If kfuncs and bpf trampoline don't work on a particular architecture
that developer is likely screwed anyway. Dynptr is the last thing they
would worry about.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 18:44                           ` Andrii Nakryiko
@ 2023-01-04 19:56                             ` Alexei Starovoitov
  0 siblings, 0 replies; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-04 19:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Vernet, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 10:44:07AM -0800, Andrii Nakryiko wrote:
> >
> > Agree that any hard policy like 'only kfuncs from now on' gotta have its limits.
> > Maybe there will be a strong reason to add a new helper one day,
> > so we can keep the door open a tiny bit for an exception,
> > but for dynptr...
> > There are kfuncs with dynptr already (bpf_verify_pkcs7_signature)
> > So precedent is already made.
> 
> bpf_verify_pkcs7_signature() is using dynptr as a pointer to memory.
> It's a totally valid and intended use case, to pass memory area of
> statically unknown size, yes.
> 
> But that's very different from having basic dynptr helpers like
> is_null() and trim/advance as kfunc. Such helpers are stable, they
> manipulate generic attributes of dynptr: size, offset, underlying
> memory pointer. There is nothing unstable and potentially changing
> about them.

dynptr is defined in uapi as:
struct bpf_dynptr {
        __u64 :64;
        __u64 :64;
} __attribute__((aligned(8)));

So sizes, offset and memory pointer are not stable today and
there is no need to stabilize this part of it.

> From original exchange:
> 
> > > > So just because there is no perfect way to
> > > > handle all the SKB/XDP physical non-contiguity, doesn't mean that the
> > > > dynptr concept itself is flawed or not well thought out. It's just
> > >
> > > I think that's exactly what it means. dynptr concept is flawed.
> 
> Must be a lot of typos in here ;) because as written it clearly states
> that the whole concept of dynptr is flawed.

Maybe will we realize a year from now that it is?
We have some uapi exposure of dynptr in uapi. I think it's a safer bet
to keep it to the minimum.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 18:59                                 ` Andrii Nakryiko
@ 2023-01-04 20:03                                   ` Alexei Starovoitov
  2023-01-04 21:57                                     ` Andrii Nakryiko
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-04 20:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Daniel Borkmann, David Vernet, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 10:59:15AM -0800, Andrii Nakryiko wrote:
> 
> > >> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> > >> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> > >
> > > "Proper BPF helper" model is broken.
> > > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> > >
> > > is a hack that works only when compiler optimizes the code.
> > > See gcc's attr(kernel_helper) workaround.
> > > This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> > > And because it's uapi we cannot even fix this
> > > With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> > > These tools don't exist yet, but we have a way forward whereas with helpers
> > > we are stuck with -O2.
> >
> 
> But specifically about how the BPF helper model is broken, that's at
> least an exaggeration. BPF helper call is defined at BPF ISA level, it
> has to be a `call <some constant>;`, and as long as compiler generates
> such code, it's all good. From C standpoint UAPI is just a function
> call:
> 
> bpf_map_lookup_elem(&map, ...);
> 
> As long as this compiles and generates proper `call 1;` assembly
> instruction, we are good. If/when both Clang and GCC support an
> alternative way to define helper and not as a static func pointer, -O0
> builds (at least in the aspect of calling BPF helpers, I suspect other
> stuff will break still) will just work. And what's better,
> bpf_helper_defs.h would be able to pick the best option based on
> compiler's support with end users not having to care or notice the
> difference.

Right and that's what gcc did with attribute((kernel_helper(1)),
but we didn't like it because gcc and clang would diverge.
Now you're arguing it's just a bpf_helper_defs.h change and we should
have allowed it?

Also consider that 'call <some constant>' or more precise 'call absolute_address'
as an instruction exist in only one CPU architecture. It's BPF ISA.
It's a mistake that I made 8 years ago and inability to fix it bothers me.
Now we have 100 times more developers than we had 8 years ago.
I expect 100 time more UAPI and ABI mistakes.
Minimizing unfixable mistakes is what I'm after.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 14:25                               ` Daniel Borkmann
  2023-01-04 18:59                                 ` Andrii Nakryiko
  2023-01-04 19:37                                 ` Alexei Starovoitov
@ 2023-01-04 20:50                                 ` David Vernet
  2 siblings, 0 replies; 57+ messages in thread
From: David Vernet @ 2023-01-04 20:50 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Andrii Nakryiko, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 03:25:00PM +0100, Daniel Borkmann wrote:
> On 1/4/23 12:51 AM, Alexei Starovoitov wrote:
> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
> > > On 12/31/22 1:42 AM, Alexei Starovoitov wrote:
> > > > On Fri, Dec 30, 2022 at 03:00:21PM -0600, David Vernet wrote:
> > > > > > > 
> > > > > > > Taking bpf_get_current_task() as an example, I think it's better to have
> > > > > > > the debate be "should we keep supporting this / are users still using
> > > > > > > it?" rather than, "it's UAPI, there's nothing to even discuss". The
> > > > > > > point being that even if bpf_get_current_task() is still used, there may
> > > > > > > (and inevitably will) be other UAPI helpers that are useless and that we
> > > > > > > just can't remove.
> > > > 
> > > > Sorry, missed this question in the previous reply.
> > > > The answer is "it's UAPI, there's nothing to even discuss".
> > > > It doesn't matter whether bpf_get_current_task() is used heavily or not used at all.
> > > > The chance of breaking user space is what paralyzes the changes.
> > > > Any change to uapi header file is looked at with a magnifying glass.
> > > > There is no deprecation story for uapi.
> > > > The definition and semantics of bpf helpers are frozen _forever_.
> > > > And our uapi/bpf.h is not in a good company:
> > > > ls -Sla include/uapi/linux/|head
> > > > -rw-r--r-- 1 ast users 331159 Nov  3 08:32 nl80211.h
> > > > -rw-r--r-- 1 ast users 265312 Dec 25 13:51 bpf.h
> > > > -rw-r--r-- 1 ast users 118621 Dec 25 13:51 v4l2-controls.h
> > > > -rw-r--r-- 1 ast users  99533 Dec 25 13:51 videodev2.h
> > > > -rw-r--r-- 1 ast users  86460 Nov 29 11:15 ethtool.h
> > > > 
> > > > "Freeze bpf helpers now" is a minimum we should do right now.
> > > > We need to take aggressive steps to freeze the growth of the whole uapi/bpf.h
> > > 
> > > Imho, freezing BPF helpers now is way too aggressive step. One aspect which was
> > > not discussed here is that unstable kfuncs will be a pain for user experience
> > > compared to BPF helpers. Probably not for FB or G who maintain they own limited
> > > set of kernels, but for all others. If there is valid reason that kfuncs will have
> > > to change one way or another, then BPF applications using them will have to carry
> > > the maintenance burden on their side to be able to support a variety of kernel
> > > versions with working around the kfunc quirks. So you're essentially outsourcing
> > > the problem from kernel to users, which will suck from a user experience (and add
> > > to development cost on their side).
> > 
> > It's actually the opposite.
> > A small company that wants to use BPF needs to have a workaround/plan B for
> > different kernels and different distros.
> > That's why cilium and others have to detect availability of helpers and bpf features.
> > One bpf prog for newer kernel and potentially completely different solution
> > for older kernels.
> > That's the biggest obstacle in bpf adoption: the required features are in
> > the latest kernels, but companies have to support older kernels too.
> > Now look at the problem from different angle:
> > Detecting kfuncs is no different than detecting helpers.
> > The bpf users has to have a workaround when helper/kfunc is not available.
> > In that sense stability of the helpers vs instability of kfuncs is irrelevant.
> > Both might not exist in a particular kernel.
> > So if cilium starts to use kfunc it won't be extra development cost and
> > bpf program writer experience using kfuncs vs using helpers is the same as well.
> 
> But that was not the point I was making. What you describe above is the baseline
> cost which is there regardless of BPF helper vs kfunc.. detecting availability
> and having a workaround for older kernel if needed. The added cost is if kfunc
> changes over time for whichever valid reason, then you are essentially pushing

But if there is a "valid reason" to change something, then it's better
to have the _option_ to change it, no? IMHO that's the key point here.
With kfuncs, "reasons" are allowed to be part of the discussion. With
UAPI, there is nothing to discuss.

And that's the fundamental problem with having things in UAPI. Forever
is a very long time. Do we really not want to have the option of
changing or removing something after (e.g.) 20 years? 40 years? 60
years? I agree with you that it's unambiguous that using kfuncs instead
of helpers does shift some maintenance cost from the kernel to users,
but IMO the point is that with kfuncs we at least have the ability to
control that cost. Taking an extreme example, we could decide to support
a kfunc for 30 years, and then deprecate it for 10 years, and then and
then finally remove it. With UAPI our childrens' childrens' children
will have to support it. I don't think guaranteed stability is worth
that cost. Not for symbols exported by the kernel, used by other kernel
programs, which is fundamentally what BPF programs are.

Another way to look at it would be: do we expect tooling to support all
kernel versions and features indefinitely? When we're on Linux 50.15, do
we expect that there will be tooling that requires us to support
bpf_get_current_task() instead of bpf_get_current_task_btf()? And even
if there is a tool that needs it, is it worth the cost of keeping it
around? With kfuncs the question would matter, even if it's "yes it's
worth it". With UAPI, the question is meaningless.

I realize that I'm being a bit hyperbolic here, and it is not my
intention to misrepresent any points made in favor of not freezing UAPI.
I just think it's necessary to be hyperbolic when it comes to UAPI to
really underscore the implications of using it.  There are very good
reasons for having UAPI in general, but IMHO, those reasons don't apply
to kernel programs, which is really what we're talking about here.

> the maintenance cost _from kernel to users_ when they need to keep track of that
> and implement workarounds specifically to make the kfunc work in their program
> for a set of kernels they plan to support, which they otherwise would /not/ have
> if it was a BPF helper. It raises the barrier from user side. Similarly, if users
> started out with using kfunc from a base kernel, and in future it might get
> removed given its not stable, then a workaround (if possible) needs to be
> implemented for newer kernels - probably rare occasion but not impossible or
> something that can be ruled out entirely. So the stability of the helpers vs
> instability of kfuncs is relevant in that case, not for the case you describe
> above, and that is extra development cost on user side. Generally, what I'm saying
> is, there needs to be a path forward where we are still open for both instead of
> completely freezing the former.

Curious what you envision as the policy long term (i.e. after the path
forward)?

The reason I ask is that on the one hand we're claiming that kfuncs work
for some things, while on the other we seem to be claiming that UAPI is
_necessary_ for users to have guaranteed stability and adopt the
platform (and I will preemptively apologize if I'm unintentionally
misrepresenting your view by saying that).

If we operate under the assumption that helpers are necessary for
certain things due to its stability guarantees, whereas kfuncs are
appropriate in some cases, I think that begs the question: what criteria
are we using to decide when stability is really necessary? We could say
"for core functionality", but how do we know that there aren't other
users out there who are using "non-core-functionality" kfuncs instead of
helpers? Why do we give stability to some users but not others? The fact
that we don't have a crystal ball seems to be the central argument
around why we need UAPI, but I think it's a fallacy to have that view at
the same time as also supporting the existence of kfuncs.

[...]

> > > Discoverability plus being able to know semantics from a user PoV to figure out when
> > > workarounds for older/newer kernels are required to be able to support both kernels.
> > 
> > Sounds like your concern is that there could be a kfunc that changed it semantics,
> > but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
> > such patches from landing. It's up to us to define sane and user friendly deprecation of kfuncs.
> 
> Yes, that is a concern. New kfunc and deprecation with eventual removal of the old
> one might be better in such case, agree.

Agreed. With kfuncs, say that the scenario described comes to pass. We
could have a hypothetical deprecation policy like the following:

1. Add the new kfunc with the changed semantics, arguments, etc, under a
different name.
2. Deprecate the old kfunc for X years / releases, where X is whatever
conservative deprecation value we deem appropriate (and one which we
could always extend if need be).
3. Once we feel we're ready to remove the old kfunc, we remove it,
rename the new (now old) kfunc from (1) to that name, and then keep the
temporary name from the new-old kfunc in (1) as a wrapper / alias around
it. That temporary alias can itself then be deprecated and removed after
X years.

All of this is carefully orchestrated, and we have the flexibility to be
as conservative as we'd like in support of users. Maybe we decide that
we can never stop supporting the original kfunc because it's too
ubiquitous. It will surely depend on the policy we end up crafting for
kfuncs, and will probably sometimes require a case-by-case
determination, but at least we'll have the flexibility to choose.

> 
> [...]
> > > is imho repeating the same story as BPF helpers vs kfuncs. Saying a kfunc is 'pretty
> > > stable' is kind of hinting to users that it's close to UAPI, but yet it's unstable.
> > 
> > correct.
> > 
> > > It'll confuse even more. I'd rather have a path forward where those kfuncs get promoted
> > 
> > why confuse more? There are EXPORT_SYMBOL like kmalloc that are quite stable,
> > yet they can change.
> > EXPORT_SYMBOL_GPL is exact analogy to kfunc.
> 
> They are quite stable because they are used in lots of places in-tree and changing
> would cause a ton of needless churn and merge conflicts for everyone, etc. You might
> not always have this kind of visibility on usage of kfuncs. The data you have is
> from your internal code base and what's in some of the larger OSS projects, but
> certainly a more limited/biased view. So as with 'soft' freeze this is just as well open
> to interpretation. "confuse more" because you declare it quite stable, yet not stable.
> Why is there fear to make them proper uapi then with the given known guarantees? From
> user side this guarantee is a good thing, not a bad thing. Mistakes were/are made all
> the time and learned from. Imagine syscall API is not stable anymore. Would you invest
> the cost to develop an application against it? Imho, it's one of BPF's strengths and
> we should keep the door open, not close it.

But we're talking about _kernel_ programs here, not user programs. And
from that perspective, one could argue that having kfuncs actually
promotes more upstreaming of BPF programs for the exact reasons you're
spelling out here, just as EXPORT_SYMBOL_GPL promotes the upstreaming of
modules. Of course, it won't be the exact same as EXPORT_SYMBOL_GPL
because we'll still come up with a well documented, reliable deprecation
story, but the benefits of upstreaming the BPF program still apply.

In general, I think BPF programs and the syscall layer is really an
apples and oranges comparison. The kernel has internally never had a
stable interface as Greg describes in [0]. I don't see why we'd frame
BPF programs differently than any other kernel program in that regard.

[0]: https://www.kernel.org/doc/Documentation/process/stable-api-nonsense.rst

> > > to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> > > and from an API PoV that it is ready to be a proper BPF helper, and until this point
> > 
> > "Proper BPF helper" model is broken.
> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> > 
> > is a hack that works only when compiler optimizes the code.
> > See gcc's attr(kernel_helper) workaround.
> > This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> > And because it's uapi we cannot even fix this
> > With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> > These tools don't exist yet, but we have a way forward whereas with helpers
> > we are stuck with -O2.
> 
> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
> think -O0 is a requirement or barrier for that. It may open up possibilities for

I personally disagree that not being able to support -O0 is sane for a
debugging tool, but IMHO that's not the main point. Rather, it's that
what we have now is kind of a mess (I think we're all in agreement on
that?), and we can never fix it because of UAPI. IMO, that is a sign
that things need to change.

> new tools, but production is still running with -O2. Proper BPF helper model is
> broken, but everyone relies on it, and will be for a very very long time to come,
> whether we like it or not. There is a larger ecosystem around BPF devs outside of
> kernel, and developers will use the existing means today. There are recommendations /
> guidelines that we can provide but we also don't have control over what developers
> are doing. Yet we should make their life easier, not harder. Better debugging
> possibilities should cater to everyone.
> 
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 19:44                     ` Alexei Starovoitov
@ 2023-01-04 21:55                       ` Andrii Nakryiko
  2023-01-04 23:47                         ` David Vernet
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 21:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Wed, Jan 4, 2023 at 11:44 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jan 04, 2023 at 10:43:37AM -0800, Andrii Nakryiko wrote:
> > > extern bool bpf_dynptr_is_null(const struct bpf_dynptr *p) __ksym;
> > >
> > > will likely work with both gcc and clang.
> > > And if it doesn't we can fix it.
> > >
> > > While when gcc folks saw helpers:
> > >
> > > static bool (*bpf_dynptr_is_null)(const struct bpf_dynptr *p) = (void *) 777;
> > >
> > > they realized that it is a hack that abuses compiler optimizations.
> > > They even invented attr(kernel_helper) to workaround this issue.
> > > After a bunch of arguing gcc added support for this hack without attr,
> > > but it's going to be around forever... in gcc, in clang and in kernel.
> > > It's something that we could have fixed if it wasn't for uapi.
> > > Just one more example of unfixable mistake that causing issues
> > > to multiple projects.
> > > That's the core issue of kernel uapi rules: inability to fix mistakes.
> >
> > This is BPF ISA defining `call #N;` to call helper with ID N, which
> > you agree that it (ISA) has to be stable, documented and standardized,
> > right?
> >
> > Everything else is just how we expose those constants into C code and
> > how libbpf deals with them. Libbpf could support new attribute or even
> > extern-based convention, if necessary.
> >
> > But it wasn't necessary for years and only was brought up during GCC's
> > attempt to invent a new convention here. And they successfully dealt
> > with this challenge.
>
> 'dealt with this challenge'? You mean didn't, right?
> gcc doesn't guarantee that '= (void *) 777;' will work even with optimization on.

I don't use gcc-bpf, but given they dropped kernel_helper attribute,
and given you said "After a bunch of arguing gcc added support for
this hack without attr but it's going to be around forever..." I
assumed it does work. Are you saying it doesn't?

> In clang we cannot guarantee that either.

It works today, if it ever regresses there will be a lot of noise and
this regression will be fixed. So maybe technically it's not
guaranteed, but in practice it will keep working.

We had a `const volatile` case recently, variables were not being put
into .rodata section properly. GCC was changed to do it the same way
as Clang so that all the existing apps can keep working.


> Nothing requires a compiler to do constant propagation.
>
> >
> > Yes, we won't change existing helpers, but we can add new ones if we
> > need to extend them. That's how APIs work. Yes, they need careful
> > considerations when designing and implementing new APIs. Yes, mistakes
> > do happen, that's just fact of life and par for the course of software
> > development. Yes, we have to live with those mistakes. Nothing changed
> > about that.
> >
> > But somehow libraries and kernel still produce stable APIs and
> > maintain them because they clearly provide benefits to end users.
>
> Did you 'live with mistakes done in libbpf 0.x' ? No.

for a long time yes. And it's not apples to apples comparison, with
library it is possible to deprecate APIs, which is what we did. With
lots of work and gradual transition, but did it.

If we couldn't pull this through, yeah, I would live with whatever
APIs are there. And added new ones as a better replacement. As is
always done for APIs, nothing new here.

Within 0.x and 1.x APIs are stable and we live with them. This API
stability fear doesn't paralyze libbpf development, we still add new
stable APIs, if they are considered useful and thought through enough.

> You've introduced libbpf 1.0 with incompatible api and some users suffereed.

By "suffered" you mean a few systemd folks being grumpy about this?
And having to do 100 lines of code changes ([0]) to support two
incompatible major versions of libbpf *simultaneously*?

On the other hand we got a library with saner error propagation
behavior and various API normalizations and additions. Not too bad of
a trade off.

Sure, deprecation is not easy or free, there was a lot of prep work,
and some users had to adjust their code to use new APIs. But this is
quite a tangent.

  [0] https://github.com/systemd/systemd/pull/24511/

>
> > We'll get the same amount of flame when we try to change kfunc that's
> > widely adopted.
>
> Of course. That's why we need to define a stability and deperecation
> plan for them.

Lots of things that need to be defined and figured out, but we are
already quick to freeze BPF helpers.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 19:51                           ` Alexei Starovoitov
@ 2023-01-04 21:56                             ` Andrii Nakryiko
  0 siblings, 0 replies; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 21:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Wed, Jan 4, 2023 at 11:51 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jan 04, 2023 at 10:43:52AM -0800, Andrii Nakryiko wrote:
> >
> > struct bpf_dynptr dptr = ...;
> > bool is_null = false;
> >
> > if (bpf_core_value_exists(enum bpf_func_id, BPF_FUNC_dynptr_is_null)) {
> >     is_null = bpf_dynptr_is_null(&dptr);
> > } else {
> >     struct bpf_dynptr_kern *kdptr = (void*)&dptr;
> >     is_null = !!BPF_CORE_READ(kdptr, data);
> > }
> >
> > How do you detect the existence of kfunc today? Preferably without
> > doing extra work in user-space.
> >
> > Now, let's say kfunc changes its signature. Show me a short example on
> > how you deal with that in BPF C code?
>
> Didn't we add bpf_core_type_matches for func protos specifically
> to deal with function signature changes in the kernel after tracepoint
> args got swapped?
> I'm assuming the same mechanism will work for kfuncs.
> If not we can come up with a new one.

It would be good if someone actually try that and see if it works, and
if it doesn't, to come up with an approach that does. Right now I just
see hand-wavy arguments that BPF helpers and BPF kfuncs are equivalent
in this regard. Which currently I'm afraid they are not.

>
> >
> > Think about sched_ext. Right now it's so bleeding edge that you have
> > to assume the very latest and freshest kernel code. So you know all
> > the kfuncs that you need should exist otherwise sched_ext doesn't work
> > at all. Ok, happy place.
> >
> > Now a year or two passes by. Some kfuncs are added, some are changed.
> > We still believe that BPF CO-RE (compile once - run everywhere) is
> > good and we don't want to compile and distribute multiple versions of
> > BPF application, right? You'll want to do some extra (or more
> > performant) stuff if kernel is recent and has some new kfunc, but
> > fallback to some default suboptimal behavior otherwise. How do you do
> > that in a simple and straightforward way?
>
> with a help of CORE, of course.
> If it doesn't exist today we can add it.
>
> > But even worse is what if
> > some critical kfunc is changed between kernel versions and you do

How about this one? I'm honestly curious to see someone try and figure
out what works and what doesn't.

> > *need* to support both versions. Think about those aspects, because
> > sched_ext will run into them almost inevitably soon after its
> > inclusion into kernel.
> >
> >
> > One way or another there are some technical solution of various
> > degrees of creativity. And I'm actually not sure if I have a solution
> > for kfunc signature change at all. Without BTF we could use two
> > separate .c files and statically link them together, which would work
> > because extern is untyped in pure C. But with BPF static linking we do
> > have BTF information for each extern, and those BTF types will be
> > incompatible for the same extern func.
> >
> > We can probably come up with some hacks and conventions, as usual, but
> > better start thinking about them now.
> >
> > But hopefully you can empathize a bit more with poor end users that
> > have to do hack like this and why having bpf_dynptr API defined as
> > stable BPF helpers, with no extra dependencies on BTF in kernel,
>
> BTF is a reasonable dependency.
> You've just used it to detect whether helper exists or not.
> So it's fine to use the same to check whether kfunc exists or not.

BTFGen doesn't require kernel to be built with BTF, and yet I get BPF
CO-RE stuff. But you are jumbling everything together. I don't need
BPF CO-RE to build a useful BPF application that needs to use
ringbuf+dynptr (think uprobe'ing of some app, USDTs, etc), yet we will
require BTF for no reason.

Just as you are afraid of not getting UAPI right because we can't
anticipate possible changes, let's be just as much afraid of
unnecessary dependencies, which can be a blocker or pain for some
users in some situations. Isn't that fair?

>
> >
> > Depends on perspective. If I was some humble dev trying to build
> > BPF-based tool that should work on x86, arm64, s390x, and riscv (or
> > whatever other architecture), and dynptr API is only based on kfuncs,
> > I'm screwed. I can't sponsor or do kfunc support for my favorite
> > architecture, I'm stuck waiting for this to be done by someone some
> > time, if ever.
>
> If kfuncs and bpf trampoline don't work on a particular architecture
> that developer is likely screwed anyway. Dynptr is the last thing they
> would worry about.

uprobe+dynptr+ringbuf is all I need for useful apps. Likely or not can
be argued to the end of times.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 20:03                                   ` Alexei Starovoitov
@ 2023-01-04 21:57                                     ` Andrii Nakryiko
  0 siblings, 0 replies; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-04 21:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, David Vernet, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu

On Wed, Jan 4, 2023 at 12:03 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jan 04, 2023 at 10:59:15AM -0800, Andrii Nakryiko wrote:
> >
> > > >> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> > > >> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> > > >
> > > > "Proper BPF helper" model is broken.
> > > > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> > > >
> > > > is a hack that works only when compiler optimizes the code.
> > > > See gcc's attr(kernel_helper) workaround.
> > > > This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> > > > And because it's uapi we cannot even fix this
> > > > With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> > > > These tools don't exist yet, but we have a way forward whereas with helpers
> > > > we are stuck with -O2.
> > >
> >
> > But specifically about how the BPF helper model is broken, that's at
> > least an exaggeration. BPF helper call is defined at BPF ISA level, it
> > has to be a `call <some constant>;`, and as long as compiler generates
> > such code, it's all good. From C standpoint UAPI is just a function
> > call:
> >
> > bpf_map_lookup_elem(&map, ...);
> >
> > As long as this compiles and generates proper `call 1;` assembly
> > instruction, we are good. If/when both Clang and GCC support an
> > alternative way to define helper and not as a static func pointer, -O0
> > builds (at least in the aspect of calling BPF helpers, I suspect other
> > stuff will break still) will just work. And what's better,
> > bpf_helper_defs.h would be able to pick the best option based on
> > compiler's support with end users not having to care or notice the
> > difference.
>
> Right and that's what gcc did with attribute((kernel_helper(1)),
> but we didn't like it because gcc and clang would diverge.
> Now you're arguing it's just a bpf_helper_defs.h change and we should
> have allowed it?

No, I'm saying if you feel so strongly that the current situation is
bad and attribute-based approach is preferable (presumably to allow
-O0 to work), then we can do that (both on GCC and Clang sides) and
everything will work with no UAPI changes. And I did suggest a
relatively clean approach with BPF_HELPER_DEF() ([0]) which would
combine both old and new ways.

But I personally have no problem with the current approach. You are
bringing it up as an UAPI problem, which I'm claiming it is not.

  [0] https://lore.kernel.org/bpf/CAEf4BzYwRyXG1zE5BK1ZXmxLh+ZPU0=yQhNhpqr0JmfNA30tdQ@mail.gmail.com/


>
> Also consider that 'call <some constant>' or more precise 'call absolute_address'
> as an instruction exist in only one CPU architecture. It's BPF ISA.
> It's a mistake that I made 8 years ago and inability to fix it bothers me.
> Now we have 100 times more developers than we had 8 years ago.
> I expect 100 time more UAPI and ABI mistakes.
> Minimizing unfixable mistakes is what I'm after.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 21:55                       ` Andrii Nakryiko
@ 2023-01-04 23:47                         ` David Vernet
  2023-01-05 21:01                           ` Andrii Nakryiko
  0 siblings, 1 reply; 57+ messages in thread
From: David Vernet @ 2023-01-04 23:47 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Joanne Koong, bpf, Andrii Nakryiko,
	kernel-team, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu

On Wed, Jan 04, 2023 at 01:55:32PM -0800, Andrii Nakryiko wrote:

[...]

> > > Yes, we won't change existing helpers, but we can add new ones if we
> > > need to extend them. That's how APIs work. Yes, they need careful
> > > considerations when designing and implementing new APIs. Yes, mistakes
> > > do happen, that's just fact of life and par for the course of software
> > > development. Yes, we have to live with those mistakes. Nothing changed
> > > about that.
> > >
> > > But somehow libraries and kernel still produce stable APIs and
> > > maintain them because they clearly provide benefits to end users.
> >
> > Did you 'live with mistakes done in libbpf 0.x' ? No.
> 
> for a long time yes. And it's not apples to apples comparison, with
> library it is possible to deprecate APIs, which is what we did. With
> lots of work and gradual transition, but did it.

User space <-> kernel is not an apples to apples comparison with kernel
<-> BPF programs either. Also, you're using the word "possible" here
like it's a foregone conclusion. It is "possible" to deprecate BPF APIs
as well, if we start using kfuncs going forward instead of adding to the
UAPI boundary.

> If we couldn't pull this through, yeah, I would live with whatever
> APIs are there. And added new ones as a better replacement. As is
> always done for APIs, nothing new here.

The point is that you had a choice.

> Within 0.x and 1.x APIs are stable and we live with them. This API
> stability fear doesn't paralyze libbpf development, we still add new
> stable APIs, if they are considered useful and thought through enough.

Nobody is claiming that we can't have stable APIs. We're arguing in
favor of being able to _choose_ which APIs to deprecate. Using your
logic, you wouldn't have been able to deprecate _anything_ for fear of
some user, somewhere being affected by it. I understand the sentiment,
and I agree that it's very important to have conservative and
predictable approaches to deprecation. What I don't think is important
is to provide _indefinite_ guarantees for _all_ APIs between two
different kernel contexts.

And to reiterate, as I've said a few times now but nobody seems to be
responding to (unless I missed something), this is for kernel <-> kernel
programs. We're not even talking about APIs that are available to user
space. Let's at least be clear about the boundaries for which we're
debating the merits of stability, because while some user space tooling
would certainly affected by choosing to freeze BPF helpers, kfuncs and
BPF helpers are ever invoked by _kernel_ programs.

> > You've introduced libbpf 1.0 with incompatible api and some users suffereed.
> 
> By "suffered" you mean a few systemd folks being grumpy about this?
> And having to do 100 lines of code changes ([0]) to support two
> incompatible major versions of libbpf *simultaneously*?
> 
> On the other hand we got a library with saner error propagation
> behavior and various API normalizations and additions. Not too bad of
> a trade off.

This sounds like an argument in favor of why it is acceptable to
deprecate some things? Why are some users allowed to feel "pain" (a term
you've used in other threads), but other users who are affected by your
choices are just "grumpy"? Also, what about the myriad hypothetical
users you've never heard of (the ones who we're really protecting with
UAPI) who had to deal with breaking API stability changes?

> Sure, deprecation is not easy or free, there was a lot of prep work,
> and some users had to adjust their code to use new APIs. But this is
> quite a tangent.

I don't see how this is tangential to the discussion -- it seems very
relevant. From my perspective, the core of the discussion has been
whether it's acceptable to shift _any_ of the burden of API stability to
users. My point, and I believe Alexei's point as well, is that the
answer is "it depends and it's a tradeoff", as you've essentially said
here.

What I'm failing to understand is why your argument that there are
tradeoffs applies here, but not for kernel <-> BPF kernel programs? I'm
genuinely trying to understand what the distinction is, because from
where I'm sitting it feels like we're being selective about when the
unknown _threat_ of API instability automatically completely overrides
our ability to choose our own deprecation and stability story (a
stability story which is informed by our perception of an API's
importance, usage, etc).

Note that my point here applies to something you've raised on other
threads as well, such as on [0] where you (reasonably) reiterated this
point:

[0]: https://lore.kernel.org/all/CAEf4BzY0aJNGT321Y7Fx01sjHAMT_ynu2-kN_8gB_UELvd7+vw@mail.gmail.com/

> But again. Let me repeat my point *again*. BPF helpers and kfuncs are
> not mutually exclusive, both can and should exist and evolve. That's
> one of the main points which is somehow eluding this conversation.

This is one of the big disconnects for me. If you argue that both BPF
helpers and kfuncs can and should continue to coexist indefinitely, it
feels like you're arguing for two incompatible points (and please
correct me anywhere that I'm unintentionally misrepresenting your
perspective here):

- On the one hand you're arguing that in some cases, _no_ API
  instability is acceptable. That in general, the main kernel <-> kernel
  BPF program API boundary is equivalent to UAPI, and that it's _never_
  acceptable for us to ever, _ever_ deprecate certain APIs because
  _some_ users may be using them, and the possibilty of APIs ever
  changing or being deprecated will impose an unacceptable pain to users
  which will make it too difficult to build tooling and, and end up
  discouraging adoption onto BPF. It seems that you've been making
  making this argument in favor of what you consider to be "core" BPF
  helpers such as bpf_dynptr_is_null(), etc.

- At the same time, on the other hand, you're arguing that _some_ of the
  API boundary between kernel <-> BPF program can be unstable. That it's
  acceptable for _some_ users and _some_ tooling to feel the pain of
  certain APIs changing. To perhaps extrapolate your point a bit
  further, you're arguing that niche / non-core kfuncs can be unstable,
  and that we don't have to worry about the unknown, hypothetical user
  who would feel pain from having to deal with them being deprecated,
  because they're not "core".

Assuming that's all true, my question is:

Why not just give ourselves the _option_ of being able to deem those
core helpers as being indefinitely stable for the foreseeable future,
and keep the unstable kfuncs to have the same stability guarantees as
what they have today? In terms of _stability_ specifically (so ignoring
other concerns you've raised, such as that we need BTF and BPF
trampoline support for kfuncs -- not because they're irrelevant, but
just to keep the discussion focused on stability), what do we gain by
keeping the "core" / "stable" functions as BPF helpers, instead of just
making them "super stable" kfuncs? At least then we have the option in
the far-far-far future to deprecate them if they eventually, way later,
become 100% obsolete. Plus you get the other benefits that Alexei
mentioned such as potentially being able to backport them to older
kernels by including them in modules, etc.

Note that I'm not saying with 100% conviction that we don't have _any_
work to do before freezing helpers (though IMO we should just rip the
bandaid and do it now), but I am arguing with strong conviction that
once any of that precursor work is taken care of, there is no reason to
use BPF helpers in place of kfuncs. At least, that's how I see it at
this point.

>   [0] https://github.com/systemd/systemd/pull/24511/
> 
> >
> > > We'll get the same amount of flame when we try to change kfunc that's
> > > widely adopted.
> >
> > Of course. That's why we need to define a stability and deperecation
> > plan for them.
> 
> Lots of things that need to be defined and figured out, but we are
> already quick to freeze BPF helpers.

I agree with you that it would be prudent for us to iron some of this
out more concretely. In this discussion it seems like one of the key
points of contention has been around stability, and that the lack of a
concrete policy for kfuncs has largely (but not completely) been the
cause for concern. Perhaps it would help clarify things if someone
submitted a patch set that included a more formal kfunc stability
proposal?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 19:37                                 ` Alexei Starovoitov
@ 2023-01-05  0:13                                   ` Martin KaFai Lau
  2023-01-05 17:17                                     ` KP Singh
  2023-01-05 21:02                                     ` Andrii Nakryiko
  0 siblings, 2 replies; 57+ messages in thread
From: Martin KaFai Lau @ 2023-01-05  0:13 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: David Vernet, Andrii Nakryiko, Joanne Koong, bpf, kernel-team,
	Alexei Starovoitov, Song Liu

On 1/4/23 11:37 AM, Alexei Starovoitov wrote:
> Would you invest in developing application against unstable syscall API? Absolutely.
> People develop all tons of stuff on top of fuse-fs. People develop apps that interact
> with tracing bpf progs that are clearly unstable. They do suffer when kernel side
> changes and people accept that cost. BPF and tracing in general contributed to that mind change.
> In a datacenter quite a few user apps are tied to kernel internals.
> 
>> Imho, it's one of BPF's strengths and
>> we should keep the door open, not close it.
> The strength of BPF was and still is that it has both stable and unstable interfaces.
> Roughly: networking is stable, tracing is unstable.
> The point is that to be stable one doesn't need to use helpers.
> We can make kfuncs stable too if we focus all our efforts this way and
> for that we need to abandon adding helpers though it's a pain short term.
> 
>>>> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
>>>> and from an API PoV that it is ready to be a proper BPF helper, and until this point
>>> "Proper BPF helper" model is broken.
>>> static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
>>>
>>> is a hack that works only when compiler optimizes the code.
>>> See gcc's attr(kernel_helper) workaround.
>>> This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
>>> And because it's uapi we cannot even fix this
>>> With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
>>> These tools don't exist yet, but we have a way forward whereas with helpers
>>> we are stuck with -O2.
>> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
>> think -O0 is a requirement or barrier for that. It may open up possibilities for
>> new tools, but production is still running with -O2. Proper BPF helper model is
>> broken, but everyone relies on it, and will be for a very very long time to come,
>> whether we like it or not. There is a larger ecosystem around BPF devs outside of
>> kernel, and developers will use the existing means today. There are recommendations /
>> guidelines that we can provide but we also don't have control over what developers
>> are doing. Yet we should make their life easier, not harder.
> Fully fleshed out kfunc infra will make developers job easier. No one is advocating
> to make users suffer.

It is a long discussion. I am replying on a thread with points that I have also 
been thinking about kfunc and helper.

I think bpf helper is a kernel function but helpers need to be defined in a more 
tedious form. It requires to define bpf_func_proto and then wrap into 
BPF_CALL_x. It was not obvious for me to get around to understand the reason 
behind it. With kfunc, it is a more natural way for other kernel developers to 
expose subsystem features to bpf prog. In time, I believe we will be able to 
make kfunc has a similar experience as EXPORT_SYMBOL_*.

Thus, for subsystem (hid, fuse, netdev...etc) exposing functions to bpf prog, I 
think it makes sense to stay with kfunc from now on. The subsystem is not 
exposing something like syscall as an uapi. bpf prog is part of the kernel in 
the sense that it extends that subsystem code. I don't think bpf needs to 
provide extra and more guarantee than the EXPORT_SYMBOL_* in term of api. That 
said, we should still review kfunc in a way that ensuring it is competent to the 
best of our knowledge at that point with the limited initial use cases at hand. 
I won't be surprised some of the existing EXPORT_SYMBOL_* kernel functions will 
be exposed to the bpf prog as kfunc as-is without any change in the future. For 
example, a few tcp cc kfuncs such as tcp_slow_start. They are likely stable 
without much change for a long time. It can be directly exposed as bpf kfunc. 
kfunc is a way to expose subsystem function without needing the bpf_func_proto 
and BPF_CALL_x quirks. When the function can be dual compiled later, the kfunc 
can also be inlined.

If kfunc will be used for subsystem, it is very likely the number of kfunc will 
grow and exceed the bpf helpers soon.  This seems to be a stronger need to work 
on the user experience problems about kfunc that have mentioned in this thread 
sooner than later. They have to be solved regardless. May be start with stable 
kfunc first. If the new helper is guaranteed stable, then why it cannot be kfunc 
but instead needs to go through the bpf_func_proto and BPF_CALL_x?  In time, I 
hope the bpf helper support in the verifier can be quieted down (eg. 
check_helper_call vs check_kfunc_call) and focus energy into kfunc like inlining 
kfunc...etc.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-05  0:13                                   ` Martin KaFai Lau
@ 2023-01-05 17:17                                     ` KP Singh
  2023-01-05 21:03                                       ` Andrii Nakryiko
  2023-01-05 21:02                                     ` Andrii Nakryiko
  1 sibling, 1 reply; 57+ messages in thread
From: KP Singh @ 2023-01-05 17:17 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David Vernet, Andrii Nakryiko, Joanne Koong, bpf, kernel-team,
	Alexei Starovoitov, Song Liu

On Thu, Jan 5, 2023 at 1:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 1/4/23 11:37 AM, Alexei Starovoitov wrote:
> > Would you invest in developing application against unstable syscall API? Absolutely.
> > People develop all tons of stuff on top of fuse-fs. People develop apps that interact
> > with tracing bpf progs that are clearly unstable. They do suffer when kernel side
> > changes and people accept that cost. BPF and tracing in general contributed to that mind change.
> > In a datacenter quite a few user apps are tied to kernel internals.
> >
> >> Imho, it's one of BPF's strengths and
> >> we should keep the door open, not close it.
> > The strength of BPF was and still is that it has both stable and unstable interfaces.
> > Roughly: networking is stable, tracing is unstable.
> > The point is that to be stable one doesn't need to use helpers.
> > We can make kfuncs stable too if we focus all our efforts this way and
> > for that we need to abandon adding helpers though it's a pain short term.
> >
> >>>> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> >>>> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> >>> "Proper BPF helper" model is broken.
> >>> static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> >>>
> >>> is a hack that works only when compiler optimizes the code.
> >>> See gcc's attr(kernel_helper) workaround.
> >>> This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> >>> And because it's uapi we cannot even fix this
> >>> With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> >>> These tools don't exist yet, but we have a way forward whereas with helpers
> >>> we are stuck with -O2.
> >> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
> >> think -O0 is a requirement or barrier for that. It may open up possibilities for
> >> new tools, but production is still running with -O2. Proper BPF helper model is
> >> broken, but everyone relies on it, and will be for a very very long time to come,
> >> whether we like it or not. There is a larger ecosystem around BPF devs outside of
> >> kernel, and developers will use the existing means today. There are recommendations /
> >> guidelines that we can provide but we also don't have control over what developers
> >> are doing. Yet we should make their life easier, not harder.
> > Fully fleshed out kfunc infra will make developers job easier. No one is advocating
> > to make users suffer.
>
> It is a long discussion. I am replying on a thread with points that I have also
> been thinking about kfunc and helper.
>
> I think bpf helper is a kernel function but helpers need to be defined in a more
> tedious form. It requires to define bpf_func_proto and then wrap into
> BPF_CALL_x. It was not obvious for me to get around to understand the reason
> behind it. With kfunc, it is a more natural way for other kernel developers to
> expose subsystem features to bpf prog. In time, I believe we will be able to
> make kfunc has a similar experience as EXPORT_SYMBOL_*.
>
> Thus, for subsystem (hid, fuse, netdev...etc) exposing functions to bpf prog, I
> think it makes sense to stay with kfunc from now on. The subsystem is not
> exposing something like syscall as an uapi. bpf prog is part of the kernel in
> the sense that it extends that subsystem code. I don't think bpf needs to
> provide extra and more guarantee than the EXPORT_SYMBOL_* in term of api. That
> said, we should still review kfunc in a way that ensuring it is competent to the
> best of our knowledge at that point with the limited initial use cases at hand.
> I won't be surprised some of the existing EXPORT_SYMBOL_* kernel functions will
> be exposed to the bpf prog as kfunc as-is without any change in the future. For
> example, a few tcp cc kfuncs such as tcp_slow_start. They are likely stable
> without much change for a long time. It can be directly exposed as bpf kfunc.
> kfunc is a way to expose subsystem function without needing the bpf_func_proto
> and BPF_CALL_x quirks. When the function can be dual compiled later, the kfunc
> can also be inlined.
>
> If kfunc will be used for subsystem, it is very likely the number of kfunc will
> grow and exceed the bpf helpers soon.  This seems to be a stronger need to work
> on the user experience problems about kfunc that have mentioned in this thread
> sooner than later. They have to be solved regardless. May be start with stable
> kfunc first. If the new helper is guaranteed stable, then why it cannot be kfunc
> but instead needs to go through the bpf_func_proto and BPF_CALL_x?  In time, I
> hope the bpf helper support in the verifier can be quieted down (eg.
> check_helper_call vs check_kfunc_call) and focus energy into kfunc like inlining
> kfunc...etc.


Sorry, I am late to this discussion. The way I read this is that
kfuncs and helpers are implementation details and the real question is
about the stability and mutability of the helper methods.

I think there are two kinds of BPF program developers, and I might be
oversimplifying to convey a point here:

[1] Tracing people: They craft tracing programs and are more
accustomed to probing deeper into kernel internals, handling variable
renames and consequently will tolerate a kfunc changing its signature,
being renamed or disappearing.

[2] Network people: They are not accustomed to mutability the same way
as the tracing people. If there is mutability here, these users will
face a change in developer experience.

I see two paths forward here:

[a] We want to somewhat preserve the developer experience of [2] and
we find a way to do somewhat stable APIs. kfuncs have the benefit that
they are eventually mutable, but a longer stability guarantee for
helpers used by [2] could ameliorate the pains of mutability. e.g.
something we could do for certain helpers is a deprecation story, e.g.
a kfunc won't change for X kernel versions, or when we annotate kfuncs
as deprecated, libbpf can warn users "this kfunc is going away in
kernel version Z").

If this would be difficult to guarantee and we do care about developer
experience, we might need to have some helpers exposed as UAPI.

[b] We accept the fact the user experience will change more for [2]
and that's a trade-off we accept. IMHO, this is not ideal and while
tracing folks have found a way to cope, it would be yet another thing
to worry about for folks who are not used to it.

There are things we can do to make it slightly less burdensome for the
user by adding a shim in BPF headers (however, it won't solve problems
for everyone though e.g. inline BPF, other languages but will give
them a template for their respective "shims").

Another thing to consider if there are use-cases where some users
disable BTF (for whatever reason, like running BPF in a pacemaker :P
or in extremely low memory cases).

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-04 23:47                         ` David Vernet
@ 2023-01-05 21:01                           ` Andrii Nakryiko
  2023-01-06  2:54                             ` Alexei Starovoitov
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-05 21:01 UTC (permalink / raw)
  To: David Vernet
  Cc: Alexei Starovoitov, Joanne Koong, bpf, Andrii Nakryiko,
	kernel-team, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu

Didn't find the best place to put this, so it will be here. I think it
would be beneficial to discuss BPF helpers freeze in BPF office hours.
So I took the liberty to put it up for next BPF office hours, 9am, Jan
12th 2022. I hope that some more people that have exposure to
real-world BPF application and pains associated with all that could
join the discussion, but obviously anyone is welcome as well, no
matter which way they are leaning.

Please consider joining, see details on Zoom meeting at [0]

For the rest, please see below. I'll be out for a few days and won't
be able to reply, my apologies.

  [0] https://docs.google.com/spreadsheets/d/1LfrDXZ9-fdhvPEp_LHkxAMYyxxpwBXjywWa0AejEveU/edit#gid=0

On Wed, Jan 4, 2023 at 3:47 PM David Vernet <void@manifault.com> wrote:
>
> On Wed, Jan 04, 2023 at 01:55:32PM -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > > > Yes, we won't change existing helpers, but we can add new ones if we
> > > > need to extend them. That's how APIs work. Yes, they need careful
> > > > considerations when designing and implementing new APIs. Yes, mistakes
> > > > do happen, that's just fact of life and par for the course of software
> > > > development. Yes, we have to live with those mistakes. Nothing changed
> > > > about that.
> > > >
> > > > But somehow libraries and kernel still produce stable APIs and
> > > > maintain them because they clearly provide benefits to end users.
> > >
> > > Did you 'live with mistakes done in libbpf 0.x' ? No.
> >
> > for a long time yes. And it's not apples to apples comparison, with
> > library it is possible to deprecate APIs, which is what we did. With
> > lots of work and gradual transition, but did it.
>
> User space <-> kernel is not an apples to apples comparison with kernel
> <-> BPF programs either. Also, you're using the word "possible" here
> like it's a foregone conclusion. It is "possible" to deprecate BPF APIs
> as well, if we start using kfuncs going forward instead of adding to the
> UAPI boundary.

I'm not sure what to make out of this reply, to be honest. Yes, I
think kernel and libraries are sufficiently different to not draw
direct comparisons. No, I didn't claim anything about foregone
conclusions. I think it's even possible to deprecate BPF helpers, if
we really want to. In the end, technically, the only UAPI part about
BPF helper is it's ID. That should stay fixed. We do change over time
which helpers are available in which program types. Yes, it would be
really bad to change helper signature and I'd be very much against
this, but from my perspective (and I'm sure others will disagree),
it's in the realm of possibility to do gradual deprecation of some
helpers. We'll leave BPF_FUNC_xxx enumerator intact, of course, but
add a simple wrapper that will just -ENOTSUP.

E.g., Linus requested bpf_probe_read() to not exist and not be used,
everyone agreed. Good opportunity?

But really, we are going on so many tangents instead of addressing
specific points. As I said early on in the discussion, this will be a
discussion to exhaustion of one side or the other, unfortunately.

>
> > If we couldn't pull this through, yeah, I would live with whatever
> > APIs are there. And added new ones as a better replacement. As is
> > always done for APIs, nothing new here.
>
> The point is that you had a choice.

The point is that UAPI stability is not the end of the world and
paranoia is bad. We shouldn't get paralyzed because we add APIs. We do
that to libbpf and APIs will stay stable within entire 1.x version.
Yes, we don't have such a nice "luxury" with kernel, but see above.
There are libraries that go to great lengths to keep old APIs, however
broken or inconvenient they are. Yes, it's a pain, but it doesn't
paralyze development.

>
> > Within 0.x and 1.x APIs are stable and we live with them. This API
> > stability fear doesn't paralyze libbpf development, we still add new
> > stable APIs, if they are considered useful and thought through enough.
>
> Nobody is claiming that we can't have stable APIs. We're arguing in
> favor of being able to _choose_ which APIs to deprecate. Using your
> logic, you wouldn't have been able to deprecate _anything_ for fear of
> some user, somewhere being affected by it. I understand the sentiment,
> and I agree that it's very important to have conservative and
> predictable approaches to deprecation. What I don't think is important
> is to provide _indefinite_ guarantees for _all_ APIs between two
> different kernel contexts.
>
> And to reiterate, as I've said a few times now but nobody seems to be
> responding to (unless I missed something), this is for kernel <-> kernel
> programs. We're not even talking about APIs that are available to user
> space. Let's at least be clear about the boundaries for which we're
> debating the merits of stability, because while some user space tooling
> would certainly affected by choosing to freeze BPF helpers, kfuncs and
> BPF helpers are ever invoked by _kernel_ programs.

I'm also for the choice. And freezing BPF helpers removes this choice.
I want to have functionality that won't depend on arch-specific kfunc
support, won't depend on BTF, etc.

Think about it this way (and try to avoid the temptation to point out
imperfections of analogy). How would you feel if Rust added slice
support, and said that it will work in some super basic form
everywhere. But some things, like deriving subslice or checking
slice's size would be  architecture-specific, they will initially work
in Tier 1 supported architectures, maybe or soon they might work on
Tier 2, but unlikely to work on Tier 3, unless someone will do a bunch
of highly technical work and signs up to maintain it going forward.
Does this sound reasonable for something that is a stable and simple
abstraction, which should feel like an integral part of the BPF
framework. It doesn't have any ties into arch-specific details, it
doesn't require debug information to be usable and efficient, etc.
Alas.

Another example. I'm adding BPF open-coded iterators. One of them is
fundamentally an improved (in terms of functionality and ergonomics)
version of bpf_loop() and bounded BPF loop support. It consists of a
black-box struct bpf_iter to keep state and three helpers:
bpf_iter_range_new(), bpf_iter_range_next() and
bpf_iter_range_destroy(). It can be used roughly like this:

struct bpf_iter it;
int N = ..., *v, i;

bpf_iter_range_new(&it, 0, N);
while ((v = bpf_iter_range_next(&it))) {
  i = *v;
  /* use i which will take values from 0 to N-1 */
}

Not too bad, but a bit verbose. I'd like to add a simple macro to help
write this a bit more natural. Right now I know how to do it so that
is looks roughly like this:

bpf_for(i, 0, N, ({
    /* my code using i and any other local variables */
}));

Here's a few concerns if I'm made to do these bpf_iter_xxx() functions as kfunc:

a) I'll have an ability to do this iteration only on architectures
that do support kfunc, which is not *all* architectures that support
BPF. So there are case where I can write some BPF programs, kernel
could be recent enough to support bpf_iter_*() APIs, but I won't be
able to rely on my BPF applications (which is some simple tool that
doesn't need anything fancy from BPF, no BTF, no BPF trampoline, no
nothing, I just want to trace some uprobes and USDTs, fetch some data
from user-space app, do post-processing, maintains few simple ARRAY
and HASH maps, dump data through perfbuf/ringbuf).

Why do I need to explain to customers why they can't use bpf_iter_*()
even if they have a recent kernel? There is no reason for a simple
looping construct to require all this extra baggage. ZERO.

b) I'd like to provide bpf_for() macro from libbpf. Well, whether you
agree or not, but libbpf does provide stable APIs as well. bpf_for()
can't be really stable because bpf_iter_*() funcs are declared
unstable (and if they are stable, then why can't I make them BPF
helpers). If something change, it will be on libbpf to come up with
some creative ingenious work arounds. If they get removed -- oops, too
bad, libbpf.

Also given that kfuncs are not part of bpf_helper_defs.h (and
shouldn't, they are unstable), I'll have to define __ksym definitions
for necessary APIs somewhere in the same header where bpf_for() is
defined. Luckily (I checked, not too lazy to try solve problems
end-to-end, would be happy for someone to reply to my specific request
to do the same, but alas), it's ok to have multiple duplicated externs
__ksym definitions. So it's annoying, but at least not impossible.

I know what will come next: proposal to add some unstable headers and
APIs to libbpf and stuff. It's another discussion, everything is
possible, etc, etc. But I'm hoping that at least some people will
garner a bit of empathy for consequences of these helpers vs kfunc
choices.


Just to reiterate. I have no problem with kfuncs per se. Task struct,
ct, xfrm, whatever other things that are working with kernel objects
-- totally makes sense to have them as kfunc. Totally.

But concepts like dynptr (memory slice), for loop, etc. I see zero,
absolutely zero, reason to dictate that they should be unstable and
arch-specific.

>
> > > You've introduced libbpf 1.0 with incompatible api and some users suffereed.
> >
> > By "suffered" you mean a few systemd folks being grumpy about this?
> > And having to do 100 lines of code changes ([0]) to support two
> > incompatible major versions of libbpf *simultaneously*?
> >
> > On the other hand we got a library with saner error propagation
> > behavior and various API normalizations and additions. Not too bad of
> > a trade off.
>
> This sounds like an argument in favor of why it is acceptable to
> deprecate some things? Why are some users allowed to feel "pain" (a term
> you've used in other threads), but other users who are affected by your
> choices are just "grumpy"? Also, what about the myriad hypothetical
> users you've never heard of (the ones who we're really protecting with
> UAPI) who had to deal with breaking API stability changes?

I think you are twisting what I'm arguing for. I didn't say that
everything should be stable, did I? I'm saying some things should be
stable, like dynptr and for loop iterator.

As for the libbpf deprecation process. I'm happy to discuss how it
went and what could have been done better. But I don't think this
thread is the place to discuss this. Please, ping me offline or start
a separate thread.

>
> > Sure, deprecation is not easy or free, there was a lot of prep work,
> > and some users had to adjust their code to use new APIs. But this is
> > quite a tangent.
>
> I don't see how this is tangential to the discussion -- it seems very
> relevant. From my perspective, the core of the discussion has been
> whether it's acceptable to shift _any_ of the burden of API stability to
> users. My point, and I believe Alexei's point as well, is that the
> answer is "it depends and it's a tradeoff", as you've essentially said
> here.

Interesting. Alexei is saying "no more BPF helpers", and that has all
the consequences I outlined above (and probably more I haven't thought
about). Daniel is asking to have this "it depends" option by not
taking such a hard line on BPF helpers freeze.

From my perspective, the core of the discussion is whether stability
of UAPI is the paramount issue that overshadows everything else or
not. Me and Daniel are saying no, you and Alexei are arguing yes.

>
> What I'm failing to understand is why your argument that there are
> tradeoffs applies here, but not for kernel <-> BPF kernel programs? I'm
> genuinely trying to understand what the distinction is, because from
> where I'm sitting it feels like we're being selective about when the
> unknown _threat_ of API instability automatically completely overrides
> our ability to choose our own deprecation and stability story (a
> stability story which is informed by our perception of an API's
> importance, usage, etc).

There is some misunderstanding obviously. I'm all for flexibility and
considering tradeoffs. But dictating "no more BPF helpers" is not
that, it's the opposite of that. And yes, I do not believe that UAPI
stability is the most important and the only aspect that should be
taken into consideration.

I really hope that specific points about dynptr and for loop iterator
help you understand my position. It's not even so much a stability
(though that matter for core concepts, obviously), but rather all the
incidental complexities, dependencies, and limitations that come with
kfuncs (and some, like arch-specific support, are fundamental; while
others, like detecting their support are currently big hurdles, but
could be solved; and let's solve them first, before taking these hard
stances, not the other way around).

>
> Note that my point here applies to something you've raised on other
> threads as well, such as on [0] where you (reasonably) reiterated this
> point:
>
> [0]: https://lore.kernel.org/all/CAEf4BzY0aJNGT321Y7Fx01sjHAMT_ynu2-kN_8gB_UELvd7+vw@mail.gmail.com/
>
> > But again. Let me repeat my point *again*. BPF helpers and kfuncs are
> > not mutually exclusive, both can and should exist and evolve. That's
> > one of the main points which is somehow eluding this conversation.
>
> This is one of the big disconnects for me. If you argue that both BPF
> helpers and kfuncs can and should continue to coexist indefinitely, it
> feels like you're arguing for two incompatible points (and please
> correct me anywhere that I'm unintentionally misrepresenting your
> perspective here):
>
> - On the one hand you're arguing that in some cases, _no_ API
>   instability is acceptable. That in general, the main kernel <-> kernel
>   BPF program API boundary is equivalent to UAPI, and that it's _never_
>   acceptable for us to ever, _ever_ deprecate certain APIs because

you are being hyperbolic and overdramatic again for no good reason,
"ever, _ever_" -- really? There is no such thing.

>   _some_ users may be using them, and the possibilty of APIs ever
>   changing or being deprecated will impose an unacceptable pain to users
>   which will make it too difficult to build tooling and, and end up
>   discouraging adoption onto BPF. It seems that you've been making
>   making this argument in favor of what you consider to be "core" BPF
>   helpers such as bpf_dynptr_is_null(), etc.
>
> - At the same time, on the other hand, you're arguing that _some_ of the
>   API boundary between kernel <-> BPF program can be unstable. That it's
>   acceptable for _some_ users and _some_ tooling to feel the pain of
>   certain APIs changing. To perhaps extrapolate your point a bit
>   further, you're arguing that niche / non-core kfuncs can be unstable,
>   and that we don't have to worry about the unknown, hypothetical user
>   who would feel pain from having to deal with them being deprecated,
>   because they're not "core".
>

Yes, but I don't see the contradiction. If BPF map abstraction and its
API was declared unstable (and made arch-specific, this is not a small
detail which you conveniently want to ignore below), I as a user would
think twice before using them. Depends on the situation and what I'm
trying to do. Developing some app within Meta internally -- should,
I'd probably still go for it. But building some tool like perf or
retsnoop -- I'd think twice if I want to take dependency on BPF map
(or dynptr for that matter), if it potentially limits the
applicability of my application.

But when we think about kfuncs that work with kernel object
(task_struct, sockets, whatnot), yes, it's reasonable that we in BPF
can't guarantee stability of those (though I'd very much hope that we
wouldn't willy-nilly keep changing them for no good reason and do
reasonable effort to isolate end users from some reasonable underlying
changes to how task/socket/etc are handled within kernel). If tomorrow
the kernel decides to drop socket abstraction, I don't think BPF
subsystem should "emulate" it somehow (though even that depends, tbh).

So yes, I don't see contradictions. With BPF map, dynptr, (some)
iterators -- BPF controls its destiny, it can and should provide an
unassuming interface, abstractions, APIs and stick to supporting them
and not dictating arbitrary extra dependencies.

> Assuming that's all true, my question is:
>
> Why not just give ourselves the _option_ of being able to deem those
> core helpers as being indefinitely stable for the foreseeable future,
> and keep the unstable kfuncs to have the same stability guarantees as
> what they have today? In terms of _stability_ specifically (so ignoring
> other concerns you've raised, such as that we need BTF and BPF
> trampoline support for kfuncs -- not because they're irrelevant, but
> just to keep the discussion focused on stability), what do we gain by

Quite convenient to ignore very important limitations, of course. But
hopefully I addressed your question above?

> keeping the "core" / "stable" functions as BPF helpers, instead of just
> making them "super stable" kfuncs? At least then we have the option in
> the far-far-far future to deprecate them if they eventually, way later,
> become 100% obsolete. Plus you get the other benefits that Alexei
> mentioned such as potentially being able to backport them to older
> kernels by including them in modules, etc.
>
> Note that I'm not saying with 100% conviction that we don't have _any_
> work to do before freezing helpers (though IMO we should just rip the
> bandaid and do it now), but I am arguing with strong conviction that
> once any of that precursor work is taken care of, there is no reason to
> use BPF helpers in place of kfuncs. At least, that's how I see it at
> this point.

I disagree about ripping the bandaid and precluding dynptr framework
to be whole before we solve various problems I pointed out in [1]
(which unfortunately was mostly ignored, it seems).

And for the "for loop iterator", I absolutely do not want to have a
useful generic abstraction for repeatable loop, that will have few
asterisks associated with them, dictating which arches and what kernel
config values (beyond basic BPF ones) should be ensured to make
iteration work. Kills any motivation to finish it. Imagine if HASH map
didn't work on some new minor platform, even though basic BPF works
there. How does that sound to you?

  [1] https://lore.kernel.org/all/CAEf4BzY0aJNGT321Y7Fx01sjHAMT_ynu2-kN_8gB_UELvd7+vw@mail.gmail.com/

>
> >   [0] https://github.com/systemd/systemd/pull/24511/
> >
> > >
> > > > We'll get the same amount of flame when we try to change kfunc that's
> > > > widely adopted.
> > >
> > > Of course. That's why we need to define a stability and deperecation
> > > plan for them.
> >
> > Lots of things that need to be defined and figured out, but we are
> > already quick to freeze BPF helpers.
>
> I agree with you that it would be prudent for us to iron some of this
> out more concretely. In this discussion it seems like one of the key
> points of contention has been around stability, and that the lack of a
> concrete policy for kfuncs has largely (but not completely) been the
> cause for concern. Perhaps it would help clarify things if someone
> submitted a patch set that included a more formal kfunc stability
> proposal?

Stability isn't the only concern, hopefully I made this clear above.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-05  0:13                                   ` Martin KaFai Lau
  2023-01-05 17:17                                     ` KP Singh
@ 2023-01-05 21:02                                     ` Andrii Nakryiko
  1 sibling, 0 replies; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-05 21:02 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David Vernet, Joanne Koong, bpf, kernel-team, Alexei Starovoitov,
	Song Liu

On Wed, Jan 4, 2023 at 4:13 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 1/4/23 11:37 AM, Alexei Starovoitov wrote:
> > Would you invest in developing application against unstable syscall API? Absolutely.
> > People develop all tons of stuff on top of fuse-fs. People develop apps that interact
> > with tracing bpf progs that are clearly unstable. They do suffer when kernel side
> > changes and people accept that cost. BPF and tracing in general contributed to that mind change.
> > In a datacenter quite a few user apps are tied to kernel internals.
> >
> >> Imho, it's one of BPF's strengths and
> >> we should keep the door open, not close it.
> > The strength of BPF was and still is that it has both stable and unstable interfaces.
> > Roughly: networking is stable, tracing is unstable.
> > The point is that to be stable one doesn't need to use helpers.
> > We can make kfuncs stable too if we focus all our efforts this way and
> > for that we need to abandon adding helpers though it's a pain short term.
> >
> >>>> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> >>>> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> >>> "Proper BPF helper" model is broken.
> >>> static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> >>>
> >>> is a hack that works only when compiler optimizes the code.
> >>> See gcc's attr(kernel_helper) workaround.
> >>> This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> >>> And because it's uapi we cannot even fix this
> >>> With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> >>> These tools don't exist yet, but we have a way forward whereas with helpers
> >>> we are stuck with -O2.
> >> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
> >> think -O0 is a requirement or barrier for that. It may open up possibilities for
> >> new tools, but production is still running with -O2. Proper BPF helper model is
> >> broken, but everyone relies on it, and will be for a very very long time to come,
> >> whether we like it or not. There is a larger ecosystem around BPF devs outside of
> >> kernel, and developers will use the existing means today. There are recommendations /
> >> guidelines that we can provide but we also don't have control over what developers
> >> are doing. Yet we should make their life easier, not harder.
> > Fully fleshed out kfunc infra will make developers job easier. No one is advocating
> > to make users suffer.
>
> It is a long discussion. I am replying on a thread with points that I have also
> been thinking about kfunc and helper.
>
> I think bpf helper is a kernel function but helpers need to be defined in a more
> tedious form. It requires to define bpf_func_proto and then wrap into
> BPF_CALL_x. It was not obvious for me to get around to understand the reason

This is subjective and there is no point in arguing about that. I find
BPF helper definitions more obvious and more discoverable, for
example. But it doesn't matter what I prefer personally.

Whatever the case might be, this is purely internal implementation
detail that can be improved and unified much more between helpers and
kfuncs, and it's way less important compared to stability and
usability issues brought up in this thread, as it has no bearing on
user's experience.

> behind it. With kfunc, it is a more natural way for other kernel developers to
> expose subsystem features to bpf prog. In time, I believe we will be able to
> make kfunc has a similar experience as EXPORT_SYMBOL_*.

The original goal for kfuncs was to just directly expose kernel
functions as is, but then we ended up adding allowlists, tuning them,
fixing them, reworking them. We are talking about different lists per
different program types, etc. But again, this is internal matters.

There is fundamentally no difference between how kfunc and helpers
can/should be defined, they are both kernel functions with additional
annotations. If we put work into it we can converge the mechanics of
how they are defined.


>
> Thus, for subsystem (hid, fuse, netdev...etc) exposing functions to bpf prog, I
> think it makes sense to stay with kfunc from now on. The subsystem is not
> exposing something like syscall as an uapi. bpf prog is part of the kernel in
> the sense that it extends that subsystem code. I don't think bpf needs to
> provide extra and more guarantee than the EXPORT_SYMBOL_* in term of api. That
> said, we should still review kfunc in a way that ensuring it is competent to the
> best of our knowledge at that point with the limited initial use cases at hand.
> I won't be surprised some of the existing EXPORT_SYMBOL_* kernel functions will
> be exposed to the bpf prog as kfunc as-is without any change in the future. For
> example, a few tcp cc kfuncs such as tcp_slow_start. They are likely stable
> without much change for a long time. It can be directly exposed as bpf kfunc.
> kfunc is a way to expose subsystem function without needing the bpf_func_proto
> and BPF_CALL_x quirks. When the function can be dual compiled later, the kfunc
> can also be inlined.
>
> If kfunc will be used for subsystem, it is very likely the number of kfunc will
> grow and exceed the bpf helpers soon.  This seems to be a stronger need to work
> on the user experience problems about kfunc that have mentioned in this thread
> sooner than later. They have to be solved regardless. May be start with stable
> kfunc first. If the new helper is guaranteed stable, then why it cannot be kfunc
> but instead needs to go through the bpf_func_proto and BPF_CALL_x?  In time, I
> hope the bpf helper support in the verifier can be quieted down (eg.
> check_helper_call vs check_kfunc_call) and focus energy into kfunc like inlining
> kfunc...etc.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-05 17:17                                     ` KP Singh
@ 2023-01-05 21:03                                       ` Andrii Nakryiko
  2023-01-06  1:32                                         ` KP Singh
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-05 21:03 UTC (permalink / raw)
  To: KP Singh
  Cc: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, David Vernet, Joanne Koong, bpf, kernel-team,
	Alexei Starovoitov, Song Liu

On Thu, Jan 5, 2023 at 9:17 AM KP Singh <kpsingh@kernel.org> wrote:
>
> On Thu, Jan 5, 2023 at 1:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >
> > On 1/4/23 11:37 AM, Alexei Starovoitov wrote:
> > > Would you invest in developing application against unstable syscall API? Absolutely.
> > > People develop all tons of stuff on top of fuse-fs. People develop apps that interact
> > > with tracing bpf progs that are clearly unstable. They do suffer when kernel side
> > > changes and people accept that cost. BPF and tracing in general contributed to that mind change.
> > > In a datacenter quite a few user apps are tied to kernel internals.
> > >
> > >> Imho, it's one of BPF's strengths and
> > >> we should keep the door open, not close it.
> > > The strength of BPF was and still is that it has both stable and unstable interfaces.
> > > Roughly: networking is stable, tracing is unstable.
> > > The point is that to be stable one doesn't need to use helpers.
> > > We can make kfuncs stable too if we focus all our efforts this way and
> > > for that we need to abandon adding helpers though it's a pain short term.
> > >
> > >>>> to actual BPF helpers by then where we go and say, that kfunc has proven itself in production
> > >>>> and from an API PoV that it is ready to be a proper BPF helper, and until this point
> > >>> "Proper BPF helper" model is broken.
> > >>> static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> > >>>
> > >>> is a hack that works only when compiler optimizes the code.
> > >>> See gcc's attr(kernel_helper) workaround.
> > >>> This 'proper helper' hack is the reason we cannot compile bpf programs with -O0.
> > >>> And because it's uapi we cannot even fix this
> > >>> With kfuncs we will be able to compile with -O0 and debug bpf programs with better tools.
> > >>> These tools don't exist yet, but we have a way forward whereas with helpers
> > >>> we are stuck with -O2.
> > >> Better debugging tools are needed either way, independent of -O0 or -O2. I don't
> > >> think -O0 is a requirement or barrier for that. It may open up possibilities for
> > >> new tools, but production is still running with -O2. Proper BPF helper model is
> > >> broken, but everyone relies on it, and will be for a very very long time to come,
> > >> whether we like it or not. There is a larger ecosystem around BPF devs outside of
> > >> kernel, and developers will use the existing means today. There are recommendations /
> > >> guidelines that we can provide but we also don't have control over what developers
> > >> are doing. Yet we should make their life easier, not harder.
> > > Fully fleshed out kfunc infra will make developers job easier. No one is advocating
> > > to make users suffer.
> >
> > It is a long discussion. I am replying on a thread with points that I have also
> > been thinking about kfunc and helper.
> >
> > I think bpf helper is a kernel function but helpers need to be defined in a more
> > tedious form. It requires to define bpf_func_proto and then wrap into
> > BPF_CALL_x. It was not obvious for me to get around to understand the reason
> > behind it. With kfunc, it is a more natural way for other kernel developers to
> > expose subsystem features to bpf prog. In time, I believe we will be able to
> > make kfunc has a similar experience as EXPORT_SYMBOL_*.
> >
> > Thus, for subsystem (hid, fuse, netdev...etc) exposing functions to bpf prog, I
> > think it makes sense to stay with kfunc from now on. The subsystem is not
> > exposing something like syscall as an uapi. bpf prog is part of the kernel in
> > the sense that it extends that subsystem code. I don't think bpf needs to
> > provide extra and more guarantee than the EXPORT_SYMBOL_* in term of api. That
> > said, we should still review kfunc in a way that ensuring it is competent to the
> > best of our knowledge at that point with the limited initial use cases at hand.
> > I won't be surprised some of the existing EXPORT_SYMBOL_* kernel functions will
> > be exposed to the bpf prog as kfunc as-is without any change in the future. For
> > example, a few tcp cc kfuncs such as tcp_slow_start. They are likely stable
> > without much change for a long time. It can be directly exposed as bpf kfunc.
> > kfunc is a way to expose subsystem function without needing the bpf_func_proto
> > and BPF_CALL_x quirks. When the function can be dual compiled later, the kfunc
> > can also be inlined.
> >
> > If kfunc will be used for subsystem, it is very likely the number of kfunc will
> > grow and exceed the bpf helpers soon.  This seems to be a stronger need to work
> > on the user experience problems about kfunc that have mentioned in this thread
> > sooner than later. They have to be solved regardless. May be start with stable
> > kfunc first. If the new helper is guaranteed stable, then why it cannot be kfunc
> > but instead needs to go through the bpf_func_proto and BPF_CALL_x?  In time, I
> > hope the bpf helper support in the verifier can be quieted down (eg.
> > check_helper_call vs check_kfunc_call) and focus energy into kfunc like inlining
> > kfunc...etc.
>
>
> Sorry, I am late to this discussion. The way I read this is that
> kfuncs and helpers are implementation details and the real question is
> about the stability and mutability of the helper methods.
>
> I think there are two kinds of BPF program developers, and I might be
> oversimplifying to convey a point here:
>
> [1] Tracing people: They craft tracing programs and are more
> accustomed to probing deeper into kernel internals, handling variable
> renames and consequently will tolerate a kfunc changing its signature,
> being renamed or disappearing.
>
> [2] Network people: They are not accustomed to mutability the same way
> as the tracing people. If there is mutability here, these users will
> face a change in developer experience.
>
> I see two paths forward here:

As I mentioned in another reply, I took a liberty to add "BPF helpers
freeze" as a topic for next BPF office hours. It's probably going to
be a bit more productive to discuss it there. WDYT?

>
> [a] We want to somewhat preserve the developer experience of [2] and
> we find a way to do somewhat stable APIs. kfuncs have the benefit that
> they are eventually mutable, but a longer stability guarantee for
> helpers used by [2] could ameliorate the pains of mutability. e.g.
> something we could do for certain helpers is a deprecation story, e.g.
> a kfunc won't change for X kernel versions, or when we annotate kfuncs
> as deprecated, libbpf can warn users "this kfunc is going away in
> kernel version Z").
>
> If this would be difficult to guarantee and we do care about developer
> experience, we might need to have some helpers exposed as UAPI.
>
> [b] We accept the fact the user experience will change more for [2]
> and that's a trade-off we accept. IMHO, this is not ideal and while
> tracing folks have found a way to cope, it would be yet another thing
> to worry about for folks who are not used to it.
>
> There are things we can do to make it slightly less burdensome for the
> user by adding a shim in BPF headers (however, it won't solve problems
> for everyone though e.g. inline BPF, other languages but will give
> them a template for their respective "shims").
>
> Another thing to consider if there are use-cases where some users
> disable BTF (for whatever reason, like running BPF in a pacemaker :P
> or in extremely low memory cases).

There are various embedded systems (which usually means stricter
memory requirements and less mainstream architectures) and people are
experimenting with them, trying to run libbpf-tools and such there, or
building their own tracing tools. I keep getting Github issues in
libbpf-bootstrap and libbpf about something not working on some
embedded system and it's absolutely unclear why. I'd rather not have
to debug stuff like this for dynptr or for the loop iterator.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-05 21:03                                       ` Andrii Nakryiko
@ 2023-01-06  1:32                                         ` KP Singh
  0 siblings, 0 replies; 57+ messages in thread
From: KP Singh @ 2023-01-06  1:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, David Vernet, Joanne Koong, bpf, kernel-team,
	Alexei Starovoitov, Song Liu

[...]

> > I see two paths forward here:
>
> As I mentioned in another reply, I took a liberty to add "BPF helpers
> freeze" as a topic for next BPF office hours. It's probably going to
> be a bit more productive to discuss it there. WDYT?

Perfect, much easier to discuss during office hours. Thanks for adding it!

>
> >
> > [a] We want to somewhat preserve the developer experience of [2] and
> > we find a way to do somewhat stable APIs. kfuncs have the benefit that

[...]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-05 21:01                           ` Andrii Nakryiko
@ 2023-01-06  2:54                             ` Alexei Starovoitov
  2023-01-09 17:46                               ` Andrii Nakryiko
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-06  2:54 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Vernet, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Jan 05, 2023 at 01:01:56PM -0800, Andrii Nakryiko wrote:
> Didn't find the best place to put this, so it will be here. I think it
> would be beneficial to discuss BPF helpers freeze in BPF office hours.
> So I took the liberty to put it up for next BPF office hours, 9am, Jan
> 12th 2022. I hope that some more people that have exposure to
> real-world BPF application and pains associated with all that could
> join the discussion, but obviously anyone is welcome as well, no
> matter which way they are leaning.
> 
> Please consider joining, see details on Zoom meeting at [0]
> 
> For the rest, please see below. I'll be out for a few days and won't
> be able to reply, my apologies.
> 
>   [0] https://docs.google.com/spreadsheets/d/1LfrDXZ9-fdhvPEp_LHkxAMYyxxpwBXjywWa0AejEveU/edit#gid=0

Thanks for adding it to the agenda.
Hopefully we'll be able to converge faster on a call.

There are several things to discuss:
1. whether or not to freeze helpers.
2. whether dynptr accessors should be helpers or kfuncs.
3. whether your future inline iterators should be helpers or kfuncs.
4. whether cilium's bpf_sock_destroy should be helper or kfunc.

If we hard freeze helpers in 1 it automatically decides the fate for 2, 3, 4.
We can soft freeze the helpers then 2,3,4 are up for discussion.
Looks like the thread so far was primarily about 1.
4 was touched separately. Daniel hasn't replied yet to my suggestion for it to be kfunc.
You insist that 2 and 3 must be helpers.
No one seen the patches for 3. I've seen you whiteboard them. It's impossible
for others to participate without patches, so let's postpone that.

Let's try to focus this thread on 2 assuming both helpers and kfuncs
are on the table for dynptrs...

> conclusions. I think it's even possible to deprecate BPF helpers, if
> we really want to. In the end, technically, the only UAPI part about
> BPF helper is it's ID. That should stay fixed. We do change over time
> which helpers are available in which program types. Yes, it would be
> really bad to change helper signature and I'd be very much against
> this, but from my perspective (and I'm sure others will disagree),
> it's in the realm of possibility to do gradual deprecation of some
> helpers. We'll leave BPF_FUNC_xxx enumerator intact, of course, but
> add a simple wrapper that will just -ENOTSUP.

Unfortunately you're completely wrong in the above paragraph.
I suggest to read this Linus's rant first:
https://lkml.org/lkml/2012/12/23/75

Everything that user space sees we cannot change.
We can try to, but it will be reverted if users complain.
That's why we never try unless there is a very strong reason like security issue.

For example your last commit to uapi/bpf.h
commit 8a76145a2ec2 ("bpf: explicitly define BPF_FUNC_xxx integer values")
is a leap of faith.
Though we tried to make it as transparent as possible and
I googled BPF_FUNC_MAPPER before applying the patch to see in what weird ways
people can use the macro, there is still a non zero chance that
we would have to revert it if users complain loud enough.

For example cilium has this bit of code:
https://github.com/cilium/ebpf/blob/master/asm/func.go
I suspect it's broken now, because you've changed 'FN' macro in that commit.
Cilium folks are unlikely to complain and demand a revert, so we should be safe
in this regard, but we cannot assume that for other users.

It should be obvious that we cannot deprecate helpers with ENOTSUP
or deprecate them in any other way.

> E.g., Linus requested bpf_probe_read() to not exist and not be used,
> everyone agreed. Good opportunity?

It's an exception that proves the rule.
1. it's a security issue that's why uapi breakage was on the table.
2. it wasn't completely removed. See:

#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
        case BPF_FUNC_probe_read:
                return security_locked_down(LOCKDOWN_BPF_READ_KERNEL) < 0 ?
                       NULL : &bpf_probe_read_compat_proto;

> The point is that UAPI stability is not the end of the world and
> paranoia is bad. We shouldn't get paralyzed because we add APIs. We do
> that to libbpf and APIs will stay stable within entire 1.x version.
> Yes, we don't have such a nice "luxury" with kernel, but see above.

Exactly. See above. There is no way at all to deprecate helpers.

> >
> > - On the one hand you're arguing that in some cases, _no_ API
> >   instability is acceptable. That in general, the main kernel <-> kernel
> >   BPF program API boundary is equivalent to UAPI, and that it's _never_
> >   acceptable for us to ever, _ever_ deprecate certain APIs because
> 
> you are being hyperbolic and overdramatic again for no good reason,
> "ever, _ever_" -- really? There is no such thing.

Andrii, it's really _ever_. You need to internalize that first
before we discuss this topic again during office hours.

> I'd probably still go for it. But building some tool like perf or
> retsnoop -- I'd think twice if I want to take dependency on BPF map
> (or dynptr for that matter), if it potentially limits the
> applicability of my application.

A quote from retsnoop readme:
"
NOTE: Retsnoop relies on BPF CO-RE technology, so please make sure your Linux
kernel is built with CONFIG_DEBUG_INFO_BTF=y kernel config. Without this
retsnoop will refuse to start.
"
and in calib_feat.bpf.c
/* Detect if bpf_get_func_ip() helper is supported by the kernel.
/* Detect if fentry/fexit re-entry protection is implemented.
/* Detect if fexit is safe to use for long-running and sleepable
/* Detect if bpf_get_branch_snapshot() helper is supported.
/* Detect if BPF_MAP_TYPE_RINGBUF map is supported.
/* Detect if BPF cookie is supported for kprobes.
/* Detect if multi-attach kprobes are supported.

If the feature is useful you will use it. In retsnoop and everywhere else.
Regardless whether it's arch dependent, kernel dependent or unstable.

> I disagree about ripping the bandaid and precluding dynptr framework
> to be whole before we solve various problems I pointed out in [1]
> (which unfortunately was mostly ignored, it seems).

Let's look at your
https://lore.kernel.org/all/CAEf4BzZM0+j6DXMgu2o2UvjtzoOxcjsJtT8j-jqVZYvAqxc52g@mail.gmail.com/
"
1. Generic accessors to check validity of *any* dynptr, and it's
inherent properties like offset, available size, read-only property
(just as useful somethings as bpf_ringbuf_query() is for ringbufs,
both for debugging and for various heuristics in production).

bpf_dynptr_is_null(struct bpf_dynptr *ptr)
long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)

There is nothing to add or remove here. No flags, no change in semantics.
"

You're arguing that it's obviously stable material.
Like:
+BPF_CALL_1(bpf_dynptr_get_offset, struct bpf_dynptr_kern *, ptr)
+{
+    if (!ptr->data)
+         return -EINVAL;
+
+    return ptr->offset;
+}

but we can do it now in native bpf code:

static inline int bpf_dynptr_get_offset(const struct bpf_dynptr *uptr)
{
     struct bpf_dynptr_kern *ptr = bpf_rdonly_cast(uptr, bpf_core_type_id_kernel(struct bpf_dynptr_kern));

     if (!ptr->data)
          return -EINVAL;

     return ptr->offset;
}

No kernel changes necessary. No UAPI helpers. No kfuncs.
CO-RE will take care of kernel version differences.

Do you still insist that it should be a stable uapi helper ?

> And for the "for loop iterator", I absolutely do not want to have a
> useful generic abstraction for repeatable loop, that will have few
> asterisks associated with them, dictating which arches and what kernel
> config values (beyond basic BPF ones) should be ensured to make
> iteration work. Kills any motivation to finish it. 

I'm really sad that you went down this ultimatum path.
Essentially you're saying: "loop iterator has to be stable helper or
I quit working on it."
Say we cave in and accepted your demand. Later you do another ultimatum
and we cannot cave in for whatever reason. You stay true to your words
and quit BPF development. Now we're stuck with your uapi that we cannot
change, cannot improve, but still have to maintain it _forever_
without you because you quit. That would suck.
Let's get back to discussing technical merits without ultimatums. Ok?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-06  2:54                             ` Alexei Starovoitov
@ 2023-01-09 17:46                               ` Andrii Nakryiko
  2023-01-11 21:29                                 ` Song Liu
  0 siblings, 1 reply; 57+ messages in thread
From: Andrii Nakryiko @ 2023-01-09 17:46 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, Joanne Koong, bpf, Andrii Nakryiko, kernel-team,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu

On Thu, Jan 5, 2023 at 6:54 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Jan 05, 2023 at 01:01:56PM -0800, Andrii Nakryiko wrote:
> > Didn't find the best place to put this, so it will be here. I think it
> > would be beneficial to discuss BPF helpers freeze in BPF office hours.
> > So I took the liberty to put it up for next BPF office hours, 9am, Jan
> > 12th 2022. I hope that some more people that have exposure to
> > real-world BPF application and pains associated with all that could
> > join the discussion, but obviously anyone is welcome as well, no
> > matter which way they are leaning.
> >
> > Please consider joining, see details on Zoom meeting at [0]
> >
> > For the rest, please see below. I'll be out for a few days and won't
> > be able to reply, my apologies.
> >
> >   [0] https://docs.google.com/spreadsheets/d/1LfrDXZ9-fdhvPEp_LHkxAMYyxxpwBXjywWa0AejEveU/edit#gid=0
>
> Thanks for adding it to the agenda.
> Hopefully we'll be able to converge faster on a call.

Yep, hopefully. Looking forward to BPF office hours this week.

>
> There are several things to discuss:
> 1. whether or not to freeze helpers.
> 2. whether dynptr accessors should be helpers or kfuncs.
> 3. whether your future inline iterators should be helpers or kfuncs.
> 4. whether cilium's bpf_sock_destroy should be helper or kfunc.
>
> If we hard freeze helpers in 1 it automatically decides the fate for 2, 3, 4.
> We can soft freeze the helpers then 2,3,4 are up for discussion.
> Looks like the thread so far was primarily about 1.

The thread started as 2 and got expanded to 1, but I agree that 2, 3,
and 4 are all separate topics (just predicated on 1 being decided in
favor of not freezing helpers).

> 4 was touched separately. Daniel hasn't replied yet to my suggestion for it to be kfunc.
> You insist that 2 and 3 must be helpers.
> No one seen the patches for 3. I've seen you whiteboard them. It's impossible
> for others to participate without patches, so let's postpone that.

Sure, as I intended to do in [0], except if BPF helpers are
hard-frozen, there would be no discussion to have. But hopefully it's
clear that my example with iterators was about stability and
generality of certain concepts (looping) and how libbpf has stable API
expectations and responsibilities as well.

  [0] https://lore.kernel.org/bpf/CAEf4BzbVoiVSa1_49CMNu-q5NnOvmaaHsOWxed-nZo9rioooWg@mail.gmail.com/

>
> Let's try to focus this thread on 2 assuming both helpers and kfuncs
> are on the table for dynptrs...
>
> > conclusions. I think it's even possible to deprecate BPF helpers, if
> > we really want to. In the end, technically, the only UAPI part about
> > BPF helper is it's ID. That should stay fixed. We do change over time
> > which helpers are available in which program types. Yes, it would be
> > really bad to change helper signature and I'd be very much against
> > this, but from my perspective (and I'm sure others will disagree),
> > it's in the realm of possibility to do gradual deprecation of some
> > helpers. We'll leave BPF_FUNC_xxx enumerator intact, of course, but
> > add a simple wrapper that will just -ENOTSUP.
>
> Unfortunately you're completely wrong in the above paragraph.
> I suggest to read this Linus's rant first:
> https://lkml.org/lkml/2012/12/23/75
>
> Everything that user space sees we cannot change.
> We can try to, but it will be reverted if users complain.

I very well might be and it was my opinion (which I explicitly
acknowledged as certainly being controversial).

This is a completely separate discussion, but on one hand we say it's
fine to remove or change kfuncs, because kfuncs are only visible to
BPF programs, which are kernel-to-kernel programs and user-space rules
do not apply. On the other hand, BPF helpers are also only visible to
BPF programs, the only user-space visible part is enum name and ID.
Yet they are treated very differently.

It's fine, but to me it's more of an issue of a user contract, rather
than some technicality about being defined in some header. It feels
like we should be able to define a contract that some range of IDs
will be "unstable" in the sense that they might start eventually
returning -ENOTSUP if we have reasonable confidence they are not
useful anymore.

But it's just my opinion, and no amount of shouting at me will change that fact.

And as I said before, I don't think BPF helpers are a big maintenance
liability in the first place.


> That's why we never try unless there is a very strong reason like security issue.
>
> For example your last commit to uapi/bpf.h
> commit 8a76145a2ec2 ("bpf: explicitly define BPF_FUNC_xxx integer values")
> is a leap of faith.
> Though we tried to make it as transparent as possible and
> I googled BPF_FUNC_MAPPER before applying the patch to see in what weird ways
> people can use the macro, there is still a non zero chance that
> we would have to revert it if users complain loud enough.
>
> For example cilium has this bit of code:
> https://github.com/cilium/ebpf/blob/master/asm/func.go
> I suspect it's broken now, because you've changed 'FN' macro in that commit.
> Cilium folks are unlikely to complain and demand a revert, so we should be safe
> in this regard, but we cannot assume that for other users.

Sure, all above is true and we discussed all that when reviewing that
patch. And I liked that we could weigh pros and cons in that
particular case, and hopefully can keep doing that.

>
> It should be obvious that we cannot deprecate helpers with ENOTSUP
> or deprecate them in any other way.

I'm fine with that.

>
> > E.g., Linus requested bpf_probe_read() to not exist and not be used,
> > everyone agreed. Good opportunity?
>
> It's an exception that proves the rule.
> 1. it's a security issue that's why uapi breakage was on the table.
> 2. it wasn't completely removed. See:
>
> #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>         case BPF_FUNC_probe_read:
>                 return security_locked_down(LOCKDOWN_BPF_READ_KERNEL) < 0 ?
>                        NULL : &bpf_probe_read_compat_proto;

Sure, not disputing this. I do think that it's just another example
emphasizing that the world is not black and white and there *has* to
be nuance in every decision.

>
> > The point is that UAPI stability is not the end of the world and
> > paranoia is bad. We shouldn't get paralyzed because we add APIs. We do
> > that to libbpf and APIs will stay stable within entire 1.x version.
> > Yes, we don't have such a nice "luxury" with kernel, but see above.
>
> Exactly. See above. There is no way at all to deprecate helpers.

OK.

>
> > >
> > > - On the one hand you're arguing that in some cases, _no_ API
> > >   instability is acceptable. That in general, the main kernel <-> kernel
> > >   BPF program API boundary is equivalent to UAPI, and that it's _never_
> > >   acceptable for us to ever, _ever_ deprecate certain APIs because
> >
> > you are being hyperbolic and overdramatic again for no good reason,
> > "ever, _ever_" -- really? There is no such thing.
>
> Andrii, it's really _ever_. You need to internalize that first
> before we discuss this topic again during office hours.

I'll try to. My point (somewhat subtle, perhaps) was that humans are
very bad about planning 5-10-20-50 years ahead. So any "ever" is
overdramatized and hyperbolic. There might be no BPF, Linux, or
computers in current form in 50 years. I refuse to stress about not
being able to remove BPF helpers in 50 years, sorry.

>
> > I'd probably still go for it. But building some tool like perf or
> > retsnoop -- I'd think twice if I want to take dependency on BPF map
> > (or dynptr for that matter), if it potentially limits the
> > applicability of my application.
>
> A quote from retsnoop readme:
> "
> NOTE: Retsnoop relies on BPF CO-RE technology, so please make sure your Linux
> kernel is built with CONFIG_DEBUG_INFO_BTF=y kernel config. Without this
> retsnoop will refuse to start.
> "
> and in calib_feat.bpf.c
> /* Detect if bpf_get_func_ip() helper is supported by the kernel.
> /* Detect if fentry/fexit re-entry protection is implemented.
> /* Detect if fexit is safe to use for long-running and sleepable
> /* Detect if bpf_get_branch_snapshot() helper is supported.
> /* Detect if BPF_MAP_TYPE_RINGBUF map is supported.
> /* Detect if BPF cookie is supported for kprobes.
> /* Detect if multi-attach kprobes are supported.
>
> If the feature is useful you will use it. In retsnoop and everywhere else.
> Regardless whether it's arch dependent, kernel dependent or unstable.

But I'm just a hostage of these BPF quirks and I very much would like
not to be (or at the very least minimize them)! Do you think I'm happy
that retsnoop won't work on so many different kernel configs and
arches, even though retsnoop would be very useful there? I'm happy I
don't make money off of retsnoop, so I can afford to just say "sorry,
retsnoop won't work in your particular situation, too bad". But if I
had a company and some product that relied on BPF, any such hurdle
would be painful and result in extra support, maintenance, developer
work, lost opportunity, hurdles in adoption, just headaches.

"If the feature is useful you will use it" is missing the nuance
again. Almost every feature can be worked around. And if some feature
adds too many unnecessary complexities and/or dependencies, I might
choose to just work around it. Or use some older feature that's less
convenient, less performant, maybe more fragile, but works.

E.g., instead of using bpf_ringbuf_reserve_dynptr() to minimize amount
of data sent over ringbuf, I'll choose to do bigger fixed-sized chunk,
lose efficiency, but not reduce a variety of kernels and systems that
my app will work on. But in some other situation this extra efficiency
might be the difference between product viability and death, so yeah,
I'll take that hit and do the extra work.

But again, as a BPF user I will feel as a hostage, knowing that it
didn't *have* to be this way.

That's why I'm fighting so passionately *to not add unnecessary
dependencies and complications*.

>
> > I disagree about ripping the bandaid and precluding dynptr framework
> > to be whole before we solve various problems I pointed out in [1]
> > (which unfortunately was mostly ignored, it seems).
>
> Let's look at your
> https://lore.kernel.org/all/CAEf4BzZM0+j6DXMgu2o2UvjtzoOxcjsJtT8j-jqVZYvAqxc52g@mail.gmail.com/
> "
> 1. Generic accessors to check validity of *any* dynptr, and it's
> inherent properties like offset, available size, read-only property
> (just as useful somethings as bpf_ringbuf_query() is for ringbufs,
> both for debugging and for various heuristics in production).
>
> bpf_dynptr_is_null(struct bpf_dynptr *ptr)
> long bpf_dynptr_get_size(struct bpf_dynptr *ptr)
> long bpf_dynptr_get_offset(struct bpf_dynptr *ptr)
> bpf_dynptr_is_rdonly(struct bpf_dynptr *ptr)
>
> There is nothing to add or remove here. No flags, no change in semantics.
> "
>
> You're arguing that it's obviously stable material.
> Like:
> +BPF_CALL_1(bpf_dynptr_get_offset, struct bpf_dynptr_kern *, ptr)
> +{
> +    if (!ptr->data)
> +         return -EINVAL;
> +
> +    return ptr->offset;
> +}
>
> but we can do it now in native bpf code:
>
> static inline int bpf_dynptr_get_offset(const struct bpf_dynptr *uptr)
> {
>      struct bpf_dynptr_kern *ptr = bpf_rdonly_cast(uptr, bpf_core_type_id_kernel(struct bpf_dynptr_kern));
>
>      if (!ptr->data)
>           return -EINVAL;
>
>      return ptr->offset;
> }
>
> No kernel changes necessary. No UAPI helpers. No kfuncs.
> CO-RE will take care of kernel version differences.
>
> Do you still insist that it should be a stable uapi helper ?

Yes!

bpf_rdonly_cast() is kfunc, with all the consequences. And we are not
just exposing internal implementation details of dynptr, we *expect*
users to know, care, and follow them. Neither is great.

These simple helpers I can implement with BPF_CORE_READ() even,
without kfunc dependency, as I already explained before. And it will
even work on kernels with no CO-RE support, thanks to BTFgen.

But I do not consider that a good approach and good API, sorry.
Certainly doesn't make me feel like dynptr is a core first-class
concept in BPF.


And I actually have no such solution for
bpf_dynptr_clone()/bpf_dynptr_advance()/bpf_dynptr_trim(), which is
absolutely critical to make dynptr a standard interface for passing
variable-sized chunks of memory to other helpers and kfuncs.

>
> > And for the "for loop iterator", I absolutely do not want to have a
> > useful generic abstraction for repeatable loop, that will have few
> > asterisks associated with them, dictating which arches and what kernel
> > config values (beyond basic BPF ones) should be ensured to make
> > iteration work. Kills any motivation to finish it.
>
> I'm really sad that you went down this ultimatum path.

This wasn't my intent and that's not what I'm doing here. I'm
explaining my motivation and how I feel about core concepts being part
of stable BPF API offerings. And how the inflexible BPF freeze
approach will hurt adoption. And yes, I'm afraid it might hurt even
the addition of new features if people feel that their work can't be
used universally because of arbitrary policies.

Human factor is real. Don't be sad, but try to see the argument behind
all the words and examples.

> Essentially you're saying: "loop iterator has to be stable helper or
> I quit working on it."
> Say we cave in and accepted your demand. Later you do another ultimatum

I hope you can "cave in" based on technical arguments and feedback
from users of BPF technology, which have to deal with real-world
aspects of all the BPF machinery. And then already have enough to care
about, no need to make their life harder.

I'm saying the loop iterator has to be a stable helper to be
universally used and universally recommended as *the solution for
doing repeatable work*. Without thinking about BTF, kfuncs,
arch-specific stuff. Because there is no reason why a loop iterator
would require any of that.

> and we cannot cave in for whatever reason. You stay true to your words
> and quit BPF development. Now we're stuck with your uapi that we cannot
> change, cannot improve, but still have to maintain it _forever_
> without you because you quit. That would suck.

This was always a risk for many years, that didn't stop BPF from
gaining lots of useful functionality, even if we'd retrospectively
would like to do some things differently.

> Let's get back to discussing technical merits without ultimatums. Ok?

That's what I've been (and still am) doing all this time.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-09 17:46                               ` Andrii Nakryiko
@ 2023-01-11 21:29                                 ` Song Liu
  2023-01-12  4:23                                   ` Alexei Starovoitov
  0 siblings, 1 reply; 57+ messages in thread
From: Song Liu @ 2023-01-11 21:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, David Vernet, Joanne Koong, bpf,
	Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau

 ()

On Mon, Jan 9, 2023 at 9:47 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Jan 5, 2023 at 6:54 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Jan 05, 2023 at 01:01:56PM -0800, Andrii Nakryiko wrote:
> > > Didn't find the best place to put this, so it will be here. I think it
> > > would be beneficial to discuss BPF helpers freeze in BPF office hours.
> > > So I took the liberty to put it up for next BPF office hours, 9am, Jan
> > > 12th 2022. I hope that some more people that have exposure to
> > > real-world BPF application and pains associated with all that could
> > > join the discussion, but obviously anyone is welcome as well, no
> > > matter which way they are leaning.
> > >
> > > Please consider joining, see details on Zoom meeting at [0]
> > >
> > > For the rest, please see below. I'll be out for a few days and won't
> > > be able to reply, my apologies.
> > >
> > >   [0] https://docs.google.com/spreadsheets/d/1LfrDXZ9-fdhvPEp_LHkxAMYyxxpwBXjywWa0AejEveU/edit#gid=0
> >
> > Thanks for adding it to the agenda.
> > Hopefully we'll be able to converge faster on a call.
>
> Yep, hopefully. Looking forward to BPF office hours this week.
>
> >
> > There are several things to discuss:
> > 1. whether or not to freeze helpers.
> > 2. whether dynptr accessors should be helpers or kfuncs.
> > 3. whether your future inline iterators should be helpers or kfuncs.
> > 4. whether cilium's bpf_sock_destroy should be helper or kfunc.

I think these are all big questions. Maybe we can start with some
smaller questions? Here is a list of questions I have:

1. Do we want stable kfuncs (as stable as helpers)? Do we want
   almost stable kfuncs? Will most users of stable APIs be as happy
   with almost stable alternatives?

2. Do we decide the stability of a kfunc when it is first added? Or
    do we plan to promote (maybe also demote?) stability later?

3. Besides stability, what are the concerns with kfuncs? How hard
    is it to resolve them?
    AFAICT, the concerns are: require BTF, require trampoline.
    Anything else? I guess we will never remove BTF dependency.
    Trampoline dependency is hard to resolve, but still possible?

4. We have feature-rich BPF with Linux-x86_64. Do we need some
   bare-minimal BPF, say for Linux-MIPS, or Windows-ARM, or
   even nvme-something? I guess this is also related to the BPF
   standard?

Thanks,
Song

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-03 23:51                             ` Alexei Starovoitov
  2023-01-04 14:25                               ` Daniel Borkmann
@ 2023-01-11 22:56                               ` Maxim Mikityanskiy
  2023-01-12  4:48                                 ` Alexei Starovoitov
  1 sibling, 1 reply; 57+ messages in thread
From: Maxim Mikityanskiy @ 2023-01-11 22:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, David Vernet, Andrii Nakryiko, Joanne Koong,
	bpf, Andrii Nakryiko, kernel-team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, KP Singh

On Tue, Jan 03, 2023 at 03:51:07PM -0800, Alexei Starovoitov wrote:
> On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
> > Discoverability plus being able to know semantics from a user PoV to figure out when
> > workarounds for older/newer kernels are required to be able to support both kernels.
> 
> Sounds like your concern is that there could be a kfunc that changed it semantics,
> but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
> such patches from landing. It's up to us to define sane and user friendly deprecation of kfuncs.

I would advocate for adding versioning to BPF API (be it helpers or
"stable" kfuncs). Right now we have two extremes: helpers that can't be
changed/fixed/deprecated ever, and kfuncs that can be changed at any
time, so the end users can't be sure new kernel won't break their stuff.
Detecting and fixing the breakage can also be tricky: end users have to
write different probes on a case-by-case basis, and sometimes it's not
just a matter of checking the number of function parameters or presence
of some definition (such difficulties happen when backporting drivers to
older kernels, so I assume it may be an issue for BPF programs as well).

Let's say we add a version number to the kernel, and the BPF program
also has an API version number it's compiled for. Whenever something
changes in the stable API on the kernel side, the version number is
increased. At the same time, compatibility on the kernel side is
preserved for some reasonable period of time (2 years, 5 years,
whatever), which means that if the kernel loads a BPF program with an
older version number, and that version is within the supported period of
time, the kernel will behave in the old way, i.e. verify the old
signature of a function, preserve the old behavior, etc.

This approach has the following upsides:

1. End users can stop worrying that some function changes unexpectedly,
and they can have a smoother migration plan.

2. Clear deprecation schedule.

3. Easy way to probe for needed functionality, it's just a matter of
comparing numbers: the BPF program loader checks that the kernel is new
enough, and the kernel checks that the BPF program's API is not too old.

4. Kernel maintainers will have a deprecation strategy.

Cons:

1. Arguably a maintainance burden to preserve compatibility on the
kernel side, but I would say it's a balance between helpers (which are
maintainance burden forever) and kfuncs (which can be changed in every
kernel version without keeping any compatibility). "Kfunc that changed
its semantics is bad, we should prevent such patches" are just words,
but if the developer needs to keep both versions for a while, it will
serve as a calm-down mechanism to prevent changes that aren't really
necessary. At the same time, the dead code will stop accumulating,
because it can be removed according to the schedule.

2. Having a single version number complicates backporting features to
older kernels, it would require backporting all previous features
chronologically, even if there is no direct dependency. Having multiple
version numbers (per feature) is cumbersome for the BPF program to
declare. However, this issue is not new, it's already the case for BPF
helpers (you can't backport new helpers skipping some other, because the
numbers in the list must match).

The above description intentionally doesn't specify whether it should be
applied to helpers or kfuncs, because it's a universal concept, about
which I would like to hear opinions about versioning without bias to
helpers or kfuncs.

Regarding freezing helpers, I think there should be a solution for
deprecating obsolete stuff. There are historical examples of removing
things from UAPI: removing i386 support, ipchains, devfs, IrDA
subsystem, even a few architectures [1]. If we apply the versioning
approach to helpers, we can make long-waiting incompatible changes in
v1, keeping the current set of helpers as v0, used for programs that
don't declare a version. Eventually (in 5 years, in 10 years, whatever
sounds reasonable) we can drop v0 and remove the support for unversioned
BPF programs altogether, similar to how other big things were removed
from the kernel. Does it sound feasible?

[1]: https://lwn.net/Articles/748074/

> "Proper BPF helper" model is broken.
> static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> 
> is a hack that works only when compiler optimizes the code.

What if we replace codegen for helpers, so that it becomes something
like this?

static inline void *bpf_map_lookup_elem(void *map, const void *key)
{
	// pseudocode alert!
	asm("call 1" : : "r1"(map), "r2"(key));
}

I.e. can we just throw in some inline BPF assembly that prepares
registers and invokes a call instruction with the helper number? That
should be portable across clang and gcc, allowing to stop relying on
optimizations.

Any caveats?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-11 21:29                                 ` Song Liu
@ 2023-01-12  4:23                                   ` Alexei Starovoitov
  2023-01-12  7:35                                     ` Song Liu
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-12  4:23 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, David Vernet, Joanne Koong, bpf,
	Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau

On Wed, Jan 11, 2023 at 1:29 PM Song Liu <song@kernel.org> wrote:
>
>  ()
>
> On Mon, Jan 9, 2023 at 9:47 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Jan 5, 2023 at 6:54 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Jan 05, 2023 at 01:01:56PM -0800, Andrii Nakryiko wrote:
> > > > Didn't find the best place to put this, so it will be here. I think it
> > > > would be beneficial to discuss BPF helpers freeze in BPF office hours.
> > > > So I took the liberty to put it up for next BPF office hours, 9am, Jan
> > > > 12th 2022. I hope that some more people that have exposure to
> > > > real-world BPF application and pains associated with all that could
> > > > join the discussion, but obviously anyone is welcome as well, no
> > > > matter which way they are leaning.
> > > >
> > > > Please consider joining, see details on Zoom meeting at [0]
> > > >
> > > > For the rest, please see below. I'll be out for a few days and won't
> > > > be able to reply, my apologies.
> > > >
> > > >   [0] https://docs.google.com/spreadsheets/d/1LfrDXZ9-fdhvPEp_LHkxAMYyxxpwBXjywWa0AejEveU/edit#gid=0
> > >
> > > Thanks for adding it to the agenda.
> > > Hopefully we'll be able to converge faster on a call.
> >
> > Yep, hopefully. Looking forward to BPF office hours this week.
> >
> > >
> > > There are several things to discuss:
> > > 1. whether or not to freeze helpers.
> > > 2. whether dynptr accessors should be helpers or kfuncs.
> > > 3. whether your future inline iterators should be helpers or kfuncs.
> > > 4. whether cilium's bpf_sock_destroy should be helper or kfunc.
>
> I think these are all big questions. Maybe we can start with some
> smaller questions? Here is a list of questions I have:
>
> 1. Do we want stable kfuncs (as stable as helpers)? Do we want
>    almost stable kfuncs?

Yes. We've touched on some of that earlier.
We can talk about a range:
unstable, deprecated, starting to deprecate, stable
plus orthogonal versioning scheme.

> Will most users of stable APIs be as happy
>    with almost stable alternatives?

kfuncs are very much analogous to EXPORT_SYMBOL_GPL.
There is no versioning scheme, nor deprecation scheme for that.
Yet in-kernel and out-of-tree users have been dealing with it.
There are kABI things that make things stable to various degrees.
So 'happy' is relative.
Using that analogy...
In-kernel bpf progs won't care. unstable or not they will get
carried along automatically when kfuncs change.
Out of tree bpf progs can be divided to kernel dependent
and kernel independent. The former are similar to in-tree
with extra pain that can be mitigated with kfunc detection.
The latter will always use stable with understandable deprecation path.
Yet it's all in theory.
In practice networking folks are using conntrack kfuncs and
xfrm kfuncs assuming we will make it all work somehow,
though right now we're saying kfuncs are unstable only.

So 'happy' and 'pain' are relative depending on the usefulness
of kfunc. If bpf prog needs a feature it will use it.
If it's a shiny new feature, the prog authors might wait
until kfunc stabilizes.
Which is exactly the point.
We can wish for something to be useful, but we won't know
until we actually use it for real and not in some selftest.

And it becomes chicken and egg. If it's a cool new feature
the bpf prog wants it to be stable to rely on it later,
but because it's so new it's not clear whether it's actually useful,
so we shouldn't be declaring it stable and cause kernel pains.

> 2. Do we decide the stability of a kfunc when it is first added? Or
>     do we plan to promote (maybe also demote?) stability later?

Claiming that something is stable on day one
is a subjective opinion of the developer who's adding that feature.
There could even be a giant user space project next to it
attempting to use that feature, but we've seen that with other
uapi-s in the past.

> 3. Besides stability, what are the concerns with kfuncs? How hard
>     is it to resolve them?
>     AFAICT, the concerns are: require BTF, require trampoline.

Only the former. kfuncs do not require bpf trampoline.

$ git grep bpf_jit_supports_kfunc_call
arch/arm64/net/bpf_jit_comp.c:bool bpf_jit_supports_kfunc_call(void)
arch/loongarch/net/bpf_jit.c:bool bpf_jit_supports_kfunc_call(void)
arch/x86/net/bpf_jit_comp.c:bool bpf_jit_supports_kfunc_call(void)
arch/x86/net/bpf_jit_comp32.c:bool bpf_jit_supports_kfunc_call(void)

iirc I've seen the patches for risc-v and arm32.

>     Anything else? I guess we will never remove BTF dependency.
>     Trampoline dependency is hard to resolve, but still possible?
>
> 4. We have feature-rich BPF with Linux-x86_64. Do we need some
>    bare-minimal BPF, say for Linux-MIPS, or Windows-ARM, or
>    even nvme-something? I guess this is also related to the BPF
>    standard?

It's not related to ISA standardization.
We're not even talking about BTF standardization.
Nor about psABI (calling convention and such).
It's going to happen much much later.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-11 22:56                               ` Maxim Mikityanskiy
@ 2023-01-12  4:48                                 ` Alexei Starovoitov
  2023-01-13  9:48                                   ` Jose E. Marchesi
  0 siblings, 1 reply; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-12  4:48 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: Daniel Borkmann, David Vernet, Andrii Nakryiko, Joanne Koong,
	bpf, Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, KP Singh

On Wed, Jan 11, 2023 at 2:57 PM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
>
> On Tue, Jan 03, 2023 at 03:51:07PM -0800, Alexei Starovoitov wrote:
> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
> > > Discoverability plus being able to know semantics from a user PoV to figure out when
> > > workarounds for older/newer kernels are required to be able to support both kernels.
> >
> > Sounds like your concern is that there could be a kfunc that changed it semantics,
> > but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
> > such patches from landing. It's up to us to define sane and user friendly deprecation of kfuncs.
>
> I would advocate for adding versioning to BPF API (be it helpers or
> "stable" kfuncs). Right now we have two extremes: helpers that can't be
> changed/fixed/deprecated ever, and kfuncs that can be changed at any
> time, so the end users can't be sure new kernel won't break their stuff.
> Detecting and fixing the breakage can also be tricky: end users have to
> write different probes on a case-by-case basis, and sometimes it's not
> just a matter of checking the number of function parameters or presence
> of some definition (such difficulties happen when backporting drivers to
> older kernels, so I assume it may be an issue for BPF programs as well).
>
> Let's say we add a version number to the kernel, and the BPF program
> also has an API version number it's compiled for. Whenever something
> changes in the stable API on the kernel side, the version number is
> increased. At the same time, compatibility on the kernel side is
> preserved for some reasonable period of time (2 years, 5 years,
> whatever), which means that if the kernel loads a BPF program with an
> older version number, and that version is within the supported period of
> time, the kernel will behave in the old way, i.e. verify the old
> signature of a function, preserve the old behavior, etc.

Right. I think somebody proposed a version scheme for kfuncs already.
There were so many replies I've lost track.
But yes it's definitely on the table and
we should consider it.
Something like libbpf.map
We can declare which stable features are supported in which "version".

> This approach has the following upsides:
>
> 1. End users can stop worrying that some function changes unexpectedly,
> and they can have a smoother migration plan.
>
> 2. Clear deprecation schedule.
>
> 3. Easy way to probe for needed functionality, it's just a matter of
> comparing numbers: the BPF program loader checks that the kernel is new
> enough, and the kernel checks that the BPF program's API is not too old.
>
> 4. Kernel maintainers will have a deprecation strategy.

+1

> Cons:
>
> 1. Arguably a maintainance burden to preserve compatibility on the
> kernel side, but I would say it's a balance between helpers (which are
> maintainance burden forever) and kfuncs (which can be changed in every
> kernel version without keeping any compatibility). "Kfunc that changed
> its semantics is bad, we should prevent such patches" are just words,
> but if the developer needs to keep both versions for a while, it will
> serve as a calm-down mechanism to prevent changes that aren't really
> necessary. At the same time, the dead code will stop accumulating,
> because it can be removed according to the schedule.

That sounds like 'pro' instead of 'con' to me :)

> 2. Having a single version number complicates backporting features to
> older kernels, it would require backporting all previous features
> chronologically, even if there is no direct dependency. Having multiple
> version numbers (per feature) is cumbersome for the BPF program to
> declare. However, this issue is not new, it's already the case for BPF
> helpers (you can't backport new helpers skipping some other, because the
> numbers in the list must match).

yeah. I recall amazon linux or something else backported
helpers out of order and that screwed up bpf progs.
That was the reason we added numbers to the FN macro in uapi/bpf.h
That will hopefully prevent such mistakes.

But practically speaking...
The distro that does out-of-order backporting and skips
certain helpers is saying: I'm defining my own kABI equivalent
for bpf progs.
In that sense there is zero difference between helpers and kfuncs
from distro point of view and from point of view of their customers.
Both helpers and kfuncs are neither stable nor unstable.

This discussion is only about pros and cons of the upstream kernel
and bpf progs that consume upstream kernel.

If we include hyperscalers in the discussion then all
helpers and all kfuncs immediately become stable from
point of view of their engineers.
Big datacenters can maintain kernels with whatever helpers
and kfuncs they need.

>
> The above description intentionally doesn't specify whether it should be
> applied to helpers or kfuncs, because it's a universal concept, about
> which I would like to hear opinions about versioning without bias to
> helpers or kfuncs.
>
> Regarding freezing helpers, I think there should be a solution for
> deprecating obsolete stuff. There are historical examples of removing
> things from UAPI: removing i386 support, ipchains, devfs, IrDA
> subsystem, even a few architectures [1]. If we apply the versioning
> approach to helpers, we can make long-waiting incompatible changes in
> v1, keeping the current set of helpers as v0, used for programs that
> don't declare a version. Eventually (in 5 years, in 10 years, whatever
> sounds reasonable) we can drop v0 and remove the support for unversioned
> BPF programs altogether, similar to how other big things were removed
> from the kernel. Does it sound feasible?

Not to me. Breaking uapi in whichever way with whatever excuse
is not on the table.
We've documented our rules long ago:

Q: Does BPF have a stable ABI?
------------------------------
A: YES. BPF instructions, arguments to BPF programs, set of helper
functions and their arguments, recognized return codes are all part
of ABI.

> > "Proper BPF helper" model is broken.
> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> >
> > is a hack that works only when compiler optimizes the code.
>
> What if we replace codegen for helpers, so that it becomes something
> like this?
>
> static inline void *bpf_map_lookup_elem(void *map, const void *key)
> {
>         // pseudocode alert!
>         asm("call 1" : : "r1"(map), "r2"(key));
> }
>
> I.e. can we just throw in some inline BPF assembly that prepares
> registers and invokes a call instruction with the helper number? That
> should be portable across clang and gcc, allowing to stop relying on
> optimizations.

Great idea!
It needs "=r" to capture R0 into the 'ret' variable and then it should work.
clang may have issues with such asm, but should be fixable.
gcc is less clear.
iirc they had their own incompatible inline asm :(
It's a bigger issue.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-12  4:23                                   ` Alexei Starovoitov
@ 2023-01-12  7:35                                     ` Song Liu
  0 siblings, 0 replies; 57+ messages in thread
From: Song Liu @ 2023-01-12  7:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, David Vernet, Joanne Koong, bpf,
	Andrii Nakryiko, Kernel Team, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau

On Wed, Jan 11, 2023 at 8:24 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
[...]

> >
> > 1. Do we want stable kfuncs (as stable as helpers)? Do we want
> >    almost stable kfuncs?
>
> Yes. We've touched on some of that earlier.
> We can talk about a range:
> unstable, deprecated, starting to deprecate, stable
> plus orthogonal versioning scheme.
>
> > Will most users of stable APIs be as happy
> >    with almost stable alternatives?
>
> kfuncs are very much analogous to EXPORT_SYMBOL_GPL.
> There is no versioning scheme, nor deprecation scheme for that.
> Yet in-kernel and out-of-tree users have been dealing with it.
> There are kABI things that make things stable to various degrees.
> So 'happy' is relative.
> Using that analogy...
> In-kernel bpf progs won't care. unstable or not they will get
> carried along automatically when kfuncs change.
> Out of tree bpf progs can be divided to kernel dependent
> and kernel independent. The former are similar to in-tree
> with extra pain that can be mitigated with kfunc detection.
> The latter will always use stable with understandable deprecation path.
> Yet it's all in theory.
> In practice networking folks are using conntrack kfuncs and
> xfrm kfuncs assuming we will make it all work somehow,
> though right now we're saying kfuncs are unstable only.

I think we need something more stable than EXPORT_SYMBOL_GPL,
because: 1) there are more OOT bpf progs than OOT drivers;
2) some BPF developers (network people in KP's categories)
have less kernel experience, and thus have a stronger
preference for more stable APIs. The range of stability on top
of EXPORT_SYMBOL_GPL could be really helpful for these
users.

>
> So 'happy' and 'pain' are relative depending on the usefulness
> of kfunc. If bpf prog needs a feature it will use it.
> If it's a shiny new feature, the prog authors might wait
> until kfunc stabilizes.
> Which is exactly the point.
> We can wish for something to be useful, but we won't know
> until we actually use it for real and not in some selftest.
>
> And it becomes chicken and egg. If it's a cool new feature
> the bpf prog wants it to be stable to rely on it later,
> but because it's so new it's not clear whether it's actually useful,
> so we shouldn't be declaring it stable and cause kernel pains.
>
> > 2. Do we decide the stability of a kfunc when it is first added? Or
> >     do we plan to promote (maybe also demote?) stability later?
>
> Claiming that something is stable on day one
> is a subjective opinion of the developer who's adding that feature.
> There could even be a giant user space project next to it
> attempting to use that feature, but we've seen that with other
> uapi-s in the past.

With the range of stability, stable could mean "not going away for
at least 5 years". Then claiming something is stable means "I/we
will support it for at least 5 years". It is probably not too crazy to
make this type of promises for some core APIs.

>
> > 3. Besides stability, what are the concerns with kfuncs? How hard
> >     is it to resolve them?
> >     AFAICT, the concerns are: require BTF, require trampoline.
>
> Only the former. kfuncs do not require bpf trampoline.
>
> $ git grep bpf_jit_supports_kfunc_call
> arch/arm64/net/bpf_jit_comp.c:bool bpf_jit_supports_kfunc_call(void)
> arch/loongarch/net/bpf_jit.c:bool bpf_jit_supports_kfunc_call(void)
> arch/x86/net/bpf_jit_comp.c:bool bpf_jit_supports_kfunc_call(void)
> arch/x86/net/bpf_jit_comp32.c:bool bpf_jit_supports_kfunc_call(void)
>
> iirc I've seen the patches for risc-v and arm32.

Thanks for the correction. Reading commits that enabled kfunc for
different archs, I think it is easier than enabling trampolines.

AFAICT, more stability of some kfuncs and better availability of
kfuncs should address most of the concerns. I would like to hear
Andrii's thoughts on this.

Thanks,
Song

>
> >     Anything else? I guess we will never remove BTF dependency.
> >     Trampoline dependency is hard to resolve, but still possible?
> >
> > 4. We have feature-rich BPF with Linux-x86_64. Do we need some
> >    bare-minimal BPF, say for Linux-MIPS, or Windows-ARM, or
> >    even nvme-something? I guess this is also related to the BPF
> >    standard?
>
> It's not related to ISA standardization.
> We're not even talking about BTF standardization.
> Nor about psABI (calling convention and such).
> It's going to happen much much later.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-12  4:48                                 ` Alexei Starovoitov
@ 2023-01-13  9:48                                   ` Jose E. Marchesi
  2023-01-13 16:35                                     ` Alexei Starovoitov
  0 siblings, 1 reply; 57+ messages in thread
From: Jose E. Marchesi @ 2023-01-13  9:48 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Maxim Mikityanskiy, Daniel Borkmann, David Vernet,
	Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, Kernel Team,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, KP Singh,
	david.faust


> On Wed, Jan 11, 2023 at 2:57 PM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
>>
>> On Tue, Jan 03, 2023 at 03:51:07PM -0800, Alexei Starovoitov wrote:
>> > On Tue, Jan 03, 2023 at 12:43:58PM +0100, Daniel Borkmann wrote:
>> > > Discoverability plus being able to know semantics from a user PoV to figure out when
>> > > workarounds for older/newer kernels are required to be able to support both kernels.
>> >
>> > Sounds like your concern is that there could be a kfunc that changed it semantics,
>> > but kept exact same name and arguments? Yeah. That would be bad, but we should prevent
>> > such patches from landing. It's up to us to define sane and user
>> > friendly deprecation of kfuncs.
>>
>> I would advocate for adding versioning to BPF API (be it helpers or
>> "stable" kfuncs). Right now we have two extremes: helpers that can't be
>> changed/fixed/deprecated ever, and kfuncs that can be changed at any
>> time, so the end users can't be sure new kernel won't break their stuff.
>> Detecting and fixing the breakage can also be tricky: end users have to
>> write different probes on a case-by-case basis, and sometimes it's not
>> just a matter of checking the number of function parameters or presence
>> of some definition (such difficulties happen when backporting drivers to
>> older kernels, so I assume it may be an issue for BPF programs as well).
>>
>> Let's say we add a version number to the kernel, and the BPF program
>> also has an API version number it's compiled for. Whenever something
>> changes in the stable API on the kernel side, the version number is
>> increased. At the same time, compatibility on the kernel side is
>> preserved for some reasonable period of time (2 years, 5 years,
>> whatever), which means that if the kernel loads a BPF program with an
>> older version number, and that version is within the supported period of
>> time, the kernel will behave in the old way, i.e. verify the old
>> signature of a function, preserve the old behavior, etc.
>
> Right. I think somebody proposed a version scheme for kfuncs already.
> There were so many replies I've lost track.
> But yes it's definitely on the table and
> we should consider it.
> Something like libbpf.map
> We can declare which stable features are supported in which "version".
>
>> This approach has the following upsides:
>>
>> 1. End users can stop worrying that some function changes unexpectedly,
>> and they can have a smoother migration plan.
>>
>> 2. Clear deprecation schedule.
>>
>> 3. Easy way to probe for needed functionality, it's just a matter of
>> comparing numbers: the BPF program loader checks that the kernel is new
>> enough, and the kernel checks that the BPF program's API is not too old.
>>
>> 4. Kernel maintainers will have a deprecation strategy.
>
> +1
>
>> Cons:
>>
>> 1. Arguably a maintainance burden to preserve compatibility on the
>> kernel side, but I would say it's a balance between helpers (which are
>> maintainance burden forever) and kfuncs (which can be changed in every
>> kernel version without keeping any compatibility). "Kfunc that changed
>> its semantics is bad, we should prevent such patches" are just words,
>> but if the developer needs to keep both versions for a while, it will
>> serve as a calm-down mechanism to prevent changes that aren't really
>> necessary. At the same time, the dead code will stop accumulating,
>> because it can be removed according to the schedule.
>
> That sounds like 'pro' instead of 'con' to me :)
>
>> 2. Having a single version number complicates backporting features to
>> older kernels, it would require backporting all previous features
>> chronologically, even if there is no direct dependency. Having multiple
>> version numbers (per feature) is cumbersome for the BPF program to
>> declare. However, this issue is not new, it's already the case for BPF
>> helpers (you can't backport new helpers skipping some other, because the
>> numbers in the list must match).
>
> yeah. I recall amazon linux or something else backported
> helpers out of order and that screwed up bpf progs.
> That was the reason we added numbers to the FN macro in uapi/bpf.h
> That will hopefully prevent such mistakes.
>
> But practically speaking...
> The distro that does out-of-order backporting and skips
> certain helpers is saying: I'm defining my own kABI equivalent
> for bpf progs.
> In that sense there is zero difference between helpers and kfuncs
> from distro point of view and from point of view of their customers.
> Both helpers and kfuncs are neither stable nor unstable.
>
> This discussion is only about pros and cons of the upstream kernel
> and bpf progs that consume upstream kernel.
>
> If we include hyperscalers in the discussion then all
> helpers and all kfuncs immediately become stable from
> point of view of their engineers.
> Big datacenters can maintain kernels with whatever helpers
> and kfuncs they need.
>
>>
>> The above description intentionally doesn't specify whether it should be
>> applied to helpers or kfuncs, because it's a universal concept, about
>> which I would like to hear opinions about versioning without bias to
>> helpers or kfuncs.
>>
>> Regarding freezing helpers, I think there should be a solution for
>> deprecating obsolete stuff. There are historical examples of removing
>> things from UAPI: removing i386 support, ipchains, devfs, IrDA
>> subsystem, even a few architectures [1]. If we apply the versioning
>> approach to helpers, we can make long-waiting incompatible changes in
>> v1, keeping the current set of helpers as v0, used for programs that
>> don't declare a version. Eventually (in 5 years, in 10 years, whatever
>> sounds reasonable) we can drop v0 and remove the support for unversioned
>> BPF programs altogether, similar to how other big things were removed
>> from the kernel. Does it sound feasible?
>
> Not to me. Breaking uapi in whichever way with whatever excuse
> is not on the table.
> We've documented our rules long ago:
>
> Q: Does BPF have a stable ABI?
> ------------------------------
> A: YES. BPF instructions, arguments to BPF programs, set of helper
> functions and their arguments, recognized return codes are all part
> of ABI.
>
>> > "Proper BPF helper" model is broken.
>> > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
>> >
>> > is a hack that works only when compiler optimizes the code.
>>
>> What if we replace codegen for helpers, so that it becomes something
>> like this?
>>
>> static inline void *bpf_map_lookup_elem(void *map, const void *key)
>> {
>>         // pseudocode alert!
>>         asm("call 1" : : "r1"(map), "r2"(key));
>> }
>>
>> I.e. can we just throw in some inline BPF assembly that prepares
>> registers and invokes a call instruction with the helper number? That
>> should be portable across clang and gcc, allowing to stop relying on
>> optimizations.
>
> Great idea!

+1

> It needs "=r" to capture R0 into the 'ret' variable and then it should work.
> clang may have issues with such asm, but should be fixable.
> gcc is less clear.

That inline assembly should work with GCC as it is now.  Both compilers
use the same syntax for the `call' instruction.

> iirc they had their own incompatible inline asm :(
> It's a bigger issue.

We are taking care of that, by adding support to the GNU assembler to
also understand the pseudo-C syntax used by llvm.  This covers both .s
files specified in the compilation line, and inline asm statements.

Should be ready soon.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: bpf helpers freeze. Was: [PATCH v2 bpf-next 0/6] Dynptr convenience helpers
  2023-01-13  9:48                                   ` Jose E. Marchesi
@ 2023-01-13 16:35                                     ` Alexei Starovoitov
  0 siblings, 0 replies; 57+ messages in thread
From: Alexei Starovoitov @ 2023-01-13 16:35 UTC (permalink / raw)
  To: Jose E. Marchesi
  Cc: Maxim Mikityanskiy, Daniel Borkmann, David Vernet,
	Andrii Nakryiko, Joanne Koong, bpf, Andrii Nakryiko, Kernel Team,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, KP Singh,
	David Faust

On Fri, Jan 13, 2023 at 1:45 AM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
> > iirc they had their own incompatible inline asm :(
> > It's a bigger issue.
>
> We are taking care of that, by adding support to the GNU assembler to
> also understand the pseudo-C syntax used by llvm.  This covers both .s
> files specified in the compilation line, and inline asm statements.
>
> Should be ready soon.

This is awesome! Thank you.

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2023-01-13 16:38 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-07 20:55 [PATCH v2 bpf-next 0/6] Dynptr convenience helpers Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 1/6] bpf: Add bpf_dynptr_trim and bpf_dynptr_advance Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 2/6] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 3/6] bpf: Add bpf_dynptr_get_size and bpf_dynptr_get_offset Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 4/6] bpf: Add bpf_dynptr_clone Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 5/6] bpf: Add bpf_dynptr_iterator Joanne Koong
2022-12-07 20:55 ` [PATCH v2 bpf-next 6/6] selftests/bpf: Tests for dynptr convenience helpers Joanne Koong
2022-12-08  1:54 ` [PATCH v2 bpf-next 0/6] Dynptr " Alexei Starovoitov
2022-12-09  0:42   ` Andrii Nakryiko
2022-12-09  1:30     ` Alexei Starovoitov
2022-12-09 22:24       ` Joanne Koong
2022-12-12 20:12       ` Andrii Nakryiko
2022-12-13 23:50         ` Joanne Koong
2022-12-14  0:57           ` Andrii Nakryiko
2022-12-14 21:25             ` Joanne Koong
2022-12-16 17:35         ` Alexei Starovoitov
2022-12-20 19:31           ` Andrii Nakryiko
2022-12-25 21:52             ` bpf helpers freeze. Was: " Alexei Starovoitov
2022-12-29 23:10               ` Andrii Nakryiko
2022-12-30  2:46                 ` Alexei Starovoitov
2022-12-30 18:38                   ` David Vernet
2022-12-30 19:31                     ` Alexei Starovoitov
2022-12-30 21:00                       ` David Vernet
2022-12-31  0:42                         ` Alexei Starovoitov
2023-01-03 11:43                           ` Daniel Borkmann
2023-01-03 23:51                             ` Alexei Starovoitov
2023-01-04 14:25                               ` Daniel Borkmann
2023-01-04 18:59                                 ` Andrii Nakryiko
2023-01-04 20:03                                   ` Alexei Starovoitov
2023-01-04 21:57                                     ` Andrii Nakryiko
2023-01-04 19:37                                 ` Alexei Starovoitov
2023-01-05  0:13                                   ` Martin KaFai Lau
2023-01-05 17:17                                     ` KP Singh
2023-01-05 21:03                                       ` Andrii Nakryiko
2023-01-06  1:32                                         ` KP Singh
2023-01-05 21:02                                     ` Andrii Nakryiko
2023-01-04 20:50                                 ` David Vernet
2023-01-11 22:56                               ` Maxim Mikityanskiy
2023-01-12  4:48                                 ` Alexei Starovoitov
2023-01-13  9:48                                   ` Jose E. Marchesi
2023-01-13 16:35                                     ` Alexei Starovoitov
2023-01-04  0:55                           ` Jakub Kicinski
2023-01-04 18:44                           ` Andrii Nakryiko
2023-01-04 19:56                             ` Alexei Starovoitov
2023-01-04 18:43                         ` Andrii Nakryiko
2023-01-04 19:51                           ` Alexei Starovoitov
2023-01-04 21:56                             ` Andrii Nakryiko
2023-01-04 18:43                   ` Andrii Nakryiko
2023-01-04 19:44                     ` Alexei Starovoitov
2023-01-04 21:55                       ` Andrii Nakryiko
2023-01-04 23:47                         ` David Vernet
2023-01-05 21:01                           ` Andrii Nakryiko
2023-01-06  2:54                             ` Alexei Starovoitov
2023-01-09 17:46                               ` Andrii Nakryiko
2023-01-11 21:29                                 ` Song Liu
2023-01-12  4:23                                   ` Alexei Starovoitov
2023-01-12  7:35                                     ` Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.