[PATCH bpf-next 0/3] bpf: allow cgroup progs to export custom errnos to userspace

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next 0/3] bpf: allow cgroup progs to export custom errnos to userspace
@ 2021-10-06 16:02 YiFei Zhu
  2021-10-06 16:02 ` [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean YiFei Zhu
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: YiFei Zhu @ 2021-10-06 16:02 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

Right now, most cgroup hooks are best used for permission checks. They
can only reject a syscall with -EPERM, so a cause of a rejection, if
the rejected by eBPF cgroup hooks, is ambiguous to userspace.
Additionally, if the syscalls are implemented in eBPF, all permission
checks and the implementation has to happen within the same filter,
as programs executed later in the series of progs are unaware of the
return values return by the previous progs.

This patch series adds a helper, bpf_export_errno, that allows hooks
to get/set the errno exported by eBPF to userspace. Invoking the helper
with a positive errno will set the exported errno, and invoking the
helper with zero will return the previously set errno. This means
that an errno, once set, can be overridden but cannot be unset. This
also allows later progs to retrieve errnos set by previous progs.

For legacy programs that rejects a syscall without setting the exported
errno, for backwards compatibility, if a prog rejects without itself
or a prior prog setting errno, the errno is set by the kernel to -EPERM.

Tests have been added in this series to test the behavior of the helper
with cgroup setsockopt getsockopt hooks.

Patch 1 changes the API of macros to prepare for the next patch and
  should be a no-op.
Patch 2 implements the helper and the tracking of errno.
Patch 3 tests the behaviors of the helper.

YiFei Zhu (3):
  bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean
  bpf: Add cgroup helper bpf_export_errno to get/set exported errno
    value
  selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt

 include/linux/bpf.h                           |  27 +-
 include/uapi/linux/bpf.h                      |  14 +
 kernel/bpf/cgroup.c                           |  65 ++-
 security/device_cgroup.c                      |   2 +-
 tools/include/uapi/linux/bpf.h                |  14 +
 .../bpf/prog_tests/cgroup_export_errno.c      | 472 ++++++++++++++++++
 .../progs/cgroup_export_errno_getsockopt.c    |  45 ++
 .../progs/cgroup_export_errno_setsockopt.c    |  52 ++
 8 files changed, 651 insertions(+), 40 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/cgroup_export_errno.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_export_errno_getsockopt.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_export_errno_setsockopt.c

--
2.33.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean
  2021-10-06 16:02 [PATCH bpf-next 0/3] bpf: allow cgroup progs to export custom errnos to userspace YiFei Zhu
@ 2021-10-06 16:02 ` YiFei Zhu
  2021-10-07  0:36   ` Song Liu
  2021-10-06 16:02 ` [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value YiFei Zhu
  2021-10-06 16:02 ` [PATCH bpf-next 3/3] selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt YiFei Zhu
  2 siblings, 1 reply; 19+ messages in thread
From: YiFei Zhu @ 2021-10-06 16:02 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

Right now BPF_PROG_RUN_ARRAY and related macros return 1 or 0
for whether the prog array allows or rejects whatever is being
hooked. The caller of these macros then return -EPERM or continue
processing based on thw macro's return value. Unforunately this is
inflexible, since -EPERM is the only errno that can be returned.

This patch should be a no-op; it prepares for the next patch. The
returning of the -EPERM is moved to inside the macros, so the outer
functions are directly returning what the macros returned if they
are non-zero.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf.h      | 16 +++++++++-------
 kernel/bpf/cgroup.c      | 41 +++++++++++++++-------------------------
 security/device_cgroup.c |  2 +-
 3 files changed, 25 insertions(+), 34 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1c7fd7c4c6d3..938885562d68 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1187,7 +1187,7 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
 
 typedef u32 (*bpf_prog_run_fn)(const struct bpf_prog *prog, const void *ctx);
 
-static __always_inline u32
+static __always_inline int
 BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
 			    const void *ctx, bpf_prog_run_fn run_prog,
 			    u32 *ret_flags)
@@ -1197,7 +1197,7 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
 	const struct bpf_prog_array *array;
 	struct bpf_run_ctx *old_run_ctx;
 	struct bpf_cg_run_ctx run_ctx;
-	u32 ret = 1;
+	int ret = 0;
 	u32 func_ret;
 
 	migrate_disable();
@@ -1208,7 +1208,8 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
 	while ((prog = READ_ONCE(item->prog))) {
 		run_ctx.prog_item = item;
 		func_ret = run_prog(prog, ctx);
-		ret &= (func_ret & 1);
+		if (!(func_ret & 1))
+			ret = -EPERM;
 		*(ret_flags) |= (func_ret >> 1);
 		item++;
 	}
@@ -1218,7 +1219,7 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
 	return ret;
 }
 
-static __always_inline u32
+static __always_inline int
 BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
 		      const void *ctx, bpf_prog_run_fn run_prog)
 {
@@ -1227,7 +1228,7 @@ BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
 	const struct bpf_prog_array *array;
 	struct bpf_run_ctx *old_run_ctx;
 	struct bpf_cg_run_ctx run_ctx;
-	u32 ret = 1;
+	int ret = 0;
 
 	migrate_disable();
 	rcu_read_lock();
@@ -1236,7 +1237,8 @@ BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
 	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
 	while ((prog = READ_ONCE(item->prog))) {
 		run_ctx.prog_item = item;
-		ret &= run_prog(prog, ctx);
+		if (!run_prog(prog, ctx))
+			ret = -EPERM;
 		item++;
 	}
 	bpf_reset_run_ctx(old_run_ctx);
@@ -1304,7 +1306,7 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
 		u32 _ret;				\
 		_ret = BPF_PROG_RUN_ARRAY_CG_FLAGS(array, ctx, func, &_flags); \
 		_cn = _flags & BPF_RET_SET_CN;		\
-		if (_ret)				\
+		if (!_ret)				\
 			_ret = (_cn ? NET_XMIT_CN : NET_XMIT_SUCCESS);	\
 		else					\
 			_ret = (_cn ? NET_XMIT_DROP : -EPERM);		\
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 03145d45e3d5..5efe2588575e 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1044,7 +1044,6 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk,
 	} else {
 		ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], skb,
 					    __bpf_prog_run_save_cb);
-		ret = (ret == 1 ? 0 : -EPERM);
 	}
 	bpf_restore_data_end(skb, saved_data_end);
 	__skb_pull(skb, offset);
@@ -1071,10 +1070,9 @@ int __cgroup_bpf_run_filter_sk(struct sock *sk,
 			       enum cgroup_bpf_attach_type atype)
 {
 	struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
-	int ret;
 
-	ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], sk, bpf_prog_run);
-	return ret == 1 ? 0 : -EPERM;
+	return BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], sk,
+				     bpf_prog_run);
 }
 EXPORT_SYMBOL(__cgroup_bpf_run_filter_sk);
 
@@ -1106,7 +1104,6 @@ int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
 	};
 	struct sockaddr_storage unspec;
 	struct cgroup *cgrp;
-	int ret;
 
 	/* Check socket family since not all sockets represent network
 	 * endpoint (e.g. AF_UNIX).
@@ -1120,10 +1117,8 @@ int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
 	}
 
 	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
-	ret = BPF_PROG_RUN_ARRAY_CG_FLAGS(cgrp->bpf.effective[atype], &ctx,
-				          bpf_prog_run, flags);
-
-	return ret == 1 ? 0 : -EPERM;
+	return BPF_PROG_RUN_ARRAY_CG_FLAGS(cgrp->bpf.effective[atype], &ctx,
+					   bpf_prog_run, flags);
 }
 EXPORT_SYMBOL(__cgroup_bpf_run_filter_sock_addr);
 
@@ -1148,11 +1143,9 @@ int __cgroup_bpf_run_filter_sock_ops(struct sock *sk,
 				     enum cgroup_bpf_attach_type atype)
 {
 	struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
-	int ret;
 
-	ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], sock_ops,
-				    bpf_prog_run);
-	return ret == 1 ? 0 : -EPERM;
+	return BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], sock_ops,
+				     bpf_prog_run);
 }
 EXPORT_SYMBOL(__cgroup_bpf_run_filter_sock_ops);
 
@@ -1165,15 +1158,15 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
 		.major = major,
 		.minor = minor,
 	};
-	int allow;
+	int ret;
 
 	rcu_read_lock();
 	cgrp = task_dfl_cgroup(current);
-	allow = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], &ctx,
-				      bpf_prog_run);
+	ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], &ctx,
+				    bpf_prog_run);
 	rcu_read_unlock();
 
-	return !allow;
+	return ret;
 }
 
 static const struct bpf_func_proto *
@@ -1314,7 +1307,7 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
 		kfree(ctx.new_val);
 	}
 
-	return ret == 1 ? 0 : -EPERM;
+	return ret;
 }
 
 #ifdef CONFIG_NET
@@ -1419,10 +1412,8 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 				    &ctx, bpf_prog_run);
 	release_sock(sk);
 
-	if (!ret) {
-		ret = -EPERM;
+	if (ret)
 		goto out;
-	}
 
 	if (ctx.optlen == -1) {
 		/* optlen set to -1, bypass kernel */
@@ -1529,10 +1520,8 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 				    &ctx, bpf_prog_run);
 	release_sock(sk);
 
-	if (!ret) {
-		ret = -EPERM;
+	if (ret)
 		goto out;
-	}
 
 	if (ctx.optlen > max_optlen || ctx.optlen < 0) {
 		ret = -EFAULT;
@@ -1588,8 +1577,8 @@ int __cgroup_bpf_run_filter_getsockopt_kern(struct sock *sk, int level,
 
 	ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[CGROUP_GETSOCKOPT],
 				    &ctx, bpf_prog_run);
-	if (!ret)
-		return -EPERM;
+	if (ret)
+		return ret;
 
 	if (ctx.optlen > *optlen)
 		return -EFAULT;
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 04375df52fc9..cd15c7994d34 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -837,7 +837,7 @@ int devcgroup_check_permission(short type, u32 major, u32 minor, short access)
 	int rc = BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(type, major, minor, access);
 
 	if (rc)
-		return -EPERM;
+		return rc;
 
 	#ifdef CONFIG_CGROUP_DEVICE
 	return devcgroup_legacy_check_permission(type, major, minor, access);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-06 16:02 [PATCH bpf-next 0/3] bpf: allow cgroup progs to export custom errnos to userspace YiFei Zhu
  2021-10-06 16:02 ` [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean YiFei Zhu
@ 2021-10-06 16:02 ` YiFei Zhu
  2021-10-07  0:41   ` Song Liu
  2021-10-20 23:28   ` Andrii Nakryiko
  2021-10-06 16:02 ` [PATCH bpf-next 3/3] selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt YiFei Zhu
  2 siblings, 2 replies; 19+ messages in thread
From: YiFei Zhu @ 2021-10-06 16:02 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

When passed in a positive errno, it sets the errno and returns 0.
When passed in 0, it gets the previously set errno. When passed in
an out of bound number, it returns -EINVAL. This is unambiguous:
negative return values are error in invoking the helper itself,
and positive return values are errnos being exported. Errnos once
set cannot be unset, but can be overridden.

The errno value is stored inside bpf_cg_run_ctx for ease of access
different prog types with different context structs layouts. The
helper implementation can simply perform a container_of from
current->bpf_ctx to retrieve bpf_cg_run_ctx.

For backward compatibility, if a program rejects without calling
the helper, and the errno has not been set by any prior progs, the
BPF_PROG_RUN_ARRAY_CG family macros automatically set the errno to
EPERM. If a prog sets an errno but returns 1 (allow), the outcome
is considered implementation-defined. This patch treat it the same
way as if 0 (reject) is returned.

For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
that, if the return value is NET_XMIT_DROP, the packet is silently
dropped. We preserve this behavior for backward compatibility
reasons, so even if an errno is set, the errno does not return to
caller.

For getsockopt hooks, they are different in that bpf progs runs
after kernel processes the getsockopt syscall instead of before.
There is also a retval in its context struct in which bpf progs
can unset the retval, and can force an -EPERM by returning 0.
We preseve the same semantics. Even though there is retval,
that value can only be unset, while progs can set (and not unset)
additional errno by using the helper, and that will override
whatever is in retval.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
---
 include/linux/bpf.h            | 23 +++++++++++------------
 include/uapi/linux/bpf.h       | 14 ++++++++++++++
 kernel/bpf/cgroup.c            | 24 ++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 14 ++++++++++++++
 4 files changed, 63 insertions(+), 12 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 938885562d68..5e3f3d2f5871 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1155,6 +1155,7 @@ struct bpf_run_ctx {};
 struct bpf_cg_run_ctx {
 	struct bpf_run_ctx run_ctx;
 	const struct bpf_prog_array_item *prog_item;
+	int errno_val;
 };
 
 struct bpf_trace_run_ctx {
@@ -1196,8 +1197,7 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
 	const struct bpf_prog *prog;
 	const struct bpf_prog_array *array;
 	struct bpf_run_ctx *old_run_ctx;
-	struct bpf_cg_run_ctx run_ctx;
-	int ret = 0;
+	struct bpf_cg_run_ctx run_ctx = {};
 	u32 func_ret;
 
 	migrate_disable();
@@ -1208,15 +1208,15 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
 	while ((prog = READ_ONCE(item->prog))) {
 		run_ctx.prog_item = item;
 		func_ret = run_prog(prog, ctx);
-		if (!(func_ret & 1))
-			ret = -EPERM;
+		if (!(func_ret & 1) && !run_ctx.errno_val)
+			run_ctx.errno_val = EPERM;
 		*(ret_flags) |= (func_ret >> 1);
 		item++;
 	}
 	bpf_reset_run_ctx(old_run_ctx);
 	rcu_read_unlock();
 	migrate_enable();
-	return ret;
+	return -run_ctx.errno_val;
 }
 
 static __always_inline int
@@ -1227,8 +1227,7 @@ BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
 	const struct bpf_prog *prog;
 	const struct bpf_prog_array *array;
 	struct bpf_run_ctx *old_run_ctx;
-	struct bpf_cg_run_ctx run_ctx;
-	int ret = 0;
+	struct bpf_cg_run_ctx run_ctx = {};
 
 	migrate_disable();
 	rcu_read_lock();
@@ -1237,14 +1236,14 @@ BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
 	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
 	while ((prog = READ_ONCE(item->prog))) {
 		run_ctx.prog_item = item;
-		if (!run_prog(prog, ctx))
-			ret = -EPERM;
+		if (!run_prog(prog, ctx) && !run_ctx.errno_val)
+			run_ctx.errno_val = EPERM;
 		item++;
 	}
 	bpf_reset_run_ctx(old_run_ctx);
 	rcu_read_unlock();
 	migrate_enable();
-	return ret;
+	return -run_ctx.errno_val;
 }
 
 static __always_inline u32
@@ -1297,7 +1296,7 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
  *   0: NET_XMIT_SUCCESS  skb should be transmitted
  *   1: NET_XMIT_DROP     skb should be dropped and cn
  *   2: NET_XMIT_CN       skb should be transmitted and cn
- *   3: -EPERM            skb should be dropped
+ *   3: -errno            skb should be dropped
  */
 #define BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(array, ctx, func)		\
 	({						\
@@ -1309,7 +1308,7 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
 		if (!_ret)				\
 			_ret = (_cn ? NET_XMIT_CN : NET_XMIT_SUCCESS);	\
 		else					\
-			_ret = (_cn ? NET_XMIT_DROP : -EPERM);		\
+			_ret = (_cn ? NET_XMIT_DROP : _ret);		\
 		_ret;					\
 	})
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 6fc59d61937a..d8126f8c0541 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4909,6 +4909,19 @@ union bpf_attr {
  *	Return
  *		The number of bytes written to the buffer, or a negative error
  *		in case of failure.
+ *
+ * int bpf_export_errno(int errno_val)
+ *	Description
+ *		If *errno_val* is positive, set the syscall's return error code;
+ *		if *errno_val* is zero, retrieve the previously set code.
+ *
+ *		This helper is currently supported by cgroup programs only.
+ *	Return
+ *		Zero if set is successful, or the previously set error code on
+ *		retrieval. Previously set code may be zero if it was never set.
+ *		On error, a negative value.
+ *
+ *		**-EINVAL** if *errno_val* not between zero and MAX_ERRNO inclusive.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5089,6 +5102,7 @@ union bpf_attr {
 	FN(task_pt_regs),		\
 	FN(get_branch_snapshot),	\
 	FN(trace_vprintk),		\
+	FN(export_errno),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 5efe2588575e..5b5051eb43e6 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1169,6 +1169,28 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
 	return ret;
 }
 
+BPF_CALL_1(bpf_export_errno, int, errno_val)
+{
+	struct bpf_cg_run_ctx *ctx =
+		container_of(current->bpf_ctx, struct bpf_cg_run_ctx, run_ctx);
+
+	if (errno_val < 0 || errno_val > MAX_ERRNO)
+		return -EINVAL;
+
+	if (!errno_val)
+		return ctx->errno_val;
+
+	ctx->errno_val = errno_val;
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_export_errno_proto = {
+	.func		= bpf_export_errno,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 cgroup_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1181,6 +1203,8 @@ cgroup_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_current_cgroup_id_proto;
 	case BPF_FUNC_perf_event_output:
 		return &bpf_event_output_data_proto;
+	case BPF_FUNC_export_errno:
+		return &bpf_export_errno_proto;
 	default:
 		return bpf_base_func_proto(func_id);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 6fc59d61937a..d8126f8c0541 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4909,6 +4909,19 @@ union bpf_attr {
  *	Return
  *		The number of bytes written to the buffer, or a negative error
  *		in case of failure.
+ *
+ * int bpf_export_errno(int errno_val)
+ *	Description
+ *		If *errno_val* is positive, set the syscall's return error code;
+ *		if *errno_val* is zero, retrieve the previously set code.
+ *
+ *		This helper is currently supported by cgroup programs only.
+ *	Return
+ *		Zero if set is successful, or the previously set error code on
+ *		retrieval. Previously set code may be zero if it was never set.
+ *		On error, a negative value.
+ *
+ *		**-EINVAL** if *errno_val* not between zero and MAX_ERRNO inclusive.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5089,6 +5102,7 @@ union bpf_attr {
 	FN(task_pt_regs),		\
 	FN(get_branch_snapshot),	\
 	FN(trace_vprintk),		\
+	FN(export_errno),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next 3/3] selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt
  2021-10-06 16:02 [PATCH bpf-next 0/3] bpf: allow cgroup progs to export custom errnos to userspace YiFei Zhu
  2021-10-06 16:02 ` [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean YiFei Zhu
  2021-10-06 16:02 ` [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value YiFei Zhu
@ 2021-10-06 16:02 ` YiFei Zhu
  2021-10-18 17:51   ` Song Liu
  2 siblings, 1 reply; 19+ messages in thread
From: YiFei Zhu @ 2021-10-06 16:02 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

The tests checks how different ways of interacting with the helper
(getting errno, setting EUNATCH, EISCONN, and legacy reject
returning 0 without setting errno), produce different results in
both the setsockopt syscall and the errno value returned by the
helper. A few more tests verify the interaction between the
exported errno and the retval in getsockopt context.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
---
 .../bpf/prog_tests/cgroup_export_errno.c      | 472 ++++++++++++++++++
 .../progs/cgroup_export_errno_getsockopt.c    |  45 ++
 .../progs/cgroup_export_errno_setsockopt.c    |  52 ++
 3 files changed, 569 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/cgroup_export_errno.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_export_errno_getsockopt.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_export_errno_setsockopt.c

diff --git a/tools/testing/selftests/bpf/prog_tests/cgroup_export_errno.c b/tools/testing/selftests/bpf/prog_tests/cgroup_export_errno.c
new file mode 100644
index 000000000000..c472267f8427
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/cgroup_export_errno.c
@@ -0,0 +1,472 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright 2021 Google LLC.
+ */
+
+#include <test_progs.h>
+#include <cgroup_helpers.h>
+#include <network_helpers.h>
+
+#include "cgroup_export_errno_setsockopt.skel.h"
+#include "cgroup_export_errno_getsockopt.skel.h"
+
+#define SOL_CUSTOM	0xdeadbeef
+
+static int zero;
+
+static void test_setsockopt_set(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_set_eunatch = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that sets EUNATCH, assert that
+	 * we actually get that error when we run setsockopt()
+	 */
+	link_set_eunatch = bpf_program__attach_cgroup(obj->progs.set_eunatch,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eunatch, "cg-attach-set_eunatch"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				   &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EUNATCH, "setsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 1, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_set_eunatch);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_setsockopt_set_and_get(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_set_eunatch = NULL, *link_get_errno = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that sets EUNATCH, and one that gets the
+	 * previously set errno. Assert that we get the same errno back.
+	 */
+	link_set_eunatch = bpf_program__attach_cgroup(obj->progs.set_eunatch,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eunatch, "cg-attach-set_eunatch"))
+		goto close_bpf_object;
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				   &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EUNATCH, "setsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 2, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, EUNATCH, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_set_eunatch);
+	bpf_link__destroy(link_get_errno);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_setsockopt_default_zero(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_get_errno = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that gets the previously set errno.
+	 * Assert that, without anything setting one, we get 0.
+	 */
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_OK(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				  &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 1, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, 0, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_get_errno);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_setsockopt_default_zero_and_set(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_get_errno = NULL, *link_set_eunatch = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that gets the previously set errno, and then
+	 * one that sets the errno to EUNATCH. Assert that the get does not
+	 * see EUNATCH set later, and does not prevent EUNATCH from being set.
+	 */
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+	link_set_eunatch = bpf_program__attach_cgroup(obj->progs.set_eunatch,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eunatch, "cg-attach-set_eunatch"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				   &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EUNATCH, "setsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 2, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, 0, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_get_errno);
+	bpf_link__destroy(link_set_eunatch);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_setsockopt_override(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_set_eunatch = NULL, *link_set_eisconn = NULL;
+	struct bpf_link *link_get_errno = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that sets EUNATCH, then one that sets EISCONN,
+	 * and then one that gets the exported errno. Assert both the syscall
+	 * and the helper sees the last set errno.
+	 */
+	link_set_eunatch = bpf_program__attach_cgroup(obj->progs.set_eunatch,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eunatch, "cg-attach-set_eunatch"))
+		goto close_bpf_object;
+	link_set_eisconn = bpf_program__attach_cgroup(obj->progs.set_eisconn,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eisconn, "cg-attach-set_eisconn"))
+		goto close_bpf_object;
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				   &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EISCONN, "setsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 3, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, EISCONN, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_set_eunatch);
+	bpf_link__destroy(link_set_eisconn);
+	bpf_link__destroy(link_get_errno);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_setsockopt_legacy_eperm(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_legacy_eperm = NULL, *link_get_errno = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that return a reject without setting errno
+	 * (legacy reject), and one that gets the errno. Assert that for
+	 * backward compatibility the syscall result in EPERM, and this
+	 * is also visible to the helper.
+	 */
+	link_legacy_eperm = bpf_program__attach_cgroup(obj->progs.legacy_eperm,
+						       cgroup_fd);
+	if (!ASSERT_OK_PTR(link_legacy_eperm, "cg-attach-legacy_eperm"))
+		goto close_bpf_object;
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				   &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EPERM, "setsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 2, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, EPERM, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_legacy_eperm);
+	bpf_link__destroy(link_get_errno);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_setsockopt_legacy_no_override(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_setsockopt *obj;
+	struct bpf_link *link_set_eunatch = NULL, *link_legacy_eperm = NULL;
+	struct bpf_link *link_get_errno = NULL;
+
+	obj = cgroup_export_errno_setsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach setsockopt that sets EUNATCH, then one that return a reject
+	 * without setting errno, and then one that gets the exported errno.
+	 * Assert both the syscall and the helper's errno are unaffected by
+	 * the second prog (i.e. legacy rejects does not override the errno
+	 * to EPERM).
+	 */
+	link_set_eunatch = bpf_program__attach_cgroup(obj->progs.set_eunatch,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eunatch, "cg-attach-set_eunatch"))
+		goto close_bpf_object;
+	link_legacy_eperm = bpf_program__attach_cgroup(obj->progs.legacy_eperm,
+						       cgroup_fd);
+	if (!ASSERT_OK_PTR(link_legacy_eperm, "cg-attach-legacy_eperm"))
+		goto close_bpf_object;
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(setsockopt(sock_fd, SOL_SOCKET, SO_REUSEADDR,
+				   &zero, sizeof(int)), "setsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EUNATCH, "setsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 3, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, EUNATCH, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_set_eunatch);
+	bpf_link__destroy(link_legacy_eperm);
+	bpf_link__destroy(link_get_errno);
+
+	cgroup_export_errno_setsockopt__destroy(obj);
+}
+
+static void test_getsockopt_get(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_getsockopt *obj;
+	struct bpf_link *link_get_errno = NULL;
+	int buf;
+	socklen_t optlen = sizeof(buf);
+
+	obj = cgroup_export_errno_getsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach getsockopt that gets previously set errno. Assert that the
+	 * error from kernel is in retval_value and not errno_value.
+	 */
+	link_get_errno = bpf_program__attach_cgroup(obj->progs.get_errno,
+						    cgroup_fd);
+	if (!ASSERT_OK_PTR(link_get_errno, "cg-attach-get_errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(getsockopt(sock_fd, SOL_CUSTOM, 0,
+				   &buf, &optlen), "getsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EOPNOTSUPP, "getsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 1, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->errno_value, 0, "errno_value"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(obj->bss->retval_value, -EOPNOTSUPP, "errno_value"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_get_errno);
+
+	cgroup_export_errno_getsockopt__destroy(obj);
+}
+
+static void test_getsockopt_override(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_getsockopt *obj;
+	struct bpf_link *link_set_eisconn = NULL;
+	int buf;
+	socklen_t optlen = sizeof(buf);
+
+	obj = cgroup_export_errno_getsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach getsockopt that sets errno to EISCONN. Assert that this
+	 * overrides the value from kernel.
+	 */
+	link_set_eisconn = bpf_program__attach_cgroup(obj->progs.set_eisconn,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eisconn, "cg-attach-set_eisconn"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(getsockopt(sock_fd, SOL_CUSTOM, 0,
+				   &buf, &optlen), "getsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EISCONN, "getsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 1, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_set_eisconn);
+
+	cgroup_export_errno_getsockopt__destroy(obj);
+}
+
+static void test_getsockopt_retval_no_clear_errno(int cgroup_fd, int sock_fd)
+{
+	struct cgroup_export_errno_getsockopt *obj;
+	struct bpf_link *link_set_eisconn = NULL, *link_clear_retval = NULL;
+	int buf;
+	socklen_t optlen = sizeof(buf);
+
+	obj = cgroup_export_errno_getsockopt__open_and_load();
+	if (!ASSERT_OK_PTR(obj, "skel-load"))
+		return;
+
+	/* Attach getsockopt that sets errno to EISCONN, and one that clears
+	 * retval. Assert that the clearing retval does not clear EISCONN.
+	 */
+	link_set_eisconn = bpf_program__attach_cgroup(obj->progs.set_eisconn,
+						      cgroup_fd);
+	if (!ASSERT_OK_PTR(link_set_eisconn, "cg-attach-set_eisconn"))
+		goto close_bpf_object;
+	link_clear_retval = bpf_program__attach_cgroup(obj->progs.clear_retval,
+						       cgroup_fd);
+	if (!ASSERT_OK_PTR(link_clear_retval, "cg-attach-clear_retval"))
+		goto close_bpf_object;
+
+	if (!ASSERT_ERR(getsockopt(sock_fd, SOL_CUSTOM, 0,
+				   &buf, &optlen), "getsockopt"))
+		goto close_bpf_object;
+	if (!ASSERT_EQ(errno, EISCONN, "getsockopt-errno"))
+		goto close_bpf_object;
+
+	if (!ASSERT_EQ(obj->bss->invocations, 2, "invocations"))
+		goto close_bpf_object;
+	if (!ASSERT_FALSE(obj->bss->assertion_error, "assertion_error"))
+		goto close_bpf_object;
+
+close_bpf_object:
+	bpf_link__destroy(link_set_eisconn);
+	bpf_link__destroy(link_clear_retval);
+
+	cgroup_export_errno_getsockopt__destroy(obj);
+}
+
+void test_cgroup_export_errno(void)
+{
+	int cgroup_fd = -1;
+	int sock_fd = -1;
+
+	cgroup_fd = test__join_cgroup("/cgroup_export_errno");
+	if (!ASSERT_GE(cgroup_fd, 0, "cg-create"))
+		goto close_fd;
+
+	sock_fd = start_server(AF_INET, SOCK_DGRAM, NULL, 0, 0);
+	if (!ASSERT_GE(sock_fd, 0, "start-server"))
+		goto close_fd;
+
+	if (test__start_subtest("setsockopt-set"))
+		test_setsockopt_set(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("setsockopt-set_and_get"))
+		test_setsockopt_set_and_get(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("setsockopt-default_zero"))
+		test_setsockopt_default_zero(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("setsockopt-default_zero_and_set"))
+		test_setsockopt_default_zero_and_set(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("setsockopt-override"))
+		test_setsockopt_override(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("setsockopt-legacy_eperm"))
+		test_setsockopt_legacy_eperm(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("setsockopt-legacy_no_override"))
+		test_setsockopt_legacy_no_override(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("getsockopt-get"))
+		test_getsockopt_get(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("getsockopt-override"))
+		test_getsockopt_override(cgroup_fd, sock_fd);
+
+	if (test__start_subtest("getsockopt-retval_no_clear_errno"))
+		test_getsockopt_retval_no_clear_errno(cgroup_fd, sock_fd);
+
+close_fd:
+	close(cgroup_fd);
+}
diff --git a/tools/testing/selftests/bpf/progs/cgroup_export_errno_getsockopt.c b/tools/testing/selftests/bpf/progs/cgroup_export_errno_getsockopt.c
new file mode 100644
index 000000000000..2429e66b325a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/cgroup_export_errno_getsockopt.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright 2021 Google LLC.
+ */
+
+#include <errno.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+__u32 invocations = 0;
+__u32 assertion_error = 0;
+__u32 errno_value = 0;
+__u32 retval_value = 0;
+
+SEC("cgroup/getsockopt")
+int get_errno(struct bpf_sockopt *ctx)
+{
+	errno_value = bpf_export_errno(0);
+	retval_value = ctx->retval;
+	__sync_fetch_and_add(&invocations, 1);
+
+	return 1;
+}
+
+SEC("cgroup/getsockopt")
+int set_eisconn(struct bpf_sockopt *ctx)
+{
+	__sync_fetch_and_add(&invocations, 1);
+
+	if (bpf_export_errno(EISCONN))
+		assertion_error = 1;
+
+	return 1;
+}
+
+SEC("cgroup/getsockopt")
+int clear_retval(struct bpf_sockopt *ctx)
+{
+	__sync_fetch_and_add(&invocations, 1);
+
+	ctx->retval = 0;
+
+	return 1;
+}
diff --git a/tools/testing/selftests/bpf/progs/cgroup_export_errno_setsockopt.c b/tools/testing/selftests/bpf/progs/cgroup_export_errno_setsockopt.c
new file mode 100644
index 000000000000..f8585e100863
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/cgroup_export_errno_setsockopt.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright 2021 Google LLC.
+ */
+
+#include <errno.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+__u32 invocations = 0;
+__u32 assertion_error = 0;
+__u32 errno_value = 0;
+
+SEC("cgroup/setsockopt")
+int get_errno(struct bpf_sockopt *ctx)
+{
+	errno_value = bpf_export_errno(0);
+	__sync_fetch_and_add(&invocations, 1);
+
+	return 1;
+}
+
+SEC("cgroup/setsockopt")
+int set_eunatch(struct bpf_sockopt *ctx)
+{
+	__sync_fetch_and_add(&invocations, 1);
+
+	if (bpf_export_errno(EUNATCH))
+		assertion_error = 1;
+
+	return 0;
+}
+
+SEC("cgroup/setsockopt")
+int set_eisconn(struct bpf_sockopt *ctx)
+{
+	__sync_fetch_and_add(&invocations, 1);
+
+	if (bpf_export_errno(EISCONN))
+		assertion_error = 1;
+
+	return 0;
+}
+
+SEC("cgroup/setsockopt")
+int legacy_eperm(struct bpf_sockopt *ctx)
+{
+	__sync_fetch_and_add(&invocations, 1);
+
+	return 0;
+}
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean
  2021-10-06 16:02 ` [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean YiFei Zhu
@ 2021-10-07  0:36   ` Song Liu
  0 siblings, 0 replies; 19+ messages in thread
From: Song Liu @ 2021-10-07  0:36 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

On Wed, Oct 6, 2021 at 9:04 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> Right now BPF_PROG_RUN_ARRAY and related macros return 1 or 0
> for whether the prog array allows or rejects whatever is being
> hooked. The caller of these macros then return -EPERM or continue
> processing based on thw macro's return value. Unforunately this is
> inflexible, since -EPERM is the only errno that can be returned.
>
> This patch should be a no-op; it prepares for the next patch. The
> returning of the -EPERM is moved to inside the macros, so the outer
> functions are directly returning what the macros returned if they
> are non-zero.
>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Reviewed-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-06 16:02 ` [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value YiFei Zhu
@ 2021-10-07  0:41   ` Song Liu
  2021-10-07  5:59     ` Song Liu
  2021-10-20 23:28   ` Andrii Nakryiko
  1 sibling, 1 reply; 19+ messages in thread
From: Song Liu @ 2021-10-07  0:41 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

On Wed, Oct 6, 2021 at 9:04 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> When passed in a positive errno, it sets the errno and returns 0.
> When passed in 0, it gets the previously set errno. When passed in
> an out of bound number, it returns -EINVAL. This is unambiguous:
> negative return values are error in invoking the helper itself,
> and positive return values are errnos being exported. Errnos once
> set cannot be unset, but can be overridden.
>
> The errno value is stored inside bpf_cg_run_ctx for ease of access
> different prog types with different context structs layouts. The
> helper implementation can simply perform a container_of from
> current->bpf_ctx to retrieve bpf_cg_run_ctx.
>
> For backward compatibility, if a program rejects without calling
> the helper, and the errno has not been set by any prior progs, the
> BPF_PROG_RUN_ARRAY_CG family macros automatically set the errno to
> EPERM. If a prog sets an errno but returns 1 (allow), the outcome
> is considered implementation-defined. This patch treat it the same
> way as if 0 (reject) is returned.
>
> For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
> that, if the return value is NET_XMIT_DROP, the packet is silently
> dropped. We preserve this behavior for backward compatibility
> reasons, so even if an errno is set, the errno does not return to
> caller.
>
> For getsockopt hooks, they are different in that bpf progs runs
> after kernel processes the getsockopt syscall instead of before.
> There is also a retval in its context struct in which bpf progs
> can unset the retval, and can force an -EPERM by returning 0.
> We preseve the same semantics. Even though there is retval,
> that value can only be unset, while progs can set (and not unset)
> additional errno by using the helper, and that will override
> whatever is in retval.
>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Reviewed-by: Stanislav Fomichev <sdf@google.com>

This is pretty complicated, but the logic looks all correct. Thus,

Acked-by: Song Liu <songliubraving@fb.com>

One question, if the program want to retrieve existing errno_val, and
set a different one, it needs to call the helper twice, right? I guess it
is possible to do that in one call with a "swap" logic. Would this work?

Thanks,
Song

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-07  0:41   ` Song Liu
@ 2021-10-07  5:59     ` Song Liu
  2021-10-07 15:11       ` sdf
  0 siblings, 1 reply; 19+ messages in thread
From: Song Liu @ 2021-10-07  5:59 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

On Wed, Oct 6, 2021 at 5:41 PM Song Liu <song@kernel.org> wrote:
>
> On Wed, Oct 6, 2021 at 9:04 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
> >
> > From: YiFei Zhu <zhuyifei@google.com>
> >
> > When passed in a positive errno, it sets the errno and returns 0.
> > When passed in 0, it gets the previously set errno. When passed in
> > an out of bound number, it returns -EINVAL. This is unambiguous:
> > negative return values are error in invoking the helper itself,
> > and positive return values are errnos being exported. Errnos once
> > set cannot be unset, but can be overridden.
> >
> > The errno value is stored inside bpf_cg_run_ctx for ease of access
> > different prog types with different context structs layouts. The
> > helper implementation can simply perform a container_of from
> > current->bpf_ctx to retrieve bpf_cg_run_ctx.
> >
> > For backward compatibility, if a program rejects without calling
> > the helper, and the errno has not been set by any prior progs, the
> > BPF_PROG_RUN_ARRAY_CG family macros automatically set the errno to
> > EPERM. If a prog sets an errno but returns 1 (allow), the outcome
> > is considered implementation-defined. This patch treat it the same
> > way as if 0 (reject) is returned.
> >
> > For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
> > that, if the return value is NET_XMIT_DROP, the packet is silently
> > dropped. We preserve this behavior for backward compatibility
> > reasons, so even if an errno is set, the errno does not return to
> > caller.
> >
> > For getsockopt hooks, they are different in that bpf progs runs
> > after kernel processes the getsockopt syscall instead of before.
> > There is also a retval in its context struct in which bpf progs
> > can unset the retval, and can force an -EPERM by returning 0.
> > We preseve the same semantics. Even though there is retval,
> > that value can only be unset, while progs can set (and not unset)
> > additional errno by using the helper, and that will override
> > whatever is in retval.
> >
> > Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> > Reviewed-by: Stanislav Fomichev <sdf@google.com>
>
> This is pretty complicated, but the logic looks all correct. Thus,
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> One question, if the program want to retrieve existing errno_val, and
> set a different one, it needs to call the helper twice, right? I guess it
> is possible to do that in one call with a "swap" logic. Would this work?

Actually, how about we split this into two helpers:bpf_set_errno() and
bpf_get_errno(). This should avoid some confusion in long term.

Thanks,
Song

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-07  5:59     ` Song Liu
@ 2021-10-07 15:11       ` sdf
  2021-10-07 16:23         ` YiFei Zhu
  0 siblings, 1 reply; 19+ messages in thread
From: sdf @ 2021-10-07 15:11 UTC (permalink / raw)
  To: Song Liu; +Cc: YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu

On 10/06, Song Liu wrote:
> On Wed, Oct 6, 2021 at 5:41 PM Song Liu <song@kernel.org> wrote:
> >
> > On Wed, Oct 6, 2021 at 9:04 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
> > >
> > > From: YiFei Zhu <zhuyifei@google.com>
> > >
> > > When passed in a positive errno, it sets the errno and returns 0.
> > > When passed in 0, it gets the previously set errno. When passed in
> > > an out of bound number, it returns -EINVAL. This is unambiguous:
> > > negative return values are error in invoking the helper itself,
> > > and positive return values are errnos being exported. Errnos once
> > > set cannot be unset, but can be overridden.
> > >
> > > The errno value is stored inside bpf_cg_run_ctx for ease of access
> > > different prog types with different context structs layouts. The
> > > helper implementation can simply perform a container_of from
> > > current->bpf_ctx to retrieve bpf_cg_run_ctx.
> > >
> > > For backward compatibility, if a program rejects without calling
> > > the helper, and the errno has not been set by any prior progs, the
> > > BPF_PROG_RUN_ARRAY_CG family macros automatically set the errno to
> > > EPERM. If a prog sets an errno but returns 1 (allow), the outcome
> > > is considered implementation-defined. This patch treat it the same
> > > way as if 0 (reject) is returned.
> > >
> > > For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
> > > that, if the return value is NET_XMIT_DROP, the packet is silently
> > > dropped. We preserve this behavior for backward compatibility
> > > reasons, so even if an errno is set, the errno does not return to
> > > caller.
> > >
> > > For getsockopt hooks, they are different in that bpf progs runs
> > > after kernel processes the getsockopt syscall instead of before.
> > > There is also a retval in its context struct in which bpf progs
> > > can unset the retval, and can force an -EPERM by returning 0.
> > > We preseve the same semantics. Even though there is retval,
> > > that value can only be unset, while progs can set (and not unset)
> > > additional errno by using the helper, and that will override
> > > whatever is in retval.
> > >
> > > Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> > > Reviewed-by: Stanislav Fomichev <sdf@google.com>
> >
> > This is pretty complicated, but the logic looks all correct. Thus,
> >
> > Acked-by: Song Liu <songliubraving@fb.com>
> >
> > One question, if the program want to retrieve existing errno_val, and
> > set a different one, it needs to call the helper twice, right? I guess  
> it
> > is possible to do that in one call with a "swap" logic. Would this work?

> Actually, how about we split this into two helpers:bpf_set_errno() and
> bpf_get_errno(). This should avoid some confusion in long term.

We've agreed on the single helper during bpf office hours (about 2 weeks
ago), but we can do two, I don't think it matters that much.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-07 15:11       ` sdf
@ 2021-10-07 16:23         ` YiFei Zhu
  2021-10-07 16:34           ` Song Liu
  0 siblings, 1 reply; 19+ messages in thread
From: YiFei Zhu @ 2021-10-07 16:23 UTC (permalink / raw)
  To: sdf; +Cc: Song Liu, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

Yeah it felt like we only needed one helper for the parameters and
return values to be unambiguous. But if two better avoid confusion for
users, we can do that.

YiFei Zhu

On Thu, Oct 7, 2021 at 8:11 AM <sdf@google.com> wrote:
>
> On 10/06, Song Liu wrote:
> > On Wed, Oct 6, 2021 at 5:41 PM Song Liu <song@kernel.org> wrote:
> > >
> > > On Wed, Oct 6, 2021 at 9:04 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
> > > >
> > > > From: YiFei Zhu <zhuyifei@google.com>
> > > >
> > > > When passed in a positive errno, it sets the errno and returns 0.
> > > > When passed in 0, it gets the previously set errno. When passed in
> > > > an out of bound number, it returns -EINVAL. This is unambiguous:
> > > > negative return values are error in invoking the helper itself,
> > > > and positive return values are errnos being exported. Errnos once
> > > > set cannot be unset, but can be overridden.
> > > >
> > > > The errno value is stored inside bpf_cg_run_ctx for ease of access
> > > > different prog types with different context structs layouts. The
> > > > helper implementation can simply perform a container_of from
> > > > current->bpf_ctx to retrieve bpf_cg_run_ctx.
> > > >
> > > > For backward compatibility, if a program rejects without calling
> > > > the helper, and the errno has not been set by any prior progs, the
> > > > BPF_PROG_RUN_ARRAY_CG family macros automatically set the errno to
> > > > EPERM. If a prog sets an errno but returns 1 (allow), the outcome
> > > > is considered implementation-defined. This patch treat it the same
> > > > way as if 0 (reject) is returned.
> > > >
> > > > For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
> > > > that, if the return value is NET_XMIT_DROP, the packet is silently
> > > > dropped. We preserve this behavior for backward compatibility
> > > > reasons, so even if an errno is set, the errno does not return to
> > > > caller.
> > > >
> > > > For getsockopt hooks, they are different in that bpf progs runs
> > > > after kernel processes the getsockopt syscall instead of before.
> > > > There is also a retval in its context struct in which bpf progs
> > > > can unset the retval, and can force an -EPERM by returning 0.
> > > > We preseve the same semantics. Even though there is retval,
> > > > that value can only be unset, while progs can set (and not unset)
> > > > additional errno by using the helper, and that will override
> > > > whatever is in retval.
> > > >
> > > > Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> > > > Reviewed-by: Stanislav Fomichev <sdf@google.com>
> > >
> > > This is pretty complicated, but the logic looks all correct. Thus,
> > >
> > > Acked-by: Song Liu <songliubraving@fb.com>
> > >
> > > One question, if the program want to retrieve existing errno_val, and
> > > set a different one, it needs to call the helper twice, right? I guess
> > it
> > > is possible to do that in one call with a "swap" logic. Would this work?
>
> > Actually, how about we split this into two helpers:bpf_set_errno() and
> > bpf_get_errno(). This should avoid some confusion in long term.
>
> We've agreed on the single helper during bpf office hours (about 2 weeks
> ago), but we can do two, I don't think it matters that much.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-07 16:23         ` YiFei Zhu
@ 2021-10-07 16:34           ` Song Liu
  2021-10-08 20:49             ` YiFei Zhu
  0 siblings, 1 reply; 19+ messages in thread
From: Song Liu @ 2021-10-07 16:34 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: Stanislav Fomichev, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Thu, Oct 7, 2021 at 9:23 AM YiFei Zhu <zhuyifei@google.com> wrote:
>
> Yeah it felt like we only needed one helper for the parameters and
> return values to be unambiguous. But if two better avoid confusion for
> users, we can do that.
>
> YiFei Zhu
>
[...]
> > > >
> > > > One question, if the program want to retrieve existing errno_val, and
> > > > set a different one, it needs to call the helper twice, right? I guess
> > > it
> > > > is possible to do that in one call with a "swap" logic. Would this work?
> >
> > > Actually, how about we split this into two helpers:bpf_set_errno() and
> > > bpf_get_errno(). This should avoid some confusion in long term.
> >
> > We've agreed on the single helper during bpf office hours (about 2 weeks
> > ago), but we can do two, I don't think it matters that much.

I see. If we agreed on this syntax, I won't object.

Thanks,
Song

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-07 16:34           ` Song Liu
@ 2021-10-08 20:49             ` YiFei Zhu
  2021-10-08 21:00               ` Stanislav Fomichev
  0 siblings, 1 reply; 19+ messages in thread
From: YiFei Zhu @ 2021-10-08 20:49 UTC (permalink / raw)
  To: Song Liu
  Cc: Stanislav Fomichev, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Thu, Oct 7, 2021 at 9:34 AM Song Liu <song@kernel.org> wrote:
>
> On Thu, Oct 7, 2021 at 9:23 AM YiFei Zhu <zhuyifei@google.com> wrote:
> >
> > Yeah it felt like we only needed one helper for the parameters and
> > return values to be unambiguous. But if two better avoid confusion for
> > users, we can do that.
> >
> > YiFei Zhu
> >
> [...]
> > > > >
> > > > > One question, if the program want to retrieve existing errno_val, and
> > > > > set a different one, it needs to call the helper twice, right? I guess
> > > > it
> > > > > is possible to do that in one call with a "swap" logic. Would this work?
> > >
> > > > Actually, how about we split this into two helpers:bpf_set_errno() and
> > > > bpf_get_errno(). This should avoid some confusion in long term.
> > >
> > > We've agreed on the single helper during bpf office hours (about 2 weeks
> > > ago), but we can do two, I don't think it matters that much.
>
> I see. If we agreed on this syntax, I won't object.
>
> Thanks,
> Song

Shall I do the swap then? I don't think it has been discussed, and I
don't see any downsides from doing so, but I don't really see a
scenario in which someone would want to get and set at the same time
either.

YiFei Zhu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-08 20:49             ` YiFei Zhu
@ 2021-10-08 21:00               ` Stanislav Fomichev
  0 siblings, 0 replies; 19+ messages in thread
From: Stanislav Fomichev @ 2021-10-08 21:00 UTC (permalink / raw)
  To: YiFei Zhu; +Cc: Song Liu, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Fri, Oct 8, 2021 at 1:49 PM YiFei Zhu <zhuyifei@google.com> wrote:
>
> On Thu, Oct 7, 2021 at 9:34 AM Song Liu <song@kernel.org> wrote:
> >
> > On Thu, Oct 7, 2021 at 9:23 AM YiFei Zhu <zhuyifei@google.com> wrote:
> > >
> > > Yeah it felt like we only needed one helper for the parameters and
> > > return values to be unambiguous. But if two better avoid confusion for
> > > users, we can do that.
> > >
> > > YiFei Zhu
> > >
> > [...]
> > > > > >
> > > > > > One question, if the program want to retrieve existing errno_val, and
> > > > > > set a different one, it needs to call the helper twice, right? I guess
> > > > > it
> > > > > > is possible to do that in one call with a "swap" logic. Would this work?
> > > >
> > > > > Actually, how about we split this into two helpers:bpf_set_errno() and
> > > > > bpf_get_errno(). This should avoid some confusion in long term.
> > > >
> > > > We've agreed on the single helper during bpf office hours (about 2 weeks
> > > > ago), but we can do two, I don't think it matters that much.
> >
> > I see. If we agreed on this syntax, I won't object.
> >
> > Thanks,
> > Song
>
> Shall I do the swap then? I don't think it has been discussed, and I
> don't see any downsides from doing so, but I don't really see a
> scenario in which someone would want to get and set at the same time
> either.

What kind of swap do you have in mind? IMO it's such a corner case
operation that doing 2 calls is fine. I'm assuming the majority of
use-cases are: (1) export a custom errno regardless of was was
previously done in the chain (2) see if there was already an errno set
in the chain and bail out early. I don't see any real need for some
efficient swapping and rewriting, but I might be missing something..

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 3/3] selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt
  2021-10-06 16:02 ` [PATCH bpf-next 3/3] selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt YiFei Zhu
@ 2021-10-18 17:51   ` Song Liu
  0 siblings, 0 replies; 19+ messages in thread
From: Song Liu @ 2021-10-18 17:51 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

On Wed, Oct 6, 2021 at 9:05 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> The tests checks how different ways of interacting with the helper
> (getting errno, setting EUNATCH, EISCONN, and legacy reject
> returning 0 without setting errno), produce different results in
> both the setsockopt syscall and the errno value returned by the
> helper. A few more tests verify the interaction between the
> exported errno and the retval in getsockopt context.
>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Reviewed-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-06 16:02 ` [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value YiFei Zhu
  2021-10-07  0:41   ` Song Liu
@ 2021-10-20 23:28   ` Andrii Nakryiko
  2021-10-26  0:06     ` YiFei Zhu
  1 sibling, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2021-10-20 23:28 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev, YiFei Zhu

On Wed, Oct 6, 2021 at 9:04 AM YiFei Zhu <zhuyifei1999@gmail.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> When passed in a positive errno, it sets the errno and returns 0.
> When passed in 0, it gets the previously set errno. When passed in
> an out of bound number, it returns -EINVAL. This is unambiguous:
> negative return values are error in invoking the helper itself,
> and positive return values are errnos being exported. Errnos once
> set cannot be unset, but can be overridden.
>
> The errno value is stored inside bpf_cg_run_ctx for ease of access
> different prog types with different context structs layouts. The
> helper implementation can simply perform a container_of from
> current->bpf_ctx to retrieve bpf_cg_run_ctx.
>
> For backward compatibility, if a program rejects without calling
> the helper, and the errno has not been set by any prior progs, the
> BPF_PROG_RUN_ARRAY_CG family macros automatically set the errno to
> EPERM. If a prog sets an errno but returns 1 (allow), the outcome
> is considered implementation-defined. This patch treat it the same
> way as if 0 (reject) is returned.
>
> For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
> that, if the return value is NET_XMIT_DROP, the packet is silently
> dropped. We preserve this behavior for backward compatibility
> reasons, so even if an errno is set, the errno does not return to
> caller.
>
> For getsockopt hooks, they are different in that bpf progs runs
> after kernel processes the getsockopt syscall instead of before.
> There is also a retval in its context struct in which bpf progs
> can unset the retval, and can force an -EPERM by returning 0.
> We preseve the same semantics. Even though there is retval,
> that value can only be unset, while progs can set (and not unset)
> additional errno by using the helper, and that will override
> whatever is in retval.
>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Reviewed-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/linux/bpf.h            | 23 +++++++++++------------
>  include/uapi/linux/bpf.h       | 14 ++++++++++++++
>  kernel/bpf/cgroup.c            | 24 ++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h | 14 ++++++++++++++
>  4 files changed, 63 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 938885562d68..5e3f3d2f5871 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1155,6 +1155,7 @@ struct bpf_run_ctx {};
>  struct bpf_cg_run_ctx {
>         struct bpf_run_ctx run_ctx;
>         const struct bpf_prog_array_item *prog_item;
> +       int errno_val;
>  };
>
>  struct bpf_trace_run_ctx {
> @@ -1196,8 +1197,7 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
>         const struct bpf_prog *prog;
>         const struct bpf_prog_array *array;
>         struct bpf_run_ctx *old_run_ctx;
> -       struct bpf_cg_run_ctx run_ctx;
> -       int ret = 0;
> +       struct bpf_cg_run_ctx run_ctx = {};

you are zero-initializing this struct unnecessarily here. It's a
microoptimization, but it would be a bit cheaper to just
run_ctx.errno_val = 0; before the loop.

>         u32 func_ret;
>
>         migrate_disable();
> @@ -1208,15 +1208,15 @@ BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
>         while ((prog = READ_ONCE(item->prog))) {
>                 run_ctx.prog_item = item;
>                 func_ret = run_prog(prog, ctx);
> -               if (!(func_ret & 1))
> -                       ret = -EPERM;
> +               if (!(func_ret & 1) && !run_ctx.errno_val)
> +                       run_ctx.errno_val = EPERM;
>                 *(ret_flags) |= (func_ret >> 1);
>                 item++;
>         }
>         bpf_reset_run_ctx(old_run_ctx);
>         rcu_read_unlock();
>         migrate_enable();
> -       return ret;
> +       return -run_ctx.errno_val;
>  }
>
>  static __always_inline int
> @@ -1227,8 +1227,7 @@ BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
>         const struct bpf_prog *prog;
>         const struct bpf_prog_array *array;
>         struct bpf_run_ctx *old_run_ctx;
> -       struct bpf_cg_run_ctx run_ctx;
> -       int ret = 0;
> +       struct bpf_cg_run_ctx run_ctx = {};
>
>         migrate_disable();
>         rcu_read_lock();
> @@ -1237,14 +1236,14 @@ BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
>         old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>         while ((prog = READ_ONCE(item->prog))) {
>                 run_ctx.prog_item = item;
> -               if (!run_prog(prog, ctx))
> -                       ret = -EPERM;
> +               if (!run_prog(prog, ctx) && !run_ctx.errno_val)
> +                       run_ctx.errno_val = EPERM;
>                 item++;
>         }
>         bpf_reset_run_ctx(old_run_ctx);
>         rcu_read_unlock();
>         migrate_enable();
> -       return ret;
> +       return -run_ctx.errno_val;
>  }
>
>  static __always_inline u32
> @@ -1297,7 +1296,7 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
>   *   0: NET_XMIT_SUCCESS  skb should be transmitted
>   *   1: NET_XMIT_DROP     skb should be dropped and cn
>   *   2: NET_XMIT_CN       skb should be transmitted and cn
> - *   3: -EPERM            skb should be dropped
> + *   3: -errno            skb should be dropped
>   */
>  #define BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(array, ctx, func)                \
>         ({                                              \
> @@ -1309,7 +1308,7 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
>                 if (!_ret)                              \
>                         _ret = (_cn ? NET_XMIT_CN : NET_XMIT_SUCCESS);  \
>                 else                                    \
> -                       _ret = (_cn ? NET_XMIT_DROP : -EPERM);          \
> +                       _ret = (_cn ? NET_XMIT_DROP : _ret);            \
>                 _ret;                                   \
>         })
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 6fc59d61937a..d8126f8c0541 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4909,6 +4909,19 @@ union bpf_attr {
>   *     Return
>   *             The number of bytes written to the buffer, or a negative error
>   *             in case of failure.
> + *
> + * int bpf_export_errno(int errno_val)

it's subjective, but "bpf_export_errno" name is quite confusing. What
are we "exporting" and where?

I actually like Song's proposal for two helpers,
bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I
honestly don't remember the requirement to have one combined helper
from the BPF office hour discussion, but if there was a good reason
for that, please remind us.

> + *     Description
> + *             If *errno_val* is positive, set the syscall's return error code;

This inversion of error code is also confusing. If we are to return
-EXXX, bpf_set_err(EXXX) is quite confusing.

> + *             if *errno_val* is zero, retrieve the previously set code.

Also, are there use cases where zero is the valid "error" (or lack of
it, rather). I.e., wouldn't there be cases where you want to clear a
previous error? We might have discussed this, sorry if I forgot.

But either way, if bpf_set_err() accepted <= 0 and used that as error
value as-is (> 0 should be rejected, probably) that would make for
straightforward logic. Then for getting the current error we can have
a well-paired bpf_get_err()?


BTW, "errno" is very strongly associated with user-space errno, do we
want to have this naming association (this is the reason I used "err"
terminology above).

> + *
> + *             This helper is currently supported by cgroup programs only.
> + *     Return
> + *             Zero if set is successful, or the previously set error code on
> + *             retrieval. Previously set code may be zero if it was never set.
> + *             On error, a negative value.
> + *
> + *             **-EINVAL** if *errno_val* not between zero and MAX_ERRNO inclusive.

[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-20 23:28   ` Andrii Nakryiko
@ 2021-10-26  0:06     ` YiFei Zhu
  2021-10-26 15:44       ` Stanislav Fomichev
  0 siblings, 1 reply; 19+ messages in thread
From: YiFei Zhu @ 2021-10-26  0:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev

On Wed, Oct 20, 2021 at 4:28 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> it's subjective, but "bpf_export_errno" name is quite confusing. What
> are we "exporting" and where?
>
> I actually like Song's proposal for two helpers,
> bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I
> honestly don't remember the requirement to have one combined helper
> from the BPF office hour discussion, but if there was a good reason
> for that, please remind us.
>
> > + *     Description
> > + *             If *errno_val* is positive, set the syscall's return error code;
>
> This inversion of error code is also confusing. If we are to return
> -EXXX, bpf_set_err(EXXX) is quite confusing.
>
> > + *             if *errno_val* is zero, retrieve the previously set code.
>
> Also, are there use cases where zero is the valid "error" (or lack of
> it, rather). I.e., wouldn't there be cases where you want to clear a
> previous error? We might have discussed this, sorry if I forgot.

Hmm, originally I thought it's best to assume the underlying
assumption is that filters may set policies and it would violate it if
policies become ignored; however one could argue that debugging would
be a use case for an error-clearing filter.

Let's say we do bpf_set_err()/bpf_get_err(), with the ability to clear
errors. I'm having trouble thinking of the best way to have it
interact with the getsockopt "retval" in its context:
* Let's say the kernel initially sets an error code in the retval. I
think it would be a surprising behavior if only "retval" but not
bpf_get_err() shows the error. Therefore we'd need to initialize "err"
with the "retval" if retval is an error.
* If we initialize "err" with the "retval", then for a prog to clear
the error they'd need to clear it twice, once with bpf_set_err(0) with
and another with ctx->retval = 0. This will immediately break backward
compatibility. Therefore, we'd need to mirror the setting of
ctx->retval = 0 to bpf_set_err(0)
* In that case, what to do if a user uses ctx->retval as a way to pass
data between filters? I mean, whether ctx->retval is set to 0 or the
original is only checked after all filters are run. It could be any
value while the filters are running.
* A second issue, if we have first a legacy filter that returns 0 to
set EPERM, and then there's another filter that does a ctx->retval =
0. The original behavior would be that the syscall fails with EPERM,
but if we mirror ctx->retval = 0 to bpf_set_err(0), then that EPERM
would be cleared.

One of the reasons I liked "export" is that it's slightly clearer that
this value is strictly from the BPF's side and has nothing to do with
what the kernel sets (as in the getsockopt case). But yeah I agree
it's not an ideal name.

> But either way, if bpf_set_err() accepted <= 0 and used that as error
> value as-is (> 0 should be rejected, probably) that would make for
> straightforward logic. Then for getting the current error we can have
> a well-paired bpf_get_err()?
>
>
> BTW, "errno" is very strongly associated with user-space errno, do we
> want to have this naming association (this is the reason I used "err"
> terminology above).

Ack.

YiFei Zhu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-26  0:06     ` YiFei Zhu
@ 2021-10-26 15:44       ` Stanislav Fomichev
  2021-10-26 20:50         ` YiFei Zhu
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Fomichev @ 2021-10-26 15:44 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: Andrii Nakryiko, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Mon, Oct 25, 2021 at 5:06 PM YiFei Zhu <zhuyifei@google.com> wrote:
>
> On Wed, Oct 20, 2021 at 4:28 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > it's subjective, but "bpf_export_errno" name is quite confusing. What
> > are we "exporting" and where?
> >
> > I actually like Song's proposal for two helpers,
> > bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I
> > honestly don't remember the requirement to have one combined helper
> > from the BPF office hour discussion, but if there was a good reason
> > for that, please remind us.
> >
> > > + *     Description
> > > + *             If *errno_val* is positive, set the syscall's return error code;
> >
> > This inversion of error code is also confusing. If we are to return
> > -EXXX, bpf_set_err(EXXX) is quite confusing.
> >
> > > + *             if *errno_val* is zero, retrieve the previously set code.
> >
> > Also, are there use cases where zero is the valid "error" (or lack of
> > it, rather). I.e., wouldn't there be cases where you want to clear a
> > previous error? We might have discussed this, sorry if I forgot.
>
> Hmm, originally I thought it's best to assume the underlying
> assumption is that filters may set policies and it would violate it if
> policies become ignored; however one could argue that debugging would
> be a use case for an error-clearing filter.
>
> Let's say we do bpf_set_err()/bpf_get_err(), with the ability to clear
> errors. I'm having trouble thinking of the best way to have it
> interact with the getsockopt "retval" in its context:
> * Let's say the kernel initially sets an error code in the retval. I
> think it would be a surprising behavior if only "retval" but not
> bpf_get_err() shows the error. Therefore we'd need to initialize "err"
> with the "retval" if retval is an error.
> * If we initialize "err" with the "retval", then for a prog to clear
> the error they'd need to clear it twice, once with bpf_set_err(0) with
> and another with ctx->retval = 0. This will immediately break backward
> compatibility. Therefore, we'd need to mirror the setting of
> ctx->retval = 0 to bpf_set_err(0)
> * In that case, what to do if a user uses ctx->retval as a way to pass
> data between filters? I mean, whether ctx->retval is set to 0 or the
> original is only checked after all filters are run. It could be any
> value while the filters are running.
> * A second issue, if we have first a legacy filter that returns 0 to
> set EPERM, and then there's another filter that does a ctx->retval =
> 0. The original behavior would be that the syscall fails with EPERM,
> but if we mirror ctx->retval = 0 to bpf_set_err(0), then that EPERM
> would be cleared.
>
> One of the reasons I liked "export" is that it's slightly clearer that
> this value is strictly from the BPF's side and has nothing to do with
> what the kernel sets (as in the getsockopt case). But yeah I agree
> it's not an ideal name.

For getsockopt, maybe the best way to go is to point ctx->retval to
run_ctx.errno_val? (i.e., bpf_set_err would be equivalent to doing
ctx->retval = x;). We can leave ctx->retval as a backwards-compatible
legacy way of doing things. For new programs, bpf_set_err would work
universally, regardless of attach type. Any cons here?

> > But either way, if bpf_set_err() accepted <= 0 and used that as error
> > value as-is (> 0 should be rejected, probably) that would make for
> > straightforward logic. Then for getting the current error we can have
> > a well-paired bpf_get_err()?
> >
> >
> > BTW, "errno" is very strongly associated with user-space errno, do we
> > want to have this naming association (this is the reason I used "err"
> > terminology above).
>
> Ack.
>
> YiFei Zhu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-26 15:44       ` Stanislav Fomichev
@ 2021-10-26 20:50         ` YiFei Zhu
  2021-10-26 21:26           ` Stanislav Fomichev
  0 siblings, 1 reply; 19+ messages in thread
From: YiFei Zhu @ 2021-10-26 20:50 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Tue, Oct 26, 2021 at 8:44 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Mon, Oct 25, 2021 at 5:06 PM YiFei Zhu <zhuyifei@google.com> wrote:
> >
> > On Wed, Oct 20, 2021 at 4:28 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > it's subjective, but "bpf_export_errno" name is quite confusing. What
> > > are we "exporting" and where?
> > >
> > > I actually like Song's proposal for two helpers,
> > > bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I
> > > honestly don't remember the requirement to have one combined helper
> > > from the BPF office hour discussion, but if there was a good reason
> > > for that, please remind us.
> > >
> > > > + *     Description
> > > > + *             If *errno_val* is positive, set the syscall's return error code;
> > >
> > > This inversion of error code is also confusing. If we are to return
> > > -EXXX, bpf_set_err(EXXX) is quite confusing.
> > >
> > > > + *             if *errno_val* is zero, retrieve the previously set code.
> > >
> > > Also, are there use cases where zero is the valid "error" (or lack of
> > > it, rather). I.e., wouldn't there be cases where you want to clear a
> > > previous error? We might have discussed this, sorry if I forgot.
> >
> > Hmm, originally I thought it's best to assume the underlying
> > assumption is that filters may set policies and it would violate it if
> > policies become ignored; however one could argue that debugging would
> > be a use case for an error-clearing filter.
> >
> > Let's say we do bpf_set_err()/bpf_get_err(), with the ability to clear
> > errors. I'm having trouble thinking of the best way to have it
> > interact with the getsockopt "retval" in its context:
> > * Let's say the kernel initially sets an error code in the retval. I
> > think it would be a surprising behavior if only "retval" but not
> > bpf_get_err() shows the error. Therefore we'd need to initialize "err"
> > with the "retval" if retval is an error.
> > * If we initialize "err" with the "retval", then for a prog to clear
> > the error they'd need to clear it twice, once with bpf_set_err(0) with
> > and another with ctx->retval = 0. This will immediately break backward
> > compatibility. Therefore, we'd need to mirror the setting of
> > ctx->retval = 0 to bpf_set_err(0)
> > * In that case, what to do if a user uses ctx->retval as a way to pass
> > data between filters? I mean, whether ctx->retval is set to 0 or the
> > original is only checked after all filters are run. It could be any
> > value while the filters are running.
> > * A second issue, if we have first a legacy filter that returns 0 to
> > set EPERM, and then there's another filter that does a ctx->retval =
> > 0. The original behavior would be that the syscall fails with EPERM,
> > but if we mirror ctx->retval = 0 to bpf_set_err(0), then that EPERM
> > would be cleared.
> >
> > One of the reasons I liked "export" is that it's slightly clearer that
> > this value is strictly from the BPF's side and has nothing to do with
> > what the kernel sets (as in the getsockopt case). But yeah I agree
> > it's not an ideal name.
>
> For getsockopt, maybe the best way to go is to point ctx->retval to
> run_ctx.errno_val? (i.e., bpf_set_err would be equivalent to doing
> ctx->retval = x;). We can leave ctx->retval as a backwards-compatible
> legacy way of doing things. For new programs, bpf_set_err would work
> universally, regardless of attach type. Any cons here?

Is it a concern that AFAICT getsockopt retval may be a positive number
whereas the err here must be non-negative?

Also the fourth point still stands. If any getsockopt returns 0,
original behavior is return -EPERM whereas new behavior, clearing
retval will clear -EPERM.

YiFei Zhu

> > > But either way, if bpf_set_err() accepted <= 0 and used that as error
> > > value as-is (> 0 should be rejected, probably) that would make for
> > > straightforward logic. Then for getting the current error we can have
> > > a well-paired bpf_get_err()?
> > >
> > >
> > > BTW, "errno" is very strongly associated with user-space errno, do we
> > > want to have this naming association (this is the reason I used "err"
> > > terminology above).
> >
> > Ack.
> >
> > YiFei Zhu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-26 20:50         ` YiFei Zhu
@ 2021-10-26 21:26           ` Stanislav Fomichev
  2021-11-01 10:23             ` YiFei Zhu
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Fomichev @ 2021-10-26 21:26 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: Andrii Nakryiko, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Tue, Oct 26, 2021 at 1:50 PM YiFei Zhu <zhuyifei@google.com> wrote:
>
> On Tue, Oct 26, 2021 at 8:44 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Mon, Oct 25, 2021 at 5:06 PM YiFei Zhu <zhuyifei@google.com> wrote:
> > >
> > > On Wed, Oct 20, 2021 at 4:28 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > it's subjective, but "bpf_export_errno" name is quite confusing. What
> > > > are we "exporting" and where?
> > > >
> > > > I actually like Song's proposal for two helpers,
> > > > bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I
> > > > honestly don't remember the requirement to have one combined helper
> > > > from the BPF office hour discussion, but if there was a good reason
> > > > for that, please remind us.
> > > >
> > > > > + *     Description
> > > > > + *             If *errno_val* is positive, set the syscall's return error code;
> > > >
> > > > This inversion of error code is also confusing. If we are to return
> > > > -EXXX, bpf_set_err(EXXX) is quite confusing.
> > > >
> > > > > + *             if *errno_val* is zero, retrieve the previously set code.
> > > >
> > > > Also, are there use cases where zero is the valid "error" (or lack of
> > > > it, rather). I.e., wouldn't there be cases where you want to clear a
> > > > previous error? We might have discussed this, sorry if I forgot.
> > >
> > > Hmm, originally I thought it's best to assume the underlying
> > > assumption is that filters may set policies and it would violate it if
> > > policies become ignored; however one could argue that debugging would
> > > be a use case for an error-clearing filter.
> > >
> > > Let's say we do bpf_set_err()/bpf_get_err(), with the ability to clear
> > > errors. I'm having trouble thinking of the best way to have it
> > > interact with the getsockopt "retval" in its context:
> > > * Let's say the kernel initially sets an error code in the retval. I
> > > think it would be a surprising behavior if only "retval" but not
> > > bpf_get_err() shows the error. Therefore we'd need to initialize "err"
> > > with the "retval" if retval is an error.
> > > * If we initialize "err" with the "retval", then for a prog to clear
> > > the error they'd need to clear it twice, once with bpf_set_err(0) with
> > > and another with ctx->retval = 0. This will immediately break backward
> > > compatibility. Therefore, we'd need to mirror the setting of
> > > ctx->retval = 0 to bpf_set_err(0)
> > > * In that case, what to do if a user uses ctx->retval as a way to pass
> > > data between filters? I mean, whether ctx->retval is set to 0 or the
> > > original is only checked after all filters are run. It could be any
> > > value while the filters are running.
> > > * A second issue, if we have first a legacy filter that returns 0 to
> > > set EPERM, and then there's another filter that does a ctx->retval =
> > > 0. The original behavior would be that the syscall fails with EPERM,
> > > but if we mirror ctx->retval = 0 to bpf_set_err(0), then that EPERM
> > > would be cleared.
> > >
> > > One of the reasons I liked "export" is that it's slightly clearer that
> > > this value is strictly from the BPF's side and has nothing to do with
> > > what the kernel sets (as in the getsockopt case). But yeah I agree
> > > it's not an ideal name.
> >
> > For getsockopt, maybe the best way to go is to point ctx->retval to
> > run_ctx.errno_val? (i.e., bpf_set_err would be equivalent to doing
> > ctx->retval = x;). We can leave ctx->retval as a backwards-compatible
> > legacy way of doing things. For new programs, bpf_set_err would work
> > universally, regardless of attach type. Any cons here?
>
> Is it a concern that AFAICT getsockopt retval may be a positive number
> whereas the err here must be non-negative?

getsockopt retval is either -errno or 0. It's not really enforced at
load/attach time, but there is a runtime check which returns -EFAULT
if the prog sets it to something else.

> Also the fourth point still stands. If any getsockopt returns 0,
> original behavior is return -EPERM whereas new behavior, clearing
> retval will clear -EPERM.

True, but do you think these cases exist out there? I guess somebody
can do it inadvertently, but the example you've mentioned doesn't
really make sense, right?
This is why we are adding a way to propagate the status, so the
programs in the chain can understand whether they should do anything
at all (previous prog returned EPERM). Returning EPERM from the child
and then doing ctx->retval=0 in the parent should already not work as
expected.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value
  2021-10-26 21:26           ` Stanislav Fomichev
@ 2021-11-01 10:23             ` YiFei Zhu
  0 siblings, 0 replies; 19+ messages in thread
From: YiFei Zhu @ 2021-11-01 10:23 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, YiFei Zhu, bpf, Alexei Starovoitov, Daniel Borkmann

On Tue, Oct 26, 2021 at 2:26 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Tue, Oct 26, 2021 at 1:50 PM YiFei Zhu <zhuyifei@google.com> wrote:
> >
> > On Tue, Oct 26, 2021 at 8:44 AM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > On Mon, Oct 25, 2021 at 5:06 PM YiFei Zhu <zhuyifei@google.com> wrote:
> > > >
> > > > On Wed, Oct 20, 2021 at 4:28 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > it's subjective, but "bpf_export_errno" name is quite confusing. What
> > > > > are we "exporting" and where?
> > > > >
> > > > > I actually like Song's proposal for two helpers,
> > > > > bpf_set_err()/bpf_get_err(). It makes the semantics less confusing. I
> > > > > honestly don't remember the requirement to have one combined helper
> > > > > from the BPF office hour discussion, but if there was a good reason
> > > > > for that, please remind us.
> > > > >
> > > > > > + *     Description
> > > > > > + *             If *errno_val* is positive, set the syscall's return error code;
> > > > >
> > > > > This inversion of error code is also confusing. If we are to return
> > > > > -EXXX, bpf_set_err(EXXX) is quite confusing.
> > > > >
> > > > > > + *             if *errno_val* is zero, retrieve the previously set code.
> > > > >
> > > > > Also, are there use cases where zero is the valid "error" (or lack of
> > > > > it, rather). I.e., wouldn't there be cases where you want to clear a
> > > > > previous error? We might have discussed this, sorry if I forgot.
> > > >
> > > > Hmm, originally I thought it's best to assume the underlying
> > > > assumption is that filters may set policies and it would violate it if
> > > > policies become ignored; however one could argue that debugging would
> > > > be a use case for an error-clearing filter.
> > > >
> > > > Let's say we do bpf_set_err()/bpf_get_err(), with the ability to clear
> > > > errors. I'm having trouble thinking of the best way to have it
> > > > interact with the getsockopt "retval" in its context:
> > > > * Let's say the kernel initially sets an error code in the retval. I
> > > > think it would be a surprising behavior if only "retval" but not
> > > > bpf_get_err() shows the error. Therefore we'd need to initialize "err"
> > > > with the "retval" if retval is an error.
> > > > * If we initialize "err" with the "retval", then for a prog to clear
> > > > the error they'd need to clear it twice, once with bpf_set_err(0) with
> > > > and another with ctx->retval = 0. This will immediately break backward
> > > > compatibility. Therefore, we'd need to mirror the setting of
> > > > ctx->retval = 0 to bpf_set_err(0)
> > > > * In that case, what to do if a user uses ctx->retval as a way to pass
> > > > data between filters? I mean, whether ctx->retval is set to 0 or the
> > > > original is only checked after all filters are run. It could be any
> > > > value while the filters are running.
> > > > * A second issue, if we have first a legacy filter that returns 0 to
> > > > set EPERM, and then there's another filter that does a ctx->retval =
> > > > 0. The original behavior would be that the syscall fails with EPERM,
> > > > but if we mirror ctx->retval = 0 to bpf_set_err(0), then that EPERM
> > > > would be cleared.
> > > >
> > > > One of the reasons I liked "export" is that it's slightly clearer that
> > > > this value is strictly from the BPF's side and has nothing to do with
> > > > what the kernel sets (as in the getsockopt case). But yeah I agree
> > > > it's not an ideal name.
> > >
> > > For getsockopt, maybe the best way to go is to point ctx->retval to
> > > run_ctx.errno_val? (i.e., bpf_set_err would be equivalent to doing
> > > ctx->retval = x;). We can leave ctx->retval as a backwards-compatible
> > > legacy way of doing things. For new programs, bpf_set_err would work
> > > universally, regardless of attach type. Any cons here?
> >
> > Is it a concern that AFAICT getsockopt retval may be a positive number
> > whereas the err here must be non-negative?
>
> getsockopt retval is either -errno or 0. It's not really enforced at
> load/attach time, but there is a runtime check which returns -EFAULT
> if the prog sets it to something else.
>
> > Also the fourth point still stands. If any getsockopt returns 0,
> > original behavior is return -EPERM whereas new behavior, clearing
> > retval will clear -EPERM.
>
> True, but do you think these cases exist out there? I guess somebody
> can do it inadvertently, but the example you've mentioned doesn't
> really make sense, right?
> This is why we are adding a way to propagate the status, so the
> programs in the chain can understand whether they should do anything
> at all (previous prog returned EPERM). Returning EPERM from the child
> and then doing ctx->retval=0 in the parent should already not work as
> expected.

How about this? Have a bpf_{get,set}_retval that mirrors (in both
directions) the ctx->retval without any processing. Considering
in-kernel implementations of getsockopt sometimes return positive
values (usually optlen), we could allow eBPF-implemented getsockopt to
do so too, by relaxing the current 'only change to zero or keep the
same restriction, and allow it the filter to set arbitrary return
values to user space. For a filter that runs before the in-kernel
implementation, such as setsockopt or cgroup_skb, we verify after
running all the hooks, that it must be 0 or a negative number in
-errno; -EFAULT otherwise.

For legacy -EPERM programs that do it by returning 0, a filter that
bpf_set_retval(0) or ctx->retval = 0 will clear the -EPERM, this will
be different from the current behavior of getsockopt programs. I'm not
really sure of any use cases where users would rely on the current
behavior -- one would do ctx->retval = 0 to tell userspace that
something is done, yet another filter denies that 'something'? Doesn't
make sense to me, but correct me if I'm wrong or if we think this UAPI
must be kept exactly the same.

Another potential UAPI breakage is that originally getsockopt hooks
can inject an -EFAULT instead of -EPERM by setting bogus values to
ctx->retval. Now they have to do it by setting ctx->retval = -EFAULT;
any other value, even bogus values will be passed to userspace. That
said, I'm not sure why anyone would want to return an -EFAULT instead
of -EPERM; some unusual fault injection maybe? And in that case if
they literally want -EFAULT, the statement that makes sense would
already be ctx->retval = -EFAULT, which is usually a bogus value.

Considering that here we would have bpf_{get,set}_retval with no
in-kernel processing at all to mirror a value in ctx... I think it
would make a lot of sense to just use a context variable instead of
helpers (i.e. provide ctx->retval to all cgroup program types)?

YiFei Zhu

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-11-01 10:23 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06 16:02 [PATCH bpf-next 0/3] bpf: allow cgroup progs to export custom errnos to userspace YiFei Zhu
2021-10-06 16:02 ` [PATCH bpf-next 1/3] bpf: Make BPF_PROG_RUN_ARRAY return -errno instead of allow boolean YiFei Zhu
2021-10-07  0:36   ` Song Liu
2021-10-06 16:02 ` [PATCH bpf-next 2/3] bpf: Add cgroup helper bpf_export_errno to get/set exported errno value YiFei Zhu
2021-10-07  0:41   ` Song Liu
2021-10-07  5:59     ` Song Liu
2021-10-07 15:11       ` sdf
2021-10-07 16:23         ` YiFei Zhu
2021-10-07 16:34           ` Song Liu
2021-10-08 20:49             ` YiFei Zhu
2021-10-08 21:00               ` Stanislav Fomichev
2021-10-20 23:28   ` Andrii Nakryiko
2021-10-26  0:06     ` YiFei Zhu
2021-10-26 15:44       ` Stanislav Fomichev
2021-10-26 20:50         ` YiFei Zhu
2021-10-26 21:26           ` Stanislav Fomichev
2021-11-01 10:23             ` YiFei Zhu
2021-10-06 16:02 ` [PATCH bpf-next 3/3] selftests/bpf: Test bpf_export_errno behavior with cgroup/sockopt YiFei Zhu
2021-10-18 17:51   ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).