All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 0/2] libbpf: auto-bumpd RLIMIT_MEMLOCK on old kernels
@ 2021-12-10 20:13 Andrii Nakryiko
  2021-12-10 20:13 ` [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF Andrii Nakryiko
  2021-12-10 20:13 ` [PATCH v2 bpf-next 2/2] selftests/bpf: remove explicit setrlimit(RLIMIT_MEMLOCK) in main selftests Andrii Nakryiko
  0 siblings, 2 replies; 6+ messages in thread
From: Andrii Nakryiko @ 2021-12-10 20:13 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

Make libbpf bump RLIMIT_MEMLOCK, similarly to BCC, if kernel is old enough to
use memcg-based memory accounting for BPF. Patch #2 drops explicit
setrlimi(RLIMIT_MEMLOCK) calls in test_progs, test_maps, and test_verifier.

v1->v2:
  - fix up out-of-sync comments (Toke).

Andrii Nakryiko (2):
  libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
  selftests/bpf: remove explicit setrlimit(RLIMIT_MEMLOCK) in main
    selftests

 tools/lib/bpf/bpf.c                           | 121 ++++++++++++++++++
 tools/lib/bpf/bpf.h                           |   2 +
 tools/lib/bpf/libbpf.c                        |  47 ++-----
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/lib/bpf/libbpf_internal.h               |  39 ++++++
 tools/lib/bpf/libbpf_legacy.h                 |  12 +-
 tools/testing/selftests/bpf/bench.c           |  16 ---
 tools/testing/selftests/bpf/prog_tests/btf.c  |   1 -
 .../bpf/prog_tests/select_reuseport.c         |   1 -
 .../selftests/bpf/prog_tests/sk_lookup.c      |   1 -
 .../selftests/bpf/prog_tests/sock_fields.c    |   1 -
 tools/testing/selftests/bpf/test_maps.c       |   1 -
 tools/testing/selftests/bpf/test_progs.c      |   2 -
 tools/testing/selftests/bpf/test_verifier.c   |   4 +-
 14 files changed, 186 insertions(+), 63 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
  2021-12-10 20:13 [PATCH v2 bpf-next 0/2] libbpf: auto-bumpd RLIMIT_MEMLOCK on old kernels Andrii Nakryiko
@ 2021-12-10 20:13 ` Andrii Nakryiko
  2021-12-11 19:36   ` Toke Høiland-Jørgensen
  2021-12-10 20:13 ` [PATCH v2 bpf-next 2/2] selftests/bpf: remove explicit setrlimit(RLIMIT_MEMLOCK) in main selftests Andrii Nakryiko
  1 sibling, 1 reply; 6+ messages in thread
From: Andrii Nakryiko @ 2021-12-10 20:13 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
one of the first extremely frustrating gotchas that all new BPF users go
through and in some cases have to learn it a very hard way.

Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
dropped the dependency on memlock and uses memcg-based memory accounting
instead. Unfortunately, detecting memcg-based BPF memory accounting is
far from trivial (as can be evidenced by this patch), so in practice
most BPF applications still do unconditional RLIMIT_MEMLOCK increase.

As we move towards libbpf 1.0, it would be good to allow users to forget
about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
automatically. This patch paves the way forward in this matter. Libbpf
will do feature detection of memcg-based accounting, and if detected,
will do nothing. But if the kernel is too old, just like BCC, libbpf
will automatically increase RLIMIT_MEMLOCK on behalf of user
application ([0]).

As this is technically a breaking change, during the transition period
applications have to opt into libbpf 1.0 mode by setting
LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
libbpf_set_strict_mode().

Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
called before the first bpf_prog_load(), bpf_btf_load(), or
bpf_object__load() call, otherwise it has no effect and will return
-EBUSY.

  [0] Closes: https://github.com/libbpf/libbpf/issues/369

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf.c             | 121 ++++++++++++++++++++++++++++++++
 tools/lib/bpf/bpf.h             |   2 +
 tools/lib/bpf/libbpf.c          |  47 +++----------
 tools/lib/bpf/libbpf.map        |   1 +
 tools/lib/bpf/libbpf_internal.h |  39 ++++++++++
 tools/lib/bpf/libbpf_legacy.h   |  12 +++-
 6 files changed, 183 insertions(+), 39 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 6b2407e12060..7c82136979bf 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -29,6 +29,7 @@
 #include <errno.h>
 #include <linux/bpf.h>
 #include <limits.h>
+#include <sys/resource.h>
 #include "bpf.h"
 #include "libbpf.h"
 #include "libbpf_internal.h"
@@ -94,6 +95,118 @@ static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size, int
 	return fd;
 }
 
+/* Probe whether kernel switched from memlock-based (RLIMIT_MEMLOCK) to
+ * memcg-based memory accounting for BPF maps and progs. This was done in [0],
+ * but it's not straightforward to detect those changes from the user-space.
+ * So instead we'll try to detect whether [1] is in the kernel, which follows
+ * [0] almost immediately and made it into the upstream kernel in the same
+ * release.
+ *
+ * For this, we'll upload a trivial BTF into the kernel and will try to load
+ * a trivial BPF program with attach_btf_obj_fd pointing to our BTF. If it
+ * returns anything but -ENOTSUPP, we'll assume we still need RLIMIT_MEMLOCK.
+ *
+ *   [0] https://lore.kernel.org/bpf/20201201215900.3569844-1-guro@fb.com/
+ *   [1] Fixes: 8bdd8e275ede ("bpf: Return -ENOTSUPP when attaching to non-kernel BTF")
+ */
+int probe_memcg_account(void)
+{
+	const size_t bpf_load_attr_sz = offsetofend(union bpf_attr, attach_btf_obj_fd);
+	const size_t btf_load_attr_sz = offsetofend(union bpf_attr, btf_log_level);
+	int prog_fd = -1, btf_fd = -1, err;
+	struct bpf_insn insns[1] = {}; /* instructions don't matter */
+	const void *btf_data;
+	union bpf_attr attr;
+	__u32 btf_data_sz;
+	struct btf *btf;
+
+	/* create the simplest BTF object and upload it into the kernel */
+	btf = btf__new_empty();
+	err = libbpf_get_error(btf);
+	if (err)
+		return err;
+	err = btf__add_int(btf, "int", 4, 0);
+	btf_data = btf__raw_data(btf, &btf_data_sz);
+	if (!btf_data) {
+		err = -ENOMEM;
+		goto cleanup;
+	}
+
+	/* we won't use bpf_btf_load() or bpf_prog_load() because they will
+	 * be trying to detect this feature and will cause an infinite loop
+	 */
+	memset(&attr, 0, btf_load_attr_sz);
+	attr.btf = ptr_to_u64(btf_data);
+	attr.btf_size = btf_data_sz;
+	btf_fd = sys_bpf_fd(BPF_BTF_LOAD, &attr, btf_load_attr_sz);
+	if (btf_fd < 0) {
+		err = -errno;
+		goto cleanup;
+	}
+
+	/* attempt loading freplace trying to use custom BTF */
+	memset(&attr, 0, bpf_load_attr_sz);
+	attr.prog_type = BPF_PROG_TYPE_TRACING;
+	attr.expected_attach_type = BPF_TRACE_FENTRY;
+	attr.insns = ptr_to_u64(insns);
+	attr.insn_cnt = sizeof(insns) / sizeof(insns[0]);
+	attr.license = ptr_to_u64("GPL");
+	attr.attach_btf_obj_fd = btf_fd;
+
+	prog_fd = sys_bpf_fd(BPF_PROG_LOAD, &attr, bpf_load_attr_sz);
+	if (prog_fd >= 0)
+		/* bug? we shouldn't be able to successfully load program */
+		err = -EFAULT;
+	else
+		/* assume memcg accounting is supported if we get -ENOTSUPP */
+		err = errno == 524 /* ENOTSUPP */ ? 1 : 0;
+
+cleanup:
+	btf__free(btf);
+	if (btf_fd >= 0)
+		close(btf_fd);
+	if (prog_fd >= 0)
+		close(prog_fd);
+	return err;
+}
+
+static bool memlock_bumped;
+static rlim_t memlock_rlim_max = RLIM_INFINITY;
+
+int libbpf_set_memlock_rlim_max(size_t memlock_max)
+{
+	if (memlock_bumped)
+		return libbpf_err(-EBUSY);
+
+	memlock_rlim_max = memlock_max;
+	return 0;
+}
+
+int bump_rlimit_memlock(void)
+{
+	struct rlimit rlim;
+
+	/* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
+	if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
+		return 0;
+
+	/* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
+	if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
+		return 0;
+
+	memlock_bumped = true;
+
+	/* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */
+	if (memlock_rlim_max == 0)
+		return 0;
+
+	rlim.rlim_cur = rlim.rlim_max = memlock_rlim_max;
+	if (setrlimit(RLIMIT_MEMLOCK, &rlim))
+		return -errno;
+
+	return 0;
+}
+
 int bpf_map_create(enum bpf_map_type map_type,
 		   const char *map_name,
 		   __u32 key_size,
@@ -105,6 +218,8 @@ int bpf_map_create(enum bpf_map_type map_type,
 	union bpf_attr attr;
 	int fd;
 
+	bump_rlimit_memlock();
+
 	memset(&attr, 0, attr_sz);
 
 	if (!OPTS_VALID(opts, bpf_map_create_opts))
@@ -251,6 +366,8 @@ int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type,
 	union bpf_attr attr;
 	char *log_buf;
 
+	bump_rlimit_memlock();
+
 	if (!OPTS_VALID(opts, bpf_prog_load_opts))
 		return libbpf_err(-EINVAL);
 
@@ -456,6 +573,8 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns,
 	union bpf_attr attr;
 	int fd;
 
+	bump_rlimit_memlock();
+
 	memset(&attr, 0, sizeof(attr));
 	attr.prog_type = type;
 	attr.insn_cnt = (__u32)insns_cnt;
@@ -1056,6 +1175,8 @@ int bpf_btf_load(const void *btf_data, size_t btf_size, const struct bpf_btf_loa
 	__u32 log_level;
 	int fd;
 
+	bump_rlimit_memlock();
+
 	memset(&attr, 0, attr_sz);
 
 	if (!OPTS_VALID(opts, bpf_btf_load_opts))
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 94e553a0ff9d..752c076e9bb8 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -35,6 +35,8 @@
 extern "C" {
 #endif
 
+int libbpf_set_memlock_rlim_max(size_t memlock_max);
+
 struct bpf_map_create_opts {
 	size_t sz; /* size of this struct for forward/backward compatibility */
 
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index d027e1d620fc..4e842ce14080 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -187,42 +187,6 @@ const char *libbpf_version_string(void)
 #undef __S
 }
 
-enum kern_feature_id {
-	/* v4.14: kernel support for program & map names. */
-	FEAT_PROG_NAME,
-	/* v5.2: kernel support for global data sections. */
-	FEAT_GLOBAL_DATA,
-	/* BTF support */
-	FEAT_BTF,
-	/* BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO support */
-	FEAT_BTF_FUNC,
-	/* BTF_KIND_VAR and BTF_KIND_DATASEC support */
-	FEAT_BTF_DATASEC,
-	/* BTF_FUNC_GLOBAL is supported */
-	FEAT_BTF_GLOBAL_FUNC,
-	/* BPF_F_MMAPABLE is supported for arrays */
-	FEAT_ARRAY_MMAP,
-	/* kernel support for expected_attach_type in BPF_PROG_LOAD */
-	FEAT_EXP_ATTACH_TYPE,
-	/* bpf_probe_read_{kernel,user}[_str] helpers */
-	FEAT_PROBE_READ_KERN,
-	/* BPF_PROG_BIND_MAP is supported */
-	FEAT_PROG_BIND_MAP,
-	/* Kernel support for module BTFs */
-	FEAT_MODULE_BTF,
-	/* BTF_KIND_FLOAT support */
-	FEAT_BTF_FLOAT,
-	/* BPF perf link support */
-	FEAT_PERF_LINK,
-	/* BTF_KIND_DECL_TAG support */
-	FEAT_BTF_DECL_TAG,
-	/* BTF_KIND_TYPE_TAG support */
-	FEAT_BTF_TYPE_TAG,
-	__FEAT_CNT,
-};
-
-static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id);
-
 enum reloc_type {
 	RELO_LD64,
 	RELO_CALL,
@@ -4354,6 +4318,10 @@ bpf_object__probe_loading(struct bpf_object *obj)
 	if (obj->gen_loader)
 		return 0;
 
+	ret = bump_rlimit_memlock();
+	if (ret)
+		pr_warn("Failed to bump RLIMIT_MEMLOCK (err = %d), you might need to do it explicitly!\n", ret);
+
 	/* make sure basic loading works */
 	ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, NULL);
 	if (ret < 0)
@@ -4720,14 +4688,17 @@ static struct kern_feature_desc {
 	[FEAT_BTF_TYPE_TAG] = {
 		"BTF_KIND_TYPE_TAG support", probe_kern_btf_type_tag,
 	},
+	[FEAT_MEMCG_ACCOUNT] = {
+		"memcg-based memory accounting", probe_memcg_account,
+	},
 };
 
-static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
+bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
 {
 	struct kern_feature_desc *feat = &feature_probes[feat_id];
 	int ret;
 
-	if (obj->gen_loader)
+	if (obj && obj->gen_loader)
 		/* To generate loader program assume the latest kernel
 		 * to avoid doing extra prog_load, map_create syscalls.
 		 */
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 4d483af7dba6..b3938b3f8fc9 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -427,4 +427,5 @@ LIBBPF_0.7.0 {
 		bpf_program__log_level;
 		bpf_program__set_log_buf;
 		bpf_program__set_log_level;
+		libbpf_set_memlock_rlim_max;
 };
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 355c41019aed..0a6754606234 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -272,6 +272,45 @@ static inline bool libbpf_validate_opts(const char *opts,
 					(opts)->sz - __off);		      \
 })
 
+enum kern_feature_id {
+	/* v4.14: kernel support for program & map names. */
+	FEAT_PROG_NAME,
+	/* v5.2: kernel support for global data sections. */
+	FEAT_GLOBAL_DATA,
+	/* BTF support */
+	FEAT_BTF,
+	/* BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO support */
+	FEAT_BTF_FUNC,
+	/* BTF_KIND_VAR and BTF_KIND_DATASEC support */
+	FEAT_BTF_DATASEC,
+	/* BTF_FUNC_GLOBAL is supported */
+	FEAT_BTF_GLOBAL_FUNC,
+	/* BPF_F_MMAPABLE is supported for arrays */
+	FEAT_ARRAY_MMAP,
+	/* kernel support for expected_attach_type in BPF_PROG_LOAD */
+	FEAT_EXP_ATTACH_TYPE,
+	/* bpf_probe_read_{kernel,user}[_str] helpers */
+	FEAT_PROBE_READ_KERN,
+	/* BPF_PROG_BIND_MAP is supported */
+	FEAT_PROG_BIND_MAP,
+	/* Kernel support for module BTFs */
+	FEAT_MODULE_BTF,
+	/* BTF_KIND_FLOAT support */
+	FEAT_BTF_FLOAT,
+	/* BPF perf link support */
+	FEAT_PERF_LINK,
+	/* BTF_KIND_DECL_TAG support */
+	FEAT_BTF_DECL_TAG,
+	/* BTF_KIND_TYPE_TAG support */
+	FEAT_BTF_TYPE_TAG,
+	/* memcg-based accounting for BPF maps and progs */
+	FEAT_MEMCG_ACCOUNT,
+	__FEAT_CNT,
+};
+
+int probe_memcg_account(void);
+bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id);
+int bump_rlimit_memlock(void);
 
 int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz);
 int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
diff --git a/tools/lib/bpf/libbpf_legacy.h b/tools/lib/bpf/libbpf_legacy.h
index bb03c568af7b..79131f761a27 100644
--- a/tools/lib/bpf/libbpf_legacy.h
+++ b/tools/lib/bpf/libbpf_legacy.h
@@ -45,7 +45,6 @@ enum libbpf_strict_mode {
 	 * (positive) error code.
 	 */
 	LIBBPF_STRICT_DIRECT_ERRS = 0x02,
-
 	/*
 	 * Enforce strict BPF program section (SEC()) names.
 	 * E.g., while prefiously SEC("xdp_whatever") or SEC("perf_event_blah") were
@@ -63,6 +62,17 @@ enum libbpf_strict_mode {
 	 * Clients can maintain it on their own if it is valuable for them.
 	 */
 	LIBBPF_STRICT_NO_OBJECT_LIST = 0x08,
+	/*
+	 * Automatically bump RLIMIT_MEMLOCK using setrlimit() before the
+	 * first BPF program or map creation operation. This is done only if
+	 * kernel is too old to support memcg-based memory accounting for BPF
+	 * subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY,
+	 * but it can be overriden with libbpf_set_memlock_rlim_max() API.
+	 * Note that libbpf_set_memlock_rlim_max() needs to be called before
+	 * the very first bpf_prog_load(), bpf_map_create() or bpf_object__load()
+	 * operation.
+	 */
+	LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK = 0x10,
 
 	__LIBBPF_STRICT_LAST,
 };
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 bpf-next 2/2] selftests/bpf: remove explicit setrlimit(RLIMIT_MEMLOCK) in main selftests
  2021-12-10 20:13 [PATCH v2 bpf-next 0/2] libbpf: auto-bumpd RLIMIT_MEMLOCK on old kernels Andrii Nakryiko
  2021-12-10 20:13 ` [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF Andrii Nakryiko
@ 2021-12-10 20:13 ` Andrii Nakryiko
  1 sibling, 0 replies; 6+ messages in thread
From: Andrii Nakryiko @ 2021-12-10 20:13 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

As libbpf now is able to automatically take care of RLIMIT_MEMLOCK
increase (or skip it altogether on recent enough kernels), remove
explicit setrlimit() invocations in bench, test_maps, test_verifier, and
test_progs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/testing/selftests/bpf/bench.c              | 16 ----------------
 tools/testing/selftests/bpf/prog_tests/btf.c     |  1 -
 .../selftests/bpf/prog_tests/select_reuseport.c  |  1 -
 .../testing/selftests/bpf/prog_tests/sk_lookup.c |  1 -
 .../selftests/bpf/prog_tests/sock_fields.c       |  1 -
 tools/testing/selftests/bpf/test_maps.c          |  1 -
 tools/testing/selftests/bpf/test_progs.c         |  2 --
 tools/testing/selftests/bpf/test_verifier.c      |  4 +++-
 8 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 3d6082b97a56..bdb071dc41f2 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -29,26 +29,10 @@ static int libbpf_print_fn(enum libbpf_print_level level,
 	return vfprintf(stderr, format, args);
 }
 
-static int bump_memlock_rlimit(void)
-{
-	struct rlimit rlim_new = {
-		.rlim_cur	= RLIM_INFINITY,
-		.rlim_max	= RLIM_INFINITY,
-	};
-
-	return setrlimit(RLIMIT_MEMLOCK, &rlim_new);
-}
-
 void setup_libbpf()
 {
-	int err;
-
 	libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
 	libbpf_set_print(libbpf_print_fn);
-
-	err = bump_memlock_rlimit();
-	if (err)
-		fprintf(stderr, "failed to increase RLIMIT_MEMLOCK: %d", err);
 }
 
 void false_hits_report_progress(int iter, struct bench_res *res, long delta_ns)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
index 01b776a7beeb..8ba53acf9eb4 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf.c
@@ -22,7 +22,6 @@
 #include <bpf/libbpf.h>
 #include <bpf/btf.h>
 
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "../test_btf.h"
 #include "test_progs.h"
diff --git a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
index 980ac0f2c0bb..1cbd8cd64044 100644
--- a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
+++ b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
@@ -18,7 +18,6 @@
 #include <netinet/in.h>
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 
 #include "test_progs.h"
diff --git a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c
index 57846cc7ce36..597d0467a926 100644
--- a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c
+++ b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c
@@ -30,7 +30,6 @@
 #include <bpf/bpf.h>
 
 #include "test_progs.h"
-#include "bpf_rlimit.h"
 #include "bpf_util.h"
 #include "cgroup_helpers.h"
 #include "network_helpers.h"
diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index fae40db4d81f..9fc040eaa482 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -15,7 +15,6 @@
 #include "network_helpers.h"
 #include "cgroup_helpers.h"
 #include "test_progs.h"
-#include "bpf_rlimit.h"
 #include "test_sock_fields.skel.h"
 
 enum bpf_linum_array_idx {
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index f4cd658bbe00..50f7e74ca0b9 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -23,7 +23,6 @@
 #include <bpf/libbpf.h>
 
 #include "bpf_util.h"
-#include "bpf_rlimit.h"
 #include "test_maps.h"
 #include "testing_helpers.h"
 
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 296928948bb9..2ecb73a65206 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -4,7 +4,6 @@
 #define _GNU_SOURCE
 #include "test_progs.h"
 #include "cgroup_helpers.h"
-#include "bpf_rlimit.h"
 #include <argp.h>
 #include <pthread.h>
 #include <sched.h>
@@ -1342,7 +1341,6 @@ int main(int argc, char **argv)
 
 	/* Use libbpf 1.0 API mode */
 	libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
-
 	libbpf_set_print(libbpf_print_fn);
 
 	srand(time(NULL));
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 07b88a8f504f..ee661424eb2f 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -41,7 +41,6 @@
 #  define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
 # endif
 #endif
-#include "bpf_rlimit.h"
 #include "bpf_rand.h"
 #include "bpf_util.h"
 #include "test_btf.h"
@@ -1355,6 +1354,9 @@ int main(int argc, char **argv)
 		return EXIT_FAILURE;
 	}
 
+	/* Use libbpf 1.0 API mode */
+	libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
+
 	bpf_semi_rand_init();
 	return do_test(unpriv, from, to);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
  2021-12-10 20:13 ` [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF Andrii Nakryiko
@ 2021-12-11 19:36   ` Toke Høiland-Jørgensen
  2021-12-12  2:22     ` Alexei Starovoitov
  0 siblings, 1 reply; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-12-11 19:36 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, ast, daniel; +Cc: andrii, kernel-team

Andrii Nakryiko <andrii@kernel.org> writes:

> The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
> one of the first extremely frustrating gotchas that all new BPF users go
> through and in some cases have to learn it a very hard way.
>
> Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
> dropped the dependency on memlock and uses memcg-based memory accounting
> instead. Unfortunately, detecting memcg-based BPF memory accounting is
> far from trivial (as can be evidenced by this patch), so in practice
> most BPF applications still do unconditional RLIMIT_MEMLOCK increase.
>
> As we move towards libbpf 1.0, it would be good to allow users to forget
> about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
> automatically. This patch paves the way forward in this matter. Libbpf
> will do feature detection of memcg-based accounting, and if detected,
> will do nothing. But if the kernel is too old, just like BCC, libbpf
> will automatically increase RLIMIT_MEMLOCK on behalf of user
> application ([0]).
>
> As this is technically a breaking change, during the transition period
> applications have to opt into libbpf 1.0 mode by setting
> LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
> libbpf_set_strict_mode().
>
> Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
> with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
> nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
> called before the first bpf_prog_load(), bpf_btf_load(), or
> bpf_object__load() call, otherwise it has no effect and will return
> -EBUSY.
>
>   [0] Closes: https://github.com/libbpf/libbpf/issues/369
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

The probing approach breaks with out-of-order backports, I suppose.
Hopefully no one will do those for that particular patch, though (it's
not really a bugfix), and at least for RHEL we did backport them
together.

Can't think of any better ways of doing the detection either, but maybe
something to be aware of in the future (i.e., "don't change things in a
way that can't be detected from userspace")?

Anyway, with the nits below:

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>

> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 6b2407e12060..7c82136979bf 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c

[...]

> +	/* attempt loading freplace trying to use custom BTF */
> +	memset(&attr, 0, bpf_load_attr_sz);
> +	attr.prog_type = BPF_PROG_TYPE_TRACING;
> +	attr.expected_attach_type = BPF_TRACE_FENTRY;

This comment also seems to be disagreeing with the code it's commenting
on?

[...]

> +static bool memlock_bumped;
> +static rlim_t memlock_rlim_max = RLIM_INFINITY;
> +
> +int libbpf_set_memlock_rlim_max(size_t memlock_max)
> +{
> +	if (memlock_bumped)
> +		return libbpf_err(-EBUSY);
> +
> +	memlock_rlim_max = memlock_max;
> +	return 0;
> +}
> +
> +int bump_rlimit_memlock(void)
> +{
> +	struct rlimit rlim;
> +
> +	/* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
> +	if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
> +		return 0;
> +
> +	/* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
> +	if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
> +		return 0;
> +
> +	memlock_bumped = true;
> +
> +	/* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */
> +	if (memlock_rlim_max == 0)
> +		return 0;
> +
> +	rlim.rlim_cur = rlim.rlim_max = memlock_rlim_max;
> +	if (setrlimit(RLIMIT_MEMLOCK, &rlim))
> +		return -errno;
> +
> +	return 0;
> +}
> +

"rlim_max" seems to imply this will only ever increase the limit, but if
I'm reading the code correctly it could actually end up lowering the
effective limit?

-Toke


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
  2021-12-11 19:36   ` Toke Høiland-Jørgensen
@ 2021-12-12  2:22     ` Alexei Starovoitov
  2021-12-12 19:45       ` Andrii Nakryiko
  0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2021-12-12  2:22 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team

On Sat, Dec 11, 2021 at 11:36 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii@kernel.org> writes:
>
> > The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
> > one of the first extremely frustrating gotchas that all new BPF users go
> > through and in some cases have to learn it a very hard way.
> >
> > Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
> > dropped the dependency on memlock and uses memcg-based memory accounting
> > instead. Unfortunately, detecting memcg-based BPF memory accounting is
> > far from trivial (as can be evidenced by this patch), so in practice
> > most BPF applications still do unconditional RLIMIT_MEMLOCK increase.
> >
> > As we move towards libbpf 1.0, it would be good to allow users to forget
> > about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
> > automatically. This patch paves the way forward in this matter. Libbpf
> > will do feature detection of memcg-based accounting, and if detected,
> > will do nothing. But if the kernel is too old, just like BCC, libbpf
> > will automatically increase RLIMIT_MEMLOCK on behalf of user
> > application ([0]).
> >
> > As this is technically a breaking change, during the transition period
> > applications have to opt into libbpf 1.0 mode by setting
> > LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
> > libbpf_set_strict_mode().
> >
> > Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
> > with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
> > nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
> > called before the first bpf_prog_load(), bpf_btf_load(), or
> > bpf_object__load() call, otherwise it has no effect and will return
> > -EBUSY.
> >
> >   [0] Closes: https://github.com/libbpf/libbpf/issues/369
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> The probing approach breaks with out-of-order backports, I suppose.
> Hopefully no one will do those for that particular patch, though (it's
> not really a bugfix), and at least for RHEL we did backport them
> together.
>
> Can't think of any better ways of doing the detection either, but maybe
> something to be aware of in the future (i.e., "don't change things in a
> way that can't be detected from userspace")?
>
> Anyway, with the nits below:
>
> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 6b2407e12060..7c82136979bf 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
>
> [...]
>
> > +     /* attempt loading freplace trying to use custom BTF */
> > +     memset(&attr, 0, bpf_load_attr_sz);
> > +     attr.prog_type = BPF_PROG_TYPE_TRACING;
> > +     attr.expected_attach_type = BPF_TRACE_FENTRY;
>
> This comment also seems to be disagreeing with the code it's commenting
> on?
>
> [...]
>
> > +static bool memlock_bumped;
> > +static rlim_t memlock_rlim_max = RLIM_INFINITY;
> > +
> > +int libbpf_set_memlock_rlim_max(size_t memlock_max)
> > +{
> > +     if (memlock_bumped)
> > +             return libbpf_err(-EBUSY);
> > +
> > +     memlock_rlim_max = memlock_max;
> > +     return 0;
> > +}
> > +
> > +int bump_rlimit_memlock(void)
> > +{
> > +     struct rlimit rlim;
> > +
> > +     /* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
> > +     if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
> > +             return 0;
> > +
> > +     /* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
> > +     if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
> > +             return 0;
> > +
> > +     memlock_bumped = true;
> > +
> > +     /* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */
> > +     if (memlock_rlim_max == 0)
> > +             return 0;
> > +
> > +     rlim.rlim_cur = rlim.rlim_max = memlock_rlim_max;
> > +     if (setrlimit(RLIMIT_MEMLOCK, &rlim))
> > +             return -errno;
> > +
> > +     return 0;
> > +}
> > +
>
> "rlim_max" seems to imply this will only ever increase the limit, but if
> I'm reading the code correctly it could actually end up lowering the
> effective limit?

_max suffix is confusing to me as well.
libbpf_set_memlock_rlim() would be more accurate.

The other part of the patch:
+               /* assume memcg accounting is supported if we get -ENOTSUPP */
+               err = errno == 524 /* ENOTSUPP */ ? 1 : 0;

is dangerous.
ENOTSUPP is not a standard error code.
The kernel returns it from a few places, but we might fix
all of them eventually and return EOPNOTSUPP everywhere.
In other words user space cannot rely on this 524 error code number.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
  2021-12-12  2:22     ` Alexei Starovoitov
@ 2021-12-12 19:45       ` Andrii Nakryiko
  0 siblings, 0 replies; 6+ messages in thread
From: Andrii Nakryiko @ 2021-12-12 19:45 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Toke Høiland-Jørgensen, Andrii Nakryiko, bpf,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team

On Sat, Dec 11, 2021 at 6:23 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Dec 11, 2021 at 11:36 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > Andrii Nakryiko <andrii@kernel.org> writes:
> >
> > > The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
> > > one of the first extremely frustrating gotchas that all new BPF users go
> > > through and in some cases have to learn it a very hard way.
> > >
> > > Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
> > > dropped the dependency on memlock and uses memcg-based memory accounting
> > > instead. Unfortunately, detecting memcg-based BPF memory accounting is
> > > far from trivial (as can be evidenced by this patch), so in practice
> > > most BPF applications still do unconditional RLIMIT_MEMLOCK increase.
> > >
> > > As we move towards libbpf 1.0, it would be good to allow users to forget
> > > about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
> > > automatically. This patch paves the way forward in this matter. Libbpf
> > > will do feature detection of memcg-based accounting, and if detected,
> > > will do nothing. But if the kernel is too old, just like BCC, libbpf
> > > will automatically increase RLIMIT_MEMLOCK on behalf of user
> > > application ([0]).
> > >
> > > As this is technically a breaking change, during the transition period
> > > applications have to opt into libbpf 1.0 mode by setting
> > > LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
> > > libbpf_set_strict_mode().
> > >
> > > Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
> > > with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
> > > nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
> > > called before the first bpf_prog_load(), bpf_btf_load(), or
> > > bpf_object__load() call, otherwise it has no effect and will return
> > > -EBUSY.
> > >
> > >   [0] Closes: https://github.com/libbpf/libbpf/issues/369
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > The probing approach breaks with out-of-order backports, I suppose.
> > Hopefully no one will do those for that particular patch, though (it's
> > not really a bugfix), and at least for RHEL we did backport them
> > together.
> >
> > Can't think of any better ways of doing the detection either, but maybe
> > something to be aware of in the future (i.e., "don't change things in a
> > way that can't be detected from userspace")?

Initial version of memcg changes was supposed to return 0 memlock
value, when queried for BPF map. But it was decided to output some
approximation instead, so that one convenient way to detect this was
removed. I agree that kernel changes should show more care for
user-space detection in cases like this.

> >
> > Anyway, with the nits below:
> >
> > Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >
> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > > index 6b2407e12060..7c82136979bf 100644
> > > --- a/tools/lib/bpf/bpf.c
> > > +++ b/tools/lib/bpf/bpf.c
> >
> > [...]
> >
> > > +     /* attempt loading freplace trying to use custom BTF */
> > > +     memset(&attr, 0, bpf_load_attr_sz);
> > > +     attr.prog_type = BPF_PROG_TYPE_TRACING;
> > > +     attr.expected_attach_type = BPF_TRACE_FENTRY;
> >
> > This comment also seems to be disagreeing with the code it's commenting
> > on?
> >

That's what I get for writing comments. Will fix it up, thanks for spotting.

> > [...]
> >
> > > +static bool memlock_bumped;
> > > +static rlim_t memlock_rlim_max = RLIM_INFINITY;
> > > +
> > > +int libbpf_set_memlock_rlim_max(size_t memlock_max)
> > > +{
> > > +     if (memlock_bumped)
> > > +             return libbpf_err(-EBUSY);
> > > +
> > > +     memlock_rlim_max = memlock_max;
> > > +     return 0;
> > > +}
> > > +
> > > +int bump_rlimit_memlock(void)
> > > +{
> > > +     struct rlimit rlim;
> > > +
> > > +     /* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
> > > +     if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
> > > +             return 0;
> > > +
> > > +     /* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
> > > +     if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
> > > +             return 0;
> > > +
> > > +     memlock_bumped = true;
> > > +
> > > +     /* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */
> > > +     if (memlock_rlim_max == 0)
> > > +             return 0;
> > > +
> > > +     rlim.rlim_cur = rlim.rlim_max = memlock_rlim_max;
> > > +     if (setrlimit(RLIMIT_MEMLOCK, &rlim))
> > > +             return -errno;
> > > +
> > > +     return 0;
> > > +}
> > > +
> >
> > "rlim_max" seems to imply this will only ever increase the limit, but if
> > I'm reading the code correctly it could actually end up lowering the
> > effective limit?
>
> _max suffix is confusing to me as well.
> libbpf_set_memlock_rlim() would be more accurate.

The thinking was that we bump rlim_max value with setrlimit(), but
yeah, I'm totally fine naming it as libbpf_set_memlock_rlim(), I'll
update in the next revision.

>
> The other part of the patch:
> +               /* assume memcg accounting is supported if we get -ENOTSUPP */
> +               err = errno == 524 /* ENOTSUPP */ ? 1 : 0;
>
> is dangerous.
> ENOTSUPP is not a standard error code.
> The kernel returns it from a few places, but we might fix
> all of them eventually and return EOPNOTSUPP everywhere.
> In other words user space cannot rely on this 524 error code number.

I can check both ENOTSUPP and EOPNOTSUPP, if the error code is the
concern. Worst case, even if this check breaks, we'll just do
setrlimit() unnecessarily, which the user can prevent with
libbpf_set_memlock_rlim(0), if they know that all supported systems
they expect to run on have memcg accounting for BPF. If not, they
should be fine with setrlimit() anyways (or prevent it explicitly, for
libbpf 1.0 mode).

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-12 19:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10 20:13 [PATCH v2 bpf-next 0/2] libbpf: auto-bumpd RLIMIT_MEMLOCK on old kernels Andrii Nakryiko
2021-12-10 20:13 ` [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF Andrii Nakryiko
2021-12-11 19:36   ` Toke Høiland-Jørgensen
2021-12-12  2:22     ` Alexei Starovoitov
2021-12-12 19:45       ` Andrii Nakryiko
2021-12-10 20:13 ` [PATCH v2 bpf-next 2/2] selftests/bpf: remove explicit setrlimit(RLIMIT_MEMLOCK) in main selftests Andrii Nakryiko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.