[PATCH bpf-next v4 0/3] Introduce BPF map tracing capability

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf-next v4 0/3] Introduce BPF map tracing capability
@ 2022-01-05  3:03 Joe Burton
  2022-01-05  3:03 ` [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites Joe Burton
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Joe Burton @ 2022-01-05  3:03 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, linux-kernel, netdev, bpf, ppenkov, sdf, haoluo
  Cc: Joe Burton

From: Joe Burton <jevburton@google.com>

This is the fourth version of a patch series implementing map tracing.

Map tracing enables executing BPF programs upon BPF map updates. This
might be useful to perform upgrades of stateful programs; e.g., tracing
programs can propagate changes to maps that occur during an upgrade
operation.

This version uses trampoline hooks to provide the capability.
fentry/fexit/fmod_ret programs can attach to two new functions:
        int bpf_map_trace_update_elem(struct bpf_map* map, void* key,
                void* val, u32 flags);
        int bpf_map_trace_delete_elem(struct bpf_map* map, void* key);

These hooks work as intended for the following map types:
        BPF_MAP_TYPE_ARRAY
        BPF_MAP_TYPE_PERCPU_ARRAY
        BPF_MAP_TYPE_HASH
        BPF_MAP_TYPE_PERCPU_HASH
        BPF_MAP_TYPE_LRU_HASH
        BPF_MAP_TYPE_LRU_PERCPU_HASH

The only guarantee about the semantics of these hooks is that they execute
after the operation takes place. We cannot call them with locks held
because the hooked program might try to acquire the same locks.

Changes from v3 -> v4:
* Hooks execute *after* the associated operation, not before.
* Replaced `#pragma once' with traditional `#ifdef' guards.
* Explicitly constrained selftests to x86, since trampolines are only
  implemented there.
* Use /dev/null instead of /tmp/map_trace_test_file in selftests.

Changes from v2 -> v3:
* Reimplemented using trampoline hooks, simplifying greatly.

Changes from v1 -> v2:
* None. Resent series to a broader audience.

Joe Burton (3):
  bpf: Add map tracing functions and call sites
  bpf: Add selftests
  bpf: Add real world example for map tracing

 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/arraymap.c                         |   4 +-
 kernel/bpf/hashtab.c                          |  20 +-
 kernel/bpf/map_trace.c                        |  17 +
 kernel/bpf/map_trace.h                        |  19 +
 .../selftests/bpf/prog_tests/map_trace.c      | 427 ++++++++++++++++++
 .../selftests/bpf/progs/bpf_map_trace.c       |  95 ++++
 .../bpf/progs/bpf_map_trace_common.h          |  12 +
 .../progs/bpf_map_trace_real_world_common.h   | 125 +++++
 .../bpf_map_trace_real_world_migration.c      | 102 +++++
 .../bpf/progs/bpf_map_trace_real_world_new.c  |   4 +
 .../bpf/progs/bpf_map_trace_real_world_old.c  |   5 +
 12 files changed, 829 insertions(+), 3 deletions(-)
 create mode 100644 kernel/bpf/map_trace.c
 create mode 100644 kernel/bpf/map_trace.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_trace.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_migration.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_new.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_old.c

-- 
2.34.1.448.ga2b2bfdf31-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites
  2022-01-05  3:03 [PATCH bpf-next v4 0/3] Introduce BPF map tracing capability Joe Burton
@ 2022-01-05  3:03 ` Joe Burton
  2022-01-06 23:11   ` Daniel Borkmann
  2022-01-05  3:03 ` [PATCH bpf-next v4 2/3] bpf: Add selftests Joe Burton
  2022-01-05  3:03 ` [PATCH bpf-next v4 3/3] bpf: Add real world example for map tracing Joe Burton
  2 siblings, 1 reply; 6+ messages in thread
From: Joe Burton @ 2022-01-05  3:03 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, linux-kernel, netdev, bpf, ppenkov, sdf, haoluo
  Cc: Joe Burton

From: Joe Burton <jevburton@google.com>

Add two functions that fentry/fexit/fmod_ret programs can attach to:
	bpf_map_trace_update_elem
	bpf_map_trace_delete_elem
These functions have the same arguments as bpf_map_{update,delete}_elem.

Invoke these functions from the following map types:
	BPF_MAP_TYPE_ARRAY
	BPF_MAP_TYPE_PERCPU_ARRAY
	BPF_MAP_TYPE_HASH
	BPF_MAP_TYPE_PERCPU_HASH
	BPF_MAP_TYPE_LRU_HASH
	BPF_MAP_TYPE_LRU_PERCPU_HASH

The only guarantee about these functions is that they are invoked before
the corresponding action occurs. Other conditions may prevent the
corresponding action from occurring after the function is invoked.

Signed-off-by: Joe Burton <jevburton@google.com>
---
 kernel/bpf/Makefile    |  2 +-
 kernel/bpf/arraymap.c  |  4 +++-
 kernel/bpf/hashtab.c   | 20 +++++++++++++++++++-
 kernel/bpf/map_trace.c | 17 +++++++++++++++++
 kernel/bpf/map_trace.h | 19 +++++++++++++++++++
 5 files changed, 59 insertions(+), 3 deletions(-)
 create mode 100644 kernel/bpf/map_trace.c
 create mode 100644 kernel/bpf/map_trace.h

diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index c1a9be6a4b9f..0cf38dab339a 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -9,7 +9,7 @@ CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy)
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o prog_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
-obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o
+obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o map_trace.o
 obj-${CONFIG_BPF_LSM}	  += bpf_inode_storage.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_JIT) += trampoline.o
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index c7a5be3bf8be..e9e7bd27ffad 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -13,6 +13,7 @@
 #include <linux/rcupdate_trace.h>
 
 #include "map_in_map.h"
+#include "map_trace.h"
 
 #define ARRAY_CREATE_FLAG_MASK \
 	(BPF_F_NUMA_NODE | BPF_F_MMAPABLE | BPF_F_ACCESS_MASK | \
@@ -329,7 +330,8 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value,
 			copy_map_value(map, val, value);
 		check_and_free_timer_in_array(array, val);
 	}
-	return 0;
+
+	return bpf_map_trace_update_elem(map, key, value, map_flags);
 }
 
 int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index d29af9988f37..8fb19ed707e8 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -13,6 +13,7 @@
 #include "percpu_freelist.h"
 #include "bpf_lru_list.h"
 #include "map_in_map.h"
+#include "map_trace.h"
 
 #define HTAB_CREATE_FLAG_MASK						\
 	(BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE |	\
@@ -1055,7 +1056,8 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
 			copy_map_value_locked(map,
 					      l_old->key + round_up(key_size, 8),
 					      value, false);
-			return 0;
+			return bpf_map_trace_update_elem(map, key, value,
+							 map_flags);
 		}
 		/* fall through, grab the bucket lock and lookup again.
 		 * 99.9% chance that the element won't be found,
@@ -1109,6 +1111,8 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
 	ret = 0;
 err:
 	htab_unlock_bucket(htab, b, hash, flags);
+	if (!ret)
+		ret = bpf_map_trace_update_elem(map, key, value, map_flags);
 	return ret;
 }
 
@@ -1133,6 +1137,10 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
 		/* unknown flags */
 		return -EINVAL;
 
+	ret = bpf_map_trace_update_elem(map, key, value, map_flags);
+	if (unlikely(ret))
+		return ret;
+
 	WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
 		     !rcu_read_lock_bh_held());
 
@@ -1182,6 +1190,8 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
 	else if (l_old)
 		htab_lru_push_free(htab, l_old);
 
+	if (!ret)
+		ret = bpf_map_trace_update_elem(map, key, value, map_flags);
 	return ret;
 }
 
@@ -1237,6 +1247,8 @@ static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
 	ret = 0;
 err:
 	htab_unlock_bucket(htab, b, hash, flags);
+	if (!ret)
+		ret = bpf_map_trace_update_elem(map, key, value, map_flags);
 	return ret;
 }
 
@@ -1304,6 +1316,8 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 	htab_unlock_bucket(htab, b, hash, flags);
 	if (l_new)
 		bpf_lru_push_free(&htab->lru, &l_new->lru_node);
+	if (!ret)
+		ret = bpf_map_trace_update_elem(map, key, value, map_flags);
 	return ret;
 }
 
@@ -1354,6 +1368,8 @@ static int htab_map_delete_elem(struct bpf_map *map, void *key)
 	}
 
 	htab_unlock_bucket(htab, b, hash, flags);
+	if (!ret)
+		ret = bpf_map_trace_delete_elem(map, key);
 	return ret;
 }
 
@@ -1390,6 +1406,8 @@ static int htab_lru_map_delete_elem(struct bpf_map *map, void *key)
 	htab_unlock_bucket(htab, b, hash, flags);
 	if (l)
 		htab_lru_push_free(htab, l);
+	if (!ret)
+		ret = bpf_map_trace_delete_elem(map, key);
 	return ret;
 }
 
diff --git a/kernel/bpf/map_trace.c b/kernel/bpf/map_trace.c
new file mode 100644
index 000000000000..336848e83daf
--- /dev/null
+++ b/kernel/bpf/map_trace.c
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2022 Google */
+#include "map_trace.h"
+
+noinline int bpf_map_trace_update_elem(struct bpf_map *map, void *key,
+				       void *value, u64 map_flags)
+{
+	return 0;
+}
+ALLOW_ERROR_INJECTION(bpf_map_trace_update_elem, ERRNO);
+
+noinline int bpf_map_trace_delete_elem(struct bpf_map *map, void *key)
+{
+	return 0;
+}
+ALLOW_ERROR_INJECTION(bpf_map_trace_delete_elem, ERRNO);
+
diff --git a/kernel/bpf/map_trace.h b/kernel/bpf/map_trace.h
new file mode 100644
index 000000000000..ae943af9e2a5
--- /dev/null
+++ b/kernel/bpf/map_trace.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright (c) 2022 Google */
+#ifndef __BPF_MAP_TRACE_H_
+#define __BPF_MAP_TRACE_H_
+
+#include <linux/bpf.h>
+
+/*
+ * Map tracing hooks. They are called from some, but not all, bpf map types.
+ * For those map types which call them, the only guarantee is that they are
+ * called after the corresponding action (bpf_map_update_elem, etc.) takes
+ * effect.
+ */
+int bpf_map_trace_update_elem(struct bpf_map *map, void *key,
+			      void *value, u64 map_flags);
+
+int bpf_map_trace_delete_elem(struct bpf_map *map, void *key);
+
+#endif  // __BPF_MAP_TRACE_H_
-- 
2.34.1.448.ga2b2bfdf31-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v4 2/3] bpf: Add selftests
  2022-01-05  3:03 [PATCH bpf-next v4 0/3] Introduce BPF map tracing capability Joe Burton
  2022-01-05  3:03 ` [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites Joe Burton
@ 2022-01-05  3:03 ` Joe Burton
  2022-01-07  0:56   ` Alexei Starovoitov
  2022-01-05  3:03 ` [PATCH bpf-next v4 3/3] bpf: Add real world example for map tracing Joe Burton
  2 siblings, 1 reply; 6+ messages in thread
From: Joe Burton @ 2022-01-05  3:03 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, linux-kernel, netdev, bpf, ppenkov, sdf, haoluo
  Cc: Joe Burton

From: Joe Burton <jevburton@google.com>

Add selftests verifying that each supported map type is traced.

Signed-off-by: Joe Burton <jevburton@google.com>
---
 .../selftests/bpf/prog_tests/map_trace.c      | 171 ++++++++++++++++++
 .../selftests/bpf/progs/bpf_map_trace.c       |  95 ++++++++++
 .../bpf/progs/bpf_map_trace_common.h          |  12 ++
 3 files changed, 278 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_trace.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_common.h

diff --git a/tools/testing/selftests/bpf/prog_tests/map_trace.c b/tools/testing/selftests/bpf/prog_tests/map_trace.c
new file mode 100644
index 000000000000..a622bbedd99d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/map_trace.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Google */
+#include <test_progs.h>
+
+#include "bpf_map_trace.skel.h"
+#include "progs/bpf_map_trace_common.h"
+
+#include <sys/mount.h>
+#include <sys/stat.h>
+
+enum BoolOrErr {
+	TRUE = 0,
+	FALSE = 1,
+	ERROR = 2,
+};
+
+enum BoolOrErr percpu_key_is_set(struct bpf_map *map, uint32_t map_key)
+{
+	int num_cpus = libbpf_num_possible_cpus();
+	uint64_t *percpu_map_val = NULL;
+	int map_fd = bpf_map__fd(map);
+	enum BoolOrErr ret = ERROR;
+	int err;
+	int i;
+
+	if (!ASSERT_GE(num_cpus, 1, "get number of cpus"))
+		goto out;
+
+	percpu_map_val = malloc(sizeof(*percpu_map_val) * num_cpus);
+	if (!ASSERT_NEQ(percpu_map_val, NULL, "allocate percpu map array"))
+		goto out;
+
+	err = bpf_map_lookup_elem(map_fd, &map_key, percpu_map_val);
+	if (!ASSERT_EQ(err, 0, "map lookup update_elem"))
+		goto out;
+
+	ret = FALSE;
+	for (i = 0; i < num_cpus; i++)
+		if (percpu_map_val[i] != 0)
+			ret = TRUE;
+
+out:
+	if (percpu_map_val != NULL)
+		free(percpu_map_val);
+
+	return ret;
+}
+
+enum BoolOrErr key_is_set(struct bpf_map *map, uint32_t map_key)
+{
+	int map_fd = bpf_map__fd(map);
+	uint32_t map_val;
+	int rc;
+
+	rc = bpf_map_lookup_elem(map_fd, &map_key, &map_val);
+	if (!ASSERT_EQ(rc, 0, "array map lookup update_elem"))
+		return ERROR;
+
+	return (map_val == 0 ? FALSE : TRUE);
+}
+
+void verify_map_contents(struct bpf_map_trace *skel)
+{
+	enum BoolOrErr rc_or_err;
+	struct bpf_map *map;
+
+	map = skel->maps.array_map;
+	rc_or_err = key_is_set(map, ACCESS_LOC__TRACE_UPDATE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "array map updates are traced"))
+		return;
+	rc_or_err = key_is_set(map, ACCESS_LOC__TRACE_DELETE);
+	if (!ASSERT_EQ(rc_or_err, FALSE, "array map deletions are not traced"))
+		return;
+
+	map = skel->maps.percpu_array_map;
+	rc_or_err = percpu_key_is_set(map, ACCESS_LOC__TRACE_UPDATE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "percpu array map updates are traced"))
+		return;
+	rc_or_err = percpu_key_is_set(map, ACCESS_LOC__TRACE_DELETE);
+	if (!ASSERT_EQ(rc_or_err, FALSE,
+		       "percpu array map deletions are not traced"))
+		return;
+
+	map = skel->maps.hash_map;
+	rc_or_err = key_is_set(map, ACCESS_LOC__TRACE_UPDATE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "hash map updates are traced"))
+		return;
+	rc_or_err = key_is_set(map, ACCESS_LOC__TRACE_DELETE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "hash map deletions are traced"))
+		return;
+
+	map = skel->maps.percpu_hash_map;
+	rc_or_err = percpu_key_is_set(map, ACCESS_LOC__TRACE_UPDATE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "percpu hash map updates are traced"))
+		return;
+	rc_or_err = percpu_key_is_set(map, ACCESS_LOC__TRACE_DELETE);
+	if (!ASSERT_EQ(rc_or_err, TRUE,
+		       "percpu hash map deletions are traced"))
+		return;
+
+	map = skel->maps.lru_hash_map;
+	rc_or_err = key_is_set(map, ACCESS_LOC__TRACE_UPDATE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "lru_hash map updates are traced"))
+		return;
+	rc_or_err = key_is_set(map, ACCESS_LOC__TRACE_DELETE);
+	if (!ASSERT_EQ(rc_or_err, TRUE, "lru_hash map deletions are traced"))
+		return;
+
+	map = skel->maps.percpu_lru_hash_map;
+	rc_or_err = percpu_key_is_set(map, ACCESS_LOC__TRACE_UPDATE);
+	if (!ASSERT_EQ(rc_or_err, TRUE,
+		       "percpu lru hash map updates are traced"))
+		return;
+	rc_or_err = percpu_key_is_set(map, ACCESS_LOC__TRACE_DELETE);
+	if (!ASSERT_EQ(rc_or_err, TRUE,
+		       "percpu lru hash map deletions are traced"))
+		return;
+}
+
+void map_trace_test(void)
+{
+	struct bpf_map_trace *skel;
+	ssize_t bytes_written;
+	char write_buf = 'a';
+	int write_fd = -1;
+	int rc;
+
+	/*
+	 * Load and attach programs.
+	 */
+	skel = bpf_map_trace__open_and_load();
+	if (!ASSERT_NEQ(skel, NULL, "open/load skeleton"))
+		return;
+
+	rc = bpf_map_trace__attach(skel);
+	if (!ASSERT_EQ(rc, 0, "attach skeleton"))
+		goto out;
+
+	/*
+	 * Invoke core BPF program.
+	 */
+	write_fd = open("/tmp/map_trace_test_file", O_CREAT | O_WRONLY);
+	if (!ASSERT_GE(rc, 0, "open tmp file for writing"))
+		goto out;
+
+	bytes_written = write(write_fd, &write_buf, sizeof(write_buf));
+	if (!ASSERT_EQ(bytes_written, sizeof(write_buf), "write to tmp file"))
+		return;
+
+	/*
+	 * Verify that tracing programs were invoked as expected.
+	 */
+	verify_map_contents(skel);
+
+out:
+	if (skel)
+		bpf_map_trace__destroy(skel);
+	if (write_fd != -1)
+		close(write_fd);
+}
+
+void test_map_trace(void)
+{
+	/*
+	 * Trampoline programs are only supported on x86.
+	 */
+#if defined(__x86_64__)
+	map_trace_test();
+#endif
+}
+
diff --git a/tools/testing/selftests/bpf/progs/bpf_map_trace.c b/tools/testing/selftests/bpf/progs/bpf_map_trace.c
new file mode 100644
index 000000000000..90d4d435fe59
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_map_trace.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Google */
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <errno.h>
+#include <string.h>
+
+#include "bpf_map_trace_common.h"
+
+#define DECLARE_MAP(name, map_type) \
+		struct { \
+			__uint(type, map_type); \
+			__uint(max_entries, __ACCESS_LOC__MAX); \
+			__type(key, u32); \
+			__type(value, u32); \
+		} name SEC(".maps")
+
+DECLARE_MAP(array_map, BPF_MAP_TYPE_ARRAY);
+DECLARE_MAP(percpu_array_map, BPF_MAP_TYPE_PERCPU_ARRAY);
+DECLARE_MAP(hash_map, BPF_MAP_TYPE_HASH);
+DECLARE_MAP(percpu_hash_map, BPF_MAP_TYPE_PERCPU_HASH);
+DECLARE_MAP(lru_hash_map, BPF_MAP_TYPE_LRU_HASH);
+DECLARE_MAP(percpu_lru_hash_map, BPF_MAP_TYPE_LRU_PERCPU_HASH);
+
+static inline void __log_location(void *map,
+				  enum MapAccessLocations location)
+{
+	u32 key = location;
+	u32 val = 1;
+
+	bpf_map_update_elem(map, &key, &val, /*flags=*/0);
+}
+
+static inline void log_location(struct bpf_map *map,
+				enum MapAccessLocations location)
+{
+	if (map == &array_map)
+		__log_location(&array_map, location);
+	if (map == &percpu_array_map)
+		__log_location(&percpu_array_map, location);
+	if (map == &hash_map)
+		__log_location(&hash_map, location);
+	if (map == &percpu_hash_map)
+		__log_location(&percpu_hash_map, location);
+	if (map == &lru_hash_map)
+		__log_location(&lru_hash_map, location);
+	if (map == &percpu_lru_hash_map)
+		__log_location(&percpu_lru_hash_map, location);
+}
+
+SEC("fentry/bpf_map_trace_update_elem")
+int BPF_PROG(fentry__bpf_map_trace_update_elem,
+	     struct bpf_map *map, void *key,
+	     void *value, u64 map_flags)
+{
+	log_location(map, ACCESS_LOC__TRACE_UPDATE);
+	return 0;
+}
+
+SEC("fentry/bpf_map_trace_delete_elem")
+int BPF_PROG(fentry__bpf_map_trace_delete_elem,
+	     struct bpf_map *map, void *key)
+{
+	log_location(map, ACCESS_LOC__TRACE_DELETE);
+	return 0;
+}
+
+static inline void do_map_accesses(void *map)
+{
+	u32 key = ACCESS_LOC__APP;
+	u32 val = 1;
+
+	bpf_map_update_elem(map, &key, &val, /*flags=*/0);
+	bpf_map_delete_elem(map, &key);
+}
+
+SEC("fentry/__x64_sys_write")
+int BPF_PROG(fentry__x64_sys_write, struct pt_regs *regs, int ret)
+{
+	/*
+	 * Trigger an update and a delete for every map type under test.
+	 * We want to verify that bpf_map_trace_{update,delete}_elem() fire
+	 * for each map type.
+	 */
+	do_map_accesses(&array_map);
+	do_map_accesses(&percpu_array_map);
+	do_map_accesses(&hash_map);
+	do_map_accesses(&percpu_hash_map);
+	do_map_accesses(&lru_hash_map);
+	do_map_accesses(&percpu_lru_hash_map);
+	return 0;
+}
+
diff --git a/tools/testing/selftests/bpf/progs/bpf_map_trace_common.h b/tools/testing/selftests/bpf/progs/bpf_map_trace_common.h
new file mode 100644
index 000000000000..8986e6286350
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_map_trace_common.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2022 Google */
+#pragma once
+
+enum MapAccessLocations {
+	ACCESS_LOC__APP,
+	ACCESS_LOC__TRACE_UPDATE,
+	ACCESS_LOC__TRACE_DELETE,
+
+	__ACCESS_LOC__MAX,
+};
+
-- 
2.34.1.448.ga2b2bfdf31-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v4 3/3] bpf: Add real world example for map tracing
  2022-01-05  3:03 [PATCH bpf-next v4 0/3] Introduce BPF map tracing capability Joe Burton
  2022-01-05  3:03 ` [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites Joe Burton
  2022-01-05  3:03 ` [PATCH bpf-next v4 2/3] bpf: Add selftests Joe Burton
@ 2022-01-05  3:03 ` Joe Burton
  2 siblings, 0 replies; 6+ messages in thread
From: Joe Burton @ 2022-01-05  3:03 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, linux-kernel, netdev, bpf, ppenkov, sdf, haoluo
  Cc: Joe Burton

From: Joe Burton <jevburton@google.com>

Add an example that demonstrates how map tracing helps us avoid race
conditions while upgrading stateful programs.

Signed-off-by: Joe Burton <jevburton@google.com>
---
 .../selftests/bpf/prog_tests/map_trace.c      | 258 +++++++++++++++++-
 .../progs/bpf_map_trace_real_world_common.h   | 125 +++++++++
 .../bpf_map_trace_real_world_migration.c      | 102 +++++++
 .../bpf/progs/bpf_map_trace_real_world_new.c  |   4 +
 .../bpf/progs/bpf_map_trace_real_world_old.c  |   5 +
 5 files changed, 493 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_common.h
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_migration.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_new.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_old.c

diff --git a/tools/testing/selftests/bpf/prog_tests/map_trace.c b/tools/testing/selftests/bpf/prog_tests/map_trace.c
index a622bbedd99d..00648d44b2b6 100644
--- a/tools/testing/selftests/bpf/prog_tests/map_trace.c
+++ b/tools/testing/selftests/bpf/prog_tests/map_trace.c
@@ -3,6 +3,9 @@
 #include <test_progs.h>
 
 #include "bpf_map_trace.skel.h"
+#include "bpf_map_trace_real_world_migration.skel.h"
+#include "bpf_map_trace_real_world_new.skel.h"
+#include "bpf_map_trace_real_world_old.skel.h"
 #include "progs/bpf_map_trace_common.h"
 
 #include <sys/mount.h>
@@ -139,7 +142,7 @@ void map_trace_test(void)
 	/*
 	 * Invoke core BPF program.
 	 */
-	write_fd = open("/tmp/map_trace_test_file", O_CREAT | O_WRONLY);
+	write_fd = open("/dev/null", O_CREAT | O_WRONLY);
 	if (!ASSERT_GE(rc, 0, "open tmp file for writing"))
 		goto out;
 
@@ -159,6 +162,258 @@ void map_trace_test(void)
 		close(write_fd);
 }
 
+int real_world_example__attach_migration(
+		struct bpf_map_trace_real_world_migration *migration_skel,
+		struct bpf_link **iter_link,
+		struct bpf_link **map_trace_link_update,
+		struct bpf_link **map_trace_link_delete)
+{
+	union bpf_iter_link_info iter_link_info;
+	struct bpf_iter_attach_opts iter_opts;
+	int64_t error;
+
+	*map_trace_link_update = bpf_program__attach(
+			migration_skel->progs.copy_on_write__update);
+	error = libbpf_get_error(map_trace_link_update);
+	if (!ASSERT_EQ(error, 0,
+		       "copy_on_write update bpf_program__attach failure"))
+		return 1;
+
+	*map_trace_link_delete = bpf_program__attach(
+			migration_skel->progs.copy_on_write__delete);
+	error = libbpf_get_error(map_trace_link_delete);
+	if (!ASSERT_EQ(error, 0,
+		       "copy_on_write update bpf_program__delete failure"))
+		return 1;
+
+	memset(&iter_link_info, 0, sizeof(iter_link_info));
+	iter_link_info.map.map_fd = bpf_map__fd(migration_skel->maps.old_map);
+
+	memset(&iter_opts, 0, sizeof(iter_opts));
+	iter_opts.sz = sizeof(iter_opts);
+	iter_opts.link_info = &iter_link_info;
+	iter_opts.link_info_len = sizeof(iter_link_info);
+	*iter_link = bpf_program__attach_iter(
+			migration_skel->progs.bulk_migration, &iter_opts);
+	error = libbpf_get_error(iter_link);
+	if (!ASSERT_EQ(error, 0, "bpf_program__attach_iter failure"))
+		return 1;
+
+	return 0;
+}
+
+int open_and_write_files(const char *path, size_t num_files)
+{
+	int *fds = malloc(sizeof(int) * num_files);
+	ssize_t bytes_written;
+	const char buf = 'a';
+	size_t i, j;
+	int ret = 0;
+
+	if (fds == NULL)
+		return 1;
+
+	for (i = 0; i < num_files; i++) {
+		fds[i] = open(path, O_WRONLY | O_CREAT);
+
+		if (fds[i] < 0) {
+			ret = 2;
+			break;
+		}
+		bytes_written = write(fds[i], &buf, sizeof(buf));
+		if (bytes_written != sizeof(buf)) {
+			ret = 3;
+			break;
+		}
+	}
+	for (j = 0; j < i; j++)
+		close(fds[j]);
+	return ret;
+}
+
+void real_world_example(void)
+{
+	struct bpf_map_trace_real_world_migration *migration_skel = NULL;
+	int file_fd_should_write = -1, file_fd_should_not_write = -1;
+	struct bpf_map_trace_real_world_new *new_skel = NULL;
+	struct bpf_map_trace_real_world_old *old_skel = NULL;
+	struct bpf_link *map_trace_link_update = NULL;
+	struct bpf_link *map_trace_link_delete = NULL;
+	struct bpf_link *iter_link = NULL;
+	const bool enable_filtering = 1;
+	const uint32_t pid = getpid();
+	uint32_t max_open_files;
+	char file_buf = 'a';
+	int iter_fd = -1;
+	char iter_buf[1];
+	int rc;
+
+	/*
+	 * Begin by loading and attaching the old version of our program.
+	 */
+	old_skel = bpf_map_trace_real_world_old__open_and_load();
+	if (!ASSERT_NEQ(old_skel, NULL, "open/load old skeleton"))
+		return;
+	rc = bpf_map_trace_real_world_old__attach(old_skel);
+	if (!ASSERT_EQ(rc, 0, "attach old skeleton")) {
+		fprintf(stderr, "Failed to attach skeleton: %d\n", errno);
+		goto out;
+	}
+	rc = bpf_map_update_elem(bpf_map__fd(old_skel->maps.filtered_pids),
+				 &pid, &enable_filtering, /*flags=*/0);
+	if (!ASSERT_EQ(rc, 0, "configure process to be filtered"))
+		return;
+	if (!ASSERT_EQ(open_and_write_files("/tmp/tst_file", 1), 0,
+		       "program allows writing a single new file"))
+		goto out;
+	max_open_files = bpf_map__max_entries(old_skel->maps.allow_reads);
+	if (!ASSERT_NEQ(open_and_write_files("/tmp/tst_file",
+					     max_open_files + 1), 0,
+		       "program blocks writing too many new files"))
+		goto out;
+
+	/*
+	 * Then load the new version of the program.
+	 */
+	new_skel = bpf_map_trace_real_world_new__open_and_load();
+	if (!ASSERT_NEQ(new_skel, NULL, "open/load new skeleton"))
+		goto out;
+
+	/*
+	 * Hook up the migration programs. This gives the old map
+	 * copy-on-write semantics.
+	 */
+	migration_skel = bpf_map_trace_real_world_migration__open();
+	if (!ASSERT_NEQ(migration_skel, NULL, "open migration skeleton"))
+		goto out;
+	rc = bpf_map__reuse_fd(migration_skel->maps.old_map,
+			       bpf_map__fd(old_skel->maps.allow_reads));
+	if (!ASSERT_EQ(rc, 0, "reuse old map fd"))
+		goto out;
+	rc = bpf_map__reuse_fd(migration_skel->maps.new_map,
+			       bpf_map__fd(new_skel->maps.allow_reads));
+	if (!ASSERT_EQ(rc, 0, "reuse new map fd"))
+		goto out;
+	rc = bpf_map_trace_real_world_migration__load(migration_skel);
+	if (!ASSERT_EQ(rc, 0, "load migration skeleton"))
+		goto out;
+	rc = real_world_example__attach_migration(migration_skel,
+						  &iter_link,
+						  &map_trace_link_update,
+						  &map_trace_link_delete);
+	if (!ASSERT_EQ(rc, 0, "attach migration programs"))
+		goto out;
+
+	/*
+	 * Simulated race condition type 1: An application opens an fd before
+	 * bulk transfer and closes it after.
+	 */
+	file_fd_should_not_write = open("/tmp/tst_file", O_WRONLY | O_CREAT);
+	if (!ASSERT_GE(file_fd_should_not_write, 0,
+		       "open file before bulk migration"))
+		goto out;
+
+	/*
+	 * Perform bulk transfer.
+	 */
+	iter_fd = bpf_iter_create(bpf_link__fd(iter_link));
+	if (!ASSERT_GE(iter_fd, 0, "create iterator"))
+		goto out;
+	rc = read(iter_fd, &iter_buf, sizeof(iter_buf));
+	if (!ASSERT_EQ(rc, 0, "execute map iterator"))
+		goto out;
+	rc = bpf_map_update_elem(bpf_map__fd(new_skel->maps.filtered_pids),
+				 &pid, &enable_filtering, /*flags=*/0);
+	if (!ASSERT_EQ(rc, 0, "configure process to be filtered"))
+		goto out;
+
+	/*
+	 * Simulated race condition type 1 (continued). This close() does not
+	 * propagate to the new map without copy-on-write semantics, so it
+	 * would occupy a spot in the map until our app happens to close an fd
+	 * with the same number. This would subtly degrade the contract with
+	 * the application.
+	 */
+	close(file_fd_should_not_write);
+	file_fd_should_not_write = -1;
+
+	/*
+	 * Simulated race condition type 2: An application opens a file
+	 * descriptor after bulk transfer. This openat() does not propagate to
+	 * the new map without copy-on-write, so our app would not be able to
+	 * write to it.
+	 */
+	file_fd_should_write = open("/tmp/tst_file", O_WRONLY | O_CREAT);
+	if (!ASSERT_GE(file_fd_should_write, 0,
+		       "open file after bulk migration"))
+		goto out;
+
+	/*
+	 * State is migrated. Load new programs.
+	 */
+	rc = bpf_map_trace_real_world_new__attach(new_skel);
+	if (!ASSERT_EQ(rc, 0, "failed to attach new programs"))
+		goto out;
+
+	/*
+	 * Unload migration progs.
+	 */
+	close(iter_fd);
+	iter_fd = -1;
+	bpf_link__destroy(map_trace_link_update);
+	map_trace_link_update = NULL;
+	bpf_link__destroy(map_trace_link_delete);
+	map_trace_link_delete = NULL;
+	bpf_link__destroy(iter_link);
+	iter_link = NULL;
+	bpf_map_trace_real_world_migration__destroy(migration_skel);
+	migration_skel = NULL;
+
+	/*
+	 * Unload old programs.
+	 */
+	bpf_map_trace_real_world_old__destroy(old_skel);
+	old_skel = NULL;
+
+	if (!ASSERT_EQ(open_and_write_files("/tmp/tst_file", 1), 0,
+		       "program allows writing a single new file"))
+		goto out;
+	max_open_files = bpf_map__max_entries(new_skel->maps.allow_reads);
+	if (!ASSERT_NEQ(open_and_write_files("/tmp/tst_file",
+					     max_open_files + 1), 0,
+		       "program blocks writing too many new files"))
+		goto out;
+	/*
+	 * Simulated race condition type 2 (continued): If we didn't do
+	 * copy-on-write, this would be expected to fail, since the FD would
+	 * not be in the new map.
+	 */
+	rc = write(file_fd_should_write, &file_buf, sizeof(file_buf));
+	if (!ASSERT_EQ(rc, sizeof(file_buf),
+		       "migrated program allows writing to file opened before migration"))
+		goto out;
+
+out:
+	if (old_skel)
+		bpf_map_trace_real_world_old__destroy(old_skel);
+	if (new_skel)
+		bpf_map_trace_real_world_new__destroy(new_skel);
+	if (migration_skel)
+		bpf_map_trace_real_world_migration__destroy(migration_skel);
+	if (map_trace_link_update)
+		bpf_link__destroy(map_trace_link_update);
+	if (map_trace_link_delete)
+		bpf_link__destroy(map_trace_link_delete);
+	if (iter_link)
+		bpf_link__destroy(iter_link);
+	if (iter_fd > -1)
+		close(iter_fd);
+	if (file_fd_should_write > -1)
+		close(file_fd_should_write);
+	if (file_fd_should_not_write > -1)
+		close(file_fd_should_not_write);
+}
+
 void test_map_trace(void)
 {
 	/*
@@ -166,6 +421,7 @@ void test_map_trace(void)
 	 */
 #if defined(__x86_64__)
 	map_trace_test();
+	real_world_example();
 #endif
 }
 
diff --git a/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_common.h b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_common.h
new file mode 100644
index 000000000000..0311f1d4e99f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_common.h
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2022 Google */
+#pragma once
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <errno.h>
+#include <string.h>
+
+/*
+ * Mock "real world" application.
+ *
+ * Blocks all writes from a set of applications. A limited number of newly
+ * openat()ed file descriptors file descriptors may be written to. Writes to
+ * already-open file descriptors are blocked.
+ *
+ * The affected processes are selected by populating filtered_pid.
+ *
+ * It is intended as an example of a stateful policy-enforcement application
+ * which benefits from map tracing. It is not intended to be useful.
+ */
+
+/*
+ * This is the only difference between the old and new application. Since we're
+ * enforcing a policy based on this data, we want to migrate it. Since the
+ * application can modify the data in parallel, we need to give this map
+ * copy-on-write semantics so that those changes propagate.
+ */
+#if defined(OLD_VERSION)
+struct allow_reads_key {
+	uint32_t pid;
+	int fd;
+};
+#else
+struct allow_reads_key {
+	int fd;
+	uint32_t pid;
+};
+#endif
+struct allow_reads_value {
+	bool do_allow;
+};
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 16);
+	__type(key, struct allow_reads_key);
+	__type(value, struct allow_reads_value);
+} allow_reads SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 16);
+	__type(key, uint32_t);
+	__type(value, bool);
+} filtered_pids SEC(".maps");
+
+
+SEC("kretprobe/__x64_sys_openat")
+int BPF_KRETPROBE(kretprobe__x64_sys_openat, int ret)
+{
+	struct allow_reads_key key;
+	struct allow_reads_value val;
+	uint32_t pid;
+	char *pid_is_filtered;
+
+	pid = (bpf_get_current_pid_tgid() >> 32) & 0xFFFFFFFF;
+	memset(&key, 0, sizeof(key));
+	key.pid = pid;
+	key.fd = ret;
+	val.do_allow = true;
+
+	if (ret < 0)
+		return 0;
+
+	pid_is_filtered = bpf_map_lookup_elem(&filtered_pids, &pid);
+	if (!pid_is_filtered)
+		return 0;
+
+	if (!*pid_is_filtered)
+		return 0;
+
+	/*
+	 * Ignore errors. Failing to insert has the effect of blocking writes
+	 * on that file descriptor.
+	 */
+	bpf_map_update_elem(&allow_reads, &key, &val, /*flags=*/0);
+	return 0;
+}
+
+SEC("fmod_ret/__x64_sys_write")
+int BPF_PROG(fmod_ret__x64_sys_write, struct pt_regs *regs, int ret)
+{
+	int fd = PT_REGS_PARM1(regs);
+	struct allow_reads_value *val;
+	struct allow_reads_key key;
+
+	memset(&key, 0, sizeof(key));
+	key.pid = (bpf_get_current_pid_tgid() >> 32) & 0xFFFFFFFF;
+	key.fd = fd;
+	val = bpf_map_lookup_elem(&allow_reads, &key);
+	if (!val)
+		return -EPERM;
+	return val->do_allow ? 0 : -EPERM;
+}
+
+SEC("fmod_ret/__x64_sys_close")
+int BPF_PROG(fmod_ret__x64_sys_close, struct pt_regs *regs, int ret)
+{
+	int fd = PT_REGS_PARM1(regs);
+	struct allow_reads_key key;
+	struct allow_reads_value val;
+
+	memset(&key, 0, sizeof(key));
+	key.pid = (bpf_get_current_pid_tgid() >> 32) & 0xFFFFFFFF;
+	key.fd = fd;
+	val.do_allow = true;
+
+	bpf_map_delete_elem(&allow_reads, &key);
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+
diff --git a/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_migration.c b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_migration.c
new file mode 100644
index 000000000000..842dc4bc382a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_migration.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Google */
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+/* In the "real" real world, we would use BTF to generate a program which knows
+ * about the old and new map ABI. To keep things simple we'll just use a
+ * statically defined program which knows about them.
+ */
+struct allow_reads_key__old {
+	uint32_t pid;
+	int fd;
+};
+struct allow_reads_key__new {
+	int fd;
+	uint32_t pid;
+};
+struct allow_reads_value__old {
+	bool do_drop;
+};
+struct allow_reads_value__new {
+	bool do_drop;
+};
+
+/* Likewise, in the "real" real world we would simply generate a program
+ * containing the fd of this map. For libbpf to generate a skeleton for us we
+ * need to dupicate this definition.
+ */
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 100);
+	__type(key, struct allow_reads_key__old);
+	__type(value, struct allow_reads_value__old);
+} old_map SEC(".maps");
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 100);
+	__type(key, struct allow_reads_key__new);
+	__type(value, struct allow_reads_value__new);
+} new_map SEC(".maps");
+
+static inline void read_migrate_write(void *key, void *value)
+{
+	struct allow_reads_key__old old_key = {};
+	struct allow_reads_key__new new_key = {};
+	char old_value = 0;
+
+	if (bpf_probe_read_kernel(&old_key, sizeof(old_key), key))
+		return;
+	if (bpf_probe_read_kernel(&old_value, sizeof(old_value), value))
+		return;
+
+	new_key.pid = old_key.pid;
+	new_key.fd = old_key.fd;
+
+	bpf_map_update_elem(&new_map, &new_key, &old_value, /*flags=*/0);
+}
+
+SEC("fentry/bpf_map_trace_update_elem")
+int BPF_PROG(copy_on_write__update,
+	     struct bpf_map *map, void *key,
+	     void *value, u64 map_flags)
+{
+	if (map == &old_map)
+		read_migrate_write(key, value);
+	return 0;
+}
+
+static inline void read_migrate_delete(void *key)
+{
+	struct allow_reads_key__old old_key = {};
+	struct allow_reads_key__new new_key = {};
+
+	if (bpf_probe_read(&old_key, sizeof(old_key), key))
+		return; /* Could write to a map here */
+
+	new_key.pid = old_key.pid;
+	new_key.fd = old_key.fd;
+
+	bpf_map_delete_elem(&new_map, &new_key);
+}
+
+SEC("fentry/bpf_map_trace_delete_elem")
+int BPF_PROG(copy_on_write__delete,
+	     struct bpf_map *map, void *key)
+{
+	if (map == &old_map)
+		read_migrate_delete(key);
+	return 0;
+}
+
+SEC("iter/bpf_map_elem")
+int bulk_migration(struct bpf_iter__bpf_map_elem *ctx)
+{
+	read_migrate_write(ctx->key, ctx->value);
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+
diff --git a/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_new.c b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_new.c
new file mode 100644
index 000000000000..4291a39c8009
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_new.c
@@ -0,0 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Google */
+#include "bpf_map_trace_real_world_common.h"
+
diff --git a/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_old.c b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_old.c
new file mode 100644
index 000000000000..4124d4c96d55
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_map_trace_real_world_old.c
@@ -0,0 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Google */
+#define OLD_VERSION
+#include "bpf_map_trace_real_world_common.h"
+
-- 
2.34.1.448.ga2b2bfdf31-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites
  2022-01-05  3:03 ` [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites Joe Burton
@ 2022-01-06 23:11   ` Daniel Borkmann
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Borkmann @ 2022-01-06 23:11 UTC (permalink / raw)
  To: Joe Burton, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, linux-kernel, netdev, bpf, ppenkov, sdf, haoluo
  Cc: Joe Burton

On 1/5/22 4:03 AM, Joe Burton wrote:
> From: Joe Burton <jevburton@google.com>
> 
> Add two functions that fentry/fexit/fmod_ret programs can attach to:
> 	bpf_map_trace_update_elem
> 	bpf_map_trace_delete_elem
> These functions have the same arguments as bpf_map_{update,delete}_elem.
> 
> Invoke these functions from the following map types:
> 	BPF_MAP_TYPE_ARRAY
> 	BPF_MAP_TYPE_PERCPU_ARRAY
> 	BPF_MAP_TYPE_HASH
> 	BPF_MAP_TYPE_PERCPU_HASH
> 	BPF_MAP_TYPE_LRU_HASH
> 	BPF_MAP_TYPE_LRU_PERCPU_HASH
> 
> The only guarantee about these functions is that they are invoked before
> the corresponding action occurs. Other conditions may prevent the
> corresponding action from occurring after the function is invoked.
> 
> Signed-off-by: Joe Burton <jevburton@google.com>
> ---
>   kernel/bpf/Makefile    |  2 +-
>   kernel/bpf/arraymap.c  |  4 +++-
>   kernel/bpf/hashtab.c   | 20 +++++++++++++++++++-
>   kernel/bpf/map_trace.c | 17 +++++++++++++++++
>   kernel/bpf/map_trace.h | 19 +++++++++++++++++++
>   5 files changed, 59 insertions(+), 3 deletions(-)
>   create mode 100644 kernel/bpf/map_trace.c
>   create mode 100644 kernel/bpf/map_trace.h
> 
> diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
> index c1a9be6a4b9f..0cf38dab339a 100644
> --- a/kernel/bpf/Makefile
> +++ b/kernel/bpf/Makefile
> @@ -9,7 +9,7 @@ CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy)
>   obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o prog_iter.o
>   obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
>   obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
> -obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o
> +obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o map_trace.o
>   obj-${CONFIG_BPF_LSM}	  += bpf_inode_storage.o
>   obj-$(CONFIG_BPF_SYSCALL) += disasm.o
>   obj-$(CONFIG_BPF_JIT) += trampoline.o
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index c7a5be3bf8be..e9e7bd27ffad 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -13,6 +13,7 @@
>   #include <linux/rcupdate_trace.h>
>   
>   #include "map_in_map.h"
> +#include "map_trace.h"
>   
>   #define ARRAY_CREATE_FLAG_MASK \
>   	(BPF_F_NUMA_NODE | BPF_F_MMAPABLE | BPF_F_ACCESS_MASK | \
> @@ -329,7 +330,8 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value,
>   			copy_map_value(map, val, value);
>   		check_and_free_timer_in_array(array, val);
>   	}
> -	return 0;
> +
> +	return bpf_map_trace_update_elem(map, key, value, map_flags);

Given post-update, do you have a use case where you make use of the fmod_ret for
propagating non-zero ret codes?

Could you also extend commit description on rationale to not add these dummy
funcs more higher level, e.g. into map_update_elem() upon success?

[...]
> @@ -1133,6 +1137,10 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>   		/* unknown flags */
>   		return -EINVAL;
>   
> +	ret = bpf_map_trace_update_elem(map, key, value, map_flags);
> +	if (unlikely(ret))
> +		return ret;
> +
>   	WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() &&
>   		     !rcu_read_lock_bh_held());
>   
> @@ -1182,6 +1190,8 @@ static int htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value,
>   	else if (l_old)
>   		htab_lru_push_free(htab, l_old);
>   
> +	if (!ret)
> +		ret = bpf_map_trace_update_elem(map, key, value, map_flags);
>   	return ret;
>   }

Here, it's pre and post update, is that intentional?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v4 2/3] bpf: Add selftests
  2022-01-05  3:03 ` [PATCH bpf-next v4 2/3] bpf: Add selftests Joe Burton
@ 2022-01-07  0:56   ` Alexei Starovoitov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexei Starovoitov @ 2022-01-07  0:56 UTC (permalink / raw)
  To: Joe Burton
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, LKML, Network Development, bpf, Petar Penkov,
	Stanislav Fomichev, Hao Luo, Joe Burton

On Tue, Jan 4, 2022 at 7:03 PM Joe Burton <jevburton.kernel@gmail.com> wrote:
> +
> +enum BoolOrErr {
> +       TRUE = 0,
> +       FALSE = 1,
> +       ERROR = 2,
> +};

No camel style in the kernel please.

> +++ b/tools/testing/selftests/bpf/progs/bpf_map_trace_common.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (c) 2022 Google */
> +#pragma once

Didn't you say that pragma once was removed?

> +
> +enum MapAccessLocations {

here and other places.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-07  0:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-05  3:03 [PATCH bpf-next v4 0/3] Introduce BPF map tracing capability Joe Burton
2022-01-05  3:03 ` [PATCH bpf-next v4 1/3] bpf: Add map tracing functions and call sites Joe Burton
2022-01-06 23:11   ` Daniel Borkmann
2022-01-05  3:03 ` [PATCH bpf-next v4 2/3] bpf: Add selftests Joe Burton
2022-01-07  0:56   ` Alexei Starovoitov
2022-01-05  3:03 ` [PATCH bpf-next v4 3/3] bpf: Add real world example for map tracing Joe Burton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.