[RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs
@ 2020-08-14 19:15 YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count YiFei Zhu
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: YiFei Zhu @ 2020-08-14 19:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

Currently, if a user wants to store arbitrary metadata for an eBPF
program, for example, the program build commit hash or version, they
could store it in a map, and conveniently libbpf uses .data section to
populate an internal map. However, if the program does not actually
reference the map, then the map would be de-refcounted and freed.

This patch set introduces a new syscall BPF_PROG_ADD_MAP to attach a map
to a program's used_maps, even if the program instructions does not
reference the map. libbpf is extended to recognize the .metadata section
and load it as an internal map, and use the new syscall to ensure the
map is attached. bpftool is also extended to have a new flag to prog
subcommand, "--metadata" to dump the contents of the metadata section
without a separate map dump call.

An example use of this would be BPF C file declaring:

 char commit_hash[] SEC(".metadata") = "abcdef123456";

and bpftool would emit:

  $ bpftool prog --metadata
  [...]
        metadata:
                commit_hash = "abcdef123456"

Patch 1 moves the used_maps array and count into an RCU-protected
struct.

Patch 2 implements the new syscall.

Patch 3 extends libbpf to have a wrapper around the syscall, probe the
kernel for support of this new syscall, and use it on .metadata section
if supported and the section exists.

Patch 4 extends bpftool so that it is able to dump metadata from prog
show.

Patch 5 adds a test to check the metadata loading and dumping.

YiFei Zhu (5):
  bpf: RCU protect used_maps array and count
  bpf: Add BPF_PROG_ADD_MAP syscall
  libbpf: Add BPF_PROG_ADD_MAP syscall and use it on .metadata section
  bpftool: support dumping metadata
  selftests/bpf: Test bpftool loading and dumping metadata

 .../net/ethernet/netronome/nfp/bpf/offload.c  |  33 +++--
 include/linux/bpf.h                           |  11 +-
 include/uapi/linux/bpf.h                      |   7 +
 kernel/bpf/core.c                             |  25 +++-
 kernel/bpf/syscall.c                          | 102 +++++++++++++-
 kernel/bpf/verifier.c                         |  37 +++--
 net/core/dev.c                                |  12 +-
 tools/bpf/bpftool/json_writer.c               |   6 +
 tools/bpf/bpftool/json_writer.h               |   3 +
 tools/bpf/bpftool/main.c                      |  10 ++
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/prog.c                      | 132 ++++++++++++++++++
 tools/include/uapi/linux/bpf.h                |   7 +
 tools/lib/bpf/bpf.c                           |  11 ++
 tools/lib/bpf/bpf.h                           |   1 +
 tools/lib/bpf/libbpf.c                        |  97 ++++++++++++-
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/testing/selftests/bpf/Makefile          |   3 +-
 .../selftests/bpf/progs/metadata_unused.c     |  15 ++
 .../selftests/bpf/progs/metadata_used.c       |  15 ++
 .../selftests/bpf/test_bpftool_metadata.sh    |  82 +++++++++++
 21 files changed, 572 insertions(+), 39 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_unused.c
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_used.c
 create mode 100755 tools/testing/selftests/bpf/test_bpftool_metadata.sh

-- 
2.28.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count
  2020-08-14 19:15 [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs YiFei Zhu
@ 2020-08-14 19:15 ` YiFei Zhu
  2020-08-19  0:48   ` Alexei Starovoitov
  2020-08-14 19:15 ` [RFC PATCH bpf-next 2/5] bpf: Add BPF_PROG_ADD_MAP syscall YiFei Zhu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: YiFei Zhu @ 2020-08-14 19:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

To support modifying the used_maps array, we use RCU to protect the
use of the counter and the array. A mutex is used as the write-side
lock, and it is initialized in the verifier where the rcu struct is
allocated.

Most uses are non-sleeping and very straight forward, just holding
rcu_read_lock. bpf_check_tail_call can be called for a cBPF map
(by bpf_prog_select_runtime, bpf_migrate_filter) so an extra check
is added for the case when the program did not pass through the
eBPF verifier and the used_maps pointer is NULL.

The Netronome driver does allocate memory while reading the used_maps
array so for simplicity it is made to hold the write-side lock.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
 .../net/ethernet/netronome/nfp/bpf/offload.c  | 33 +++++++++++------
 include/linux/bpf.h                           | 11 +++++-
 kernel/bpf/core.c                             | 25 ++++++++++---
 kernel/bpf/syscall.c                          | 19 ++++++++--
 kernel/bpf/verifier.c                         | 37 +++++++++++++------
 net/core/dev.c                                | 12 ++++--
 6 files changed, 100 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/offload.c b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
index ac02369174a9..74ed42b678e2 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
@@ -111,34 +111,45 @@ static int
 nfp_map_ptrs_record(struct nfp_app_bpf *bpf, struct nfp_prog *nfp_prog,
 		    struct bpf_prog *prog)
 {
-	int i, cnt, err;
+	struct bpf_used_maps *used_maps;
+	int i, cnt, err = 0;
+
+	/* We are calling sleepable functions while reading used_maps array */
+	mutex_lock(&prog->aux->used_maps_mutex);
+
+	used_maps = rcu_dereference_protected(prog->aux->used_maps,
+			lockdep_is_held(&prog->aux->used_maps_mutex));
 
 	/* Quickly count the maps we will have to remember */
 	cnt = 0;
-	for (i = 0; i < prog->aux->used_map_cnt; i++)
-		if (bpf_map_offload_neutral(prog->aux->used_maps[i]))
+	for (i = 0; i < used_maps->cnt; i++)
+		if (bpf_map_offload_neutral(used_maps->arr[i]))
 			cnt++;
 	if (!cnt)
-		return 0;
+		goto out;
 
 	nfp_prog->map_records = kmalloc_array(cnt,
 					      sizeof(nfp_prog->map_records[0]),
 					      GFP_KERNEL);
-	if (!nfp_prog->map_records)
-		return -ENOMEM;
+	if (!nfp_prog->map_records) {
+		err = -ENOMEM;
+		goto out;
+	}
 
-	for (i = 0; i < prog->aux->used_map_cnt; i++)
-		if (bpf_map_offload_neutral(prog->aux->used_maps[i])) {
+	for (i = 0; i < used_maps->cnt; i++)
+		if (bpf_map_offload_neutral(used_maps->arr[i])) {
 			err = nfp_map_ptr_record(bpf, nfp_prog,
-						 prog->aux->used_maps[i]);
+						 used_maps->arr[i]);
 			if (err) {
 				nfp_map_ptrs_forget(bpf, nfp_prog);
-				return err;
+				goto out;
 			}
 		}
 	WARN_ON(cnt != nfp_prog->map_records_cnt);
 
-	return 0;
+out:
+	mutex_unlock(&prog->aux->used_maps_mutex);
+	return err;
 }
 
 static int
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cef4ef0d2b4e..417189b4061d 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -689,9 +689,15 @@ struct bpf_ctx_arg_aux {
 	u32 btf_id;
 };
 
+/* rcu struct for prog used_maps */
+struct bpf_used_maps {
+	u32 cnt;
+	struct bpf_map **arr;
+	struct rcu_head rcu;
+};
+
 struct bpf_prog_aux {
 	atomic64_t refcnt;
-	u32 used_map_cnt;
 	u32 max_ctx_offset;
 	u32 max_pkt_offset;
 	u32 max_tp_access;
@@ -722,7 +728,8 @@ struct bpf_prog_aux {
 	u32 size_poke_tab;
 	struct bpf_ksym ksym;
 	const struct bpf_prog_ops *ops;
-	struct bpf_map **used_maps;
+	struct mutex used_maps_mutex; /* write-side mutex for used_maps */
+	struct bpf_used_maps __rcu *used_maps;
 	struct bpf_prog *prog;
 	struct user_struct *user;
 	u64 load_time; /* ns since boottime */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index bde93344164d..9766aa0337d9 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1746,11 +1746,16 @@ bool bpf_prog_array_compatible(struct bpf_array *array,
 
 static int bpf_check_tail_call(const struct bpf_prog *fp)
 {
-	struct bpf_prog_aux *aux = fp->aux;
+	const struct bpf_used_maps *used_maps;
 	int i;
 
-	for (i = 0; i < aux->used_map_cnt; i++) {
-		struct bpf_map *map = aux->used_maps[i];
+	rcu_read_lock();
+	used_maps = rcu_dereference(fp->aux->used_maps);
+	if (!used_maps)
+		goto out;
+
+	for (i = 0; i < used_maps->cnt; i++) {
+		struct bpf_map *map = used_maps->arr[i];
 		struct bpf_array *array;
 
 		if (map->map_type != BPF_MAP_TYPE_PROG_ARRAY)
@@ -1761,6 +1766,8 @@ static int bpf_check_tail_call(const struct bpf_prog *fp)
 			return -EINVAL;
 	}
 
+out:
+	rcu_read_unlock();
 	return 0;
 }
 
@@ -2113,8 +2120,16 @@ void __bpf_free_used_maps(struct bpf_prog_aux *aux,
 
 static void bpf_free_used_maps(struct bpf_prog_aux *aux)
 {
-	__bpf_free_used_maps(aux, aux->used_maps, aux->used_map_cnt);
-	kfree(aux->used_maps);
+	struct bpf_used_maps *used_maps = aux->used_maps;
+
+	if (!used_maps)
+		return;
+
+	__bpf_free_used_maps(aux, used_maps->arr, used_maps->cnt);
+	kfree(used_maps->arr);
+	kfree(used_maps);
+
+	mutex_destroy(&aux->used_maps_mutex);
 }
 
 static void bpf_prog_free_deferred(struct work_struct *work)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2f343ce15747..3fde9dc4b595 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3149,11 +3149,15 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog,
 					      unsigned long addr, u32 *off,
 					      u32 *type)
 {
+	const struct bpf_used_maps *used_maps;
 	const struct bpf_map *map;
 	int i;
 
-	for (i = 0, *off = 0; i < prog->aux->used_map_cnt; i++) {
-		map = prog->aux->used_maps[i];
+	rcu_read_lock();
+	used_maps = rcu_dereference(prog->aux->used_maps);
+
+	for (i = 0, *off = 0; i < used_maps->cnt; i++) {
+		map = used_maps->arr[i];
 		if (map == (void *)addr) {
 			*type = BPF_PSEUDO_MAP_FD;
 			return map;
@@ -3166,6 +3170,7 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog,
 		}
 	}
 
+	rcu_read_unlock();
 	return NULL;
 }
 
@@ -3263,6 +3268,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 	struct bpf_prog_stats stats;
 	char __user *uinsns;
 	u32 ulen;
+	const struct bpf_used_maps *used_maps;
 	int err;
 
 	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
@@ -3284,19 +3290,24 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 	memcpy(info.tag, prog->tag, sizeof(prog->tag));
 	memcpy(info.name, prog->aux->name, sizeof(prog->aux->name));
 
+	rcu_read_lock();
+	used_maps = rcu_dereference(prog->aux->used_maps);
+
 	ulen = info.nr_map_ids;
-	info.nr_map_ids = prog->aux->used_map_cnt;
+	info.nr_map_ids = used_maps->cnt;
 	ulen = min_t(u32, info.nr_map_ids, ulen);
 	if (ulen) {
 		u32 __user *user_map_ids = u64_to_user_ptr(info.map_ids);
 		u32 i;
 
 		for (i = 0; i < ulen; i++)
-			if (put_user(prog->aux->used_maps[i]->id,
+			if (put_user(used_maps->arr[i]->id,
 				     &user_map_ids[i]))
 				return -EFAULT;
 	}
 
+	rcu_read_unlock();
+
 	err = set_info_rec_size(&info);
 	if (err)
 		return err;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b6ccfce3bf4c..9a6ca16667c7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11232,25 +11232,38 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		goto err_release_maps;
 	}
 
-	if (ret == 0 && env->used_map_cnt) {
+	if (ret == 0) {
 		/* if program passed verifier, update used_maps in bpf_prog_info */
-		env->prog->aux->used_maps = kmalloc_array(env->used_map_cnt,
-							  sizeof(env->used_maps[0]),
+		struct bpf_used_maps *used_maps = kzalloc(sizeof(*used_maps),
 							  GFP_KERNEL);
-
-		if (!env->prog->aux->used_maps) {
+		if (!used_maps) {
 			ret = -ENOMEM;
 			goto err_release_maps;
 		}
 
-		memcpy(env->prog->aux->used_maps, env->used_maps,
-		       sizeof(env->used_maps[0]) * env->used_map_cnt);
-		env->prog->aux->used_map_cnt = env->used_map_cnt;
+		if (env->used_map_cnt) {
+			used_maps->arr = kmalloc_array(env->used_map_cnt,
+						       sizeof(env->used_maps[0]),
+						       GFP_KERNEL);
 
-		/* program is valid. Convert pseudo bpf_ld_imm64 into generic
-		 * bpf_ld_imm64 instructions
-		 */
-		convert_pseudo_ld_imm64(env);
+			if (!used_maps->arr) {
+				kfree(used_maps);
+				ret = -ENOMEM;
+				goto err_release_maps;
+			}
+
+			memcpy(used_maps->arr, env->used_maps,
+			       sizeof(env->used_maps[0]) * env->used_map_cnt);
+			used_maps->cnt = env->used_map_cnt;
+
+			/* program is valid. Convert pseudo bpf_ld_imm64 into generic
+			 * bpf_ld_imm64 instructions
+			 */
+			convert_pseudo_ld_imm64(env);
+		}
+
+		env->prog->aux->used_maps = used_maps;
+		mutex_init(&env->prog->aux->used_maps_mutex);
 	}
 
 	if (ret == 0)
diff --git a/net/core/dev.c b/net/core/dev.c
index 7df6c9617321..bebbf8abd9a7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5439,17 +5439,23 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
 	int ret = 0;
 
 	if (new) {
+		const struct bpf_used_maps *used_maps;
 		u32 i;
 
+		rcu_read_lock();
+		used_maps = rcu_dereference(new->aux->used_maps);
+
 		/* generic XDP does not work with DEVMAPs that can
 		 * have a bpf_prog installed on an entry
 		 */
-		for (i = 0; i < new->aux->used_map_cnt; i++) {
-			if (dev_map_can_have_prog(new->aux->used_maps[i]))
+		for (i = 0; i < used_maps->cnt; i++) {
+			if (dev_map_can_have_prog(used_maps->arr[i]))
 				return -EINVAL;
-			if (cpu_map_prog_allowed(new->aux->used_maps[i]))
+			if (cpu_map_prog_allowed(used_maps->arr[i]))
 				return -EINVAL;
 		}
+
+		rcu_read_unlock();
 	}
 
 	switch (xdp->command) {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH bpf-next 2/5] bpf: Add BPF_PROG_ADD_MAP syscall
  2020-08-14 19:15 [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count YiFei Zhu
@ 2020-08-14 19:15 ` YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 3/5] libbpf: Add BPF_PROG_ADD_MAP syscall and use it on .metadata section YiFei Zhu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: YiFei Zhu @ 2020-08-14 19:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

This syscall attaches a map to a program. -EEXIST if the map is
already attached to the program. call_rcu is used to free the
old used_maps struct after an RCU grace period.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
 include/uapi/linux/bpf.h       |  7 +++
 kernel/bpf/syscall.c           | 83 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  7 +++
 3 files changed, 97 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b134e679e9db..01b32036a0f5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -118,6 +118,7 @@ enum bpf_cmd {
 	BPF_ENABLE_STATS,
 	BPF_ITER_CREATE,
 	BPF_LINK_DETACH,
+	BPF_PROG_ADD_MAP,
 };
 
 enum bpf_map_type {
@@ -648,6 +649,12 @@ union bpf_attr {
 		__u32		flags;
 	} iter_create;
 
+	struct { /* struct used by BPF_PROG_ADD_MAP command */
+		__u32		prog_fd;
+		__u32		map_fd;
+		__u32		flags;		/* extra flags */
+	} prog_add_map;
+
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 3fde9dc4b595..0564a4291184 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4144,6 +4144,86 @@ static int bpf_iter_create(union bpf_attr *attr)
 	return err;
 }
 
+#define BPF_PROG_ADD_MAP_LAST_FIELD prog_add_map.flags
+
+static void __bpf_free_used_maps_rcu(struct rcu_head *rcu)
+{
+	struct bpf_used_maps *used_maps = container_of(rcu, struct bpf_used_maps, rcu);
+
+	kfree(used_maps->arr);
+	kfree(used_maps);
+}
+
+static int bpf_prog_add_map(union bpf_attr *attr)
+{
+	struct bpf_prog *prog;
+	struct bpf_map *map;
+	struct bpf_used_maps *used_maps_old, *used_maps_new;
+	int i, ret = 0;
+
+	if (CHECK_ATTR(BPF_PROG_ADD_MAP))
+		return -EINVAL;
+
+	if (attr->prog_add_map.flags)
+		return -EINVAL;
+
+	prog = bpf_prog_get(attr->prog_add_map.prog_fd);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	map = bpf_map_get(attr->prog_add_map.map_fd);
+	if (IS_ERR(map)) {
+		ret = PTR_ERR(map);
+		goto out_prog_put;
+	}
+
+	used_maps_new = kzalloc(sizeof(*used_maps_new), GFP_KERNEL);
+	if (!used_maps_new) {
+		ret = -ENOMEM;
+		goto out_map_put;
+	}
+
+	mutex_lock(&prog->aux->used_maps_mutex);
+
+	used_maps_old = rcu_dereference_protected(prog->aux->used_maps,
+				lockdep_is_held(&prog->aux->used_maps_mutex));
+
+	for (i = 0; i < used_maps_old->cnt; i++)
+		if (used_maps_old->arr[i] == map) {
+			ret = -EEXIST;
+			goto out_unlock;
+		}
+
+	used_maps_new->cnt = used_maps_old->cnt + 1;
+	used_maps_new->arr = kmalloc_array(used_maps_new->cnt,
+					   sizeof(used_maps_new->arr[0]),
+					   GFP_KERNEL);
+	if (!used_maps_new->arr) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	memcpy(used_maps_new->arr, used_maps_old->arr,
+	       sizeof(used_maps_old->arr[0]) * used_maps_old->cnt);
+	used_maps_new->arr[used_maps_old->cnt] = map;
+
+	rcu_assign_pointer(prog->aux->used_maps, used_maps_new);
+	call_rcu(&used_maps_old->rcu, __bpf_free_used_maps_rcu);
+
+out_unlock:
+	mutex_unlock(&prog->aux->used_maps_mutex);
+
+	if (ret)
+		kfree(used_maps_new);
+
+out_map_put:
+	if (ret)
+		bpf_map_put(map);
+out_prog_put:
+	bpf_prog_put(prog);
+	return ret;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr;
@@ -4277,6 +4357,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_LINK_DETACH:
 		err = link_detach(&attr);
 		break;
+	case BPF_PROG_ADD_MAP:
+		err = bpf_prog_add_map(&attr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index b134e679e9db..01b32036a0f5 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -118,6 +118,7 @@ enum bpf_cmd {
 	BPF_ENABLE_STATS,
 	BPF_ITER_CREATE,
 	BPF_LINK_DETACH,
+	BPF_PROG_ADD_MAP,
 };
 
 enum bpf_map_type {
@@ -648,6 +649,12 @@ union bpf_attr {
 		__u32		flags;
 	} iter_create;
 
+	struct { /* struct used by BPF_PROG_ADD_MAP command */
+		__u32		prog_fd;
+		__u32		map_fd;
+		__u32		flags;		/* extra flags */
+	} prog_add_map;
+
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH bpf-next 3/5] libbpf: Add BPF_PROG_ADD_MAP syscall and use it on .metadata section
  2020-08-14 19:15 [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 2/5] bpf: Add BPF_PROG_ADD_MAP syscall YiFei Zhu
@ 2020-08-14 19:15 ` YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 4/5] bpftool: support dumping metadata YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 5/5] selftests/bpf: Test bpftool loading and " YiFei Zhu
  4 siblings, 0 replies; 7+ messages in thread
From: YiFei Zhu @ 2020-08-14 19:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

The patch adds a simple wrapper bpf_prog_add_map around the syscall.
And when using libbpf to load a program, it will probe the kernel for
the support of this syscall, and scan for the .metadata ELF section
and load it as an internal map like a .data section.

In the case that kernel supports the BPF_PROG_ADD_MAP syscall and
a .metadata section exists, the map will be explicitly attached to
the program via the syscall immediately after program is loaded.
-EEXIST is ignored for this syscall.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
 tools/lib/bpf/bpf.c      | 11 +++++
 tools/lib/bpf/bpf.h      |  1 +
 tools/lib/bpf/libbpf.c   | 97 +++++++++++++++++++++++++++++++++++++++-
 tools/lib/bpf/libbpf.map |  1 +
 4 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index eab14c97c15d..9b2173d4f92c 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -872,3 +872,14 @@ int bpf_enable_stats(enum bpf_stats_type type)
 
 	return sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr));
 }
+
+int bpf_prog_add_map(int prog_fd, int map_fd, int flags)
+{
+	union bpf_attr attr = {};
+
+	attr.prog_add_map.prog_fd = prog_fd;
+	attr.prog_add_map.map_fd = map_fd;
+	attr.prog_add_map.flags = flags;
+
+	return sys_bpf(BPF_PROG_ADD_MAP, &attr, sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 28855fd5b5f4..d76fee8f84e0 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -240,6 +240,7 @@ LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
 enum bpf_stats_type; /* defined in up-to-date linux/bpf.h */
 LIBBPF_API int bpf_enable_stats(enum bpf_stats_type type);
 
+LIBBPF_API int bpf_prog_add_map(int prog_fd, int map_fd, int flags);
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7be04e45d29c..3ab1cb1f2af3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -180,6 +180,8 @@ struct bpf_capabilities {
 	__u32 btf_func_global:1;
 	/* kernel support for expected_attach_type in BPF_PROG_LOAD */
 	__u32 exp_attach_type:1;
+	/* kernel support for BPF_PROG_ADD_MAP */
+	__u32 prog_add_map:1;
 };
 
 enum reloc_type {
@@ -288,6 +290,7 @@ struct bpf_struct_ops {
 #define KCONFIG_SEC ".kconfig"
 #define KSYMS_SEC ".ksyms"
 #define STRUCT_OPS_SEC ".struct_ops"
+#define METADATA_SEC ".metadata"
 
 enum libbpf_map_type {
 	LIBBPF_MAP_UNSPEC,
@@ -295,6 +298,7 @@ enum libbpf_map_type {
 	LIBBPF_MAP_BSS,
 	LIBBPF_MAP_RODATA,
 	LIBBPF_MAP_KCONFIG,
+	LIBBPF_MAP_METADATA,
 };
 
 static const char * const libbpf_type_to_btf_name[] = {
@@ -302,6 +306,7 @@ static const char * const libbpf_type_to_btf_name[] = {
 	[LIBBPF_MAP_BSS]	= BSS_SEC,
 	[LIBBPF_MAP_RODATA]	= RODATA_SEC,
 	[LIBBPF_MAP_KCONFIG]	= KCONFIG_SEC,
+	[LIBBPF_MAP_METADATA]	= METADATA_SEC,
 };
 
 struct bpf_map {
@@ -380,6 +385,8 @@ struct bpf_object {
 	size_t nr_maps;
 	size_t maps_cap;
 
+	struct bpf_map *metadata_map;
+
 	char *kconfig;
 	struct extern_desc *externs;
 	int nr_extern;
@@ -403,6 +410,7 @@ struct bpf_object {
 		Elf_Data *rodata;
 		Elf_Data *bss;
 		Elf_Data *st_ops_data;
+		Elf_Data *metadata;
 		size_t strtabidx;
 		struct {
 			GElf_Shdr shdr;
@@ -418,6 +426,7 @@ struct bpf_object {
 		int rodata_shndx;
 		int bss_shndx;
 		int st_ops_shndx;
+		int metadata_shndx;
 	} efile;
 	/*
 	 * All loaded bpf_object is linked in a list, which is
@@ -1030,6 +1039,7 @@ static struct bpf_object *bpf_object__new(const char *path,
 	obj->efile.obj_buf_sz = obj_buf_sz;
 	obj->efile.maps_shndx = -1;
 	obj->efile.btf_maps_shndx = -1;
+	obj->efile.metadata_shndx = -1;
 	obj->efile.data_shndx = -1;
 	obj->efile.rodata_shndx = -1;
 	obj->efile.bss_shndx = -1;
@@ -1391,6 +1401,9 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 	if (data)
 		memcpy(map->mmaped, data, data_sz);
 
+	if (type == LIBBPF_MAP_METADATA)
+		obj->metadata_map = map;
+
 	pr_debug("map %td is \"%s\"\n", map - obj->maps, map->name);
 	return 0;
 }
@@ -1426,6 +1439,14 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 		if (err)
 			return err;
 	}
+	if (obj->efile.metadata_shndx >= 0) {
+		err = bpf_object__init_internal_map(obj, LIBBPF_MAP_METADATA,
+						    obj->efile.metadata_shndx,
+						    obj->efile.metadata->d_buf,
+						    obj->efile.metadata->d_size);
+		if (err)
+			return err;
+	}
 	return 0;
 }
 
@@ -2665,6 +2686,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 			} else if (strcmp(name, STRUCT_OPS_SEC) == 0) {
 				obj->efile.st_ops_data = data;
 				obj->efile.st_ops_shndx = idx;
+			} else if (strcmp(name, METADATA_SEC) == 0) {
+				obj->efile.metadata = data;
+				obj->efile.metadata_shndx = idx;
 			} else {
 				pr_debug("skip section(%d) %s\n", idx, name);
 			}
@@ -3078,7 +3102,8 @@ static bool bpf_object__shndx_is_data(const struct bpf_object *obj,
 {
 	return shndx == obj->efile.data_shndx ||
 	       shndx == obj->efile.bss_shndx ||
-	       shndx == obj->efile.rodata_shndx;
+	       shndx == obj->efile.rodata_shndx ||
+	       shndx == obj->efile.metadata_shndx;
 }
 
 static bool bpf_object__shndx_is_maps(const struct bpf_object *obj,
@@ -3099,6 +3124,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 		return LIBBPF_MAP_RODATA;
 	else if (shndx == obj->efile.symbols_shndx)
 		return LIBBPF_MAP_KCONFIG;
+	else if (shndx == obj->efile.metadata_shndx)
+		return LIBBPF_MAP_METADATA;
 	else
 		return LIBBPF_MAP_UNSPEC;
 }
@@ -3633,6 +3660,59 @@ bpf_object__probe_exp_attach_type(struct bpf_object *obj)
 	return 0;
 }
 
+static int
+bpf_object__probe_prog_add_map(struct bpf_object *obj)
+{
+	struct bpf_load_program_attr prog_attr;
+	struct bpf_create_map_attr map_attr;
+	char *cp, errmsg[STRERR_BUFSIZE];
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog, map;
+
+	if (!obj->caps.global_data)
+		return 0;
+
+	memset(&map_attr, 0, sizeof(map_attr));
+	map_attr.map_type = BPF_MAP_TYPE_ARRAY;
+	map_attr.key_size = sizeof(int);
+	map_attr.value_size = 32;
+	map_attr.max_entries = 1;
+
+	map = bpf_create_map_xattr(&map_attr);
+	if (map < 0) {
+		cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
+		pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
+			__func__, cp, errno);
+		return -errno;
+	}
+
+	memset(&prog_attr, 0, sizeof(prog_attr));
+	prog_attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+	prog_attr.insns = insns;
+	prog_attr.insns_cnt = ARRAY_SIZE(insns);
+	prog_attr.license = "GPL";
+
+	prog = bpf_load_program_xattr(&prog_attr, NULL, 0);
+	if (prog < 0) {
+		cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
+		pr_warn("Error in %s():%s(%d). Couldn't create simple program.\n",
+			__func__, cp, errno);
+
+		close(map);
+		return -errno;
+	}
+
+	if (!bpf_prog_add_map(prog, map, 0))
+		obj->caps.prog_add_map = 1;
+
+	close(map);
+	close(prog);
+	return 0;
+}
+
 static int
 bpf_object__probe_caps(struct bpf_object *obj)
 {
@@ -3644,6 +3724,7 @@ bpf_object__probe_caps(struct bpf_object *obj)
 		bpf_object__probe_btf_datasec,
 		bpf_object__probe_array_mmap,
 		bpf_object__probe_exp_attach_type,
+		bpf_object__probe_prog_add_map,
 	};
 	int i, ret;
 
@@ -5404,6 +5485,20 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	if (ret >= 0) {
 		if (log_buf && load_attr.log_level)
 			pr_debug("verifier log:\n%s", log_buf);
+
+		if (prog->obj->metadata_map && prog->obj->caps.prog_add_map) {
+			if (bpf_prog_add_map(ret, bpf_map__fd(prog->obj->metadata_map), 0) &&
+			    errno != EEXIST) {
+				int fd = ret;
+
+				ret = -errno;
+				cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
+				pr_warn("add metadata map failed: %s\n", cp);
+				close(fd);
+				goto out;
+			}
+		}
+
 		*pfd = ret;
 		ret = 0;
 		goto out;
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 0c4722bfdd0a..86494a464c34 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -288,6 +288,7 @@ LIBBPF_0.1.0 {
 		bpf_map__set_value_size;
 		bpf_map__type;
 		bpf_map__value_size;
+		bpf_prog_add_map;
 		bpf_program__attach_xdp;
 		bpf_program__autoload;
 		bpf_program__is_sk_lookup;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH bpf-next 4/5] bpftool: support dumping metadata
  2020-08-14 19:15 [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs YiFei Zhu
                   ` (2 preceding siblings ...)
  2020-08-14 19:15 ` [RFC PATCH bpf-next 3/5] libbpf: Add BPF_PROG_ADD_MAP syscall and use it on .metadata section YiFei Zhu
@ 2020-08-14 19:15 ` YiFei Zhu
  2020-08-14 19:15 ` [RFC PATCH bpf-next 5/5] selftests/bpf: Test bpftool loading and " YiFei Zhu
  4 siblings, 0 replies; 7+ messages in thread
From: YiFei Zhu @ 2020-08-14 19:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

Added a flag "--metadata" to `bpftool prog list` to dump the metadata
contents. For some formatting some BTF code is put directly in the
metadata dumping. Sanity checks on the map and the kind of the btf_type
to make sure we are actually dumping what we are expecting.

A helper jsonw_reset is added to json writer so we can reuse the same
json writer without having extraneous commas.

Sample output:

  $ bpftool prog --metadata
  6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
  [...]
  	btf_id 4
  	metadata:
  		metadata_a = "foo"
  		metadata_b = 1

  $ bpftool prog --metadata --json --pretty
  [{
          "id": 6,
  [...]
          "btf_id": 4,
          "metadata": {
              "metadata_a": "foo",
              "metadata_b": 1
          }
      }
  ]

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
 tools/bpf/bpftool/json_writer.c |   6 ++
 tools/bpf/bpftool/json_writer.h |   3 +
 tools/bpf/bpftool/main.c        |  10 +++
 tools/bpf/bpftool/main.h        |   1 +
 tools/bpf/bpftool/prog.c        | 132 ++++++++++++++++++++++++++++++++
 5 files changed, 152 insertions(+)

diff --git a/tools/bpf/bpftool/json_writer.c b/tools/bpf/bpftool/json_writer.c
index 86501cd3c763..7fea83bedf48 100644
--- a/tools/bpf/bpftool/json_writer.c
+++ b/tools/bpf/bpftool/json_writer.c
@@ -119,6 +119,12 @@ void jsonw_pretty(json_writer_t *self, bool on)
 	self->pretty = on;
 }
 
+void jsonw_reset(json_writer_t *self)
+{
+	assert(self->depth == 0);
+	self->sep = '\0';
+}
+
 /* Basic blocks */
 static void jsonw_begin(json_writer_t *self, int c)
 {
diff --git a/tools/bpf/bpftool/json_writer.h b/tools/bpf/bpftool/json_writer.h
index 35cf1f00f96c..8ace65cdb92f 100644
--- a/tools/bpf/bpftool/json_writer.h
+++ b/tools/bpf/bpftool/json_writer.h
@@ -27,6 +27,9 @@ void jsonw_destroy(json_writer_t **self_p);
 /* Cause output to have pretty whitespace */
 void jsonw_pretty(json_writer_t *self, bool on);
 
+/* Reset separator to create new JSON */
+void jsonw_reset(json_writer_t *self);
+
 /* Add property name */
 void jsonw_name(json_writer_t *self, const char *name);
 
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 4a191fcbeb82..a681d568cfa7 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -28,6 +28,7 @@ bool show_pinned;
 bool block_mount;
 bool verifier_logs;
 bool relaxed_maps;
+bool dump_metadata;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
 struct pinned_obj_table link_table;
@@ -351,6 +352,10 @@ static int do_batch(int argc, char **argv)
 	return err;
 }
 
+enum bpftool_longonly_opts {
+	OPT_METADATA = 256,
+};
+
 int main(int argc, char **argv)
 {
 	static const struct option options[] = {
@@ -362,6 +367,7 @@ int main(int argc, char **argv)
 		{ "mapcompat",	no_argument,	NULL,	'm' },
 		{ "nomount",	no_argument,	NULL,	'n' },
 		{ "debug",	no_argument,	NULL,	'd' },
+		{ "metadata",	no_argument,	NULL,	OPT_METADATA },
 		{ 0 }
 	};
 	int opt, ret;
@@ -371,6 +377,7 @@ int main(int argc, char **argv)
 	json_output = false;
 	show_pinned = false;
 	block_mount = false;
+	dump_metadata = false;
 	bin_name = argv[0];
 
 	hash_init(prog_table.table);
@@ -412,6 +419,9 @@ int main(int argc, char **argv)
 			libbpf_set_print(print_all_levels);
 			verifier_logs = true;
 			break;
+		case OPT_METADATA:
+			dump_metadata = true;
+			break;
 		default:
 			p_err("unrecognized option '%s'", argv[optind - 1]);
 			if (json_output)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index e3a79b5a9960..54adfda5a9c9 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -82,6 +82,7 @@ extern bool show_pids;
 extern bool block_mount;
 extern bool verifier_logs;
 extern bool relaxed_maps;
+extern bool dump_metadata;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
 extern struct pinned_obj_table link_table;
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 158995d853b0..9f803a84d132 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -151,6 +151,132 @@ static void show_prog_maps(int fd, __u32 num_maps)
 	}
 }
 
+static void show_prog_metadata(int fd, __u32 num_maps)
+{
+	struct bpf_prog_info prog_info = {};
+	struct bpf_map_info map_info = {};
+	__u32 prog_info_len = sizeof(prog_info);
+	__u32 map_info_len = sizeof(map_info);
+	__u32 map_ids[num_maps];
+	void *value = NULL;
+	struct btf *btf = NULL;
+	const struct btf_type *t_datasec, *t_var;
+	struct btf_var_secinfo *vsi;
+	int key = 0;
+	unsigned int i, vlen;
+	int map_fd;
+	int err;
+
+	prog_info.nr_map_ids = num_maps;
+	prog_info.map_ids = ptr_to_u64(map_ids);
+
+	err = bpf_obj_get_info_by_fd(fd, &prog_info, &prog_info_len);
+	if (err || !prog_info.nr_map_ids)
+		return;
+
+	for (i = 0; i < prog_info.nr_map_ids; i++) {
+		map_fd = bpf_map_get_fd_by_id(map_ids[i]);
+		if (map_fd < 0)
+			return;
+
+		err = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
+		if (err)
+			goto out_close;
+
+		if (map_info.type != BPF_MAP_TYPE_ARRAY)
+			continue;
+		if (map_info.key_size != sizeof(int))
+			continue;
+		if (map_info.max_entries != 1)
+			continue;
+		if (!map_info.btf_value_type_id)
+			continue;
+		if (!strstr(map_info.name, ".metadata"))
+			continue;
+
+		goto found;
+	}
+
+	goto out_close;
+
+found:
+	value = malloc(map_info.value_size);
+	if (!value)
+		goto out_close;
+
+	if (bpf_map_lookup_elem(map_fd, &key, value))
+		goto out_free;
+
+	err = btf__get_from_id(map_info.btf_id, &btf);
+	if (err || !btf)
+		goto out_free;
+
+	t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
+	if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC)
+		goto out_free;
+
+	vlen = BTF_INFO_VLEN(t_datasec->info);
+	vsi = (struct btf_var_secinfo *)(t_datasec + 1);
+
+	if (json_output) {
+		struct btf_dumper d = {
+			.btf = btf,
+			.jw = json_wtr,
+			.is_plain_text = false,
+		};
+
+		jsonw_name(json_wtr, "metadata");
+
+		jsonw_start_object(json_wtr);
+		for (i = 0; i < vlen; i++) {
+			t_var = btf__type_by_id(btf, vsi[i].type);
+
+			if (BTF_INFO_KIND(t_var->info) != BTF_KIND_VAR)
+				continue;
+
+			jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off));
+			err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
+			if (err)
+				break;
+		}
+		jsonw_end_object(json_wtr);
+	} else {
+		json_writer_t *btf_wtr = jsonw_new(stdout);
+		struct btf_dumper d = {
+			.btf = btf,
+			.jw = btf_wtr,
+			.is_plain_text = true,
+		};
+		if (!btf_wtr)
+			goto out_free;
+
+		printf("\tmetadata:");
+
+		for (i = 0; i < vlen; i++) {
+			t_var = btf__type_by_id(btf, vsi[i].type);
+
+			if (BTF_INFO_KIND(t_var->info) != BTF_KIND_VAR)
+				continue;
+
+			printf("\n\t\t%s = ", btf__name_by_offset(btf, t_var->name_off));
+
+			jsonw_reset(btf_wtr);
+			err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
+			if (err)
+				break;
+		}
+
+		jsonw_destroy(&btf_wtr);
+	}
+
+out_free:
+	btf__free(btf);
+	free(value);
+
+out_close:
+	close(map_fd);
+}
+
 static void print_prog_header_json(struct bpf_prog_info *info)
 {
 	jsonw_uint_field(json_wtr, "id", info->id);
@@ -228,6 +354,9 @@ static void print_prog_json(struct bpf_prog_info *info, int fd)
 
 	emit_obj_refs_json(&refs_table, info->id, json_wtr);
 
+	if (dump_metadata)
+		show_prog_metadata(fd, info->nr_map_ids);
+
 	jsonw_end_object(json_wtr);
 }
 
@@ -297,6 +426,9 @@ static void print_prog_plain(struct bpf_prog_info *info, int fd)
 	emit_obj_refs_plain(&refs_table, info->id, "\n\tpids ");
 
 	printf("\n");
+
+	if (dump_metadata)
+		show_prog_metadata(fd, info->nr_map_ids);
 }
 
 static int show_prog(int fd)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH bpf-next 5/5] selftests/bpf: Test bpftool loading and dumping metadata
  2020-08-14 19:15 [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs YiFei Zhu
                   ` (3 preceding siblings ...)
  2020-08-14 19:15 ` [RFC PATCH bpf-next 4/5] bpftool: support dumping metadata YiFei Zhu
@ 2020-08-14 19:15 ` YiFei Zhu
  4 siblings, 0 replies; 7+ messages in thread
From: YiFei Zhu @ 2020-08-14 19:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

From: YiFei Zhu <zhuyifei@google.com>

This is a simple test to check that loading and dumping metadata
works, whether or not metadata contents are used by the program.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
 tools/testing/selftests/bpf/Makefile          |  3 +-
 .../selftests/bpf/progs/metadata_unused.c     | 15 ++++
 .../selftests/bpf/progs/metadata_used.c       | 15 ++++
 .../selftests/bpf/test_bpftool_metadata.sh    | 82 +++++++++++++++++++
 4 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_unused.c
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_used.c
 create mode 100755 tools/testing/selftests/bpf/test_bpftool_metadata.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index e7a8cf83ba48..5f559c79a977 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -68,7 +68,8 @@ TEST_PROGS := test_kmod.sh \
 	test_tc_edt.sh \
 	test_xdping.sh \
 	test_bpftool_build.sh \
-	test_bpftool.sh
+	test_bpftool.sh \
+	test_bpftool_metadata.sh \
 
 TEST_PROGS_EXTENDED := with_addr.sh \
 	with_tunnels.sh \
diff --git a/tools/testing/selftests/bpf/progs/metadata_unused.c b/tools/testing/selftests/bpf/progs/metadata_unused.c
new file mode 100644
index 000000000000..523b3c332426
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/metadata_unused.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+char metadata_a[] SEC(".metadata") = "foo";
+int metadata_b SEC(".metadata") = 1;
+
+SEC("cgroup_skb/egress")
+int prog(struct xdp_md *ctx)
+{
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/metadata_used.c b/tools/testing/selftests/bpf/progs/metadata_used.c
new file mode 100644
index 000000000000..59785404f7bb
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/metadata_used.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+char metadata_a[] SEC(".metadata") = "bar";
+int metadata_b SEC(".metadata") = 2;
+
+SEC("cgroup_skb/egress")
+int prog(struct xdp_md *ctx)
+{
+	return metadata_b ? 1 : 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_bpftool_metadata.sh b/tools/testing/selftests/bpf/test_bpftool_metadata.sh
new file mode 100755
index 000000000000..a7515c09dc2d
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_bpftool_metadata.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+TESTNAME=bpftool_metadata
+BPF_FS=$(awk '$3 == "bpf" {print $2; exit}' /proc/mounts)
+BPF_DIR=$BPF_FS/test_$TESTNAME
+
+_cleanup()
+{
+	set +e
+	rm -rf $BPF_DIR 2> /dev/null
+}
+
+cleanup_skip()
+{
+	echo "selftests: $TESTNAME [SKIP]"
+	_cleanup
+
+	exit $ksft_skip
+}
+
+cleanup()
+{
+	if [ "$?" = 0 ]; then
+		echo "selftests: $TESTNAME [PASS]"
+	else
+		echo "selftests: $TESTNAME [FAILED]"
+	fi
+	_cleanup
+}
+
+if [ $(id -u) -ne 0 ]; then
+	echo "selftests: $TESTNAME [SKIP] Need root privileges"
+	exit $ksft_skip
+fi
+
+if [ -z "$BPF_FS" ]; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without bpffs mounted"
+	exit $ksft_skip
+fi
+
+if ! bpftool version > /dev/null 2>&1; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without bpftool"
+	exit $ksft_skip
+fi
+
+set -e
+
+trap cleanup_skip EXIT
+
+mkdir $BPF_DIR
+
+trap cleanup EXIT
+
+bpftool prog load metadata_unused.o $BPF_DIR/unused
+
+METADATA_PLAIN="$(bpftool prog --metadata)"
+echo "$METADATA_PLAIN" | grep 'metadata_a = "foo"' > /dev/null
+echo "$METADATA_PLAIN" | grep 'metadata_b = 1' > /dev/null
+
+bpftool prog --metadata --json | grep '"metadata":{"metadata_a":"foo","metadata_b":1}' > /dev/null
+
+bpftool map | grep 'metada.metadata' > /dev/null
+
+rm $BPF_DIR/unused
+
+bpftool prog load metadata_used.o $BPF_DIR/used
+
+METADATA_PLAIN="$(bpftool prog --metadata)"
+echo "$METADATA_PLAIN" | grep 'metadata_a = "bar"' > /dev/null
+echo "$METADATA_PLAIN" | grep 'metadata_b = 2' > /dev/null
+
+bpftool prog --metadata --json | grep '"metadata":{"metadata_a":"bar","metadata_b":2}' > /dev/null
+
+bpftool map | grep 'metada.metadata' > /dev/null
+
+rm $BPF_DIR/used
+
+exit 0
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count
  2020-08-14 19:15 ` [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count YiFei Zhu
@ 2020-08-19  0:48   ` Alexei Starovoitov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2020-08-19  0:48 UTC (permalink / raw)
  To: YiFei Zhu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Stanislav Fomichev,
	Mahesh Bandewar, YiFei Zhu

On Fri, Aug 14, 2020 at 02:15:54PM -0500, YiFei Zhu wrote:
> @@ -3263,6 +3268,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
>  	struct bpf_prog_stats stats;
>  	char __user *uinsns;
>  	u32 ulen;
> +	const struct bpf_used_maps *used_maps;
>  	int err;
>  
>  	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
> @@ -3284,19 +3290,24 @@ static int bpf_prog_get_info_by_fd(struct file *file,
>  	memcpy(info.tag, prog->tag, sizeof(prog->tag));
>  	memcpy(info.name, prog->aux->name, sizeof(prog->aux->name));
>  
> +	rcu_read_lock();
> +	used_maps = rcu_dereference(prog->aux->used_maps);
> +
>  	ulen = info.nr_map_ids;
> -	info.nr_map_ids = prog->aux->used_map_cnt;
> +	info.nr_map_ids = used_maps->cnt;
>  	ulen = min_t(u32, info.nr_map_ids, ulen);
>  	if (ulen) {
>  		u32 __user *user_map_ids = u64_to_user_ptr(info.map_ids);
>  		u32 i;
>  
>  		for (i = 0; i < ulen; i++)
> -			if (put_user(prog->aux->used_maps[i]->id,
> +			if (put_user(used_maps->arr[i]->id,

put_user() under rcu_read_lock() is not allowed.
You should see a splat like this:
[    2.297943] BUG: sleeping function called from invalid context at kernel/bpf/syscall.c:3305
[    2.305554] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 81, name: test_progs
[    2.312802] 1 lock held by test_progs/81:
[    2.316545]  #0: ffffffff82265760 (rcu_read_lock){....}-{1:2}, at: bpf_prog_get_info_by_fd.isra.31+0xb9/0xdf0
[    2.325446] Preemption disabled at:
[    2.325454] [<ffffffff812aec64>] __fd_install+0x24/0x2b0
[    2.333571] CPU: 1 PID: 81 Comm: test_progs Not tainted 5.8.0-ga96d3fdf9 #1
[    2.339818] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[    2.347795] Call Trace:
[    2.350082]  dump_stack+0x78/0xab
[    2.353109]  ___might_sleep+0x17d/0x260
[    2.356554]  __might_fault+0x34/0x90
[    2.359873]  bpf_prog_get_info_by_fd.isra.31+0x21b/0xdf0
[    2.364674]  ? __lock_acquire+0x2e4/0x1df0
[    2.368377]  ? bpf_obj_get_info_by_fd+0x20f/0x3e0
[    2.372621]  bpf_obj_get_info_by_fd+0x20f/0x3e0

Please test your patches with kernel debug flags
like CONFIG_DEBUG_ATOMIC_SLEEP=y at a minimum.
kasan and lockdep are good to have as well.

In general I think rcu here is overkill.
Why not to use mutex everywhere? It's not like it's in critical path.

Also I think better name is needed. ADD_MAP is too specific.
When we get around to implement Daniel's suggestion of replacing
one map with another ADD_MAP as command won't fit.
Single purpose commands are ok, but this one feels too narrow.
May be BPF_PROG_BIND_MAP ?
Then later adding target_map_fd field to prog_bind_map struct would do it.
I thought about BPF_PROG_ATTACH_MAP name, but we use the word "attach"
in too many places and it could be confusing.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-19  0:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-14 19:15 [RFC PATCH bpf-next 0/5] Allow storage of flexible metadata information for eBPF programs YiFei Zhu
2020-08-14 19:15 ` [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count YiFei Zhu
2020-08-19  0:48   ` Alexei Starovoitov
2020-08-14 19:15 ` [RFC PATCH bpf-next 2/5] bpf: Add BPF_PROG_ADD_MAP syscall YiFei Zhu
2020-08-14 19:15 ` [RFC PATCH bpf-next 3/5] libbpf: Add BPF_PROG_ADD_MAP syscall and use it on .metadata section YiFei Zhu
2020-08-14 19:15 ` [RFC PATCH bpf-next 4/5] bpftool: support dumping metadata YiFei Zhu
2020-08-14 19:15 ` [RFC PATCH bpf-next 5/5] selftests/bpf: Test bpftool loading and " YiFei Zhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.