bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v5 0/3] Sockmap iterator
@ 2020-09-09 16:27 Lorenz Bauer
  2020-09-09 16:27 ` [PATCH bpf-next v5 1/3] net: sockmap: Remove unnecessary sk_fullsock checks Lorenz Bauer
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Lorenz Bauer @ 2020-09-09 16:27 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend, kafai
  Cc: bpf, kernel-team, Lorenz Bauer

I've decided to drop the changes that extend BTF pointers from this series.
They are a lot more work than I anticipated, and I don't want to hold
this up more.

The context for a sockmap / sockhash iterator contains the key, and a BTF
pointer to a socket. Iterating a sockmap will yield a NULL socket if the
slot in the array is empty. Iterating a hashmap will never yield a NULL
socket.

I'll add support to call map_update_elem from bpf_iter in a follow up to
this series.

Changes in v5:
- Drop pointer to struct sock shenanigans

Changes in v4:
- Alias struct sock* to PTR_TO_SOCK_COMMON instead of PTR_TO_SOCKET (Martin)

Changes in v3:
- Use PTR_TO_BTF_ID in iterator context (Yonghong, Martin)
- Use rcu_dereference instead of rcu_dereference_raw (Jakub)
- Fix various test nits (Jakub, Andrii)

Changes in v2:
- Remove unnecessary sk_fullsock checks (Jakub)
- Nits for test output (Jakub)
- Increase number of sockets in tests to 64 (Jakub)
- Handle ENOENT in tests (Jakub)
- Actually test SOCKHASH iteration (myself)
- Fix SOCKHASH iterator initialization (myself)

Lorenz Bauer (3):
  net: sockmap: Remove unnecessary sk_fullsock checks
  net: Allow iterating sockmap and sockhash
  selftests: bpf: Test iterating a sockmap

 net/core/sock_map.c                           | 284 +++++++++++++++++-
 .../selftests/bpf/prog_tests/sockmap_basic.c  |  89 ++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   9 +
 .../selftests/bpf/progs/bpf_iter_sockmap.c    |  43 +++
 .../selftests/bpf/progs/bpf_iter_sockmap.h    |   3 +
 5 files changed, 424 insertions(+), 4 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH bpf-next v5 1/3] net: sockmap: Remove unnecessary sk_fullsock checks
  2020-09-09 16:27 [PATCH bpf-next v5 0/3] Sockmap iterator Lorenz Bauer
@ 2020-09-09 16:27 ` Lorenz Bauer
  2020-09-09 16:27 ` [PATCH bpf-next v5 2/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
  2020-09-09 16:27 ` [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap Lorenz Bauer
  2 siblings, 0 replies; 7+ messages in thread
From: Lorenz Bauer @ 2020-09-09 16:27 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend, kafai
  Cc: bpf, kernel-team, Lorenz Bauer

The lookup paths for sockmap and sockhash currently include a check
that returns NULL if the socket we just found is not a full socket.
However, this check is not necessary. On insertion we ensure that
we have a full socket (caveat around sock_ops), so request sockets
are not a problem. Time-wait sockets are allocated separate from
the original socket and then fed into the hashdance. They don't
affect the sockets already stored in the sockmap.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 net/core/sock_map.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 078386d7d9a2..82494810d0ee 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -382,7 +382,7 @@ static void *sock_map_lookup(struct bpf_map *map, void *key)
 	struct sock *sk;
 
 	sk = __sock_map_lookup_elem(map, *(u32 *)key);
-	if (!sk || !sk_fullsock(sk))
+	if (!sk)
 		return NULL;
 	if (sk_is_refcounted(sk) && !refcount_inc_not_zero(&sk->sk_refcnt))
 		return NULL;
@@ -1110,7 +1110,7 @@ static void *sock_hash_lookup(struct bpf_map *map, void *key)
 	struct sock *sk;
 
 	sk = __sock_hash_lookup_elem(map, key);
-	if (!sk || !sk_fullsock(sk))
+	if (!sk)
 		return NULL;
 	if (sk_is_refcounted(sk) && !refcount_inc_not_zero(&sk->sk_refcnt))
 		return NULL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf-next v5 2/3] net: Allow iterating sockmap and sockhash
  2020-09-09 16:27 [PATCH bpf-next v5 0/3] Sockmap iterator Lorenz Bauer
  2020-09-09 16:27 ` [PATCH bpf-next v5 1/3] net: sockmap: Remove unnecessary sk_fullsock checks Lorenz Bauer
@ 2020-09-09 16:27 ` Lorenz Bauer
  2020-09-10 17:40   ` Yonghong Song
  2020-09-09 16:27 ` [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap Lorenz Bauer
  2 siblings, 1 reply; 7+ messages in thread
From: Lorenz Bauer @ 2020-09-09 16:27 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend, kafai
  Cc: bpf, kernel-team, Lorenz Bauer

Add bpf_iter support for sockmap / sockhash, based on the bpf_sk_storage and
hashtable implementation. sockmap and sockhash share the same iteration
context: a pointer to an arbitrary key and a pointer to a socket. Both
pointers may be NULL, and so BPF has to perform a NULL check before accessing
them. Technically it's not possible for sockhash iteration to yield a NULL
socket, but we ignore this to be able to use a single iteration point.

Iteration will visit all keys that remain unmodified during the lifetime of
the iterator. It may or may not visit newly added ones.

Switch from using rcu_dereference_raw to plain rcu_dereference, so we gain
another guard rail if CONFIG_PROVE_RCU is enabled.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 net/core/sock_map.c | 280 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 278 insertions(+), 2 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 82494810d0ee..e1f05e3fa1d0 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -2,6 +2,7 @@
 /* Copyright (c) 2017 - 2018 Covalent IO, Inc. http://covalent.io */
 
 #include <linux/bpf.h>
+#include <linux/btf_ids.h>
 #include <linux/filter.h>
 #include <linux/errno.h>
 #include <linux/file.h>
@@ -703,6 +704,109 @@ const struct bpf_func_proto bpf_msg_redirect_map_proto = {
 	.arg4_type      = ARG_ANYTHING,
 };
 
+struct sock_map_seq_info {
+	struct bpf_map *map;
+	struct sock *sk;
+	u32 index;
+};
+
+struct bpf_iter__sockmap {
+	__bpf_md_ptr(struct bpf_iter_meta *, meta);
+	__bpf_md_ptr(struct bpf_map *, map);
+	__bpf_md_ptr(void *, key);
+	__bpf_md_ptr(struct sock *, sk);
+};
+
+DEFINE_BPF_ITER_FUNC(sockmap, struct bpf_iter_meta *meta,
+		     struct bpf_map *map, void *key,
+		     struct sock *sk)
+
+static void *sock_map_seq_lookup_elem(struct sock_map_seq_info *info)
+{
+	if (unlikely(info->index >= info->map->max_entries))
+		return NULL;
+
+	info->sk = __sock_map_lookup_elem(info->map, info->index);
+
+	/* can't return sk directly, since that might be NULL */
+	return info;
+}
+
+static void *sock_map_seq_start(struct seq_file *seq, loff_t *pos)
+{
+	struct sock_map_seq_info *info = seq->private;
+
+	if (*pos == 0)
+		++*pos;
+
+	/* pairs with sock_map_seq_stop */
+	rcu_read_lock();
+	return sock_map_seq_lookup_elem(info);
+}
+
+static void *sock_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct sock_map_seq_info *info = seq->private;
+
+	++*pos;
+	++info->index;
+
+	return sock_map_seq_lookup_elem(info);
+}
+
+static int sock_map_seq_show(struct seq_file *seq, void *v)
+{
+	struct sock_map_seq_info *info = seq->private;
+	struct bpf_iter__sockmap ctx = {};
+	struct bpf_iter_meta meta;
+	struct bpf_prog *prog;
+
+	meta.seq = seq;
+	prog = bpf_iter_get_info(&meta, !v);
+	if (!prog)
+		return 0;
+
+	ctx.meta = &meta;
+	ctx.map = info->map;
+	if (v) {
+		ctx.key = &info->index;
+		ctx.sk = info->sk;
+	}
+
+	return bpf_iter_run_prog(prog, &ctx);
+}
+
+static void sock_map_seq_stop(struct seq_file *seq, void *v)
+{
+	if (!v)
+		(void)sock_map_seq_show(seq, NULL);
+
+	/* pairs with sock_map_seq_start */
+	rcu_read_unlock();
+}
+
+static const struct seq_operations sock_map_seq_ops = {
+	.start	= sock_map_seq_start,
+	.next	= sock_map_seq_next,
+	.stop	= sock_map_seq_stop,
+	.show	= sock_map_seq_show,
+};
+
+static int sock_map_init_seq_private(void *priv_data,
+				     struct bpf_iter_aux_info *aux)
+{
+	struct sock_map_seq_info *info = priv_data;
+
+	info->map = aux->map;
+	return 0;
+}
+
+static const struct bpf_iter_seq_info sock_map_iter_seq_info = {
+	.seq_ops		= &sock_map_seq_ops,
+	.init_seq_private	= sock_map_init_seq_private,
+	.seq_priv_size		= sizeof(struct sock_map_seq_info),
+};
+
 static int sock_map_btf_id;
 const struct bpf_map_ops sock_map_ops = {
 	.map_meta_equal		= bpf_map_meta_equal,
@@ -717,6 +821,7 @@ const struct bpf_map_ops sock_map_ops = {
 	.map_check_btf		= map_check_no_btf,
 	.map_btf_name		= "bpf_stab",
 	.map_btf_id		= &sock_map_btf_id,
+	.iter_seq_info		= &sock_map_iter_seq_info,
 };
 
 struct bpf_shtab_elem {
@@ -953,7 +1058,7 @@ static int sock_hash_get_next_key(struct bpf_map *map, void *key,
 	if (!elem)
 		goto find_first_elem;
 
-	elem_next = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(&elem->node)),
+	elem_next = hlist_entry_safe(rcu_dereference(hlist_next_rcu(&elem->node)),
 				     struct bpf_shtab_elem, node);
 	if (elem_next) {
 		memcpy(key_next, elem_next->key, key_size);
@@ -965,7 +1070,7 @@ static int sock_hash_get_next_key(struct bpf_map *map, void *key,
 find_first_elem:
 	for (; i < htab->buckets_num; i++) {
 		head = &sock_hash_select_bucket(htab, i)->head;
-		elem_next = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)),
+		elem_next = hlist_entry_safe(rcu_dereference(hlist_first_rcu(head)),
 					     struct bpf_shtab_elem, node);
 		if (elem_next) {
 			memcpy(key_next, elem_next->key, key_size);
@@ -1199,6 +1304,117 @@ const struct bpf_func_proto bpf_msg_redirect_hash_proto = {
 	.arg4_type      = ARG_ANYTHING,
 };
 
+struct sock_hash_seq_info {
+	struct bpf_map *map;
+	struct bpf_shtab *htab;
+	u32 bucket_id;
+};
+
+static void *sock_hash_seq_find_next(struct sock_hash_seq_info *info,
+				     struct bpf_shtab_elem *prev_elem)
+{
+	const struct bpf_shtab *htab = info->htab;
+	struct bpf_shtab_bucket *bucket;
+	struct bpf_shtab_elem *elem;
+	struct hlist_node *node;
+
+	/* try to find next elem in the same bucket */
+	if (prev_elem) {
+		node = rcu_dereference(hlist_next_rcu(&prev_elem->node));
+		elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
+		if (elem)
+			return elem;
+
+		/* no more elements, continue in the next bucket */
+		info->bucket_id++;
+	}
+
+	for (; info->bucket_id < htab->buckets_num; info->bucket_id++) {
+		bucket = &htab->buckets[info->bucket_id];
+		node = rcu_dereference(hlist_first_rcu(&bucket->head));
+		elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
+		if (elem)
+			return elem;
+	}
+
+	return NULL;
+}
+
+static void *sock_hash_seq_start(struct seq_file *seq, loff_t *pos)
+{
+	struct sock_hash_seq_info *info = seq->private;
+
+	if (*pos == 0)
+		++*pos;
+
+	/* pairs with sock_hash_seq_stop */
+	rcu_read_lock();
+	return sock_hash_seq_find_next(info, NULL);
+}
+
+static void *sock_hash_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct sock_hash_seq_info *info = seq->private;
+
+	++*pos;
+	return sock_hash_seq_find_next(info, v);
+}
+
+static int sock_hash_seq_show(struct seq_file *seq, void *v)
+{
+	struct sock_hash_seq_info *info = seq->private;
+	struct bpf_iter__sockmap ctx = {};
+	struct bpf_shtab_elem *elem = v;
+	struct bpf_iter_meta meta;
+	struct bpf_prog *prog;
+
+	meta.seq = seq;
+	prog = bpf_iter_get_info(&meta, !elem);
+	if (!prog)
+		return 0;
+
+	ctx.meta = &meta;
+	ctx.map = info->map;
+	if (elem) {
+		ctx.key = elem->key;
+		ctx.sk = elem->sk;
+	}
+
+	return bpf_iter_run_prog(prog, &ctx);
+}
+
+static void sock_hash_seq_stop(struct seq_file *seq, void *v)
+{
+	if (!v)
+		(void)sock_hash_seq_show(seq, NULL);
+
+	/* pairs with sock_hash_seq_start */
+	rcu_read_unlock();
+}
+
+static const struct seq_operations sock_hash_seq_ops = {
+	.start	= sock_hash_seq_start,
+	.next	= sock_hash_seq_next,
+	.stop	= sock_hash_seq_stop,
+	.show	= sock_hash_seq_show,
+};
+
+static int sock_hash_init_seq_private(void *priv_data,
+				     struct bpf_iter_aux_info *aux)
+{
+	struct sock_hash_seq_info *info = priv_data;
+
+	info->map = aux->map;
+	info->htab = container_of(aux->map, struct bpf_shtab, map);
+	return 0;
+}
+
+static const struct bpf_iter_seq_info sock_hash_iter_seq_info = {
+	.seq_ops		= &sock_hash_seq_ops,
+	.init_seq_private	= sock_hash_init_seq_private,
+	.seq_priv_size		= sizeof(struct sock_hash_seq_info),
+};
+
 static int sock_hash_map_btf_id;
 const struct bpf_map_ops sock_hash_ops = {
 	.map_meta_equal		= bpf_map_meta_equal,
@@ -1213,6 +1429,7 @@ const struct bpf_map_ops sock_hash_ops = {
 	.map_check_btf		= map_check_no_btf,
 	.map_btf_name		= "bpf_shtab",
 	.map_btf_id		= &sock_hash_map_btf_id,
+	.iter_seq_info		= &sock_hash_iter_seq_info,
 };
 
 static struct sk_psock_progs *sock_map_progs(struct bpf_map *map)
@@ -1323,3 +1540,62 @@ void sock_map_close(struct sock *sk, long timeout)
 	release_sock(sk);
 	saved_close(sk, timeout);
 }
+
+static int sock_map_iter_attach_target(struct bpf_prog *prog,
+				       union bpf_iter_link_info *linfo,
+				       struct bpf_iter_aux_info *aux)
+{
+	struct bpf_map *map;
+	int err = -EINVAL;
+
+	if (!linfo->map.map_fd)
+		return -EBADF;
+
+	map = bpf_map_get_with_uref(linfo->map.map_fd);
+	if (IS_ERR(map))
+		return PTR_ERR(map);
+
+	if (map->map_type != BPF_MAP_TYPE_SOCKMAP &&
+	    map->map_type != BPF_MAP_TYPE_SOCKHASH)
+		goto put_map;
+
+	if (prog->aux->max_rdonly_access > map->key_size) {
+		err = -EACCES;
+		goto put_map;
+	}
+
+	aux->map = map;
+	return 0;
+
+put_map:
+	bpf_map_put_with_uref(map);
+	return err;
+}
+
+static void sock_map_iter_detach_target(struct bpf_iter_aux_info *aux)
+{
+	bpf_map_put_with_uref(aux->map);
+}
+
+static struct bpf_iter_reg sock_map_iter_reg = {
+	.target			= "sockmap",
+	.attach_target		= sock_map_iter_attach_target,
+	.detach_target		= sock_map_iter_detach_target,
+	.show_fdinfo		= bpf_iter_map_show_fdinfo,
+	.fill_link_info		= bpf_iter_map_fill_link_info,
+	.ctx_arg_info_size	= 2,
+	.ctx_arg_info		= {
+		{ offsetof(struct bpf_iter__sockmap, key),
+		  PTR_TO_RDONLY_BUF_OR_NULL },
+		{ offsetof(struct bpf_iter__sockmap, sk),
+		  PTR_TO_BTF_ID_OR_NULL },
+	},
+};
+
+static int __init bpf_sockmap_iter_init(void)
+{
+	sock_map_iter_reg.ctx_arg_info[1].btf_id =
+		btf_sock_ids[BTF_SOCK_TYPE_SOCK];
+	return bpf_iter_reg_target(&sock_map_iter_reg);
+}
+late_initcall(bpf_sockmap_iter_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap
  2020-09-09 16:27 [PATCH bpf-next v5 0/3] Sockmap iterator Lorenz Bauer
  2020-09-09 16:27 ` [PATCH bpf-next v5 1/3] net: sockmap: Remove unnecessary sk_fullsock checks Lorenz Bauer
  2020-09-09 16:27 ` [PATCH bpf-next v5 2/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
@ 2020-09-09 16:27 ` Lorenz Bauer
  2020-09-10 18:11   ` Yonghong Song
  2 siblings, 1 reply; 7+ messages in thread
From: Lorenz Bauer @ 2020-09-09 16:27 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend, kafai
  Cc: bpf, kernel-team, Lorenz Bauer

Add a test that exercises a basic sockmap / sockhash iteration. For
now we simply count the number of elements seen. Once sockmap update
from iterators works we can extend this to perform a full copy.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 89 +++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |  9 ++
 .../selftests/bpf/progs/bpf_iter_sockmap.c    | 43 +++++++++
 .../selftests/bpf/progs/bpf_iter_sockmap.h    |  3 +
 4 files changed, 144 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index 0b79d78b98db..3215f4d22720 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -6,6 +6,9 @@
 #include "test_skmsg_load_helpers.skel.h"
 #include "test_sockmap_update.skel.h"
 #include "test_sockmap_invalid_update.skel.h"
+#include "bpf_iter_sockmap.skel.h"
+
+#include "progs/bpf_iter_sockmap.h"
 
 #define TCP_REPAIR		19	/* TCP sock is under repair right now */
 
@@ -171,6 +174,88 @@ static void test_sockmap_invalid_update(void)
 		test_sockmap_invalid_update__destroy(skel);
 }
 
+static void test_sockmap_iter(enum bpf_map_type map_type)
+{
+	DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
+	int err, len, src_fd, iter_fd, duration;
+	union bpf_iter_link_info linfo = {0};
+	__s64 sock_fd[SOCKMAP_MAX_ENTRIES];
+	__u32 i, num_sockets, max_elems;
+	struct bpf_iter_sockmap *skel;
+	struct bpf_link *link;
+	struct bpf_map *src;
+	char buf[64];
+
+	skel = bpf_iter_sockmap__open_and_load();
+	if (CHECK(!skel, "bpf_iter_sockmap__open_and_load", "skeleton open_and_load failed\n"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(sock_fd); i++)
+		sock_fd[i] = -1;
+
+	/* Make sure we have at least one "empty" entry to test iteration of
+	 * an empty slot.
+	 */
+	num_sockets = ARRAY_SIZE(sock_fd) - 1;
+
+	if (map_type == BPF_MAP_TYPE_SOCKMAP) {
+		src = skel->maps.sockmap;
+		max_elems = bpf_map__max_entries(src);
+	} else {
+		src = skel->maps.sockhash;
+		max_elems = num_sockets;
+	}
+
+	src_fd = bpf_map__fd(src);
+
+	for (i = 0; i < num_sockets; i++) {
+		sock_fd[i] = connected_socket_v4();
+		if (CHECK(sock_fd[i] == -1, "connected_socket_v4", "cannot connect\n"))
+			goto out;
+
+		err = bpf_map_update_elem(src_fd, &i, &sock_fd[i], BPF_NOEXIST);
+		if (CHECK(err, "map_update", "failed: %s\n", strerror(errno)))
+			goto out;
+	}
+
+	linfo.map.map_fd = src_fd;
+	opts.link_info = &linfo;
+	opts.link_info_len = sizeof(linfo);
+	link = bpf_program__attach_iter(skel->progs.count_elems, &opts);
+	if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
+		goto out;
+
+	iter_fd = bpf_iter_create(bpf_link__fd(link));
+	if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
+		goto free_link;
+
+	/* do some tests */
+	while ((len = read(iter_fd, buf, sizeof(buf))) > 0)
+		;
+	if (CHECK(len < 0, "read", "failed: %s\n", strerror(errno)))
+		goto close_iter;
+
+	/* test results */
+	if (CHECK(skel->bss->elems != max_elems, "elems", "got %u expected %u\n",
+		  skel->bss->elems, max_elems))
+		goto close_iter;
+
+	if (CHECK(skel->bss->socks != num_sockets, "socks", "got %u expected %u\n",
+		  skel->bss->socks, num_sockets))
+		goto close_iter;
+
+close_iter:
+	close(iter_fd);
+free_link:
+	bpf_link__destroy(link);
+out:
+	for (i = 0; i < num_sockets; i++) {
+		if (sock_fd[i] >= 0)
+			close(sock_fd[i]);
+	}
+	bpf_iter_sockmap__destroy(skel);
+}
+
 void test_sockmap_basic(void)
 {
 	if (test__start_subtest("sockmap create_update_free"))
@@ -187,4 +272,8 @@ void test_sockmap_basic(void)
 		test_sockmap_update(BPF_MAP_TYPE_SOCKHASH);
 	if (test__start_subtest("sockmap update in unsafe context"))
 		test_sockmap_invalid_update();
+	if (test__start_subtest("sockmap iter"))
+		test_sockmap_iter(BPF_MAP_TYPE_SOCKMAP);
+	if (test__start_subtest("sockhash iter"))
+		test_sockmap_iter(BPF_MAP_TYPE_SOCKHASH);
 }
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h
index c196280df90d..df682af75510 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter.h
+++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
@@ -13,6 +13,7 @@
 #define udp6_sock udp6_sock___not_used
 #define bpf_iter__bpf_map_elem bpf_iter__bpf_map_elem___not_used
 #define bpf_iter__bpf_sk_storage_map bpf_iter__bpf_sk_storage_map___not_used
+#define bpf_iter__sockmap bpf_iter__sockmap___not_used
 #include "vmlinux.h"
 #undef bpf_iter_meta
 #undef bpf_iter__bpf_map
@@ -26,6 +27,7 @@
 #undef udp6_sock
 #undef bpf_iter__bpf_map_elem
 #undef bpf_iter__bpf_sk_storage_map
+#undef bpf_iter__sockmap
 
 struct bpf_iter_meta {
 	struct seq_file *seq;
@@ -96,3 +98,10 @@ struct bpf_iter__bpf_sk_storage_map {
 	struct sock *sk;
 	void *value;
 };
+
+struct bpf_iter__sockmap {
+	struct bpf_iter_meta *meta;
+	struct bpf_map *map;
+	void *key;
+	struct sock *sk;
+};
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
new file mode 100644
index 000000000000..0e27f73dd803
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Cloudflare */
+#include "bpf_iter.h"
+#include "bpf_tracing_net.h"
+#include "bpf_iter_sockmap.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <errno.h>
+
+char _license[] SEC("license") = "GPL";
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SOCKMAP);
+	__uint(max_entries, SOCKMAP_MAX_ENTRIES);
+	__type(key, __u32);
+	__type(value, __u64);
+} sockmap SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SOCKHASH);
+	__uint(max_entries, SOCKMAP_MAX_ENTRIES);
+	__type(key, __u32);
+	__type(value, __u64);
+} sockhash SEC(".maps");
+
+__u32 elems = 0;
+__u32 socks = 0;
+
+SEC("iter/sockmap")
+int count_elems(struct bpf_iter__sockmap *ctx)
+{
+	struct sock *sk = ctx->sk;
+	__u32 tmp, *key = ctx->key;
+	int ret;
+
+	if (key)
+		elems++;
+
+	if (sk)
+		socks++;
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h
new file mode 100644
index 000000000000..35a675d13c0f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h
@@ -0,0 +1,3 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#define SOCKMAP_MAX_ENTRIES (64)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v5 2/3] net: Allow iterating sockmap and sockhash
  2020-09-09 16:27 ` [PATCH bpf-next v5 2/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
@ 2020-09-10 17:40   ` Yonghong Song
  0 siblings, 0 replies; 7+ messages in thread
From: Yonghong Song @ 2020-09-10 17:40 UTC (permalink / raw)
  To: Lorenz Bauer, ast, daniel, jakub, john.fastabend, kafai; +Cc: bpf, kernel-team



On 9/9/20 9:27 AM, Lorenz Bauer wrote:
> Add bpf_iter support for sockmap / sockhash, based on the bpf_sk_storage and
> hashtable implementation. sockmap and sockhash share the same iteration
> context: a pointer to an arbitrary key and a pointer to a socket. Both
> pointers may be NULL, and so BPF has to perform a NULL check before accessing
> them. Technically it's not possible for sockhash iteration to yield a NULL
> socket, but we ignore this to be able to use a single iteration point.
> 
> Iteration will visit all keys that remain unmodified during the lifetime of
> the iterator. It may or may not visit newly added ones.
> 
> Switch from using rcu_dereference_raw to plain rcu_dereference, so we gain
> another guard rail if CONFIG_PROVE_RCU is enabled.
> 
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap
  2020-09-09 16:27 ` [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap Lorenz Bauer
@ 2020-09-10 18:11   ` Yonghong Song
  2020-09-10 19:36     ` Alexei Starovoitov
  0 siblings, 1 reply; 7+ messages in thread
From: Yonghong Song @ 2020-09-10 18:11 UTC (permalink / raw)
  To: Lorenz Bauer, ast, daniel, jakub, john.fastabend, kafai; +Cc: bpf, kernel-team



On 9/9/20 9:27 AM, Lorenz Bauer wrote:
> Add a test that exercises a basic sockmap / sockhash iteration. For
> now we simply count the number of elements seen. Once sockmap update
> from iterators works we can extend this to perform a full copy.
> 
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> ---
>   .../selftests/bpf/prog_tests/sockmap_basic.c  | 89 +++++++++++++++++++
>   tools/testing/selftests/bpf/progs/bpf_iter.h  |  9 ++
>   .../selftests/bpf/progs/bpf_iter_sockmap.c    | 43 +++++++++
>   .../selftests/bpf/progs/bpf_iter_sockmap.h    |  3 +
>   4 files changed, 144 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
>   create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> index 0b79d78b98db..3215f4d22720 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> @@ -6,6 +6,9 @@
>   #include "test_skmsg_load_helpers.skel.h"
>   #include "test_sockmap_update.skel.h"
>   #include "test_sockmap_invalid_update.skel.h"
> +#include "bpf_iter_sockmap.skel.h"
> +
> +#include "progs/bpf_iter_sockmap.h"
>   
>   #define TCP_REPAIR		19	/* TCP sock is under repair right now */
>   
> @@ -171,6 +174,88 @@ static void test_sockmap_invalid_update(void)
>   		test_sockmap_invalid_update__destroy(skel);
>   }
>   
> +static void test_sockmap_iter(enum bpf_map_type map_type)
> +{
> +	DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
> +	int err, len, src_fd, iter_fd, duration;
> +	union bpf_iter_link_info linfo = {0};
> +	__s64 sock_fd[SOCKMAP_MAX_ENTRIES];
> +	__u32 i, num_sockets, max_elems;
> +	struct bpf_iter_sockmap *skel;
> +	struct bpf_link *link;
> +	struct bpf_map *src;
> +	char buf[64];
> +
> +	skel = bpf_iter_sockmap__open_and_load();
> +	if (CHECK(!skel, "bpf_iter_sockmap__open_and_load", "skeleton open_and_load failed\n"))
> +		return;
> +
> +	for (i = 0; i < ARRAY_SIZE(sock_fd); i++)
> +		sock_fd[i] = -1;
> +
> +	/* Make sure we have at least one "empty" entry to test iteration of
> +	 * an empty slot.
> +	 */
> +	num_sockets = ARRAY_SIZE(sock_fd) - 1;
> +
> +	if (map_type == BPF_MAP_TYPE_SOCKMAP) {
> +		src = skel->maps.sockmap;
> +		max_elems = bpf_map__max_entries(src);
> +	} else {
> +		src = skel->maps.sockhash;
> +		max_elems = num_sockets;
> +	}

I know you include the shared header progs/bpf_iter_sockmap.h to
supply SOCKMAP_MAX_ENTRIES in order to define sock_fd array.

I think it is easier to understand if just using bpf_map__max_entries() 
for both sockmap and sockhash to get max_elems and do dynamic allocation 
for sock_fd. WDYT?

> +
> +	src_fd = bpf_map__fd(src);
> +
> +	for (i = 0; i < num_sockets; i++) {
> +		sock_fd[i] = connected_socket_v4();
> +		if (CHECK(sock_fd[i] == -1, "connected_socket_v4", "cannot connect\n"))
> +			goto out;
> +
> +		err = bpf_map_update_elem(src_fd, &i, &sock_fd[i], BPF_NOEXIST);
> +		if (CHECK(err, "map_update", "failed: %s\n", strerror(errno)))
> +			goto out;
> +	}
> +
[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap
  2020-09-10 18:11   ` Yonghong Song
@ 2020-09-10 19:36     ` Alexei Starovoitov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2020-09-10 19:36 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Lorenz Bauer, Alexei Starovoitov, Daniel Borkmann,
	Jakub Sitnicki, John Fastabend, Martin KaFai Lau, bpf,
	kernel-team

On Thu, Sep 10, 2020 at 11:11 AM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 9/9/20 9:27 AM, Lorenz Bauer wrote:
> > Add a test that exercises a basic sockmap / sockhash iteration. For
> > now we simply count the number of elements seen. Once sockmap update
> > from iterators works we can extend this to perform a full copy.
> >
> > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> > ---
> >   .../selftests/bpf/prog_tests/sockmap_basic.c  | 89 +++++++++++++++++++
> >   tools/testing/selftests/bpf/progs/bpf_iter.h  |  9 ++
> >   .../selftests/bpf/progs/bpf_iter_sockmap.c    | 43 +++++++++
> >   .../selftests/bpf/progs/bpf_iter_sockmap.h    |  3 +
> >   4 files changed, 144 insertions(+)
> >   create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
> >   create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.h
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > index 0b79d78b98db..3215f4d22720 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > @@ -6,6 +6,9 @@
> >   #include "test_skmsg_load_helpers.skel.h"
> >   #include "test_sockmap_update.skel.h"
> >   #include "test_sockmap_invalid_update.skel.h"
> > +#include "bpf_iter_sockmap.skel.h"
> > +
> > +#include "progs/bpf_iter_sockmap.h"
> >
> >   #define TCP_REPAIR          19      /* TCP sock is under repair right now */
> >
> > @@ -171,6 +174,88 @@ static void test_sockmap_invalid_update(void)
> >               test_sockmap_invalid_update__destroy(skel);
> >   }
> >
> > +static void test_sockmap_iter(enum bpf_map_type map_type)
> > +{
> > +     DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
> > +     int err, len, src_fd, iter_fd, duration;
> > +     union bpf_iter_link_info linfo = {0};
> > +     __s64 sock_fd[SOCKMAP_MAX_ENTRIES];
> > +     __u32 i, num_sockets, max_elems;
> > +     struct bpf_iter_sockmap *skel;
> > +     struct bpf_link *link;
> > +     struct bpf_map *src;
> > +     char buf[64];
> > +
> > +     skel = bpf_iter_sockmap__open_and_load();
> > +     if (CHECK(!skel, "bpf_iter_sockmap__open_and_load", "skeleton open_and_load failed\n"))
> > +             return;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(sock_fd); i++)
> > +             sock_fd[i] = -1;
> > +
> > +     /* Make sure we have at least one "empty" entry to test iteration of
> > +      * an empty slot.
> > +      */
> > +     num_sockets = ARRAY_SIZE(sock_fd) - 1;
> > +
> > +     if (map_type == BPF_MAP_TYPE_SOCKMAP) {
> > +             src = skel->maps.sockmap;
> > +             max_elems = bpf_map__max_entries(src);
> > +     } else {
> > +             src = skel->maps.sockhash;
> > +             max_elems = num_sockets;
> > +     }
>
> I know you include the shared header progs/bpf_iter_sockmap.h to
> supply SOCKMAP_MAX_ENTRIES in order to define sock_fd array.
>
> I think it is easier to understand if just using bpf_map__max_entries()
> for both sockmap and sockhash to get max_elems and do dynamic allocation
> for sock_fd. WDYT?

imo makes sense as a follow up.

I fixed up "duration = 0" in this test because of build warnings
and pushed the whole set.
Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-09-10 19:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-09 16:27 [PATCH bpf-next v5 0/3] Sockmap iterator Lorenz Bauer
2020-09-09 16:27 ` [PATCH bpf-next v5 1/3] net: sockmap: Remove unnecessary sk_fullsock checks Lorenz Bauer
2020-09-09 16:27 ` [PATCH bpf-next v5 2/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
2020-09-10 17:40   ` Yonghong Song
2020-09-09 16:27 ` [PATCH bpf-next v5 3/3] selftests: bpf: Test iterating a sockmap Lorenz Bauer
2020-09-10 18:11   ` Yonghong Song
2020-09-10 19:36     ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).