bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] Sockmap iterator
@ 2020-08-28  9:48 Lorenz Bauer
  2020-08-28  9:48 ` [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Lorenz Bauer @ 2020-08-28  9:48 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend; +Cc: bpf, kernel-team, Lorenz Bauer

Add a new bpf_iter for sockmap and sockhash. As previously discussed, we
want to use this to copy a sockhash in kernel space while modifying the
format of the keys.

The implementation leans heavily on the existing bpf_sk_storage and
hashtable iterators. However, there is a key difference for the sockmap
iterator: we don't take any bucket locks during iteration. It seems to
me that there is a risk of deadlock here if the iterator attempts to
insert an item into the bucket that we are currently iterating. I think
that the semantics are reasonable even without the lock.

In the iteration context I expose a PTR_TO_SOCKET_OR_NULL, aka struct
bpf_sock*. This is in contrast to bpf_sk_storage which uses a
PTR_TO_BTF_ID_OR_NULL, aka struct sock*. My personal preference would
be to use PTR_TO_BTF_ID_OR_NULL for sockmap as well, however the
verifier currently doesn't understand that PTR_TO_BTF_ID for struct
sock can be coerced to PTR_TO_SOCKET_OR_NULL. So we can't call
map_update_elem, etc. and the whole exercise is for naught. I'm
considering teaching this trick to the verifier, does anyone have
concerns or ideas how to achieve this?

Thanks to Yonghong for guidance on how to go about this, and for
adding bpf_iter in the first place!

Lorenz Bauer (3):
  net: Allow iterating sockmap and sockhash
  selftests: bpf: Add helper to compare socket cookies
  selftests: bpf: Test copying a sockmap via bpf_iter

 net/core/sock_map.c                           | 283 ++++++++++++++++++
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 129 +++++++-
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   9 +
 .../selftests/bpf/progs/bpf_iter_sockmap.c    |  50 ++++
 4 files changed, 457 insertions(+), 14 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash
  2020-08-28  9:48 [PATCH bpf-next 0/3] Sockmap iterator Lorenz Bauer
@ 2020-08-28  9:48 ` Lorenz Bauer
  2020-08-31 10:03   ` Jakub Sitnicki
  2020-08-28  9:48 ` [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies Lorenz Bauer
  2020-08-28  9:48 ` [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter Lorenz Bauer
  2 siblings, 1 reply; 11+ messages in thread
From: Lorenz Bauer @ 2020-08-28  9:48 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend; +Cc: bpf, kernel-team, Lorenz Bauer

Add bpf_iter support for sockmap / sockhash, based on the bpf_sk_storage and
hashtable implementation. sockmap and sockhash share the same iteration
context: a pointer to an arbitrary key and a pointer to a socket. Both
pointers may be NULL, and so BPF has to perform a NULL check before accessing
them. Technically it's not possible for sockhash iteration to yield a NULL
socket, but we ignore this to be able to use a single iteration point.

Iteration will visit all keys that remain unmodified during the lifetime of
the iterator. It may or may not visit newly added ones.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 net/core/sock_map.c | 283 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 283 insertions(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index d6c6e1e312fc..31c4332f06e4 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -703,6 +703,116 @@ const struct bpf_func_proto bpf_msg_redirect_map_proto = {
 	.arg4_type      = ARG_ANYTHING,
 };
 
+struct sock_map_seq_info {
+	struct bpf_map *map;
+	struct sock *sk;
+	u32 index;
+};
+
+struct bpf_iter__sockmap {
+	__bpf_md_ptr(struct bpf_iter_meta *, meta);
+	__bpf_md_ptr(struct bpf_map *, map);
+	__bpf_md_ptr(void *, key);
+	__bpf_md_ptr(struct bpf_sock *, sk);
+};
+
+DEFINE_BPF_ITER_FUNC(sockmap, struct bpf_iter_meta *meta,
+		     struct bpf_map *map, void *key,
+		     struct sock *sk)
+
+static void *sock_map_seq_lookup_elem(struct sock_map_seq_info *info)
+{
+	if (unlikely(info->index >= info->map->max_entries))
+		return NULL;
+
+	info->sk = __sock_map_lookup_elem(info->map, info->index);
+	if (!info->sk || !sk_fullsock(info->sk))
+		info->sk = NULL;
+
+	/* continue iterating */
+	return info;
+}
+
+static void *sock_map_seq_start(struct seq_file *seq, loff_t *pos)
+{
+	struct sock_map_seq_info *info = seq->private;
+
+	if (*pos == 0)
+		++*pos;
+
+	/* pairs with sock_map_seq_stop */
+	rcu_read_lock();
+	return sock_map_seq_lookup_elem(info);
+}
+
+static void *sock_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct sock_map_seq_info *info = seq->private;
+
+	++*pos;
+	++info->index;
+
+	return sock_map_seq_lookup_elem(info);
+}
+
+static int __sock_map_seq_show(struct seq_file *seq, void *v)
+{
+	struct sock_map_seq_info *info = seq->private;
+	struct bpf_iter__sockmap ctx = {};
+	struct bpf_iter_meta meta;
+	struct bpf_prog *prog;
+
+	meta.seq = seq;
+	prog = bpf_iter_get_info(&meta, !v);
+	if (!prog)
+		return 0;
+
+	ctx.meta = &meta;
+	ctx.map = info->map;
+	if (v) {
+		ctx.key = &info->index;
+		ctx.sk = (struct bpf_sock *)info->sk;
+	}
+
+	return bpf_iter_run_prog(prog, &ctx);
+}
+
+static int sock_map_seq_show(struct seq_file *seq, void *v)
+{
+	return __sock_map_seq_show(seq, v);
+}
+
+static void sock_map_seq_stop(struct seq_file *seq, void *v)
+{
+	if (!v)
+		(void)__sock_map_seq_show(seq, NULL);
+
+	/* pairs with sock_map_seq_start */
+	rcu_read_unlock();
+}
+
+static const struct seq_operations sock_map_seq_ops = {
+	.start	= sock_map_seq_start,
+	.next	= sock_map_seq_next,
+	.stop	= sock_map_seq_stop,
+	.show	= sock_map_seq_show,
+};
+
+static int sock_map_init_seq_private(void *priv_data,
+				     struct bpf_iter_aux_info *aux)
+{
+	struct sock_map_seq_info *info = priv_data;
+
+	info->map = aux->map;
+	return 0;
+}
+
+static const struct bpf_iter_seq_info sock_map_iter_seq_info = {
+	.seq_ops		= &sock_map_seq_ops,
+	.init_seq_private	= sock_map_init_seq_private,
+	.seq_priv_size		= sizeof(struct sock_map_seq_info),
+};
+
 static int sock_map_btf_id;
 const struct bpf_map_ops sock_map_ops = {
 	.map_alloc		= sock_map_alloc,
@@ -716,6 +826,7 @@ const struct bpf_map_ops sock_map_ops = {
 	.map_check_btf		= map_check_no_btf,
 	.map_btf_name		= "bpf_stab",
 	.map_btf_id		= &sock_map_btf_id,
+	.iter_seq_info		= &sock_map_iter_seq_info,
 };
 
 struct bpf_shtab_elem {
@@ -1198,6 +1309,120 @@ const struct bpf_func_proto bpf_msg_redirect_hash_proto = {
 	.arg4_type      = ARG_ANYTHING,
 };
 
+struct sock_hash_seq_info {
+	struct bpf_map *map;
+	struct bpf_shtab *htab;
+	u32 bucket_id;
+};
+
+static void *sock_hash_seq_find_next(struct sock_hash_seq_info *info,
+				     struct bpf_shtab_elem *prev_elem)
+{
+	const struct bpf_shtab *htab = info->htab;
+	struct bpf_shtab_bucket *bucket;
+	struct bpf_shtab_elem *elem;
+	struct hlist_node *node;
+
+	/* try to find next elem in the same bucket */
+	if (prev_elem) {
+		node = rcu_dereference_raw(hlist_next_rcu(&prev_elem->node));
+		elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
+		if (elem)
+			return elem;
+
+		/* no more elements, continue in the next bucket */
+		info->bucket_id++;
+	}
+
+	for (; info->bucket_id < htab->buckets_num; info->bucket_id++) {
+		bucket = &htab->buckets[info->bucket_id];
+		node = rcu_dereference_raw(hlist_first_rcu(&bucket->head));
+		elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
+		if (elem)
+			return elem;
+	}
+
+	return NULL;
+}
+
+static void *sock_hash_seq_start(struct seq_file *seq, loff_t *pos)
+{
+	struct sock_hash_seq_info *info = seq->private;
+
+	if (*pos == 0)
+		++*pos;
+
+	/* pairs with sock_hash_seq_stop */
+	rcu_read_lock();
+	return sock_hash_seq_find_next(info, NULL);
+}
+
+static void *sock_hash_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct sock_hash_seq_info *info = seq->private;
+
+	++*pos;
+	return sock_hash_seq_find_next(info, v);
+}
+
+static int __sock_hash_seq_show(struct seq_file *seq, struct bpf_shtab_elem *elem)
+{
+	struct sock_hash_seq_info *info = seq->private;
+	struct bpf_iter__sockmap ctx = {};
+	struct bpf_iter_meta meta;
+	struct bpf_prog *prog;
+
+	meta.seq = seq;
+	prog = bpf_iter_get_info(&meta, !elem);
+	if (!prog)
+		return 0;
+
+	ctx.meta = &meta;
+	ctx.map = info->map;
+	if (elem) {
+		ctx.key = elem->key;
+		ctx.sk = (struct bpf_sock *)elem->sk;
+	}
+
+	return bpf_iter_run_prog(prog, &ctx);
+}
+
+static int sock_hash_seq_show(struct seq_file *seq, void *v)
+{
+	return __sock_hash_seq_show(seq, v);
+}
+
+static void sock_hash_seq_stop(struct seq_file *seq, void *v)
+{
+	if (!v)
+		(void)__sock_hash_seq_show(seq, NULL);
+
+	/* pairs with sock_hash_seq_start */
+	rcu_read_unlock();
+}
+
+static const struct seq_operations sock_hash_seq_ops = {
+	.start	= sock_hash_seq_start,
+	.next	= sock_hash_seq_next,
+	.stop	= sock_hash_seq_stop,
+	.show	= sock_hash_seq_show,
+};
+
+static int sock_hash_init_seq_private(void *priv_data,
+				     struct bpf_iter_aux_info *aux)
+{
+	struct sock_hash_seq_info *info = priv_data;
+
+	info->map = aux->map;
+	return 0;
+}
+
+static const struct bpf_iter_seq_info sock_hash_iter_seq_info = {
+	.seq_ops		= &sock_hash_seq_ops,
+	.init_seq_private	= sock_hash_init_seq_private,
+	.seq_priv_size		= sizeof(struct sock_hash_seq_info),
+};
+
 static int sock_hash_map_btf_id;
 const struct bpf_map_ops sock_hash_ops = {
 	.map_alloc		= sock_hash_alloc,
@@ -1211,6 +1436,7 @@ const struct bpf_map_ops sock_hash_ops = {
 	.map_check_btf		= map_check_no_btf,
 	.map_btf_name		= "bpf_shtab",
 	.map_btf_id		= &sock_hash_map_btf_id,
+	.iter_seq_info		= &sock_hash_iter_seq_info,
 };
 
 static struct sk_psock_progs *sock_map_progs(struct bpf_map *map)
@@ -1321,3 +1547,60 @@ void sock_map_close(struct sock *sk, long timeout)
 	release_sock(sk);
 	saved_close(sk, timeout);
 }
+
+static int sock_map_iter_attach_target(struct bpf_prog *prog,
+				       union bpf_iter_link_info *linfo,
+				       struct bpf_iter_aux_info *aux)
+{
+	struct bpf_map *map;
+	int err = -EINVAL;
+
+	if (!linfo->map.map_fd)
+		return -EBADF;
+
+	map = bpf_map_get_with_uref(linfo->map.map_fd);
+	if (IS_ERR(map))
+		return PTR_ERR(map);
+
+	if (map->map_type != BPF_MAP_TYPE_SOCKMAP)
+		goto put_map;
+
+	if (prog->aux->max_rdonly_access > map->key_size) {
+		err = -EACCES;
+		goto put_map;
+	}
+
+	aux->map = map;
+	return 0;
+
+put_map:
+	bpf_map_put_with_uref(map);
+	return err;
+}
+
+static void sock_map_iter_detach_target(struct bpf_iter_aux_info *aux)
+{
+	bpf_map_put_with_uref(aux->map);
+}
+
+static struct bpf_iter_reg sock_map_iter_reg = {
+	.target			= "sockmap",
+	.attach_target		= sock_map_iter_attach_target,
+	.detach_target		= sock_map_iter_detach_target,
+	.show_fdinfo		= bpf_iter_map_show_fdinfo,
+	.fill_link_info		= bpf_iter_map_fill_link_info,
+	.ctx_arg_info_size	= 2,
+	.ctx_arg_info		= {
+		{ offsetof(struct bpf_iter__sockmap, key),
+		  PTR_TO_RDONLY_BUF_OR_NULL },
+		{ offsetof(struct bpf_iter__sockmap, sk),
+		  PTR_TO_SOCKET_OR_NULL },
+	},
+	.seq_info		= &sock_map_iter_seq_info,
+};
+
+static int __init bpf_sockmap_iter_init(void)
+{
+	return bpf_iter_reg_target(&sock_map_iter_reg);
+}
+late_initcall(bpf_sockmap_iter_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies
  2020-08-28  9:48 [PATCH bpf-next 0/3] Sockmap iterator Lorenz Bauer
  2020-08-28  9:48 ` [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
@ 2020-08-28  9:48 ` Lorenz Bauer
  2020-08-28 10:50   ` Jakub Sitnicki
  2020-08-28 11:28   ` Jakub Sitnicki
  2020-08-28  9:48 ` [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter Lorenz Bauer
  2 siblings, 2 replies; 11+ messages in thread
From: Lorenz Bauer @ 2020-08-28  9:48 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend; +Cc: bpf, kernel-team, Lorenz Bauer

We compare socket cookies to ensure that insertion into a sockmap worked.
Pull this out into a helper function for use in other tests.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 51 ++++++++++++++-----
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index 0b79d78b98db..b989f8760f1a 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -47,6 +47,38 @@ static int connected_socket_v4(void)
 	return -1;
 }
 
+static void compare_cookies(struct bpf_map *src, struct bpf_map *dst)
+{
+	__u32 i, max_entries = bpf_map__max_entries(src);
+	int err, duration, src_fd, dst_fd;
+
+	src_fd = bpf_map__fd(src);
+	dst_fd = bpf_map__fd(src);
+
+	for (i = 0; i < max_entries; i++) {
+		__u64 src_cookie, dst_cookie;
+
+		err = bpf_map_lookup_elem(src_fd, &i, &src_cookie);
+		if (err && errno == ENOENT) {
+			err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
+			if (err && errno == ENOENT)
+				continue;
+
+			CHECK(err, "map_lookup_elem(dst)", "element not deleted\n");
+			continue;
+		}
+		if (CHECK(err, "lookup_elem(src, cookie)", "%s\n", strerror(errno)))
+			continue;
+
+		err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
+		if (CHECK(err, "lookup_elem(dst, cookie)", "%s\n", strerror(errno)))
+			continue;
+
+		CHECK(dst_cookie != src_cookie, "cookie mismatch",
+		      "%llu != %llu (pos %u)\n", dst_cookie, src_cookie, i);
+	}
+}
+
 /* Create a map, populate it with one socket, and free the map. */
 static void test_sockmap_create_update_free(enum bpf_map_type map_type)
 {
@@ -106,9 +138,9 @@ static void test_skmsg_helpers(enum bpf_map_type map_type)
 static void test_sockmap_update(enum bpf_map_type map_type)
 {
 	struct bpf_prog_test_run_attr tattr;
-	int err, prog, src, dst, duration = 0;
+	int err, prog, src, duration = 0;
 	struct test_sockmap_update *skel;
-	__u64 src_cookie, dst_cookie;
+	struct bpf_map *dst_map;
 	const __u32 zero = 0;
 	char dummy[14] = {0};
 	__s64 sk;
@@ -124,18 +156,14 @@ static void test_sockmap_update(enum bpf_map_type map_type)
 	prog = bpf_program__fd(skel->progs.copy_sock_map);
 	src = bpf_map__fd(skel->maps.src);
 	if (map_type == BPF_MAP_TYPE_SOCKMAP)
-		dst = bpf_map__fd(skel->maps.dst_sock_map);
+		dst_map = skel->maps.dst_sock_map;
 	else
-		dst = bpf_map__fd(skel->maps.dst_sock_hash);
+		dst_map = skel->maps.dst_sock_hash;
 
 	err = bpf_map_update_elem(src, &zero, &sk, BPF_NOEXIST);
 	if (CHECK(err, "update_elem(src)", "errno=%u\n", errno))
 		goto out;
 
-	err = bpf_map_lookup_elem(src, &zero, &src_cookie);
-	if (CHECK(err, "lookup_elem(src, cookie)", "errno=%u\n", errno))
-		goto out;
-
 	tattr = (struct bpf_prog_test_run_attr){
 		.prog_fd = prog,
 		.repeat = 1,
@@ -148,12 +176,7 @@ static void test_sockmap_update(enum bpf_map_type map_type)
 		       "errno=%u retval=%u\n", errno, tattr.retval))
 		goto out;
 
-	err = bpf_map_lookup_elem(dst, &zero, &dst_cookie);
-	if (CHECK(err, "lookup_elem(dst, cookie)", "errno=%u\n", errno))
-		goto out;
-
-	CHECK(dst_cookie != src_cookie, "cookie mismatch", "%llu != %llu\n",
-	      dst_cookie, src_cookie);
+	compare_cookies(skel->maps.src, dst_map);
 
 out:
 	test_sockmap_update__destroy(skel);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter
  2020-08-28  9:48 [PATCH bpf-next 0/3] Sockmap iterator Lorenz Bauer
  2020-08-28  9:48 ` [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
  2020-08-28  9:48 ` [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies Lorenz Bauer
@ 2020-08-28  9:48 ` Lorenz Bauer
  2020-08-31 10:58   ` Jakub Sitnicki
  2 siblings, 1 reply; 11+ messages in thread
From: Lorenz Bauer @ 2020-08-28  9:48 UTC (permalink / raw)
  To: ast, yhs, daniel, jakub, john.fastabend; +Cc: bpf, kernel-team, Lorenz Bauer

Add a test that exercises a basic sockmap / sockhash copy using bpf_iter.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 .../selftests/bpf/prog_tests/sockmap_basic.c  | 78 +++++++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_iter.h  |  9 +++
 .../selftests/bpf/progs/bpf_iter_sockmap.c    | 50 ++++++++++++
 3 files changed, 137 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index b989f8760f1a..386aecf1f7ff 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -6,6 +6,7 @@
 #include "test_skmsg_load_helpers.skel.h"
 #include "test_sockmap_update.skel.h"
 #include "test_sockmap_invalid_update.skel.h"
+#include "bpf_iter_sockmap.skel.h"
 
 #define TCP_REPAIR		19	/* TCP sock is under repair right now */
 
@@ -194,6 +195,79 @@ static void test_sockmap_invalid_update(void)
 		test_sockmap_invalid_update__destroy(skel);
 }
 
+static void test_sockmap_copy(enum bpf_map_type map_type)
+{
+	DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
+	int err, i, len, src_fd, iter_fd, num_sockets, duration;
+	struct bpf_iter_sockmap *skel;
+	struct bpf_map *src, *dst;
+	union bpf_iter_link_info linfo = {0};
+	__s64 sock_fd[2] = {-1, -1};
+	struct bpf_link *link;
+	char buf[64];
+	__u32 max_elems;
+
+	skel = bpf_iter_sockmap__open_and_load();
+	if (CHECK(!skel, "bpf_iter_sockmap__open_and_load",
+		  "skeleton open_and_load failed\n"))
+		return;
+
+	if (map_type == BPF_MAP_TYPE_SOCKMAP)
+		src = skel->maps.sockmap;
+	else
+		src = skel->maps.sockhash;
+
+	dst = skel->maps.dst;
+	src_fd = bpf_map__fd(src);
+	max_elems = bpf_map__max_entries(src);
+
+	num_sockets = ARRAY_SIZE(sock_fd);
+	for (i = 0; i < num_sockets; i++) {
+		sock_fd[i] = connected_socket_v4();
+		if (CHECK(sock_fd[i] == -1, "connected_socket_v4", "cannot connect\n"))
+			goto out;
+
+		err = bpf_map_update_elem(src_fd, &i, &sock_fd[i], BPF_NOEXIST);
+		if (CHECK(err, "map_update", "map_update failed\n"))
+			goto out;
+	}
+
+	linfo.map.map_fd = src_fd;
+	opts.link_info = &linfo;
+	opts.link_info_len = sizeof(linfo);
+	link = bpf_program__attach_iter(skel->progs.copy_sockmap, &opts);
+	if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
+		goto out;
+
+	iter_fd = bpf_iter_create(bpf_link__fd(link));
+	if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
+		goto free_link;
+
+	/* do some tests */
+	while ((len = read(iter_fd, buf, sizeof(buf))) > 0)
+		;
+	if (CHECK(len < 0, "read", "failed: %s\n", strerror(errno)))
+		goto close_iter;
+
+	/* test results */
+	if (CHECK(skel->bss->elems != max_elems, "elems", "got %u expected %u\n",
+		  skel->bss->elems, max_elems))
+		goto close_iter;
+
+	compare_cookies(src, dst);
+
+close_iter:
+	close(iter_fd);
+free_link:
+	bpf_link__destroy(link);
+out:
+	for (i = 0; i < num_sockets; i++) {
+		if (sock_fd[i] >= 0)
+			close(sock_fd[i]);
+	}
+	bpf_iter_sockmap__destroy(skel);
+}
+
 void test_sockmap_basic(void)
 {
 	if (test__start_subtest("sockmap create_update_free"))
@@ -210,4 +284,8 @@ void test_sockmap_basic(void)
 		test_sockmap_update(BPF_MAP_TYPE_SOCKHASH);
 	if (test__start_subtest("sockmap update in unsafe context"))
 		test_sockmap_invalid_update();
+	if (test__start_subtest("sockmap copy"))
+		test_sockmap_copy(BPF_MAP_TYPE_SOCKMAP);
+	if (test__start_subtest("sockhash copy"))
+		test_sockmap_copy(BPF_MAP_TYPE_SOCKHASH);
 }
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h
index c196280df90d..ac32a29f5153 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter.h
+++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
@@ -13,6 +13,7 @@
 #define udp6_sock udp6_sock___not_used
 #define bpf_iter__bpf_map_elem bpf_iter__bpf_map_elem___not_used
 #define bpf_iter__bpf_sk_storage_map bpf_iter__bpf_sk_storage_map___not_used
+#define bpf_iter__sockmap bpf_iter__sockmap___not_used
 #include "vmlinux.h"
 #undef bpf_iter_meta
 #undef bpf_iter__bpf_map
@@ -26,6 +27,7 @@
 #undef udp6_sock
 #undef bpf_iter__bpf_map_elem
 #undef bpf_iter__bpf_sk_storage_map
+#undef bpf_iter__sockmap
 
 struct bpf_iter_meta {
 	struct seq_file *seq;
@@ -96,3 +98,10 @@ struct bpf_iter__bpf_sk_storage_map {
 	struct sock *sk;
 	void *value;
 };
+
+struct bpf_iter__sockmap {
+	struct bpf_iter_meta *meta;
+	struct bpf_map *map;
+	void *key;
+	struct bpf_sock *sk;
+};
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
new file mode 100644
index 000000000000..1b4268c9cd31
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Cloudflare */
+#include "bpf_iter.h"
+#include "bpf_tracing_net.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SOCKMAP);
+	__uint(max_entries, 3);
+	__type(key, __u32);
+	__type(value, __u64);
+} sockmap SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SOCKMAP);
+	__uint(max_entries, 3);
+	__type(key, __u32);
+	__type(value, __u64);
+} sockhash SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SOCKHASH);
+	__uint(max_entries, 3);
+	__type(key, __u32);
+	__type(value, __u64);
+} dst SEC(".maps");
+
+__u32 elems = 0;
+
+SEC("iter/sockmap")
+int copy_sockmap(struct bpf_iter__sockmap *ctx)
+{
+	__u32 tmp, *key = ctx->key;
+	struct bpf_sock *sk = ctx->sk;
+
+	if (key == (void *)0)
+		return 0;
+
+	elems++;
+	tmp = *key;
+
+	if (sk != (void *)0)
+		return bpf_map_update_elem(&dst, &tmp, sk, 0) != 0;
+
+	bpf_map_delete_elem(&dst, &tmp);
+	return 0;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies
  2020-08-28  9:48 ` [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies Lorenz Bauer
@ 2020-08-28 10:50   ` Jakub Sitnicki
  2020-08-28 15:48     ` Lorenz Bauer
  2020-08-28 11:28   ` Jakub Sitnicki
  1 sibling, 1 reply; 11+ messages in thread
From: Jakub Sitnicki @ 2020-08-28 10:50 UTC (permalink / raw)
  To: Lorenz Bauer; +Cc: ast, yhs, daniel, john.fastabend, bpf, kernel-team

On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> We compare socket cookies to ensure that insertion into a sockmap worked.
> Pull this out into a helper function for use in other tests.
>
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> ---
>  .../selftests/bpf/prog_tests/sockmap_basic.c  | 51 ++++++++++++++-----
>  1 file changed, 37 insertions(+), 14 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> index 0b79d78b98db..b989f8760f1a 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> @@ -47,6 +47,38 @@ static int connected_socket_v4(void)
>  	return -1;
>  }
>  
> +static void compare_cookies(struct bpf_map *src, struct bpf_map *dst)
> +{
> +	__u32 i, max_entries = bpf_map__max_entries(src);
> +	int err, duration, src_fd, dst_fd;
> +
> +	src_fd = bpf_map__fd(src);
> +	dst_fd = bpf_map__fd(src);
                             ^^^
That looks like a typo. We're comparing src map to src map.

> +
> +	for (i = 0; i < max_entries; i++) {
> +		__u64 src_cookie, dst_cookie;
> +
> +		err = bpf_map_lookup_elem(src_fd, &i, &src_cookie);
> +		if (err && errno == ENOENT) {
> +			err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
> +			if (err && errno == ENOENT)
> +				continue;
> +
> +			CHECK(err, "map_lookup_elem(dst)", "element not deleted\n");
> +			continue;
> +		}
> +		if (CHECK(err, "lookup_elem(src, cookie)", "%s\n", strerror(errno)))
> +			continue;
> +
> +		err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
> +		if (CHECK(err, "lookup_elem(dst, cookie)", "%s\n", strerror(errno)))
> +			continue;
> +
> +		CHECK(dst_cookie != src_cookie, "cookie mismatch",
> +		      "%llu != %llu (pos %u)\n", dst_cookie, src_cookie, i);
> +	}
> +}
> +
>  /* Create a map, populate it with one socket, and free the map. */
>  static void test_sockmap_create_update_free(enum bpf_map_type map_type)
>  {
> @@ -106,9 +138,9 @@ static void test_skmsg_helpers(enum bpf_map_type map_type)
>  static void test_sockmap_update(enum bpf_map_type map_type)
>  {
>  	struct bpf_prog_test_run_attr tattr;
> -	int err, prog, src, dst, duration = 0;
> +	int err, prog, src, duration = 0;
>  	struct test_sockmap_update *skel;
> -	__u64 src_cookie, dst_cookie;
> +	struct bpf_map *dst_map;
>  	const __u32 zero = 0;
>  	char dummy[14] = {0};
>  	__s64 sk;
> @@ -124,18 +156,14 @@ static void test_sockmap_update(enum bpf_map_type map_type)
>  	prog = bpf_program__fd(skel->progs.copy_sock_map);
>  	src = bpf_map__fd(skel->maps.src);
>  	if (map_type == BPF_MAP_TYPE_SOCKMAP)
> -		dst = bpf_map__fd(skel->maps.dst_sock_map);
> +		dst_map = skel->maps.dst_sock_map;
>  	else
> -		dst = bpf_map__fd(skel->maps.dst_sock_hash);
> +		dst_map = skel->maps.dst_sock_hash;
>  
>  	err = bpf_map_update_elem(src, &zero, &sk, BPF_NOEXIST);
>  	if (CHECK(err, "update_elem(src)", "errno=%u\n", errno))
>  		goto out;
>  
> -	err = bpf_map_lookup_elem(src, &zero, &src_cookie);
> -	if (CHECK(err, "lookup_elem(src, cookie)", "errno=%u\n", errno))
> -		goto out;
> -
>  	tattr = (struct bpf_prog_test_run_attr){
>  		.prog_fd = prog,
>  		.repeat = 1,
> @@ -148,12 +176,7 @@ static void test_sockmap_update(enum bpf_map_type map_type)
>  		       "errno=%u retval=%u\n", errno, tattr.retval))
>  		goto out;
>  
> -	err = bpf_map_lookup_elem(dst, &zero, &dst_cookie);
> -	if (CHECK(err, "lookup_elem(dst, cookie)", "errno=%u\n", errno))
> -		goto out;
> -
> -	CHECK(dst_cookie != src_cookie, "cookie mismatch", "%llu != %llu\n",
> -	      dst_cookie, src_cookie);
> +	compare_cookies(skel->maps.src, dst_map);
>  
>  out:
>  	test_sockmap_update__destroy(skel);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies
  2020-08-28  9:48 ` [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies Lorenz Bauer
  2020-08-28 10:50   ` Jakub Sitnicki
@ 2020-08-28 11:28   ` Jakub Sitnicki
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Sitnicki @ 2020-08-28 11:28 UTC (permalink / raw)
  To: Lorenz Bauer; +Cc: ast, yhs, daniel, john.fastabend, bpf, kernel-team

Sorry for the fragmented review. I should have been more thorough during
the first pass.

On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> We compare socket cookies to ensure that insertion into a sockmap worked.
> Pull this out into a helper function for use in other tests.
>
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> ---
>  .../selftests/bpf/prog_tests/sockmap_basic.c  | 51 ++++++++++++++-----
>  1 file changed, 37 insertions(+), 14 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> index 0b79d78b98db..b989f8760f1a 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> @@ -47,6 +47,38 @@ static int connected_socket_v4(void)
>  	return -1;
>  }
>
> +static void compare_cookies(struct bpf_map *src, struct bpf_map *dst)
> +{
> +	__u32 i, max_entries = bpf_map__max_entries(src);
> +	int err, duration, src_fd, dst_fd;
> +
> +	src_fd = bpf_map__fd(src);
> +	dst_fd = bpf_map__fd(src);
> +
> +	for (i = 0; i < max_entries; i++) {
> +		__u64 src_cookie, dst_cookie;
> +
> +		err = bpf_map_lookup_elem(src_fd, &i, &src_cookie);
> +		if (err && errno == ENOENT) {
> +			err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
> +			if (err && errno == ENOENT)
> +				continue;
> +
> +			CHECK(err, "map_lookup_elem(dst)", "element not deleted\n");
                              ^^^
Here we want to fail if there was no error, i.e. lookup in dst
succeeded, or in the unlikely case there was some other error than
ENOENT.

> +			continue;
> +		}
> +		if (CHECK(err, "lookup_elem(src, cookie)", "%s\n", strerror(errno)))

Nit: "lookup_elem(src)" as log tag would probably do. I don't see how
including the info that we're looking up a cookie helps.

> +			continue;
> +
> +		err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
> +		if (CHECK(err, "lookup_elem(dst, cookie)", "%s\n", strerror(errno)))
> +			continue;
> +
> +		CHECK(dst_cookie != src_cookie, "cookie mismatch",
> +		      "%llu != %llu (pos %u)\n", dst_cookie, src_cookie, i);

Does it actually make sense to continue comparing the entries after the
first mismatch (or lookup error)? We could just fail-fast.

> +	}
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies
  2020-08-28 10:50   ` Jakub Sitnicki
@ 2020-08-28 15:48     ` Lorenz Bauer
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenz Bauer @ 2020-08-28 15:48 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Yonghong Song, Daniel Borkmann,
	John Fastabend, bpf, kernel-team

On Fri, 28 Aug 2020 at 11:50, Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> > We compare socket cookies to ensure that insertion into a sockmap worked.
> > Pull this out into a helper function for use in other tests.
> >
> > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> > ---
> >  .../selftests/bpf/prog_tests/sockmap_basic.c  | 51 ++++++++++++++-----
> >  1 file changed, 37 insertions(+), 14 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > index 0b79d78b98db..b989f8760f1a 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > @@ -47,6 +47,38 @@ static int connected_socket_v4(void)
> >       return -1;
> >  }
> >
> > +static void compare_cookies(struct bpf_map *src, struct bpf_map *dst)
> > +{
> > +     __u32 i, max_entries = bpf_map__max_entries(src);
> > +     int err, duration, src_fd, dst_fd;
> > +
> > +     src_fd = bpf_map__fd(src);
> > +     dst_fd = bpf_map__fd(src);
>                              ^^^
> That looks like a typo. We're comparing src map to src map.

Oops, that's awkward! Luckily the tests still pass after fixing this.

Thanks for your other comments as well, I'll send a v2 once I have
some more reviews.

>
> > +
> > +     for (i = 0; i < max_entries; i++) {
> > +             __u64 src_cookie, dst_cookie;
> > +
> > +             err = bpf_map_lookup_elem(src_fd, &i, &src_cookie);
> > +             if (err && errno == ENOENT) {
> > +                     err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
> > +                     if (err && errno == ENOENT)
> > +                             continue;
> > +
> > +                     CHECK(err, "map_lookup_elem(dst)", "element not deleted\n");
> > +                     continue;
> > +             }
> > +             if (CHECK(err, "lookup_elem(src, cookie)", "%s\n", strerror(errno)))
> > +                     continue;
> > +
> > +             err = bpf_map_lookup_elem(dst_fd, &i, &dst_cookie);
> > +             if (CHECK(err, "lookup_elem(dst, cookie)", "%s\n", strerror(errno)))
> > +                     continue;
> > +
> > +             CHECK(dst_cookie != src_cookie, "cookie mismatch",
> > +                   "%llu != %llu (pos %u)\n", dst_cookie, src_cookie, i);
> > +     }
> > +}
> > +
> >  /* Create a map, populate it with one socket, and free the map. */
> >  static void test_sockmap_create_update_free(enum bpf_map_type map_type)
> >  {
> > @@ -106,9 +138,9 @@ static void test_skmsg_helpers(enum bpf_map_type map_type)
> >  static void test_sockmap_update(enum bpf_map_type map_type)
> >  {
> >       struct bpf_prog_test_run_attr tattr;
> > -     int err, prog, src, dst, duration = 0;
> > +     int err, prog, src, duration = 0;
> >       struct test_sockmap_update *skel;
> > -     __u64 src_cookie, dst_cookie;
> > +     struct bpf_map *dst_map;
> >       const __u32 zero = 0;
> >       char dummy[14] = {0};
> >       __s64 sk;
> > @@ -124,18 +156,14 @@ static void test_sockmap_update(enum bpf_map_type map_type)
> >       prog = bpf_program__fd(skel->progs.copy_sock_map);
> >       src = bpf_map__fd(skel->maps.src);
> >       if (map_type == BPF_MAP_TYPE_SOCKMAP)
> > -             dst = bpf_map__fd(skel->maps.dst_sock_map);
> > +             dst_map = skel->maps.dst_sock_map;
> >       else
> > -             dst = bpf_map__fd(skel->maps.dst_sock_hash);
> > +             dst_map = skel->maps.dst_sock_hash;
> >
> >       err = bpf_map_update_elem(src, &zero, &sk, BPF_NOEXIST);
> >       if (CHECK(err, "update_elem(src)", "errno=%u\n", errno))
> >               goto out;
> >
> > -     err = bpf_map_lookup_elem(src, &zero, &src_cookie);
> > -     if (CHECK(err, "lookup_elem(src, cookie)", "errno=%u\n", errno))
> > -             goto out;
> > -
> >       tattr = (struct bpf_prog_test_run_attr){
> >               .prog_fd = prog,
> >               .repeat = 1,
> > @@ -148,12 +176,7 @@ static void test_sockmap_update(enum bpf_map_type map_type)
> >                      "errno=%u retval=%u\n", errno, tattr.retval))
> >               goto out;
> >
> > -     err = bpf_map_lookup_elem(dst, &zero, &dst_cookie);
> > -     if (CHECK(err, "lookup_elem(dst, cookie)", "errno=%u\n", errno))
> > -             goto out;
> > -
> > -     CHECK(dst_cookie != src_cookie, "cookie mismatch", "%llu != %llu\n",
> > -           dst_cookie, src_cookie);
> > +     compare_cookies(skel->maps.src, dst_map);
> >
> >  out:
> >       test_sockmap_update__destroy(skel);
>


-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash
  2020-08-28  9:48 ` [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
@ 2020-08-31 10:03   ` Jakub Sitnicki
  2020-09-01  8:59     ` Lorenz Bauer
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Sitnicki @ 2020-08-31 10:03 UTC (permalink / raw)
  To: Lorenz Bauer; +Cc: ast, yhs, daniel, john.fastabend, bpf, kernel-team

On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> Add bpf_iter support for sockmap / sockhash, based on the bpf_sk_storage and
> hashtable implementation. sockmap and sockhash share the same iteration
> context: a pointer to an arbitrary key and a pointer to a socket. Both
> pointers may be NULL, and so BPF has to perform a NULL check before accessing
> them. Technically it's not possible for sockhash iteration to yield a NULL
> socket, but we ignore this to be able to use a single iteration point.
>
> Iteration will visit all keys that remain unmodified during the lifetime of
> the iterator. It may or may not visit newly added ones.
>
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> ---
>  net/core/sock_map.c | 283 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 283 insertions(+)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index d6c6e1e312fc..31c4332f06e4 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -703,6 +703,116 @@ const struct bpf_func_proto bpf_msg_redirect_map_proto = {
>  	.arg4_type      = ARG_ANYTHING,
>  };
>
> +struct sock_map_seq_info {
> +	struct bpf_map *map;
> +	struct sock *sk;
> +	u32 index;
> +};
> +
> +struct bpf_iter__sockmap {
> +	__bpf_md_ptr(struct bpf_iter_meta *, meta);
> +	__bpf_md_ptr(struct bpf_map *, map);
> +	__bpf_md_ptr(void *, key);
> +	__bpf_md_ptr(struct bpf_sock *, sk);
> +};
> +
> +DEFINE_BPF_ITER_FUNC(sockmap, struct bpf_iter_meta *meta,
> +		     struct bpf_map *map, void *key,
> +		     struct sock *sk)
> +
> +static void *sock_map_seq_lookup_elem(struct sock_map_seq_info *info)
> +{
> +	if (unlikely(info->index >= info->map->max_entries))
> +		return NULL;
> +
> +	info->sk = __sock_map_lookup_elem(info->map, info->index);
> +	if (!info->sk || !sk_fullsock(info->sk))

As we've talked off-line, we don't expect neither timewait nor request
sockets in sockmap so sk_fullsock() check is likely not needed.

> +		info->sk = NULL;
> +
> +	/* continue iterating */
> +	return info;
> +}
> +
> +static void *sock_map_seq_start(struct seq_file *seq, loff_t *pos)
> +{
> +	struct sock_map_seq_info *info = seq->private;
> +
> +	if (*pos == 0)
> +		++*pos;
> +
> +	/* pairs with sock_map_seq_stop */
> +	rcu_read_lock();
> +	return sock_map_seq_lookup_elem(info);
> +}
> +
> +static void *sock_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> +{
> +	struct sock_map_seq_info *info = seq->private;
> +
> +	++*pos;
> +	++info->index;
> +
> +	return sock_map_seq_lookup_elem(info);
> +}
> +
> +static int __sock_map_seq_show(struct seq_file *seq, void *v)
> +{
> +	struct sock_map_seq_info *info = seq->private;
> +	struct bpf_iter__sockmap ctx = {};
> +	struct bpf_iter_meta meta;
> +	struct bpf_prog *prog;
> +
> +	meta.seq = seq;
> +	prog = bpf_iter_get_info(&meta, !v);
> +	if (!prog)
> +		return 0;
> +
> +	ctx.meta = &meta;
> +	ctx.map = info->map;
> +	if (v) {
> +		ctx.key = &info->index;
> +		ctx.sk = (struct bpf_sock *)info->sk;
> +	}
> +
> +	return bpf_iter_run_prog(prog, &ctx);
> +}
> +
> +static int sock_map_seq_show(struct seq_file *seq, void *v)
> +{
> +	return __sock_map_seq_show(seq, v);
> +}
> +
> +static void sock_map_seq_stop(struct seq_file *seq, void *v)
> +{
> +	if (!v)
> +		(void)__sock_map_seq_show(seq, NULL);
> +
> +	/* pairs with sock_map_seq_start */
> +	rcu_read_unlock();
> +}
> +
> +static const struct seq_operations sock_map_seq_ops = {
> +	.start	= sock_map_seq_start,
> +	.next	= sock_map_seq_next,
> +	.stop	= sock_map_seq_stop,
> +	.show	= sock_map_seq_show,
> +};
> +
> +static int sock_map_init_seq_private(void *priv_data,
> +				     struct bpf_iter_aux_info *aux)
> +{
> +	struct sock_map_seq_info *info = priv_data;
> +
> +	info->map = aux->map;
> +	return 0;
> +}
> +
> +static const struct bpf_iter_seq_info sock_map_iter_seq_info = {
> +	.seq_ops		= &sock_map_seq_ops,
> +	.init_seq_private	= sock_map_init_seq_private,
> +	.seq_priv_size		= sizeof(struct sock_map_seq_info),
> +};
> +
>  static int sock_map_btf_id;
>  const struct bpf_map_ops sock_map_ops = {
>  	.map_alloc		= sock_map_alloc,

[...]

> @@ -1198,6 +1309,120 @@ const struct bpf_func_proto bpf_msg_redirect_hash_proto = {
>  	.arg4_type      = ARG_ANYTHING,
>  };
>
> +struct sock_hash_seq_info {
> +	struct bpf_map *map;
> +	struct bpf_shtab *htab;
> +	u32 bucket_id;
> +};
> +
> +static void *sock_hash_seq_find_next(struct sock_hash_seq_info *info,
> +				     struct bpf_shtab_elem *prev_elem)
> +{
> +	const struct bpf_shtab *htab = info->htab;
> +	struct bpf_shtab_bucket *bucket;
> +	struct bpf_shtab_elem *elem;
> +	struct hlist_node *node;
> +
> +	/* try to find next elem in the same bucket */
> +	if (prev_elem) {
> +		node = rcu_dereference_raw(hlist_next_rcu(&prev_elem->node));

I'm not convinced we need to go for the rcu_dereference_raw()
variant. Access happens inside read-side critical section, which we
entered with rcu_read_lock() in sock_hash_seq_start().

That's typical and rcu_dereference() seems appropriate. Basing this on
what I read in Documentation/RCU/rcu_dereference.rst.

> +		elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
> +		if (elem)
> +			return elem;
> +
> +		/* no more elements, continue in the next bucket */
> +		info->bucket_id++;
> +	}
> +
> +	for (; info->bucket_id < htab->buckets_num; info->bucket_id++) {
> +		bucket = &htab->buckets[info->bucket_id];
> +		node = rcu_dereference_raw(hlist_first_rcu(&bucket->head));
> +		elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
> +		if (elem)
> +			return elem;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void *sock_hash_seq_start(struct seq_file *seq, loff_t *pos)
> +{
> +	struct sock_hash_seq_info *info = seq->private;
> +
> +	if (*pos == 0)
> +		++*pos;
> +
> +	/* pairs with sock_hash_seq_stop */
> +	rcu_read_lock();
> +	return sock_hash_seq_find_next(info, NULL);
> +}
> +
> +static void *sock_hash_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> +{
> +	struct sock_hash_seq_info *info = seq->private;
> +
> +	++*pos;
> +	return sock_hash_seq_find_next(info, v);
> +}
> +
> +static int __sock_hash_seq_show(struct seq_file *seq, struct bpf_shtab_elem *elem)
> +{
> +	struct sock_hash_seq_info *info = seq->private;
> +	struct bpf_iter__sockmap ctx = {};
> +	struct bpf_iter_meta meta;
> +	struct bpf_prog *prog;
> +
> +	meta.seq = seq;
> +	prog = bpf_iter_get_info(&meta, !elem);
> +	if (!prog)
> +		return 0;
> +
> +	ctx.meta = &meta;
> +	ctx.map = info->map;
> +	if (elem) {
> +		ctx.key = elem->key;
> +		ctx.sk = (struct bpf_sock *)elem->sk;
> +	}
> +
> +	return bpf_iter_run_prog(prog, &ctx);
> +}
> +
> +static int sock_hash_seq_show(struct seq_file *seq, void *v)
> +{
> +	return __sock_hash_seq_show(seq, v);
> +}
> +
> +static void sock_hash_seq_stop(struct seq_file *seq, void *v)
> +{
> +	if (!v)
> +		(void)__sock_hash_seq_show(seq, NULL);
> +
> +	/* pairs with sock_hash_seq_start */
> +	rcu_read_unlock();
> +}
> +
> +static const struct seq_operations sock_hash_seq_ops = {
> +	.start	= sock_hash_seq_start,
> +	.next	= sock_hash_seq_next,
> +	.stop	= sock_hash_seq_stop,
> +	.show	= sock_hash_seq_show,
> +};
> +
> +static int sock_hash_init_seq_private(void *priv_data,
> +				     struct bpf_iter_aux_info *aux)
> +{
> +	struct sock_hash_seq_info *info = priv_data;
> +
> +	info->map = aux->map;
> +	return 0;
> +}
> +
> +static const struct bpf_iter_seq_info sock_hash_iter_seq_info = {
> +	.seq_ops		= &sock_hash_seq_ops,
> +	.init_seq_private	= sock_hash_init_seq_private,
> +	.seq_priv_size		= sizeof(struct sock_hash_seq_info),
> +};
> +
>  static int sock_hash_map_btf_id;
>  const struct bpf_map_ops sock_hash_ops = {
>  	.map_alloc		= sock_hash_alloc,

[...]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter
  2020-08-28  9:48 ` [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter Lorenz Bauer
@ 2020-08-31 10:58   ` Jakub Sitnicki
  2020-09-01  9:20     ` Lorenz Bauer
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Sitnicki @ 2020-08-31 10:58 UTC (permalink / raw)
  To: Lorenz Bauer; +Cc: ast, yhs, daniel, john.fastabend, bpf, kernel-team

On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> Add a test that exercises a basic sockmap / sockhash copy using bpf_iter.
>
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> ---
>  .../selftests/bpf/prog_tests/sockmap_basic.c  | 78 +++++++++++++++++++
>  tools/testing/selftests/bpf/progs/bpf_iter.h  |  9 +++
>  .../selftests/bpf/progs/bpf_iter_sockmap.c    | 50 ++++++++++++
>  3 files changed, 137 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> index b989f8760f1a..386aecf1f7ff 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> @@ -6,6 +6,7 @@
>  #include "test_skmsg_load_helpers.skel.h"
>  #include "test_sockmap_update.skel.h"
>  #include "test_sockmap_invalid_update.skel.h"
> +#include "bpf_iter_sockmap.skel.h"
>
>  #define TCP_REPAIR		19	/* TCP sock is under repair right now */
>
> @@ -194,6 +195,79 @@ static void test_sockmap_invalid_update(void)
>  		test_sockmap_invalid_update__destroy(skel);
>  }
>
> +static void test_sockmap_copy(enum bpf_map_type map_type)
> +{
> +	DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
> +	int err, i, len, src_fd, iter_fd, num_sockets, duration;
> +	struct bpf_iter_sockmap *skel;
> +	struct bpf_map *src, *dst;
> +	union bpf_iter_link_info linfo = {0};
> +	__s64 sock_fd[2] = {-1, -1};

With just two sockets and sockhash max_entries set to 3 (which means 4
buckets), we're likely not exercising walking the bucket chain in the
iterator code.

How about a more generous value?

> +	struct bpf_link *link;
> +	char buf[64];
> +	__u32 max_elems;
> +
> +	skel = bpf_iter_sockmap__open_and_load();
> +	if (CHECK(!skel, "bpf_iter_sockmap__open_and_load",
> +		  "skeleton open_and_load failed\n"))
> +		return;
> +
> +	if (map_type == BPF_MAP_TYPE_SOCKMAP)
> +		src = skel->maps.sockmap;
> +	else
> +		src = skel->maps.sockhash;
> +
> +	dst = skel->maps.dst;
> +	src_fd = bpf_map__fd(src);
> +	max_elems = bpf_map__max_entries(src);
> +
> +	num_sockets = ARRAY_SIZE(sock_fd);
> +	for (i = 0; i < num_sockets; i++) {
> +		sock_fd[i] = connected_socket_v4();
> +		if (CHECK(sock_fd[i] == -1, "connected_socket_v4", "cannot connect\n"))
> +			goto out;
> +
> +		err = bpf_map_update_elem(src_fd, &i, &sock_fd[i], BPF_NOEXIST);
> +		if (CHECK(err, "map_update", "map_update failed\n"))

Nit: No need to repeat what failed in the message when the tag already
says it. In this case the message will look like:

test_sockmap_copy:FAIL:map_update map_update failed

What would be useful is to include the errno in the message. CHECK()
doesn't print it by default.

> +			goto out;
> +	}
> +
> +	linfo.map.map_fd = src_fd;
> +	opts.link_info = &linfo;
> +	opts.link_info_len = sizeof(linfo);
> +	link = bpf_program__attach_iter(skel->progs.copy_sockmap, &opts);
> +	if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
> +		goto out;
> +
> +	iter_fd = bpf_iter_create(bpf_link__fd(link));
> +	if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
> +		goto free_link;
> +
> +	/* do some tests */
> +	while ((len = read(iter_fd, buf, sizeof(buf))) > 0)
> +		;
> +	if (CHECK(len < 0, "read", "failed: %s\n", strerror(errno)))
> +		goto close_iter;
> +
> +	/* test results */
> +	if (CHECK(skel->bss->elems != max_elems, "elems", "got %u expected %u\n",
> +		  skel->bss->elems, max_elems))
> +		goto close_iter;
> +
> +	compare_cookies(src, dst);
> +
> +close_iter:
> +	close(iter_fd);
> +free_link:
> +	bpf_link__destroy(link);
> +out:
> +	for (i = 0; i < num_sockets; i++) {
> +		if (sock_fd[i] >= 0)
> +			close(sock_fd[i]);
> +	}
> +	bpf_iter_sockmap__destroy(skel);
> +}
> +
>  void test_sockmap_basic(void)
>  {
>  	if (test__start_subtest("sockmap create_update_free"))
> @@ -210,4 +284,8 @@ void test_sockmap_basic(void)
>  		test_sockmap_update(BPF_MAP_TYPE_SOCKHASH);
>  	if (test__start_subtest("sockmap update in unsafe context"))
>  		test_sockmap_invalid_update();
> +	if (test__start_subtest("sockmap copy"))
> +		test_sockmap_copy(BPF_MAP_TYPE_SOCKMAP);
> +	if (test__start_subtest("sockhash copy"))
> +		test_sockmap_copy(BPF_MAP_TYPE_SOCKHASH);
>  }
> diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h
> index c196280df90d..ac32a29f5153 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_iter.h
> +++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
> @@ -13,6 +13,7 @@
>  #define udp6_sock udp6_sock___not_used
>  #define bpf_iter__bpf_map_elem bpf_iter__bpf_map_elem___not_used
>  #define bpf_iter__bpf_sk_storage_map bpf_iter__bpf_sk_storage_map___not_used
> +#define bpf_iter__sockmap bpf_iter__sockmap___not_used
>  #include "vmlinux.h"
>  #undef bpf_iter_meta
>  #undef bpf_iter__bpf_map
> @@ -26,6 +27,7 @@
>  #undef udp6_sock
>  #undef bpf_iter__bpf_map_elem
>  #undef bpf_iter__bpf_sk_storage_map
> +#undef bpf_iter__sockmap
>
>  struct bpf_iter_meta {
>  	struct seq_file *seq;
> @@ -96,3 +98,10 @@ struct bpf_iter__bpf_sk_storage_map {
>  	struct sock *sk;
>  	void *value;
>  };
> +
> +struct bpf_iter__sockmap {
> +	struct bpf_iter_meta *meta;
> +	struct bpf_map *map;
> +	void *key;
> +	struct bpf_sock *sk;
> +};
> diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
> new file mode 100644
> index 000000000000..1b4268c9cd31
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
> @@ -0,0 +1,50 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 Cloudflare */
> +#include "bpf_iter.h"
> +#include "bpf_tracing_net.h"
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_SOCKMAP);
> +	__uint(max_entries, 3);
> +	__type(key, __u32);
> +	__type(value, __u64);
> +} sockmap SEC(".maps");
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_SOCKMAP);
> +	__uint(max_entries, 3);
> +	__type(key, __u32);
> +	__type(value, __u64);
> +} sockhash SEC(".maps");
> +
> +struct {
> +	__uint(type, BPF_MAP_TYPE_SOCKHASH);
> +	__uint(max_entries, 3);
> +	__type(key, __u32);
> +	__type(value, __u64);
> +} dst SEC(".maps");
> +
> +__u32 elems = 0;
> +
> +SEC("iter/sockmap")
> +int copy_sockmap(struct bpf_iter__sockmap *ctx)
> +{
> +	__u32 tmp, *key = ctx->key;
> +	struct bpf_sock *sk = ctx->sk;
> +
> +	if (key == (void *)0)
> +		return 0;
> +
> +	elems++;
> +	tmp = *key;

Is the tmp variable needed? We never inspect its value directly.
Or it illustrates that they key can be modified on copy?

> +
> +	if (sk != (void *)0)
> +		return bpf_map_update_elem(&dst, &tmp, sk, 0) != 0;
> +
> +	bpf_map_delete_elem(&dst, &tmp);

map_delete_elem in theory can fail too. Not sure why were ignoring the
error here.

> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash
  2020-08-31 10:03   ` Jakub Sitnicki
@ 2020-09-01  8:59     ` Lorenz Bauer
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenz Bauer @ 2020-09-01  8:59 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Yonghong Song, Daniel Borkmann,
	John Fastabend, bpf, kernel-team

On Mon, 31 Aug 2020 at 11:04, Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> > Add bpf_iter support for sockmap / sockhash, based on the bpf_sk_storage and
> > hashtable implementation. sockmap and sockhash share the same iteration
> > context: a pointer to an arbitrary key and a pointer to a socket. Both
> > pointers may be NULL, and so BPF has to perform a NULL check before accessing
> > them. Technically it's not possible for sockhash iteration to yield a NULL
> > socket, but we ignore this to be able to use a single iteration point.
> >
> > Iteration will visit all keys that remain unmodified during the lifetime of
> > the iterator. It may or may not visit newly added ones.
> >
> > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> > ---
> >  net/core/sock_map.c | 283 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 283 insertions(+)
> >
> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > index d6c6e1e312fc..31c4332f06e4 100644
> > --- a/net/core/sock_map.c
> > +++ b/net/core/sock_map.c
> > @@ -703,6 +703,116 @@ const struct bpf_func_proto bpf_msg_redirect_map_proto = {
> >       .arg4_type      = ARG_ANYTHING,
> >  };
> >
> > +struct sock_map_seq_info {
> > +     struct bpf_map *map;
> > +     struct sock *sk;
> > +     u32 index;
> > +};
> > +
> > +struct bpf_iter__sockmap {
> > +     __bpf_md_ptr(struct bpf_iter_meta *, meta);
> > +     __bpf_md_ptr(struct bpf_map *, map);
> > +     __bpf_md_ptr(void *, key);
> > +     __bpf_md_ptr(struct bpf_sock *, sk);
> > +};
> > +
> > +DEFINE_BPF_ITER_FUNC(sockmap, struct bpf_iter_meta *meta,
> > +                  struct bpf_map *map, void *key,
> > +                  struct sock *sk)
> > +
> > +static void *sock_map_seq_lookup_elem(struct sock_map_seq_info *info)
> > +{
> > +     if (unlikely(info->index >= info->map->max_entries))
> > +             return NULL;
> > +
> > +     info->sk = __sock_map_lookup_elem(info->map, info->index);
> > +     if (!info->sk || !sk_fullsock(info->sk))
>
> As we've talked off-line, we don't expect neither timewait nor request
> sockets in sockmap so sk_fullsock() check is likely not needed.

Ack.

>
> > +             info->sk = NULL;
> > +
> > +     /* continue iterating */
> > +     return info;
> > +}
> > +
> > +static void *sock_map_seq_start(struct seq_file *seq, loff_t *pos)
> > +{
> > +     struct sock_map_seq_info *info = seq->private;
> > +
> > +     if (*pos == 0)
> > +             ++*pos;
> > +
> > +     /* pairs with sock_map_seq_stop */
> > +     rcu_read_lock();
> > +     return sock_map_seq_lookup_elem(info);
> > +}
> > +
> > +static void *sock_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> > +{
> > +     struct sock_map_seq_info *info = seq->private;
> > +
> > +     ++*pos;
> > +     ++info->index;
> > +
> > +     return sock_map_seq_lookup_elem(info);
> > +}
> > +
> > +static int __sock_map_seq_show(struct seq_file *seq, void *v)
> > +{
> > +     struct sock_map_seq_info *info = seq->private;
> > +     struct bpf_iter__sockmap ctx = {};
> > +     struct bpf_iter_meta meta;
> > +     struct bpf_prog *prog;
> > +
> > +     meta.seq = seq;
> > +     prog = bpf_iter_get_info(&meta, !v);
> > +     if (!prog)
> > +             return 0;
> > +
> > +     ctx.meta = &meta;
> > +     ctx.map = info->map;
> > +     if (v) {
> > +             ctx.key = &info->index;
> > +             ctx.sk = (struct bpf_sock *)info->sk;
> > +     }
> > +
> > +     return bpf_iter_run_prog(prog, &ctx);
> > +}
> > +
> > +static int sock_map_seq_show(struct seq_file *seq, void *v)
> > +{
> > +     return __sock_map_seq_show(seq, v);
> > +}
> > +
> > +static void sock_map_seq_stop(struct seq_file *seq, void *v)
> > +{
> > +     if (!v)
> > +             (void)__sock_map_seq_show(seq, NULL);
> > +
> > +     /* pairs with sock_map_seq_start */
> > +     rcu_read_unlock();
> > +}
> > +
> > +static const struct seq_operations sock_map_seq_ops = {
> > +     .start  = sock_map_seq_start,
> > +     .next   = sock_map_seq_next,
> > +     .stop   = sock_map_seq_stop,
> > +     .show   = sock_map_seq_show,
> > +};
> > +
> > +static int sock_map_init_seq_private(void *priv_data,
> > +                                  struct bpf_iter_aux_info *aux)
> > +{
> > +     struct sock_map_seq_info *info = priv_data;
> > +
> > +     info->map = aux->map;
> > +     return 0;
> > +}
> > +
> > +static const struct bpf_iter_seq_info sock_map_iter_seq_info = {
> > +     .seq_ops                = &sock_map_seq_ops,
> > +     .init_seq_private       = sock_map_init_seq_private,
> > +     .seq_priv_size          = sizeof(struct sock_map_seq_info),
> > +};
> > +
> >  static int sock_map_btf_id;
> >  const struct bpf_map_ops sock_map_ops = {
> >       .map_alloc              = sock_map_alloc,
>
> [...]
>
> > @@ -1198,6 +1309,120 @@ const struct bpf_func_proto bpf_msg_redirect_hash_proto = {
> >       .arg4_type      = ARG_ANYTHING,
> >  };
> >
> > +struct sock_hash_seq_info {
> > +     struct bpf_map *map;
> > +     struct bpf_shtab *htab;
> > +     u32 bucket_id;
> > +};
> > +
> > +static void *sock_hash_seq_find_next(struct sock_hash_seq_info *info,
> > +                                  struct bpf_shtab_elem *prev_elem)
> > +{
> > +     const struct bpf_shtab *htab = info->htab;
> > +     struct bpf_shtab_bucket *bucket;
> > +     struct bpf_shtab_elem *elem;
> > +     struct hlist_node *node;
> > +
> > +     /* try to find next elem in the same bucket */
> > +     if (prev_elem) {
> > +             node = rcu_dereference_raw(hlist_next_rcu(&prev_elem->node));
>
> I'm not convinced we need to go for the rcu_dereference_raw()
> variant. Access happens inside read-side critical section, which we
> entered with rcu_read_lock() in sock_hash_seq_start().
>
> That's typical and rcu_dereference() seems appropriate. Basing this on
> what I read in Documentation/RCU/rcu_dereference.rst.

Yeah, that makes sense to me. However, sock_hash_get_next_key also
uses rcu_dereference_raw. John, can you shed some light on why that
is? Can we replace that with plain rcu_dereference as well?

>
> > +             elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
> > +             if (elem)
> > +                     return elem;
> > +
> > +             /* no more elements, continue in the next bucket */
> > +             info->bucket_id++;
> > +     }
> > +
> > +     for (; info->bucket_id < htab->buckets_num; info->bucket_id++) {
> > +             bucket = &htab->buckets[info->bucket_id];
> > +             node = rcu_dereference_raw(hlist_first_rcu(&bucket->head));
> > +             elem = hlist_entry_safe(node, struct bpf_shtab_elem, node);
> > +             if (elem)
> > +                     return elem;
> > +     }
> > +
> > +     return NULL;
> > +}
> > +
> > +static void *sock_hash_seq_start(struct seq_file *seq, loff_t *pos)
> > +{
> > +     struct sock_hash_seq_info *info = seq->private;
> > +
> > +     if (*pos == 0)
> > +             ++*pos;
> > +
> > +     /* pairs with sock_hash_seq_stop */
> > +     rcu_read_lock();
> > +     return sock_hash_seq_find_next(info, NULL);
> > +}
> > +
> > +static void *sock_hash_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> > +{
> > +     struct sock_hash_seq_info *info = seq->private;
> > +
> > +     ++*pos;
> > +     return sock_hash_seq_find_next(info, v);
> > +}
> > +
> > +static int __sock_hash_seq_show(struct seq_file *seq, struct bpf_shtab_elem *elem)
> > +{
> > +     struct sock_hash_seq_info *info = seq->private;
> > +     struct bpf_iter__sockmap ctx = {};
> > +     struct bpf_iter_meta meta;
> > +     struct bpf_prog *prog;
> > +
> > +     meta.seq = seq;
> > +     prog = bpf_iter_get_info(&meta, !elem);
> > +     if (!prog)
> > +             return 0;
> > +
> > +     ctx.meta = &meta;
> > +     ctx.map = info->map;
> > +     if (elem) {
> > +             ctx.key = elem->key;
> > +             ctx.sk = (struct bpf_sock *)elem->sk;
> > +     }
> > +
> > +     return bpf_iter_run_prog(prog, &ctx);
> > +}
> > +
> > +static int sock_hash_seq_show(struct seq_file *seq, void *v)
> > +{
> > +     return __sock_hash_seq_show(seq, v);
> > +}
> > +
> > +static void sock_hash_seq_stop(struct seq_file *seq, void *v)
> > +{
> > +     if (!v)
> > +             (void)__sock_hash_seq_show(seq, NULL);
> > +
> > +     /* pairs with sock_hash_seq_start */
> > +     rcu_read_unlock();
> > +}
> > +
> > +static const struct seq_operations sock_hash_seq_ops = {
> > +     .start  = sock_hash_seq_start,
> > +     .next   = sock_hash_seq_next,
> > +     .stop   = sock_hash_seq_stop,
> > +     .show   = sock_hash_seq_show,
> > +};
> > +
> > +static int sock_hash_init_seq_private(void *priv_data,
> > +                                  struct bpf_iter_aux_info *aux)
> > +{
> > +     struct sock_hash_seq_info *info = priv_data;
> > +
> > +     info->map = aux->map;
> > +     return 0;
> > +}
> > +
> > +static const struct bpf_iter_seq_info sock_hash_iter_seq_info = {
> > +     .seq_ops                = &sock_hash_seq_ops,
> > +     .init_seq_private       = sock_hash_init_seq_private,
> > +     .seq_priv_size          = sizeof(struct sock_hash_seq_info),
> > +};
> > +
> >  static int sock_hash_map_btf_id;
> >  const struct bpf_map_ops sock_hash_ops = {
> >       .map_alloc              = sock_hash_alloc,
>
> [...]



-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter
  2020-08-31 10:58   ` Jakub Sitnicki
@ 2020-09-01  9:20     ` Lorenz Bauer
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenz Bauer @ 2020-09-01  9:20 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Yonghong Song, Daniel Borkmann,
	John Fastabend, bpf, kernel-team

On Mon, 31 Aug 2020 at 11:58, Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Fri, Aug 28, 2020 at 11:48 AM CEST, Lorenz Bauer wrote:
> > Add a test that exercises a basic sockmap / sockhash copy using bpf_iter.
> >
> > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> > ---
> >  .../selftests/bpf/prog_tests/sockmap_basic.c  | 78 +++++++++++++++++++
> >  tools/testing/selftests/bpf/progs/bpf_iter.h  |  9 +++
> >  .../selftests/bpf/progs/bpf_iter_sockmap.c    | 50 ++++++++++++
> >  3 files changed, 137 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > index b989f8760f1a..386aecf1f7ff 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
> > @@ -6,6 +6,7 @@
> >  #include "test_skmsg_load_helpers.skel.h"
> >  #include "test_sockmap_update.skel.h"
> >  #include "test_sockmap_invalid_update.skel.h"
> > +#include "bpf_iter_sockmap.skel.h"
> >
> >  #define TCP_REPAIR           19      /* TCP sock is under repair right now */
> >
> > @@ -194,6 +195,79 @@ static void test_sockmap_invalid_update(void)
> >               test_sockmap_invalid_update__destroy(skel);
> >  }
> >
> > +static void test_sockmap_copy(enum bpf_map_type map_type)
> > +{
> > +     DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
> > +     int err, i, len, src_fd, iter_fd, num_sockets, duration;
> > +     struct bpf_iter_sockmap *skel;
> > +     struct bpf_map *src, *dst;
> > +     union bpf_iter_link_info linfo = {0};
> > +     __s64 sock_fd[2] = {-1, -1};
>
> With just two sockets and sockhash max_entries set to 3 (which means 4
> buckets), we're likely not exercising walking the bucket chain in the
> iterator code.
>
> How about a more generous value?
>
> > +     struct bpf_link *link;
> > +     char buf[64];
> > +     __u32 max_elems;
> > +
> > +     skel = bpf_iter_sockmap__open_and_load();
> > +     if (CHECK(!skel, "bpf_iter_sockmap__open_and_load",
> > +               "skeleton open_and_load failed\n"))
> > +             return;
> > +
> > +     if (map_type == BPF_MAP_TYPE_SOCKMAP)
> > +             src = skel->maps.sockmap;
> > +     else
> > +             src = skel->maps.sockhash;
> > +
> > +     dst = skel->maps.dst;
> > +     src_fd = bpf_map__fd(src);
> > +     max_elems = bpf_map__max_entries(src);
> > +
> > +     num_sockets = ARRAY_SIZE(sock_fd);
> > +     for (i = 0; i < num_sockets; i++) {
> > +             sock_fd[i] = connected_socket_v4();
> > +             if (CHECK(sock_fd[i] == -1, "connected_socket_v4", "cannot connect\n"))
> > +                     goto out;
> > +
> > +             err = bpf_map_update_elem(src_fd, &i, &sock_fd[i], BPF_NOEXIST);
> > +             if (CHECK(err, "map_update", "map_update failed\n"))
>
> Nit: No need to repeat what failed in the message when the tag already
> says it. In this case the message will look like:
>
> test_sockmap_copy:FAIL:map_update map_update failed
>
> What would be useful is to include the errno in the message. CHECK()
> doesn't print it by default.

Ack.

>
> > +                     goto out;
> > +     }
> > +
> > +     linfo.map.map_fd = src_fd;
> > +     opts.link_info = &linfo;
> > +     opts.link_info_len = sizeof(linfo);
> > +     link = bpf_program__attach_iter(skel->progs.copy_sockmap, &opts);
> > +     if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
> > +             goto out;
> > +
> > +     iter_fd = bpf_iter_create(bpf_link__fd(link));
> > +     if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
> > +             goto free_link;
> > +
> > +     /* do some tests */
> > +     while ((len = read(iter_fd, buf, sizeof(buf))) > 0)
> > +             ;
> > +     if (CHECK(len < 0, "read", "failed: %s\n", strerror(errno)))
> > +             goto close_iter;
> > +
> > +     /* test results */
> > +     if (CHECK(skel->bss->elems != max_elems, "elems", "got %u expected %u\n",
> > +               skel->bss->elems, max_elems))
> > +             goto close_iter;
> > +
> > +     compare_cookies(src, dst);
> > +
> > +close_iter:
> > +     close(iter_fd);
> > +free_link:
> > +     bpf_link__destroy(link);
> > +out:
> > +     for (i = 0; i < num_sockets; i++) {
> > +             if (sock_fd[i] >= 0)
> > +                     close(sock_fd[i]);
> > +     }
> > +     bpf_iter_sockmap__destroy(skel);
> > +}
> > +
> >  void test_sockmap_basic(void)
> >  {
> >       if (test__start_subtest("sockmap create_update_free"))
> > @@ -210,4 +284,8 @@ void test_sockmap_basic(void)
> >               test_sockmap_update(BPF_MAP_TYPE_SOCKHASH);
> >       if (test__start_subtest("sockmap update in unsafe context"))
> >               test_sockmap_invalid_update();
> > +     if (test__start_subtest("sockmap copy"))
> > +             test_sockmap_copy(BPF_MAP_TYPE_SOCKMAP);
> > +     if (test__start_subtest("sockhash copy"))
> > +             test_sockmap_copy(BPF_MAP_TYPE_SOCKHASH);
> >  }
> > diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h
> > index c196280df90d..ac32a29f5153 100644
> > --- a/tools/testing/selftests/bpf/progs/bpf_iter.h
> > +++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
> > @@ -13,6 +13,7 @@
> >  #define udp6_sock udp6_sock___not_used
> >  #define bpf_iter__bpf_map_elem bpf_iter__bpf_map_elem___not_used
> >  #define bpf_iter__bpf_sk_storage_map bpf_iter__bpf_sk_storage_map___not_used
> > +#define bpf_iter__sockmap bpf_iter__sockmap___not_used
> >  #include "vmlinux.h"
> >  #undef bpf_iter_meta
> >  #undef bpf_iter__bpf_map
> > @@ -26,6 +27,7 @@
> >  #undef udp6_sock
> >  #undef bpf_iter__bpf_map_elem
> >  #undef bpf_iter__bpf_sk_storage_map
> > +#undef bpf_iter__sockmap
> >
> >  struct bpf_iter_meta {
> >       struct seq_file *seq;
> > @@ -96,3 +98,10 @@ struct bpf_iter__bpf_sk_storage_map {
> >       struct sock *sk;
> >       void *value;
> >  };
> > +
> > +struct bpf_iter__sockmap {
> > +     struct bpf_iter_meta *meta;
> > +     struct bpf_map *map;
> > +     void *key;
> > +     struct bpf_sock *sk;
> > +};
> > diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
> > new file mode 100644
> > index 000000000000..1b4268c9cd31
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/bpf_iter_sockmap.c
> > @@ -0,0 +1,50 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2020 Cloudflare */
> > +#include "bpf_iter.h"
> > +#include "bpf_tracing_net.h"
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +struct {
> > +     __uint(type, BPF_MAP_TYPE_SOCKMAP);
> > +     __uint(max_entries, 3);
> > +     __type(key, __u32);
> > +     __type(value, __u64);
> > +} sockmap SEC(".maps");
> > +
> > +struct {
> > +     __uint(type, BPF_MAP_TYPE_SOCKMAP);
> > +     __uint(max_entries, 3);
> > +     __type(key, __u32);
> > +     __type(value, __u64);
> > +} sockhash SEC(".maps");
> > +
> > +struct {
> > +     __uint(type, BPF_MAP_TYPE_SOCKHASH);
> > +     __uint(max_entries, 3);
> > +     __type(key, __u32);
> > +     __type(value, __u64);
> > +} dst SEC(".maps");
> > +
> > +__u32 elems = 0;
> > +
> > +SEC("iter/sockmap")
> > +int copy_sockmap(struct bpf_iter__sockmap *ctx)
> > +{
> > +     __u32 tmp, *key = ctx->key;
> > +     struct bpf_sock *sk = ctx->sk;
> > +
> > +     if (key == (void *)0)
> > +             return 0;
> > +
> > +     elems++;
> > +     tmp = *key;
>
> Is the tmp variable needed? We never inspect its value directly.
> Or it illustrates that they key can be modified on copy?

This is a limitation of the verifier: it doesn't let us pass key to
the helper directly, so I work around this using a temp variable on
the stack. I'll add a comment.

>
> > +
> > +     if (sk != (void *)0)
> > +             return bpf_map_update_elem(&dst, &tmp, sk, 0) != 0;
> > +
> > +     bpf_map_delete_elem(&dst, &tmp);
>
> map_delete_elem in theory can fail too. Not sure why were ignoring the
> error here.

This is because we want to ignore ENOENT. I'll update the code to take
this into account.

>
> > +     return 0;
> > +}



-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-09-01  9:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-28  9:48 [PATCH bpf-next 0/3] Sockmap iterator Lorenz Bauer
2020-08-28  9:48 ` [PATCH bpf-next 1/3] net: Allow iterating sockmap and sockhash Lorenz Bauer
2020-08-31 10:03   ` Jakub Sitnicki
2020-09-01  8:59     ` Lorenz Bauer
2020-08-28  9:48 ` [PATCH bpf-next 2/3] selftests: bpf: Add helper to compare socket cookies Lorenz Bauer
2020-08-28 10:50   ` Jakub Sitnicki
2020-08-28 15:48     ` Lorenz Bauer
2020-08-28 11:28   ` Jakub Sitnicki
2020-08-28  9:48 ` [PATCH bpf-next 3/3] selftests: bpf: Test copying a sockmap via bpf_iter Lorenz Bauer
2020-08-31 10:58   ` Jakub Sitnicki
2020-09-01  9:20     ` Lorenz Bauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).