linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next v2 0/6]  bpf: add BPF_MAP_DUMP command to
@ 2019-06-27 20:24 Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 1/6] bpf: add bpf_map_value_size and bp_map_copy_value helper functions Brian Vazquez
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

This introduces a new command to retrieve a variable number of entries
from a bpf map.

This new command can be executed from the existing BPF syscall as
follows:

err =  bpf(BPF_MAP_DUMP, union bpf_attr *attr, u32 size)
using attr->dump.map_fd, attr->dump.prev_key, attr->dump.buf,
attr->dump.buf_len
returns zero or negative error, and populates buf and buf_len on
succees

This implementation is wrapping the existing bpf methods:
map_get_next_key and map_lookup_elem
the results show that even with a 1-elem_size buffer, it runs ~40 faster
than the current implementation, improvements of ~85% are reported when
the buffer size is increased, although, after the buffer size is around
5% of the total number of entries there's no huge difference in
increasing
it.

Tested:
Tried different size buffers to handle case where the bulk is bigger, or
the elements to retrieve are less than the existing ones, all runs read
a map of 100K entries. Below are the results(in ns) from the different
runs:

buf_len_1:       55528939 entry-by-entry: 97244981 improvement
42.897887%
buf_len_2:       34425779 entry-by-entry: 88863122 improvement
61.259769%
buf_len_230:     11700316 entry-by-entry: 88753301 improvement
86.817036%
buf_len_5000:    11615290 entry-by-entry: 88362637 improvement
86.854976%
buf_len_73000:   12083976 entry-by-entry: 89956483 improvement
86.566865%
buf_len_100000:  12638913 entry-by-entry: 89642303 improvement
85.900727%
buf_len_234567:  11873964 entry-by-entry: 89080077 improvement
86.670461%

Changelog:

v2:
- use proper bpf-next tag

Brian Vazquez (6):
  bpf: add bpf_map_value_size and bp_map_copy_value helper functions
  bpf: add BPF_MAP_DUMP command to access more than one entry per call
  bpf: keep bpf.h in sync with tools/
  libbpf: support BPF_MAP_DUMP command
  selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap
  selftests/bpf: add test to measure performance of BPF_MAP_DUMP

 include/uapi/linux/bpf.h                |   9 +
 kernel/bpf/syscall.c                    | 242 ++++++++++++++++++------
 tools/include/uapi/linux/bpf.h          |   9 +
 tools/lib/bpf/bpf.c                     |  28 +++
 tools/lib/bpf/bpf.h                     |   4 +
 tools/lib/bpf/libbpf.map                |   2 +
 tools/testing/selftests/bpf/test_maps.c | 141 +++++++++++++-
 7 files changed, 372 insertions(+), 63 deletions(-)

-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v2 1/6] bpf: add bpf_map_value_size and bp_map_copy_value helper functions
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
@ 2019-06-27 20:24 ` Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call Brian Vazquez
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

Move reusable code from map_lookup_elem to helper functions to avoid code
duplication in kernel/bpf/syscall.c

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
---
 kernel/bpf/syscall.c | 134 +++++++++++++++++++++++--------------------
 1 file changed, 73 insertions(+), 61 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 7713cf39795a4..a1823a50f9be0 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -126,6 +126,76 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr)
 	return map;
 }
 
+static u32 bpf_map_value_size(struct bpf_map *map)
+{
+	if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
+	    map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH ||
+	    map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY ||
+	    map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE)
+		return round_up(map->value_size, 8) * num_possible_cpus();
+	else if (IS_FD_MAP(map))
+		return sizeof(u32);
+	else
+		return  map->value_size;
+}
+
+static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value,
+			      __u64 flags)
+{
+	void *ptr;
+	int err;
+
+	if (bpf_map_is_dev_bound(map))
+		return  bpf_map_offload_lookup_elem(map, key, value);
+
+	preempt_disable();
+	this_cpu_inc(bpf_prog_active);
+	if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
+	    map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
+		err = bpf_percpu_hash_copy(map, key, value);
+	} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
+		err = bpf_percpu_array_copy(map, key, value);
+	} else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) {
+		err = bpf_percpu_cgroup_storage_copy(map, key, value);
+	} else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) {
+		err = bpf_stackmap_copy(map, key, value);
+	} else if (IS_FD_ARRAY(map)) {
+		err = bpf_fd_array_map_lookup_elem(map, key, value);
+	} else if (IS_FD_HASH(map)) {
+		err = bpf_fd_htab_map_lookup_elem(map, key, value);
+	} else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) {
+		err = bpf_fd_reuseport_array_lookup_elem(map, key, value);
+	} else if (map->map_type == BPF_MAP_TYPE_QUEUE ||
+		   map->map_type == BPF_MAP_TYPE_STACK) {
+		err = map->ops->map_peek_elem(map, value);
+	} else {
+		rcu_read_lock();
+		if (map->ops->map_lookup_elem_sys_only)
+			ptr = map->ops->map_lookup_elem_sys_only(map, key);
+		else
+			ptr = map->ops->map_lookup_elem(map, key);
+		if (IS_ERR(ptr)) {
+			err = PTR_ERR(ptr);
+		} else if (!ptr) {
+			err = -ENOENT;
+		} else {
+			err = 0;
+			if (flags & BPF_F_LOCK)
+				/* lock 'ptr' and copy everything but lock */
+				copy_map_value_locked(map, value, ptr, true);
+			else
+				copy_map_value(map, value, ptr);
+			/* mask lock, since value wasn't zero inited */
+			check_and_init_map_lock(map, value);
+		}
+		rcu_read_unlock();
+	}
+	this_cpu_dec(bpf_prog_active);
+	preempt_enable();
+
+	return err;
+}
+
 void *bpf_map_area_alloc(size_t size, int numa_node)
 {
 	/* We really just want to fail instead of triggering OOM killer
@@ -729,7 +799,7 @@ static int map_lookup_elem(union bpf_attr *attr)
 	void __user *uvalue = u64_to_user_ptr(attr->value);
 	int ufd = attr->map_fd;
 	struct bpf_map *map;
-	void *key, *value, *ptr;
+	void *key, *value;
 	u32 value_size;
 	struct fd f;
 	int err;
@@ -761,72 +831,14 @@ static int map_lookup_elem(union bpf_attr *attr)
 		goto err_put;
 	}
 
-	if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
-	    map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH ||
-	    map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY ||
-	    map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE)
-		value_size = round_up(map->value_size, 8) * num_possible_cpus();
-	else if (IS_FD_MAP(map))
-		value_size = sizeof(u32);
-	else
-		value_size = map->value_size;
+	value_size = bpf_map_value_size(map);
 
 	err = -ENOMEM;
 	value = kmalloc(value_size, GFP_USER | __GFP_NOWARN);
 	if (!value)
 		goto free_key;
 
-	if (bpf_map_is_dev_bound(map)) {
-		err = bpf_map_offload_lookup_elem(map, key, value);
-		goto done;
-	}
-
-	preempt_disable();
-	this_cpu_inc(bpf_prog_active);
-	if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
-	    map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
-		err = bpf_percpu_hash_copy(map, key, value);
-	} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
-		err = bpf_percpu_array_copy(map, key, value);
-	} else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) {
-		err = bpf_percpu_cgroup_storage_copy(map, key, value);
-	} else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) {
-		err = bpf_stackmap_copy(map, key, value);
-	} else if (IS_FD_ARRAY(map)) {
-		err = bpf_fd_array_map_lookup_elem(map, key, value);
-	} else if (IS_FD_HASH(map)) {
-		err = bpf_fd_htab_map_lookup_elem(map, key, value);
-	} else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) {
-		err = bpf_fd_reuseport_array_lookup_elem(map, key, value);
-	} else if (map->map_type == BPF_MAP_TYPE_QUEUE ||
-		   map->map_type == BPF_MAP_TYPE_STACK) {
-		err = map->ops->map_peek_elem(map, value);
-	} else {
-		rcu_read_lock();
-		if (map->ops->map_lookup_elem_sys_only)
-			ptr = map->ops->map_lookup_elem_sys_only(map, key);
-		else
-			ptr = map->ops->map_lookup_elem(map, key);
-		if (IS_ERR(ptr)) {
-			err = PTR_ERR(ptr);
-		} else if (!ptr) {
-			err = -ENOENT;
-		} else {
-			err = 0;
-			if (attr->flags & BPF_F_LOCK)
-				/* lock 'ptr' and copy everything but lock */
-				copy_map_value_locked(map, value, ptr, true);
-			else
-				copy_map_value(map, value, ptr);
-			/* mask lock, since value wasn't zero inited */
-			check_and_init_map_lock(map, value);
-		}
-		rcu_read_unlock();
-	}
-	this_cpu_dec(bpf_prog_active);
-	preempt_enable();
-
-done:
+	err = bpf_map_copy_value(map, key, value, attr->flags);
 	if (err)
 		goto free_value;
 
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 1/6] bpf: add bpf_map_value_size and bp_map_copy_value helper functions Brian Vazquez
@ 2019-06-27 20:24 ` Brian Vazquez
  2019-06-27 22:12   ` Alexei Starovoitov
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 3/6] bpf: keep bpf.h in sync with tools/ Brian Vazquez
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

This introduces a new command to retrieve a variable number of entries
from a bpf map wrapping the existing bpf methods:
map_get_next_key and map_lookup_elem

Note that map_dump doesn't guarantee that reading the entire table is
consistent since this function is always racing with kernel and user code
but the same behaviour is found when the entire table is walked using
the current interfaces: map_get_next_key + map_lookup_elem.
It is also important to note that when a locked map is provided it is
consistent only for 1 entry at the time, meaning that the buf returned
might or might not be consistent.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
---
 include/uapi/linux/bpf.h |   9 ++++
 kernel/bpf/syscall.c     | 108 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 117 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b077507efa3f3..1d753958874df 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -106,6 +106,7 @@ enum bpf_cmd {
 	BPF_TASK_FD_QUERY,
 	BPF_MAP_LOOKUP_AND_DELETE_ELEM,
 	BPF_MAP_FREEZE,
+	BPF_MAP_DUMP,
 };
 
 enum bpf_map_type {
@@ -385,6 +386,14 @@ union bpf_attr {
 		__u64		flags;
 	};
 
+	struct { /* struct used by BPF_MAP_DUMP command */
+		__u32		map_fd;
+		__aligned_u64	prev_key;
+		__aligned_u64	buf;
+		__aligned_u64	buf_len; /* input/output: len of buf */
+		__u64		flags;
+	} dump;
+
 	struct { /* anonymous struct used by BPF_PROG_LOAD command */
 		__u32		prog_type;	/* one of enum bpf_prog_type */
 		__u32		insn_cnt;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a1823a50f9be0..7653346b5cfd1 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1097,6 +1097,111 @@ static int map_get_next_key(union bpf_attr *attr)
 	return err;
 }
 
+/* last field in 'union bpf_attr' used by this command */
+#define BPF_MAP_DUMP_LAST_FIELD dump.buf_len
+
+static int map_dump(union bpf_attr *attr)
+{
+	void __user *ukey = u64_to_user_ptr(attr->dump.prev_key);
+	void __user *ubuf = u64_to_user_ptr(attr->dump.buf);
+	u32 __user *ubuf_len = u64_to_user_ptr(attr->dump.buf_len);
+	int ufd = attr->dump.map_fd;
+	struct bpf_map *map;
+	void *buf, *prev_key, *key, *value;
+	u32 value_size, elem_size, buf_len, cp_len;
+	struct fd f;
+	int err;
+
+	if (CHECK_ATTR(BPF_MAP_DUMP))
+		return -EINVAL;
+
+	attr->flags = 0;
+	if (attr->dump.flags & ~BPF_F_LOCK)
+		return -EINVAL;
+
+	f = fdget(ufd);
+	map = __bpf_map_get(f);
+	if (IS_ERR(map))
+		return PTR_ERR(map);
+	if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
+		err = -EPERM;
+		goto err_put;
+	}
+
+	if ((attr->dump.flags & BPF_F_LOCK) &&
+	    !map_value_has_spin_lock(map)) {
+		err = -EINVAL;
+		goto err_put;
+	}
+
+	if (map->map_type == BPF_MAP_TYPE_QUEUE ||
+	    map->map_type == BPF_MAP_TYPE_STACK) {
+		err = -ENOTSUPP;
+		goto err_put;
+	}
+
+	value_size = bpf_map_value_size(map);
+
+	err = get_user(buf_len, ubuf_len);
+	if (err)
+		goto err_put;
+
+	elem_size = map->key_size + value_size;
+	if (buf_len < elem_size) {
+		err = -EINVAL;
+		goto err_put;
+	}
+
+	if (ukey) {
+		prev_key = __bpf_copy_key(ukey, map->key_size);
+		if (IS_ERR(prev_key)) {
+			err = PTR_ERR(prev_key);
+			goto err_put;
+		}
+	} else {
+		prev_key = NULL;
+	}
+
+	err = -ENOMEM;
+	buf = kmalloc(elem_size, GFP_USER | __GFP_NOWARN);
+	if (!buf)
+		goto err_put;
+
+	key = buf;
+	value = key + map->key_size;
+	for (cp_len = 0;  cp_len + elem_size <= buf_len ; cp_len += elem_size) {
+next:
+		if (signal_pending(current)) {
+			err = -EINTR;
+			break;
+		}
+
+		rcu_read_lock();
+		err = map->ops->map_get_next_key(map, prev_key, key);
+		rcu_read_unlock();
+
+		if (err)
+			break;
+
+		if (bpf_map_copy_value(map, key, value, attr->dump.flags))
+			goto next;
+
+		if (copy_to_user(ubuf + cp_len, buf, elem_size))
+			break;
+
+		prev_key = key;
+	}
+
+	if (cp_len)
+		err = 0;
+	if (copy_to_user(ubuf_len, &cp_len, sizeof(cp_len)))
+		err = -EFAULT;
+	kfree(buf);
+err_put:
+	fdput(f);
+	return err;
+}
+
 #define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value
 
 static int map_lookup_and_delete_elem(union bpf_attr *attr)
@@ -2891,6 +2996,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
 		err = map_lookup_and_delete_elem(&attr);
 		break;
+	case BPF_MAP_DUMP:
+		err = map_dump(&attr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v2 3/6] bpf: keep bpf.h in sync with tools/
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 1/6] bpf: add bpf_map_value_size and bp_map_copy_value helper functions Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call Brian Vazquez
@ 2019-06-27 20:24 ` Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 4/6] libbpf: support BPF_MAP_DUMP command Brian Vazquez
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

Adds bpf_attr.dump structure to libbpf.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
---
 tools/include/uapi/linux/bpf.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index b077507efa3f3..1d753958874df 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -106,6 +106,7 @@ enum bpf_cmd {
 	BPF_TASK_FD_QUERY,
 	BPF_MAP_LOOKUP_AND_DELETE_ELEM,
 	BPF_MAP_FREEZE,
+	BPF_MAP_DUMP,
 };
 
 enum bpf_map_type {
@@ -385,6 +386,14 @@ union bpf_attr {
 		__u64		flags;
 	};
 
+	struct { /* struct used by BPF_MAP_DUMP command */
+		__u32		map_fd;
+		__aligned_u64	prev_key;
+		__aligned_u64	buf;
+		__aligned_u64	buf_len; /* input/output: len of buf */
+		__u64		flags;
+	} dump;
+
 	struct { /* anonymous struct used by BPF_PROG_LOAD command */
 		__u32		prog_type;	/* one of enum bpf_prog_type */
 		__u32		insn_cnt;
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v2 4/6] libbpf: support BPF_MAP_DUMP command
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
                   ` (2 preceding siblings ...)
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 3/6] bpf: keep bpf.h in sync with tools/ Brian Vazquez
@ 2019-06-27 20:24 ` Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 5/6] selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap Brian Vazquez
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

Make libbpf aware of new BPF_MAP_DUMP command and add bpf_map_dump and
bpf_map_dump_flags to use them from the library.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
---
 tools/lib/bpf/bpf.c      | 28 ++++++++++++++++++++++++++++
 tools/lib/bpf/bpf.h      |  4 ++++
 tools/lib/bpf/libbpf.map |  2 ++
 3 files changed, 34 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index c7d7993c44bb0..c1139b7db756a 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -368,6 +368,34 @@ int bpf_map_update_elem(int fd, const void *key, const void *value,
 	return sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
 }
 
+int bpf_map_dump(int fd, const void *prev_key, void *buf, void *buf_len)
+{
+	union bpf_attr attr;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.dump.map_fd = fd;
+	attr.dump.prev_key = ptr_to_u64(prev_key);
+	attr.dump.buf = ptr_to_u64(buf);
+	attr.dump.buf_len = ptr_to_u64(buf_len);
+
+	return sys_bpf(BPF_MAP_DUMP, &attr, sizeof(attr));
+}
+
+int bpf_map_dump_flags(int fd, const void *prev_key, void *buf, void *buf_len,
+		       __u64 flags)
+{
+	union bpf_attr attr;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.dump.map_fd = fd;
+	attr.dump.prev_key = ptr_to_u64(prev_key);
+	attr.dump.buf = ptr_to_u64(buf);
+	attr.dump.buf_len = ptr_to_u64(buf_len);
+	attr.dump.flags = flags;
+
+	return sys_bpf(BPF_MAP_DUMP, &attr, sizeof(attr));
+}
+
 int bpf_map_lookup_elem(int fd, const void *key, void *value)
 {
 	union bpf_attr attr;
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index ff42ca043dc8f..86496443440e9 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -112,6 +112,10 @@ LIBBPF_API int bpf_verify_program(enum bpf_prog_type type,
 LIBBPF_API int bpf_map_update_elem(int fd, const void *key, const void *value,
 				   __u64 flags);
 
+LIBBPF_API int bpf_map_dump(int fd, const void *prev_key, void *buf,
+				void *buf_len);
+LIBBPF_API int bpf_map_dump_flags(int fd, const void *prev_key, void *buf,
+				void *buf_len, __u64 flags);
 LIBBPF_API int bpf_map_lookup_elem(int fd, const void *key, void *value);
 LIBBPF_API int bpf_map_lookup_elem_flags(int fd, const void *key, void *value,
 					 __u64 flags);
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 2c6d835620d25..e7641773cfb0f 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -173,4 +173,6 @@ LIBBPF_0.0.4 {
 		btf__parse_elf;
 		bpf_object__load_xattr;
 		libbpf_num_possible_cpus;
+		bpf_map_dump;
+		bpf_map_dump_flags;
 } LIBBPF_0.0.3;
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v2 5/6] selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
                   ` (3 preceding siblings ...)
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 4/6] libbpf: support BPF_MAP_DUMP command Brian Vazquez
@ 2019-06-27 20:24 ` Brian Vazquez
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 6/6] selftests/bpf: add test to measure performance of BPF_MAP_DUMP Brian Vazquez
  2019-06-27 22:14 ` [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Alexei Starovoitov
  6 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

This tests exercise the new command on a bpf hashmap and make sure it
works as expected.

Signed-off-by: Brian Vazquez <brianvv@google.com>
---
 tools/testing/selftests/bpf/test_maps.c | 70 ++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index a3fbc571280a9..3df72b46fd1d9 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -309,6 +309,73 @@ static void test_hashmap_walk(unsigned int task, void *data)
 	close(fd);
 }
 
+static void test_hashmap_dump(void)
+{
+	int fd, i, max_entries = 3;
+	uint64_t keys[max_entries], values[max_entries];
+	uint64_t key, value, next_key;
+	bool next_key_valid = true;
+	void *buf, *elem, *prev_key;
+	u32 buf_len;
+	const int elem_size = sizeof(key) + sizeof(value);
+
+	fd = helper_fill_hashmap(max_entries);
+
+	// Get the elements in the hashmap, and store them in that order
+	assert(bpf_map_get_next_key(fd, NULL, &key) == 0);
+	i = 0;
+	keys[i] = key;
+	for (i = 1; next_key_valid; i++) {
+		next_key_valid = bpf_map_get_next_key(fd, &key, &next_key) == 0;
+		assert(bpf_map_lookup_elem(fd, &key, &values[i - 1]) == 0);
+		keys[i-1] = key;
+		key = next_key;
+	}
+
+	// Alloc memory for the whole table
+	buf = malloc(elem_size * max_entries);
+	assert(buf != NULL);
+
+	// Check that buf_len < elem_size returns EINVAL
+	buf_len = elem_size-1;
+	errno = 0;
+	assert(bpf_map_dump(fd, NULL, buf, &buf_len) == -1 && errno == EINVAL);
+
+	// Check that it returns the first two elements
+	errno = 0;
+	buf_len = elem_size * 2;
+	prev_key = NULL;
+	i = 0;
+	assert(bpf_map_dump(fd, prev_key, buf, &buf_len) == 0 &&
+	       buf_len == 2*elem_size);
+	elem = buf;
+	assert((*(uint64_t *)elem) == keys[i] &&
+	       (*(uint64_t *)(elem + sizeof(key))) == values[i]);
+	elem = buf + elem_size;
+	i++;
+	assert((*(uint64_t *)elem) == keys[i] &&
+	       (*(uint64_t *)(elem + sizeof(key))) == values[i]);
+	i++;
+
+	/* Continue reading from map and verify buf_len only contains 1 element
+	 * even though buf_len is 2 elem_size.
+	 */
+	prev_key = elem;
+	assert(bpf_map_dump(fd, prev_key, buf, &buf_len) == 0 &&
+	       buf_len == elem_size);
+	elem = buf;
+	assert((*(uint64_t *)elem) == keys[i] &&
+	       (*(uint64_t *)(elem + sizeof(key))) == values[i]);
+
+	// Check that there are no more entries after last_key
+	prev_key = &keys[i];
+	assert(bpf_map_dump(fd, prev_key, buf, &buf_len) == -1 &&
+	       errno == ENOENT);
+
+	free(buf);
+	close(fd);
+}
+
 static void test_hashmap_zero_seed(void)
 {
 	int i, first, second, old_flags;
@@ -1668,6 +1735,7 @@ static void run_all_tests(void)
 	test_hashmap_percpu(0, NULL);
 	test_hashmap_walk(0, NULL);
 	test_hashmap_zero_seed();
+	test_hashmap_dump();
 
 	test_arraymap(0, NULL);
 	test_arraymap_percpu(0, NULL);
@@ -1705,11 +1773,9 @@ int main(void)
 
 	map_flags = BPF_F_NO_PREALLOC;
 	run_all_tests();
-
 #define CALL
 #include <map_tests/tests.h>
 #undef CALL
-
 	printf("test_maps: OK, %d SKIPPED\n", skips);
 	return 0;
 }
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v2 6/6] selftests/bpf: add test to measure performance of BPF_MAP_DUMP
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
                   ` (4 preceding siblings ...)
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 5/6] selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap Brian Vazquez
@ 2019-06-27 20:24 ` Brian Vazquez
  2019-06-27 22:14 ` [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Alexei Starovoitov
  6 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-27 20:24 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, Willem de Bruijn, Petar Penkov, linux-kernel,
	netdev, bpf, Brian Vazquez

This tests compares the amount of time that takes to read an entire
table of 100K elements on a bpf hashmap using both BPF_MAP_DUMP and
BPF_MAP_GET_NEXT_KEY + BPF_MAP_LOOKUP_ELEM.

Signed-off-by: Brian Vazquez <brianvv@google.com>
---
 tools/testing/selftests/bpf/test_maps.c | 71 +++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 3df72b46fd1d9..61050272c20ee 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -18,6 +18,7 @@
 #include <sys/socket.h>
 #include <netinet/in.h>
 #include <linux/bpf.h>
+#include <linux/time64.h>
 
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
@@ -376,6 +377,75 @@ static void test_hashmap_dump(void)
 	close(fd);
 }
 
+static void test_hashmap_dump_perf(void)
+{
+	int fd, i, max_entries = 100000;
+	uint64_t key, value, next_key;
+	bool next_key_valid = true;
+	void *buf;
+	u32 buf_len, entries;
+	int j, k = 0;
+	int num_ent, off;
+	int clk_id = CLOCK_MONOTONIC;
+	struct timespec begin, end;
+	long long time_spent, dump_time_spent;
+	double res;
+	int tests[] = {1, 2, 230, 5000, 73000, 100000, 234567};
+	int test_len = ARRAY_SIZE(tests);
+	const int elem_size = sizeof(key) + sizeof(value);
+
+	fd = helper_fill_hashmap(max_entries);
+	// Alloc memory considering the largest buffer
+	buf = malloc(elem_size * tests[test_len-1]);
+	assert(buf != NULL);
+
+test:
+	entries = tests[k];
+	buf_len = elem_size*tests[k];
+	k++;
+	clock_gettime(clk_id, &begin);
+	errno = 0;
+	i = 0;
+	while (errno == 0) {
+		bpf_map_dump(fd, !i ? NULL : &key,
+				  buf, &buf_len);
+		if (errno)
+			break;
+		num_ent = buf_len / elem_size;
+		for (j = 0, off = 0;  j < num_ent; j++) {
+			key = *((uint64_t *)(buf + off));
+			off += sizeof(key);
+			value = *((uint64_t *)(buf + off));
+			off += sizeof(value);
+		}
+		i += num_ent;
+	}
+	clock_gettime(clk_id, &end);
+	assert(i  == max_entries);
+	dump_time_spent = NSEC_PER_SEC * (end.tv_sec - begin.tv_sec) +
+			  end.tv_nsec - begin.tv_nsec;
+	next_key_valid = true;
+	clock_gettime(clk_id, &begin);
+	assert(bpf_map_get_next_key(fd, NULL, &key) == 0);
+	for (i = 0; next_key_valid; i++) {
+		next_key_valid = bpf_map_get_next_key(fd, &key, &next_key) == 0;
+		assert(bpf_map_lookup_elem(fd, &key, &value) == 0);
+		key = next_key;
+	}
+	clock_gettime(clk_id, &end);
+	time_spent = NSEC_PER_SEC * (end.tv_sec - begin.tv_sec) +
+		     end.tv_nsec - begin.tv_nsec;
+	res = (1-((double)dump_time_spent/time_spent))*100;
+	printf("buf_len_%u:\t %llu entry-by-entry: %llu improvement %lf\n",
+	       entries, dump_time_spent, time_spent, res);
+	assert(i  == max_entries);
+
+	if (k < test_len)
+		goto test;
+	free(buf);
+	close(fd);
+}
+
 static void test_hashmap_zero_seed(void)
 {
 	int i, first, second, old_flags;
@@ -1736,6 +1806,7 @@ static void run_all_tests(void)
 	test_hashmap_walk(0, NULL);
 	test_hashmap_zero_seed();
 	test_hashmap_dump();
+	test_hashmap_dump_perf();
 
 	test_arraymap(0, NULL);
 	test_arraymap_percpu(0, NULL);
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call Brian Vazquez
@ 2019-06-27 22:12   ` Alexei Starovoitov
  2019-06-28 17:49     ` Brian Vazquez
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2019-06-27 22:12 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, linux-kernel, netdev, bpf

On Thu, Jun 27, 2019 at 01:24:13PM -0700, Brian Vazquez wrote:
> This introduces a new command to retrieve a variable number of entries
> from a bpf map wrapping the existing bpf methods:
> map_get_next_key and map_lookup_elem
> 
> Note that map_dump doesn't guarantee that reading the entire table is
> consistent since this function is always racing with kernel and user code
> but the same behaviour is found when the entire table is walked using
> the current interfaces: map_get_next_key + map_lookup_elem.
> It is also important to note that when a locked map is provided it is
> consistent only for 1 entry at the time, meaning that the buf returned
> might or might not be consistent.

Please explain the api behavior and corner cases in the commit log
or in code comments.

Would it make sense to return last key back into prev_key,
so that next map_dump command doesn't need to copy it from the
buffer?

> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> ---
>  include/uapi/linux/bpf.h |   9 ++++
>  kernel/bpf/syscall.c     | 108 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 117 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index b077507efa3f3..1d753958874df 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -106,6 +106,7 @@ enum bpf_cmd {
>  	BPF_TASK_FD_QUERY,
>  	BPF_MAP_LOOKUP_AND_DELETE_ELEM,
>  	BPF_MAP_FREEZE,
> +	BPF_MAP_DUMP,
>  };
>  
>  enum bpf_map_type {
> @@ -385,6 +386,14 @@ union bpf_attr {
>  		__u64		flags;
>  	};
>  
> +	struct { /* struct used by BPF_MAP_DUMP command */
> +		__u32		map_fd;
> +		__aligned_u64	prev_key;
> +		__aligned_u64	buf;
> +		__aligned_u64	buf_len; /* input/output: len of buf */
> +		__u64		flags;
> +	} dump;
> +
>  	struct { /* anonymous struct used by BPF_PROG_LOAD command */
>  		__u32		prog_type;	/* one of enum bpf_prog_type */
>  		__u32		insn_cnt;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index a1823a50f9be0..7653346b5cfd1 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1097,6 +1097,111 @@ static int map_get_next_key(union bpf_attr *attr)
>  	return err;
>  }
>  
> +/* last field in 'union bpf_attr' used by this command */
> +#define BPF_MAP_DUMP_LAST_FIELD dump.buf_len
> +
> +static int map_dump(union bpf_attr *attr)
> +{
> +	void __user *ukey = u64_to_user_ptr(attr->dump.prev_key);
> +	void __user *ubuf = u64_to_user_ptr(attr->dump.buf);
> +	u32 __user *ubuf_len = u64_to_user_ptr(attr->dump.buf_len);
> +	int ufd = attr->dump.map_fd;
> +	struct bpf_map *map;
> +	void *buf, *prev_key, *key, *value;
> +	u32 value_size, elem_size, buf_len, cp_len;
> +	struct fd f;
> +	int err;
> +
> +	if (CHECK_ATTR(BPF_MAP_DUMP))
> +		return -EINVAL;
> +
> +	attr->flags = 0;
> +	if (attr->dump.flags & ~BPF_F_LOCK)
> +		return -EINVAL;
> +
> +	f = fdget(ufd);
> +	map = __bpf_map_get(f);
> +	if (IS_ERR(map))
> +		return PTR_ERR(map);
> +	if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
> +		err = -EPERM;
> +		goto err_put;
> +	}
> +
> +	if ((attr->dump.flags & BPF_F_LOCK) &&
> +	    !map_value_has_spin_lock(map)) {
> +		err = -EINVAL;
> +		goto err_put;
> +	}
> +
> +	if (map->map_type == BPF_MAP_TYPE_QUEUE ||
> +	    map->map_type == BPF_MAP_TYPE_STACK) {
> +		err = -ENOTSUPP;
> +		goto err_put;
> +	}
> +
> +	value_size = bpf_map_value_size(map);
> +
> +	err = get_user(buf_len, ubuf_len);
> +	if (err)
> +		goto err_put;
> +
> +	elem_size = map->key_size + value_size;
> +	if (buf_len < elem_size) {
> +		err = -EINVAL;
> +		goto err_put;
> +	}
> +
> +	if (ukey) {
> +		prev_key = __bpf_copy_key(ukey, map->key_size);
> +		if (IS_ERR(prev_key)) {
> +			err = PTR_ERR(prev_key);
> +			goto err_put;
> +		}
> +	} else {
> +		prev_key = NULL;
> +	}
> +
> +	err = -ENOMEM;
> +	buf = kmalloc(elem_size, GFP_USER | __GFP_NOWARN);
> +	if (!buf)
> +		goto err_put;
> +
> +	key = buf;
> +	value = key + map->key_size;
> +	for (cp_len = 0;  cp_len + elem_size <= buf_len ; cp_len += elem_size) {

checkpatch.pl please.

> +next:
> +		if (signal_pending(current)) {
> +			err = -EINTR;
> +			break;
> +		}
> +
> +		rcu_read_lock();
> +		err = map->ops->map_get_next_key(map, prev_key, key);
> +		rcu_read_unlock();
> +
> +		if (err)
> +			break;

should probably be only for ENOENT case?
and other errors should be returned to user ?

> +
> +		if (bpf_map_copy_value(map, key, value, attr->dump.flags))
> +			goto next;

only for ENOENT as well?
and instead of goto use continue and move cp_len+= to the end after prev_key=key?

> +
> +		if (copy_to_user(ubuf + cp_len, buf, elem_size))
> +			break;

return error to user?

> +
> +		prev_key = key;
> +	}
> +
> +	if (cp_len)
> +		err = 0;

this will mask any above errors if there was at least one element copied.

> +	if (copy_to_user(ubuf_len, &cp_len, sizeof(cp_len)))
> +		err = -EFAULT;
> +	kfree(buf);
> +err_put:
> +	fdput(f);
> +	return err;
> +}
> +
>  #define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value
>  
>  static int map_lookup_and_delete_elem(union bpf_attr *attr)
> @@ -2891,6 +2996,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>  	case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
>  		err = map_lookup_and_delete_elem(&attr);
>  		break;
> +	case BPF_MAP_DUMP:
> +		err = map_dump(&attr);
> +		break;
>  	default:
>  		err = -EINVAL;
>  		break;
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v2 0/6]  bpf: add BPF_MAP_DUMP command to
  2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
                   ` (5 preceding siblings ...)
  2019-06-27 20:24 ` [RFC PATCH bpf-next v2 6/6] selftests/bpf: add test to measure performance of BPF_MAP_DUMP Brian Vazquez
@ 2019-06-27 22:14 ` Alexei Starovoitov
  2019-06-28 17:50   ` Brian Vazquez
  6 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2019-06-27 22:14 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, linux-kernel, netdev, bpf

On Thu, Jun 27, 2019 at 01:24:11PM -0700, Brian Vazquez wrote:
> This introduces a new command to retrieve a variable number of entries
> from a bpf map.
> 
> This new command can be executed from the existing BPF syscall as
> follows:
> 
> err =  bpf(BPF_MAP_DUMP, union bpf_attr *attr, u32 size)
> using attr->dump.map_fd, attr->dump.prev_key, attr->dump.buf,
> attr->dump.buf_len
> returns zero or negative error, and populates buf and buf_len on
> succees
> 
> This implementation is wrapping the existing bpf methods:
> map_get_next_key and map_lookup_elem
> the results show that even with a 1-elem_size buffer, it runs ~40 faster
> than the current implementation, improvements of ~85% are reported when
> the buffer size is increased, although, after the buffer size is around
> 5% of the total number of entries there's no huge difference in
> increasing
> it.

was it with kpti and retpoline mitigations?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call
  2019-06-27 22:12   ` Alexei Starovoitov
@ 2019-06-28 17:49     ` Brian Vazquez
  0 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-28 17:49 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, linux-kernel, netdev, bpf

> Please explain the api behavior and corner cases in the commit log
> or in code comments.

Ack, will prepare a new version adding those.

> Would it make sense to return last key back into prev_key,
> so that next map_dump command doesn't need to copy it from the
> buffer?

Actually that's a good idea.


> checkpatch.pl please.

 I did use the script and it didn't complain, are you seeing something?

> > +next:
> > +             if (signal_pending(current)) {
> > +                     err = -EINTR;
> > +                     break;
> > +             }
> > +
> > +             rcu_read_lock();
> > +             err = map->ops->map_get_next_key(map, prev_key, key);
> > +             rcu_read_unlock();
> > +
> > +             if (err)
> > +                     break;
>
> should probably be only for ENOENT case?

Yes, this makes sense.

> and other errors should be returned to user ?

and what if the error happened when we had already copied some
entries? Current behavior masks the error to 0 if we copied at least 1
element

>
> > +
> > +             if (bpf_map_copy_value(map, key, value, attr->dump.flags))
> > +                     goto next;
>
> only for ENOENT as well?
> and instead of goto use continue and move cp_len+= to the end after prev_key=key?

Right

>
> > +
> > +             if (copy_to_user(ubuf + cp_len, buf, elem_size))
> > +                     break;
>
> return error to user?
>
> > +
> > +             prev_key = key;
> > +     }
> > +
> > +     if (cp_len)
> > +             err = 0;
>
> this will mask any above errors if there was at least one element copied.

So in general if we copied elements and suddenly we find and error we
should return that error and maybe set cp_len to 0 to 'invalidate' the
data that was already copied?
Yes, I think that sounds like the correct thing to do, what do you think?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to
  2019-06-27 22:14 ` [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Alexei Starovoitov
@ 2019-06-28 17:50   ` Brian Vazquez
  0 siblings, 0 replies; 11+ messages in thread
From: Brian Vazquez @ 2019-06-28 17:50 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann,
	David S . Miller, Stanislav Fomichev, Willem de Bruijn,
	Petar Penkov, linux-kernel, netdev, bpf

> was it with kpti and retpoline mitigations?

No, it wasn't. Will get back with new numbers.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-28 17:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-27 20:24 [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Brian Vazquez
2019-06-27 20:24 ` [RFC PATCH bpf-next v2 1/6] bpf: add bpf_map_value_size and bp_map_copy_value helper functions Brian Vazquez
2019-06-27 20:24 ` [RFC PATCH bpf-next v2 2/6] bpf: add BPF_MAP_DUMP command to access more than one entry per call Brian Vazquez
2019-06-27 22:12   ` Alexei Starovoitov
2019-06-28 17:49     ` Brian Vazquez
2019-06-27 20:24 ` [RFC PATCH bpf-next v2 3/6] bpf: keep bpf.h in sync with tools/ Brian Vazquez
2019-06-27 20:24 ` [RFC PATCH bpf-next v2 4/6] libbpf: support BPF_MAP_DUMP command Brian Vazquez
2019-06-27 20:24 ` [RFC PATCH bpf-next v2 5/6] selftests/bpf: test BPF_MAP_DUMP command on a bpf hashmap Brian Vazquez
2019-06-27 20:24 ` [RFC PATCH bpf-next v2 6/6] selftests/bpf: add test to measure performance of BPF_MAP_DUMP Brian Vazquez
2019-06-27 22:14 ` [RFC PATCH bpf-next v2 0/6] bpf: add BPF_MAP_DUMP command to Alexei Starovoitov
2019-06-28 17:50   ` Brian Vazquez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).