[RFC bpf-next 0/3] bpf: adding map batch processing support

* [RFC bpf-next 0/3] bpf: adding map batch processing support
@ 2019-11-07 21:20 Brian Vazquez
  2019-11-07 21:20 ` [RFC bpf-next 1/3] " Brian Vazquez
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Brian Vazquez @ 2019-11-07 21:20 UTC (permalink / raw)
  To: Brian Vazquez, Alexei Starovoitov, Daniel Borkmann, David S . Miller
  Cc: Stanislav Fomichev, linux-kernel, netdev, bpf, Brian Vazquez,
	Yonghong Song, Petar Penkov, Willem de Bruijn

This is a follow up in the effort to batch bpf map operations to reduce
the syscall overhead with the map_ops. I initially proposed the idea and
the discussion is here:
https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/

Yonghong talked at the LPC about this and also proposed and idea that
handles the special weird case of hashtables by doing traversing using
the buckets as a reference instead of a key. Discussion is here:
https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/

This RFC proposes a way to extend batch operations for more data
structures by creating generic batch functions that can be used instead
of implementing the operations for each individual data structure,
reducing the code that needs to be maintained. The series contains the
patches used in Yonghong's RFC and the patch that adds the generic
implementation of the operations plus some testing with pcpu hashmaps
and arrays. Note that pcpu hashmap shouldn't use the generic
implementation and it either should have its own implementation or share
the one introduced by Yonghong, I added that just as an example to show
that the generic implementation can be easily added to a data structure.

What I want to achieve with this RFC is to collect early feedback and see if
there's any major concern about this before I move forward. I do plan
to better separate this into different patches and explain them properly
in the commit messages.

Current known issues where I would like to discuss are the followings:

- Because Yonghong's UAPI definition was done specifically for
  iterating buckets, the batch field is u64 and is treated as an u64
  instead of an opaque pointer, this won't work for other data structures
  that are going to use a key as a batch token with a size greater than
  64. Although I think at this point the only key that couldn't be
  treated as a u64 is the key of a hashmap, and the hashmap won't use
  the generic interface.
- Not all the data structures use delete (because it's not a valid
  operation) i.e. arrays. So maybe lookup_and_delete_batch command is
  not needed and we can handle that operation with a lookup_batch and a
  flag.
- For delete_batch (not just the lookup_and_delete_batch). Is this
  operation really needed? If so, shouldn't it be better if the
  behaviour is delete the keys provided? I did that with my generic
  implementation but Yonghong's delete_batch for a hashmap deletes
  buckets.

Brian Vazquez (1):
  bpf: add generic batch support

Yonghong Song (2):
  bpf: adding map batch processing support
  tools/bpf: test bpf_map_lookup_and_delete_batch()

 include/linux/bpf.h                           |  21 +
 include/uapi/linux/bpf.h                      |  22 +
 kernel/bpf/arraymap.c                         |   4 +
 kernel/bpf/hashtab.c                          | 331 ++++++++++
 kernel/bpf/syscall.c                          | 573 ++++++++++++++----
 tools/include/uapi/linux/bpf.h                |  22 +
 tools/lib/bpf/bpf.c                           |  59 ++
 tools/lib/bpf/bpf.h                           |  13 +
 tools/lib/bpf/libbpf.map                      |   4 +
 .../map_tests/map_lookup_and_delete_batch.c   | 245 ++++++++
 .../map_lookup_and_delete_batch_array.c       | 118 ++++
 11 files changed, 1292 insertions(+), 120 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch.c
 create mode 100644 tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch_array.c

-- 
2.24.0.432.g9d3f5f5b63-goog

^ permalink raw reply	[flat|nested] 7+ messages in thread