[RFC v2 00/23] io_uring BPF requests

* [RFC v2 00/23] io_uring BPF requests
@ 2021-05-19 14:13 Pavel Begunkov
  2021-05-19 14:13 ` [PATCH 01/23] io_uring: shuffle rarely used ctx fields Pavel Begunkov
                   ` (23 more replies)
  0 siblings, 24 replies; 39+ messages in thread
From: Pavel Begunkov @ 2021-05-19 14:13 UTC (permalink / raw)
  To: io-uring, netdev, bpf, linux-kernel
  Cc: Jens Axboe, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Horst Schirmeier, Franz-B . Tuneke, Christian Dietrich

The main problem solved is feeding completion information of other
requests in a form of CQEs back into BPF. I decided to wire up support
for multiple completion queues (aka CQs) and give BPF programs access to
them, so leaving userspace in control over synchronisation that should
be much more flexible that the link-based approach.

For instance, there can be a separate CQ for each BPF program, so no
extra sync is needed, and communication can be done by submitting a
request targeting a neighboring CQ or submitting a CQE there directly
(see test3 below). CQ is choosen by sqe->cq_idx, so everyone can
cross-fire if willing.

A bunch of other features was added to play around (see v1 changelog
below or test1), some are just experimental only. The interfaces are
not even close to settle.
Note: there are problems known, one may live-lock a task, unlikely
to happen but better to be aware.

For convenience git branch for the kernel part is at [1],
libbpf + examples [2]. Examples are written in restricted C and libbpf,
and are under examples/bpf/, see [3], with 4 BPF programs and 4
corresponding test cases in uring.c. It's already shaping interesting
to play with.

test1:            just a set of use examples for features
test2/counting:   ticks-react N times using timeout reqs and CQ waiting
test3/pingpong:   two BPF reqs do message-based communication by
                  repeatedly writing a CQE to another program's CQ and
                  waiting for a response
test4/write_file: BPF writes N bytes to a file keeping QD>1

[1] https://github.com/isilence/linux/tree/ebpf_v2
[2] https://github.com/isilence/liburing/tree/ebpf_v2
[3] https://github.com/isilence/liburing/tree/ebpf_v2/examples/bpf

since v1:
- several bug fixes
- support multiple CQs
- allow BPF requests to wait on CQs
- BPF helpers for emit/reap CQE
- expose user_data to BPF program
- sleepable + let BPF read/write from userspace

Pavel Begunkov (23):
  io_uring: shuffle rarely used ctx fields
  io_uring: localise fixed resources fields
  io_uring: remove dependency on ring->sq/cq_entries
  io_uring: deduce cq_mask from cq_entries
  io_uring: kill cached_cq_overflow
  io_uring: rename io_get_cqring
  io_uring: extract struct for CQ
  io_uring: internally pass CQ indexes
  io_uring: extract cq size helper
  io_uring: add support for multiple CQs
  io_uring: enable mmap'ing additional CQs
  bpf: add IOURING program type
  io_uring: implement bpf prog registration
  io_uring: add support for bpf requests
  io_uring: enable BPF to submit SQEs
  io_uring: enable bpf to submit CQEs
  io_uring: enable bpf to reap CQEs
  libbpf: support io_uring
  io_uring: pass user_data to bpf executor
  bpf: Add bpf_copy_to_user() helper
  io_uring: wire bpf copy to user
  io_uring: don't wait on CQ exclusively
  io_uring: enable bpf reqs to wait for CQs

 fs/io_uring.c                  | 794 +++++++++++++++++++++++++++------
 include/linux/bpf.h            |   1 +
 include/linux/bpf_types.h      |   2 +
 include/uapi/linux/bpf.h       |  12 +
 include/uapi/linux/io_uring.h  |  15 +-
 kernel/bpf/helpers.c           |  17 +
 kernel/bpf/syscall.c           |   1 +
 kernel/bpf/verifier.c          |   5 +-
 tools/include/uapi/linux/bpf.h |   7 +
 tools/lib/bpf/libbpf.c         |   7 +
 10 files changed, 722 insertions(+), 139 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 39+ messages in thread