linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bpf: sk_lookup: Add user documentation
@ 2020-08-21 10:02 Jakub Sitnicki
  2020-08-24 21:47 ` Alexei Starovoitov
  0 siblings, 1 reply; 2+ messages in thread
From: Jakub Sitnicki @ 2020-08-21 10:02 UTC (permalink / raw)
  To: linux-doc
  Cc: netdev, bpf, linux-kernel, kernel-team, Jonathan Corbet,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh

Describe the purpose of BPF sk_lookup program, how it can be attached, when
it gets invoked, and what information gets passed to it. Point the reader
to examples and further documentation.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 Documentation/bpf/index.rst          |  1 +
 Documentation/bpf/prog_sk_lookup.rst | 98 ++++++++++++++++++++++++++++
 2 files changed, 99 insertions(+)
 create mode 100644 Documentation/bpf/prog_sk_lookup.rst

diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index 7df2465fd108..4f2874b729c3 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -52,6 +52,7 @@ Program types
    prog_cgroup_sysctl
    prog_flow_dissector
    bpf_lsm
+   prog_sk_lookup
 
 
 Map types
diff --git a/Documentation/bpf/prog_sk_lookup.rst b/Documentation/bpf/prog_sk_lookup.rst
new file mode 100644
index 000000000000..85a305c19bcd
--- /dev/null
+++ b/Documentation/bpf/prog_sk_lookup.rst
@@ -0,0 +1,98 @@
+.. SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+
+=====================
+BPF sk_lookup program
+=====================
+
+BPF sk_lookup program type (``BPF_PROG_TYPE_SK_LOOKUP``) introduces programmability
+into the socket lookup performed by the transport layer when a packet is to be
+delivered locally.
+
+When invoked BPF sk_lookup program can select a socket that will receive the
+incoming packet by calling the ``bpf_sk_assign()`` BPF helper function.
+
+Hooks for a common attach point (``BPF_SK_LOOKUP``) exist for both TCP and UDP.
+
+Motivation
+==========
+
+BPF sk_lookup program type was introduced to address setup scenarios where
+binding sockets to an address with ``bind()`` socket call is impractical, such
+as:
+
+1. receiving connections on a range of IP addresses, e.g. 192.0.2.0/24, when
+   binding to a wildcard address ``INADRR_ANY`` is not possible due to a port
+   conflict,
+2. receiving connections on all or a wide range of ports, i.e. an L7 proxy use
+   case.
+
+Such setups would require creating and ``bind()``'ing one socket to each of the
+IP address/port in the range, leading to resource consumption and potential
+latency spikes during socket lookup.
+
+Attachment
+==========
+
+BPF sk_lookup program can be attached to a network namespace with
+``bpf(BPF_LINK_CREATE, ...)`` syscall using the ``BPF_SK_LOOKUP`` attach type and a
+netns FD as attachment ``target_fd``.
+
+Multiple programs can be attached to one network namespace. Programs will be
+invoked in the same order as they were attached.
+
+Hooks
+=====
+
+The attached BPF sk_lookup programs run whenever the transport layer needs to
+find a listening (TCP) or an unconnected (UDP) socket for an incoming packet.
+
+Incoming traffic to established (TCP) and connected (UDP) sockets is delivered
+as usual without triggering the BPF sk_lookup hook.
+
+The attached BPF programs must return with either ``SK_PASS`` or ``SK_DROP``
+verdict code. As for other BPF program types that are network filters,
+``SK_PASS`` signifies that the socket lookup should continue on to regular
+hashtable-based lookup, while ``SK_DROP`` causes the transport layer to drop the
+packet.
+
+A BPF sk_lookup program can also select a socket to receive the packet by
+calling ``bpf_sk_assign()`` BPF helper. Typically, the program looks up a socket
+in a map holding sockets, such as ``SOCKMAP`` or ``SOCKHASH``, and passes a
+``struct bpf_sock *`` to ``bpf_sk_assign()`` helper to record the
+selection. Selecting a socket only takes effect if the program has terminated
+with ``SK_PASS`` code.
+
+When multiple programs are attached, the end result is determined from return
+codes of all the programs according to the following rules:
+
+1. If any program returned ``SK_PASS`` and selected a valid socket, the socket
+   is used as the result of the socket lookup.
+2. If more than one program returned ``SK_PASS`` and selected a socket, the last
+   selection takes effect.
+3. If any program returned ``SK_DROP``, and no program returned ``SK_PASS`` and
+   selected a socket, socket lookup fails.
+4. If all programs returned ``SK_PASS`` and none of them selected a socket,
+   socket lookup continues on.
+
+API
+===
+
+In its context, an instance of ``struct bpf_sk_lookup``, BPF sk_lookup program
+receives information about the packet that triggered the socket lookup. Namely:
+
+* IP version (``AF_INET`` or ``AF_INET6``),
+* L4 protocol identifier (``IPPROTO_TCP`` or ``IPPROTO_UDP``),
+* source and destination IP address,
+* source and destination L4 port,
+* the socket that has been selected with ``bpf_sk_assign()``.
+
+Refer to ``struct bpf_sk_lookup`` declaration in ``linux/bpf.h`` user API
+header, and `bpf-helpers(7)
+<https://man7.org/linux/man-pages/man7/bpf-helpers.7.html>`_ man-page section
+for ``bpf_sk_assign()`` for details.
+
+Example
+=======
+
+See ``tools/testing/selftests/bpf/prog_tests/sk_lookup.c`` for the reference
+implementation.
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] bpf: sk_lookup: Add user documentation
  2020-08-21 10:02 [PATCH] bpf: sk_lookup: Add user documentation Jakub Sitnicki
@ 2020-08-24 21:47 ` Alexei Starovoitov
  0 siblings, 0 replies; 2+ messages in thread
From: Alexei Starovoitov @ 2020-08-24 21:47 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: open list:DOCUMENTATION, Network Development, bpf, LKML,
	kernel-team, Jonathan Corbet, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, John Fastabend, KP Singh

On Fri, Aug 21, 2020 at 3:02 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> Describe the purpose of BPF sk_lookup program, how it can be attached, when
> it gets invoked, and what information gets passed to it. Point the reader
> to examples and further documentation.
>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>

Applied to bpf-next. Thanks

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-08-24 21:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21 10:02 [PATCH] bpf: sk_lookup: Add user documentation Jakub Sitnicki
2020-08-24 21:47 ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).