From: Alexei Starovoitov <ast@plumgrid.com>
To: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Michael Holzheu <holzheu@linux.vnet.ibm.com>,
Zi Shen Lim <zlim.lnx@gmail.com>,
linux-api@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [PATCH net-next 0/4] bpf: introduce bpf_tail_call() helper
Date: Tue, 19 May 2015 16:59:02 -0700 [thread overview]
Message-ID: <1432079946-9878-1-git-send-email-ast@plumgrid.com> (raw)
Hi All,
introduce bpf_tail_call(ctx, &jmp_table, index) helper function
which can be used from BPF programs like:
int bpf_prog(struct pt_regs *ctx)
{
...
bpf_tail_call(ctx, &jmp_table, index);
...
}
that is roughly equivalent to:
int bpf_prog(struct pt_regs *ctx)
{
...
if (jmp_table[index])
return (*jmp_table[index])(ctx);
...
}
The important detail that it's not a normal call, but a tail call.
The kernel stack is precious, so this helper reuses the current
stack frame and jumps into another BPF program without adding
extra call frame.
It's trivially done in interpreter and a bit trickier in JITs.
Use cases:
- simplify complex programs
- dispatch into other programs
(for example: index in jump table can be syscall number or network protocol)
- build dynamic chains of programs
The chain of tail calls can form unpredictable dynamic loops therefore
tail_call_cnt is used to limit the number of calls and currently is set to 32.
patch 1 - support bpf_tail_call() in interpreter
patch 2 - support in x64 JIT
We've discussed what's neccessary to support it in arm64/s390 JITs
and it looks fine.
patch 3 - sample example for tracing
patch 4 - sample example for networking
More details in every patch.
This set went through several iterations of reviews/fixes and older
attempts can be seen:
https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/log/?h=tail_call_v[123456]
- tail_call_v1 does it without touching JITs but introduces overhead
for all programs that don't use this helper function.
- tail_call_v2 still has some overhead and x64 JIT does full stack
unwind (prologue skipping optimization wasn't there)
- tail_call_v3 reuses 'call' instruction encoding and has interpreter
overhead for every normal call
- tail_call_v4 fixes above architectural shortcomings and v5,v6 fix few
more bugs
This last tail_call_v6 approach seems to be the best.
Alexei Starovoitov (4):
bpf: allow bpf programs to tail-call other bpf programs
x86: bpf_jit: implement bpf_tail_call() helper
samples/bpf: bpf_tail_call example for tracing
samples/bpf: bpf_tail_call example for networking
arch/x86/net/bpf_jit_comp.c | 150 +++++++++++++++++----
include/linux/bpf.h | 22 ++++
include/linux/filter.h | 2 +-
include/uapi/linux/bpf.h | 10 ++
kernel/bpf/arraymap.c | 113 +++++++++++++++-
kernel/bpf/core.c | 73 ++++++++++-
kernel/bpf/syscall.c | 23 +++-
kernel/bpf/verifier.c | 17 +++
kernel/trace/bpf_trace.c | 2 +
net/core/filter.c | 2 +
samples/bpf/Makefile | 8 ++
samples/bpf/bpf_helpers.h | 4 +
samples/bpf/bpf_load.c | 57 ++++++--
samples/bpf/sockex3_kern.c | 303 +++++++++++++++++++++++++++++++++++++++++++
samples/bpf/sockex3_user.c | 66 ++++++++++
samples/bpf/tracex5_kern.c | 75 +++++++++++
samples/bpf/tracex5_user.c | 46 +++++++
17 files changed, 928 insertions(+), 45 deletions(-)
create mode 100644 samples/bpf/sockex3_kern.c
create mode 100644 samples/bpf/sockex3_user.c
create mode 100644 samples/bpf/tracex5_kern.c
create mode 100644 samples/bpf/tracex5_user.c
--
1.7.9.5
next reply other threads:[~2015-05-19 23:59 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-19 23:59 Alexei Starovoitov [this message]
2015-05-19 23:59 ` [PATCH net-next 1/4] bpf: allow bpf programs to tail-call other bpf programs Alexei Starovoitov
2015-05-20 0:13 ` Andy Lutomirski
2015-05-20 0:18 ` Alexei Starovoitov
2015-05-21 16:20 ` Andy Lutomirski
2015-05-21 16:40 ` Alexei Starovoitov
2015-05-21 16:43 ` Andy Lutomirski
2015-05-21 16:53 ` Alexei Starovoitov
2015-05-21 16:57 ` Andy Lutomirski
2015-05-21 17:16 ` Alexei Starovoitov
2015-05-21 16:17 ` Daniel Borkmann
2015-05-19 23:59 ` [PATCH net-next 2/4] x86: bpf_jit: implement bpf_tail_call() helper Alexei Starovoitov
2015-05-20 0:11 ` Andy Lutomirski
2015-05-20 0:14 ` Alexei Starovoitov
2015-05-20 16:05 ` Andy Lutomirski
2015-05-20 16:29 ` Alexei Starovoitov
2015-05-19 23:59 ` [PATCH net-next 3/4] samples/bpf: bpf_tail_call example for tracing Alexei Starovoitov
2015-05-19 23:59 ` [PATCH net-next 4/4] samples/bpf: bpf_tail_call example for networking Alexei Starovoitov
2015-05-21 21:08 ` [PATCH net-next 0/4] bpf: introduce bpf_tail_call() helper David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1432079946-9878-1-git-send-email-ast@plumgrid.com \
--to=ast@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=holzheu@linux.vnet.ibm.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=zlim.lnx@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).