From: Florian Westphal <fw@strlen.de>
To: <netfilter-devel@vger.kernel.org>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, me@ubique.spb.ru,
Florian Westphal <fw@strlen.de>
Subject: [PATCH RFC nf-next 0/9] netfilter: bpf base hook program generator
Date: Thu, 14 Oct 2021 14:10:36 +0200 [thread overview]
Message-ID: <20211014121046.29329-1-fw@strlen.de> (raw)
This series adds a bpf program generator for netfilter base hooks.
Currently netfilter hooks are invoked via nf_hook_slow, which walks
an array of function_addr:arg pairs:
for i in hooks[]; do
verdict = hooks[i]->addr(hooks->[i].arg, skb, state);
switch (verdict) { ....
The autogenerator unrolls this loop and builds a bpf program
that does:
state->priv = hooks->[0].hook_arg;
v = firstfunction(state);
if (v != ACCEPT) goto out;
state->priv = hooks->[1].hook_arg;
v = secondfunction(state); ...
if (v != ACCEPT) goto out;
... and so on.
Indirections are converted to direct calls. Invocation of the
autogenerated programs is done via bpf dispatcher from nf_hook().
As long as NF_QUEUE is not used, normal data path will not call
nf_hook_slow "interpreter" anymore.
Purpose of this is to eventually add a 'netfilter prog type' to bpf and
permit attachment of (userspace generated) bpf programs to the netfilter
machinery, e.g. 'attach bpf prog id 1234 to ipv6 PREROUTING at prio -300'.
The autogenerator would be adjusted so that these userspace-bpf programs
are invoked just like native c functions.
This will require to expose the context structure (program argument,
'struct __nf_hook_state *' and rewrite read-accesses to it to match internal
nf_hook_state layout plus new verifier checks on permitted return values
(e.g. a plain 'return NF_STOLEN' results in a skb leak).
Known problems:
- I did not convert all hooks to the new scheme, e.g. ILA won't compile ATM.
- checkpatch complains about a few indendation issues, line lengths etc.
Future work:
add support for NAT hooks, they still use indirect calls, but those
are less of a problem because these get called only once per connection.
Could annotate ops struct as to what kind of verdicts the
C function can return. This would allow to elide retval
check when hook can only return NF_ACCEPT.
Could add extra support for INGRESS hook to move more code from
inline functions to the autogenerated program.
Initial tests show roughly 8% performance improvement in a netns-to-netns
UDP forward test with conntrack enabled in the 'forward' net namespace.
I'm looking for feedback on the chosen approach.
Thanks,
Florian
Florian Westphal (9):
netfilter: nf_queue: carry index in hook state
netfilter: nat: split nat hook iteration into a helper
netfilter: remove hook index from nf_hook_slow arguments
netfilter: make hook functions accept only one argument
netfilter: reduce allowed hook count to 32
netfilter: add bpf base hook program generator
netfilter: core: do not rebuild bpf program on dying netns
netfilter: ingress: switch to invocation via bpf
netfilter: hook_jit: add prog cache
drivers/net/ipvlan/ipvlan_l3s.c | 4 +-
include/linux/netfilter.h | 72 ++-
include/linux/netfilter_ingress.h | 17 +-
include/net/netfilter/br_netfilter.h | 7 +-
include/net/netfilter/nf_flow_table.h | 6 +-
include/net/netfilter/nf_hook_bpf.h | 14 +
include/net/netfilter/nf_queue.h | 3 +-
include/net/netfilter/nf_synproxy.h | 6 +-
net/bridge/br_input.c | 3 +-
net/bridge/br_netfilter_hooks.c | 30 +-
net/bridge/br_netfilter_ipv6.c | 5 +-
net/bridge/netfilter/ebtable_broute.c | 8 +-
net/bridge/netfilter/ebtable_filter.c | 5 +-
net/bridge/netfilter/ebtable_nat.c | 5 +-
net/bridge/netfilter/nf_conntrack_bridge.c | 8 +-
net/ipv4/netfilter/arptable_filter.c | 5 +-
net/ipv4/netfilter/ipt_CLUSTERIP.c | 6 +-
net/ipv4/netfilter/iptable_filter.c | 5 +-
net/ipv4/netfilter/iptable_mangle.c | 7 +-
net/ipv4/netfilter/iptable_nat.c | 6 +-
net/ipv4/netfilter/iptable_raw.c | 5 +-
net/ipv4/netfilter/iptable_security.c | 5 +-
net/ipv4/netfilter/nf_defrag_ipv4.c | 5 +-
net/ipv6/netfilter/ip6table_filter.c | 5 +-
net/ipv6/netfilter/ip6table_mangle.c | 6 +-
net/ipv6/netfilter/ip6table_nat.c | 6 +-
net/ipv6/netfilter/ip6table_raw.c | 5 +-
net/ipv6/netfilter/ip6table_security.c | 5 +-
net/ipv6/netfilter/nf_defrag_ipv6_hooks.c | 5 +-
net/netfilter/Kconfig | 10 +
net/netfilter/Makefile | 1 +
net/netfilter/core.c | 103 +++-
net/netfilter/ipvs/ip_vs_core.c | 48 +-
net/netfilter/nf_conntrack_proto.c | 34 +-
net/netfilter/nf_flow_table_inet.c | 9 +-
net/netfilter/nf_flow_table_ip.c | 12 +-
net/netfilter/nf_hook_bpf.c | 569 +++++++++++++++++++++
net/netfilter/nf_nat_core.c | 50 +-
net/netfilter/nf_nat_proto.c | 56 +-
net/netfilter/nf_queue.c | 12 +-
net/netfilter/nf_synproxy_core.c | 8 +-
net/netfilter/nft_chain_filter.c | 48 +-
net/netfilter/nft_chain_nat.c | 7 +-
net/netfilter/nft_chain_route.c | 22 +-
security/selinux/hooks.c | 58 +--
45 files changed, 1001 insertions(+), 315 deletions(-)
create mode 100644 include/net/netfilter/nf_hook_bpf.h
create mode 100644 net/netfilter/nf_hook_bpf.c
--
2.32.0
next reply other threads:[~2021-10-14 12:11 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-14 12:10 Florian Westphal [this message]
2021-10-14 12:10 ` [PATCH 1/1] netfilter: add bpf base hook program generator Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 1/9] netfilter: nf_queue: carry index in hook state Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 2/9] netfilter: nat: split nat hook iteration into a helper Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 3/9] netfilter: remove hook index from nf_hook_slow arguments Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 4/9] netfilter: make hook functions accept only one argument Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 5/9] netfilter: reduce allowed hook count to 32 Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 6/9] netfilter: add bpf base hook program generator Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 7/9] netfilter: core: do not rebuild bpf program on dying netns Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 8/9] netfilter: ingress: switch to invocation via bpf Florian Westphal
2021-10-14 12:10 ` [PATCH RFC nf-next 9/9] netfilter: hook_jit: add prog cache Florian Westphal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211014121046.29329-1-fw@strlen.de \
--to=fw@strlen.de \
--cc=bpf@vger.kernel.org \
--cc=me@ubique.spb.ru \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).