[RFC PATCH 0/3] sk_buff: add skb extension infrastructure

* [RFC PATCH 0/3] sk_buff: add skb extension infrastructure
@ 2018-11-26 11:38 Florian Westphal
  2018-11-26 11:38 ` [RFC PATCH 1/3] netfilter: avoid using skb->nf_bridge directly Florian Westphal
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Florian Westphal @ 2018-11-26 11:38 UTC (permalink / raw)
  To: netdev

The (out-of-tree) Multipath-TCP implementation needs a significant amount
of extra space in the skb control buffer.

Increasing skb->cb[] size in mainline is a non-starter for memory and
and performance reasons (f.e. increase in cb size also moves several
frequently-accessed fields to other cache lines).

One approach that might work for MPTCP is to extend skb_shared_info instead
of sk_buff.  However, this comes with other drawbacks, e.g.  it either
needs special skb allocation to make sure there is enough space for such
'extended shinfo' at the end of data buffer (which makes this only useable
for tx path) or increased size of skb_shared_info.

This adds an extension infrastructure for sk_buff instead:
1. extension memory is released when the sk_buff is free'd.
2. data is shared after cloning an skb.
3. adding extension to an skb will COW the extension
   buffer if needed.

This is also how xfrm and bridge_nf extra data (skb->sp, skb->nf_bridge)
are handled.

In the future, protocols that need to store more than 48 bytes in skb->cb[]
could add a 'SKB_EXT_EXTRA_CB' or similar to allocate extra space.

Two new members are added to sk_buff:
1. 'active_extensions' byte (filling a hole), telling which extensions
   have been enabled for this skb.
2. extension pointer, located at the end of the sk_buff.
   If active_extensions byte is 0, pointer value is undefined.

Last patch converts nf_bridge to use the extension infrastructure:
The 'nf_bridge' pointer is removed, i.e. sk_buff size remains the same.

Extra code added to skb clone and free paths (to deal with
refcount/free of extension area) replace the existing code that
deals with skb->nf_bridge.

Conversion of skb->sp (ipsec/xfrm secpath) to an skb extension could be
done as a followup, but I'm reluctant to work on this before there is
agreement that this is the right direction.

Comments welcome.

 include/linux/netfilter_bridge.h     |   33 +++++---
 include/linux/skbuff.h               |  142 +++++++++++++++++++++++++++++------
 include/net/netfilter/br_netfilter.h |   14 ---
 net/Kconfig                          |    4 
 net/bridge/br_netfilter_hooks.c      |   39 +++------
 net/bridge/br_netfilter_ipv6.c       |    4 
 net/core/skbuff.c                    |  134 ++++++++++++++++++++++++++++++++-
 net/ipv4/ip_output.c                 |    1 
 net/ipv4/netfilter/nf_reject_ipv4.c  |    6 -
 net/ipv6/ip6_output.c                |    1 
 net/ipv6/netfilter/nf_reject_ipv6.c  |   10 +-
 net/netfilter/nf_log_common.c        |   20 ++--
 net/netfilter/nf_queue.c             |   50 ++++++++----
 net/netfilter/nfnetlink_queue.c      |   23 ++---
 net/netfilter/xt_physdev.c           |    2 
 15 files changed, 368 insertions(+), 115 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread