netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/18] Netfilter updates for net-next
@ 2018-05-23 18:42 Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 01/18] netfilter: fix fallout from xt/nf osf separation Pablo Neira Ayuso
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter updates for your net-next
tree, they are:

1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal.

2) Add support for map lookups to numgen, random and hash expressions,
   from Laura Garcia.

3) Allow to register nat hooks for iptables and nftables at the same
   time. Patchset from Florian Westpha.

4) Timeout support for rbtree sets.

5) ip6_rpfilter works needs interface for link-local addresses, from
   Vincent Bernat.

6) Add nf_ct_hook and nf_nat_hook structures and use them.

7) Do not drop packets on packets raceing to insert conntrack entries
   into hashes, this is particularly a problem in nfqueue setups.

8) Address fallout from xt_osf separation to nf_osf, patches
   from Florian Westphal and Fernando Mancera.

9) Remove reference to struct nft_af_info, which doesn't exist anymore.
   From Taehee Yoo.

This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
core: add missing __rcu annotation") in your tree and 2c205dd3981f
("netfilter: add struct nf_nat_hook and use it") coming in this batch.
This conflict can be solved by leaving the __rcu tag on
__netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
related to nf_nat_decode_session_hook - which is gone after
2c205dd3981f, as described by:

diff --cc net/netfilter/core.c
index e0ae4aae96f5,206fb2c4c319..168af54db975
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo
  EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
  #endif /* CONFIG_NF_CONNTRACK */
  
- static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
 -#ifdef CONFIG_NF_NAT_NEEDED
 -void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
 -EXPORT_SYMBOL(nf_nat_decode_session_hook);
 -#endif
 -
+ static void __net_init
+ __netfilter_net_init(struct nf_hook_entries __rcu **e, int max)
  {
  	int h;
  

I can also merge your net-next tree into nf-next, solve the conflict and
resend the pull request if you prefer so.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks.

----------------------------------------------------------------

The following changes since commit 289e1f4e9e4a09c73a1c0152bb93855ea351ccda:

  net: ipv4: ipconfig: fix unused variable (2018-05-13 20:27:25 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to 0c6bca747111dee19aa48c8f73d77fc85fcb8dd0:

  netfilter: nf_tables: remove nft_af_info. (2018-05-23 12:16:25 +0200)

----------------------------------------------------------------
Fernando Fernandez Mancera (1):
      netfilter: make NF_OSF non-visible symbol

Florian Westphal (9):
      netfilter: fix fallout from xt/nf osf separation
      netfilter: nf_tables: remove old nf_log based tracing
      netfilter: nf_nat: move common nat code to nat core
      netfilter: xtables: allow table definitions not backed by hook_ops
      netfilter: nf_tables: allow chain type to override hook register
      netfilter: core: export raw versions of add/delete hook functions
      netfilter: nf_nat: add nat hook register functions to nf_nat
      netfilter: nf_nat: add nat type hooks to nat core
      netfilter: lift one-nat-hook-only restriction

Laura Garcia Liebana (2):
      netfilter: nft_numgen: add map lookups for numgen random operations
      netfilter: nft_hash: add map lookups for hashing operations

Pablo Neira Ayuso (4):
      netfilter: nft_set_rbtree: add timeout support
      netfilter: add struct nf_ct_hook and use it
      netfilter: add struct nf_nat_hook and use it
      netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks

Taehee Yoo (1):
      netfilter: nf_tables: remove nft_af_info.

Vincent Bernat (1):
      netfilter: ip6t_rpfilter: provide input interface for route lookup

 include/linux/netfilter.h                |  34 +++-
 include/linux/netfilter/nf_osf.h         |   6 +
 include/net/netfilter/nf_nat.h           |   4 +
 include/net/netfilter/nf_nat_core.h      |  11 +-
 include/net/netfilter/nf_nat_l3proto.h   |  52 +-----
 include/net/netfilter/nf_tables.h        |   8 +-
 include/net/netns/nftables.h             |   2 -
 include/uapi/linux/netfilter/nf_osf.h    |   8 +-
 include/uapi/linux/netfilter/nf_tables.h |   4 +
 net/ipv4/netfilter/ip_tables.c           |   5 +-
 net/ipv4/netfilter/iptable_nat.c         |  85 ++++-----
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 135 ++++++--------
 net/ipv4/netfilter/nft_chain_nat_ipv4.c  |  52 ++----
 net/ipv6/netfilter/ip6_tables.c          |   5 +-
 net/ipv6/netfilter/ip6t_rpfilter.c       |   2 +
 net/ipv6/netfilter/ip6table_nat.c        |  84 ++++-----
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 129 ++++++--------
 net/ipv6/netfilter/nft_chain_nat_ipv6.c  |  48 ++---
 net/netfilter/Kconfig                    |   2 +-
 net/netfilter/core.c                     | 102 +++++++----
 net/netfilter/nf_conntrack_core.c        |  91 +++++++++-
 net/netfilter/nf_conntrack_netlink.c     |  10 +-
 net/netfilter/nf_internals.h             |   5 +
 net/netfilter/nf_nat_core.c              | 294 ++++++++++++++++++++++++++++---
 net/netfilter/nf_tables_api.c            |  87 ++-------
 net/netfilter/nf_tables_core.c           |  29 +--
 net/netfilter/nfnetlink_queue.c          |  28 ++-
 net/netfilter/nft_hash.c                 | 131 +++++++++++++-
 net/netfilter/nft_numgen.c               |  76 +++++++-
 net/netfilter/nft_set_rbtree.c           |  75 +++++++-
 30 files changed, 1033 insertions(+), 571 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/18] netfilter: fix fallout from xt/nf osf separation
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 02/18] netfilter: nf_tables: remove old nf_log based tracing Pablo Neira Ayuso
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

Stephen Rothwell says:
  today's linux-next build (x86_64 allmodconfig) produced this warning:
  ./usr/include/linux/netfilter/nf_osf.h:25: found __[us]{8,16,32,64} type without #include <linux/types.h>

Fix that up and also move kernel-private struct out of uapi (it was not
exposed in any released kernel version).

tested via allmodconfig build + make headers_check.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: bfb15f2a95cb ("netfilter: extract Passive OS fingerprint infrastructure from xt_osf")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter/nf_osf.h      | 6 ++++++
 include/uapi/linux/netfilter/nf_osf.h | 8 ++------
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/linux/netfilter/nf_osf.h b/include/linux/netfilter/nf_osf.h
index a2b39602e87d..0e114c492fb8 100644
--- a/include/linux/netfilter/nf_osf.h
+++ b/include/linux/netfilter/nf_osf.h
@@ -21,6 +21,12 @@ enum osf_fmatch_states {
 	FMATCH_OPT_WRONG,
 };
 
+struct nf_osf_finger {
+	struct rcu_head			rcu_head;
+	struct list_head		finger_entry;
+	struct nf_osf_user_finger	finger;
+};
+
 bool nf_osf_match(const struct sk_buff *skb, u_int8_t family,
 		  int hooknum, struct net_device *in, struct net_device *out,
 		  const struct nf_osf_info *info, struct net *net,
diff --git a/include/uapi/linux/netfilter/nf_osf.h b/include/uapi/linux/netfilter/nf_osf.h
index 45376eae31ef..8f2f2f403183 100644
--- a/include/uapi/linux/netfilter/nf_osf.h
+++ b/include/uapi/linux/netfilter/nf_osf.h
@@ -1,6 +1,8 @@
 #ifndef _NF_OSF_H
 #define _NF_OSF_H
 
+#include <linux/types.h>
+
 #define MAXGENRELEN	32
 
 #define NF_OSF_GENRE	(1 << 0)
@@ -57,12 +59,6 @@ struct nf_osf_user_finger {
 	struct nf_osf_opt	opt[MAX_IPOPTLEN];
 };
 
-struct nf_osf_finger {
-	struct rcu_head			rcu_head;
-	struct list_head		finger_entry;
-	struct nf_osf_user_finger	finger;
-};
-
 struct nf_osf_nlmsg {
 	struct nf_osf_user_finger	f;
 	struct iphdr			ip;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/18] netfilter: nf_tables: remove old nf_log based tracing
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 01/18] netfilter: fix fallout from xt/nf osf separation Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 03/18] netfilter: nft_numgen: add map lookups for numgen random operations Pablo Neira Ayuso
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

nfnetlink tracing is available since nft 0.6 (June 2016).
Remove old nf_log based tracing to avoid rule counter in main loop.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_core.c | 29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 9cf47c4cb9d5..d457d854fcae 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -41,7 +41,7 @@ static const struct nf_loginfo trace_loginfo = {
 
 static noinline void __nft_trace_packet(struct nft_traceinfo *info,
 					const struct nft_chain *chain,
-					int rulenum, enum nft_trace_types type)
+					enum nft_trace_types type)
 {
 	const struct nft_pktinfo *pkt = info->pkt;
 
@@ -52,22 +52,16 @@ static noinline void __nft_trace_packet(struct nft_traceinfo *info,
 	info->type = type;
 
 	nft_trace_notify(info);
-
-	nf_log_trace(nft_net(pkt), nft_pf(pkt), nft_hook(pkt), pkt->skb,
-		     nft_in(pkt), nft_out(pkt), &trace_loginfo,
-		     "TRACE: %s:%s:%s:%u ",
-		     chain->table->name, chain->name, comments[type], rulenum);
 }
 
 static inline void nft_trace_packet(struct nft_traceinfo *info,
 				    const struct nft_chain *chain,
 				    const struct nft_rule *rule,
-				    int rulenum,
 				    enum nft_trace_types type)
 {
 	if (static_branch_unlikely(&nft_trace_enabled)) {
 		info->rule = rule;
-		__nft_trace_packet(info, chain, rulenum, type);
+		__nft_trace_packet(info, chain, type);
 	}
 }
 
@@ -133,7 +127,6 @@ static noinline void nft_update_chain_stats(const struct nft_chain *chain,
 struct nft_jumpstack {
 	const struct nft_chain	*chain;
 	const struct nft_rule	*rule;
-	int			rulenum;
 };
 
 unsigned int
@@ -146,7 +139,6 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	struct nft_regs regs;
 	unsigned int stackptr = 0;
 	struct nft_jumpstack jumpstack[NFT_JUMP_STACK_SIZE];
-	int rulenum;
 	unsigned int gencursor = nft_genmask_cur(net);
 	struct nft_traceinfo info;
 
@@ -154,7 +146,6 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	if (static_branch_unlikely(&nft_trace_enabled))
 		nft_trace_init(&info, pkt, &regs.verdict, basechain);
 do_chain:
-	rulenum = 0;
 	rule = list_entry(&chain->rules, struct nft_rule, list);
 next_rule:
 	regs.verdict.code = NFT_CONTINUE;
@@ -164,8 +155,6 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 		if (unlikely(rule->genmask & gencursor))
 			continue;
 
-		rulenum++;
-
 		nft_rule_for_each_expr(expr, last, rule) {
 			if (expr->ops == &nft_cmp_fast_ops)
 				nft_cmp_fast_eval(expr, &regs);
@@ -183,7 +172,7 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 			continue;
 		case NFT_CONTINUE:
 			nft_trace_packet(&info, chain, rule,
-					 rulenum, NFT_TRACETYPE_RULE);
+					 NFT_TRACETYPE_RULE);
 			continue;
 		}
 		break;
@@ -195,7 +184,7 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	case NF_QUEUE:
 	case NF_STOLEN:
 		nft_trace_packet(&info, chain, rule,
-				 rulenum, NFT_TRACETYPE_RULE);
+				 NFT_TRACETYPE_RULE);
 		return regs.verdict.code;
 	}
 
@@ -204,21 +193,19 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 		BUG_ON(stackptr >= NFT_JUMP_STACK_SIZE);
 		jumpstack[stackptr].chain = chain;
 		jumpstack[stackptr].rule  = rule;
-		jumpstack[stackptr].rulenum = rulenum;
 		stackptr++;
 		/* fall through */
 	case NFT_GOTO:
 		nft_trace_packet(&info, chain, rule,
-				 rulenum, NFT_TRACETYPE_RULE);
+				 NFT_TRACETYPE_RULE);
 
 		chain = regs.verdict.chain;
 		goto do_chain;
 	case NFT_CONTINUE:
-		rulenum++;
 		/* fall through */
 	case NFT_RETURN:
 		nft_trace_packet(&info, chain, rule,
-				 rulenum, NFT_TRACETYPE_RETURN);
+				 NFT_TRACETYPE_RETURN);
 		break;
 	default:
 		WARN_ON(1);
@@ -228,12 +215,10 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 		stackptr--;
 		chain = jumpstack[stackptr].chain;
 		rule  = jumpstack[stackptr].rule;
-		rulenum = jumpstack[stackptr].rulenum;
 		goto next_rule;
 	}
 
-	nft_trace_packet(&info, basechain, NULL, -1,
-			 NFT_TRACETYPE_POLICY);
+	nft_trace_packet(&info, basechain, NULL, NFT_TRACETYPE_POLICY);
 
 	if (static_branch_unlikely(&nft_counters_enabled))
 		nft_update_chain_stats(basechain, pkt);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/18] netfilter: nft_numgen: add map lookups for numgen random operations
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 01/18] netfilter: fix fallout from xt/nf osf separation Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 02/18] netfilter: nf_tables: remove old nf_log based tracing Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 04/18] netfilter: nft_hash: add map lookups for hashing operations Pablo Neira Ayuso
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Laura Garcia Liebana <nevola@gmail.com>

This patch uses the map lookup already included to be applied
for random number generation.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_numgen.c | 76 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 72 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nft_numgen.c b/net/netfilter/nft_numgen.c
index 8a64db8f2e69..cdbc62a53933 100644
--- a/net/netfilter/nft_numgen.c
+++ b/net/netfilter/nft_numgen.c
@@ -166,18 +166,43 @@ struct nft_ng_random {
 	enum nft_registers      dreg:8;
 	u32			modulus;
 	u32			offset;
+	struct nft_set		*map;
 };
 
+static u32 nft_ng_random_gen(struct nft_ng_random *priv)
+{
+	struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
+
+	return reciprocal_scale(prandom_u32_state(state), priv->modulus) +
+	       priv->offset;
+}
+
 static void nft_ng_random_eval(const struct nft_expr *expr,
 			       struct nft_regs *regs,
 			       const struct nft_pktinfo *pkt)
 {
 	struct nft_ng_random *priv = nft_expr_priv(expr);
-	struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
-	u32 val;
 
-	val = reciprocal_scale(prandom_u32_state(state), priv->modulus);
-	regs->data[priv->dreg] = val + priv->offset;
+	regs->data[priv->dreg] = nft_ng_random_gen(priv);
+}
+
+static void nft_ng_random_map_eval(const struct nft_expr *expr,
+				   struct nft_regs *regs,
+				   const struct nft_pktinfo *pkt)
+{
+	struct nft_ng_random *priv = nft_expr_priv(expr);
+	const struct nft_set *map = priv->map;
+	const struct nft_set_ext *ext;
+	u32 result;
+	bool found;
+
+	result = nft_ng_random_gen(priv);
+	found = map->ops->lookup(nft_net(pkt), map, &result, &ext);
+	if (!found)
+		return;
+
+	nft_data_copy(&regs->data[priv->dreg],
+		      nft_set_ext_data(ext), map->dlen);
 }
 
 static int nft_ng_random_init(const struct nft_ctx *ctx,
@@ -204,6 +229,23 @@ static int nft_ng_random_init(const struct nft_ctx *ctx,
 					   NFT_DATA_VALUE, sizeof(u32));
 }
 
+static int nft_ng_random_map_init(const struct nft_ctx *ctx,
+				  const struct nft_expr *expr,
+				  const struct nlattr * const tb[])
+{
+	struct nft_ng_random *priv = nft_expr_priv(expr);
+	u8 genmask = nft_genmask_next(ctx->net);
+
+	nft_ng_random_init(ctx, expr, tb);
+	priv->map = nft_set_lookup_global(ctx->net, ctx->table,
+					  tb[NFTA_NG_SET_NAME],
+					  tb[NFTA_NG_SET_ID], genmask);
+	if (IS_ERR(priv->map))
+		return PTR_ERR(priv->map);
+
+	return 0;
+}
+
 static int nft_ng_random_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	const struct nft_ng_random *priv = nft_expr_priv(expr);
@@ -212,6 +254,22 @@ static int nft_ng_random_dump(struct sk_buff *skb, const struct nft_expr *expr)
 			   priv->offset);
 }
 
+static int nft_ng_random_map_dump(struct sk_buff *skb,
+				  const struct nft_expr *expr)
+{
+	const struct nft_ng_random *priv = nft_expr_priv(expr);
+
+	if (nft_ng_dump(skb, priv->dreg, priv->modulus,
+			NFT_NG_RANDOM, priv->offset) ||
+	    nla_put_string(skb, NFTA_NG_SET_NAME, priv->map->name))
+		goto nla_put_failure;
+
+	return 0;
+
+nla_put_failure:
+	return -1;
+}
+
 static struct nft_expr_type nft_ng_type;
 static const struct nft_expr_ops nft_ng_inc_ops = {
 	.type		= &nft_ng_type,
@@ -237,6 +295,14 @@ static const struct nft_expr_ops nft_ng_random_ops = {
 	.dump		= nft_ng_random_dump,
 };
 
+static const struct nft_expr_ops nft_ng_random_map_ops = {
+	.type		= &nft_ng_type,
+	.size		= NFT_EXPR_SIZE(sizeof(struct nft_ng_random)),
+	.eval		= nft_ng_random_map_eval,
+	.init		= nft_ng_random_map_init,
+	.dump		= nft_ng_random_map_dump,
+};
+
 static const struct nft_expr_ops *
 nft_ng_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[])
 {
@@ -255,6 +321,8 @@ nft_ng_select_ops(const struct nft_ctx *ctx, const struct nlattr * const tb[])
 			return &nft_ng_inc_map_ops;
 		return &nft_ng_inc_ops;
 	case NFT_NG_RANDOM:
+		if (tb[NFTA_NG_SET_NAME])
+			return &nft_ng_random_map_ops;
 		return &nft_ng_random_ops;
 	}
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/18] netfilter: nft_hash: add map lookups for hashing operations
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 03/18] netfilter: nft_numgen: add map lookups for numgen random operations Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 05/18] netfilter: nf_nat: move common nat code to nat core Pablo Neira Ayuso
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Laura Garcia Liebana <nevola@gmail.com>

This patch creates new attributes to accept a map as argument and
then perform the lookup with the generated hash accordingly.

Both current hash functions are supported: Jenkins and Symmetric Hash.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_tables.h |   4 +
 net/netfilter/nft_hash.c                 | 131 ++++++++++++++++++++++++++++++-
 2 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index ce031cf72288..9c71f024f9cc 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -856,6 +856,8 @@ enum nft_hash_types {
  * @NFTA_HASH_SEED: seed value (NLA_U32)
  * @NFTA_HASH_OFFSET: add this offset value to hash result (NLA_U32)
  * @NFTA_HASH_TYPE: hash operation (NLA_U32: nft_hash_types)
+ * @NFTA_HASH_SET_NAME: name of the map to lookup (NLA_STRING)
+ * @NFTA_HASH_SET_ID: id of the map (NLA_U32)
  */
 enum nft_hash_attributes {
 	NFTA_HASH_UNSPEC,
@@ -866,6 +868,8 @@ enum nft_hash_attributes {
 	NFTA_HASH_SEED,
 	NFTA_HASH_OFFSET,
 	NFTA_HASH_TYPE,
+	NFTA_HASH_SET_NAME,
+	NFTA_HASH_SET_ID,
 	__NFTA_HASH_MAX,
 };
 #define NFTA_HASH_MAX	(__NFTA_HASH_MAX - 1)
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index e235c17f1b8b..f0fc21f88775 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -25,6 +25,7 @@ struct nft_jhash {
 	u32			modulus;
 	u32			seed;
 	u32			offset;
+	struct nft_set		*map;
 };
 
 static void nft_jhash_eval(const struct nft_expr *expr,
@@ -35,14 +36,39 @@ static void nft_jhash_eval(const struct nft_expr *expr,
 	const void *data = &regs->data[priv->sreg];
 	u32 h;
 
-	h = reciprocal_scale(jhash(data, priv->len, priv->seed), priv->modulus);
+	h = reciprocal_scale(jhash(data, priv->len, priv->seed),
+			     priv->modulus);
+
 	regs->data[priv->dreg] = h + priv->offset;
 }
 
+static void nft_jhash_map_eval(const struct nft_expr *expr,
+			       struct nft_regs *regs,
+			       const struct nft_pktinfo *pkt)
+{
+	struct nft_jhash *priv = nft_expr_priv(expr);
+	const void *data = &regs->data[priv->sreg];
+	const struct nft_set *map = priv->map;
+	const struct nft_set_ext *ext;
+	u32 result;
+	bool found;
+
+	result = reciprocal_scale(jhash(data, priv->len, priv->seed),
+					priv->modulus) + priv->offset;
+
+	found = map->ops->lookup(nft_net(pkt), map, &result, &ext);
+	if (!found)
+		return;
+
+	nft_data_copy(&regs->data[priv->dreg],
+		      nft_set_ext_data(ext), map->dlen);
+}
+
 struct nft_symhash {
 	enum nft_registers      dreg:8;
 	u32			modulus;
 	u32			offset;
+	struct nft_set		*map;
 };
 
 static void nft_symhash_eval(const struct nft_expr *expr,
@@ -58,6 +84,28 @@ static void nft_symhash_eval(const struct nft_expr *expr,
 	regs->data[priv->dreg] = h + priv->offset;
 }
 
+static void nft_symhash_map_eval(const struct nft_expr *expr,
+				 struct nft_regs *regs,
+				 const struct nft_pktinfo *pkt)
+{
+	struct nft_symhash *priv = nft_expr_priv(expr);
+	struct sk_buff *skb = pkt->skb;
+	const struct nft_set *map = priv->map;
+	const struct nft_set_ext *ext;
+	u32 result;
+	bool found;
+
+	result = reciprocal_scale(__skb_get_hash_symmetric(skb),
+				  priv->modulus) + priv->offset;
+
+	found = map->ops->lookup(nft_net(pkt), map, &result, &ext);
+	if (!found)
+		return;
+
+	nft_data_copy(&regs->data[priv->dreg],
+		      nft_set_ext_data(ext), map->dlen);
+}
+
 static const struct nla_policy nft_hash_policy[NFTA_HASH_MAX + 1] = {
 	[NFTA_HASH_SREG]	= { .type = NLA_U32 },
 	[NFTA_HASH_DREG]	= { .type = NLA_U32 },
@@ -66,6 +114,9 @@ static const struct nla_policy nft_hash_policy[NFTA_HASH_MAX + 1] = {
 	[NFTA_HASH_SEED]	= { .type = NLA_U32 },
 	[NFTA_HASH_OFFSET]	= { .type = NLA_U32 },
 	[NFTA_HASH_TYPE]	= { .type = NLA_U32 },
+	[NFTA_HASH_SET_NAME]	= { .type = NLA_STRING,
+				    .len = NFT_SET_MAXNAMELEN - 1 },
+	[NFTA_HASH_SET_ID]	= { .type = NLA_U32 },
 };
 
 static int nft_jhash_init(const struct nft_ctx *ctx,
@@ -115,6 +166,23 @@ static int nft_jhash_init(const struct nft_ctx *ctx,
 					   NFT_DATA_VALUE, sizeof(u32));
 }
 
+static int nft_jhash_map_init(const struct nft_ctx *ctx,
+			      const struct nft_expr *expr,
+			      const struct nlattr * const tb[])
+{
+	struct nft_jhash *priv = nft_expr_priv(expr);
+	u8 genmask = nft_genmask_next(ctx->net);
+
+	nft_jhash_init(ctx, expr, tb);
+	priv->map = nft_set_lookup_global(ctx->net, ctx->table,
+					  tb[NFTA_HASH_SET_NAME],
+					  tb[NFTA_HASH_SET_ID], genmask);
+	if (IS_ERR(priv->map))
+		return PTR_ERR(priv->map);
+
+	return 0;
+}
+
 static int nft_symhash_init(const struct nft_ctx *ctx,
 			    const struct nft_expr *expr,
 			    const struct nlattr * const tb[])
@@ -141,6 +209,23 @@ static int nft_symhash_init(const struct nft_ctx *ctx,
 					   NFT_DATA_VALUE, sizeof(u32));
 }
 
+static int nft_symhash_map_init(const struct nft_ctx *ctx,
+				const struct nft_expr *expr,
+				const struct nlattr * const tb[])
+{
+	struct nft_jhash *priv = nft_expr_priv(expr);
+	u8 genmask = nft_genmask_next(ctx->net);
+
+	nft_symhash_init(ctx, expr, tb);
+	priv->map = nft_set_lookup_global(ctx->net, ctx->table,
+					  tb[NFTA_HASH_SET_NAME],
+					  tb[NFTA_HASH_SET_ID], genmask);
+	if (IS_ERR(priv->map))
+		return PTR_ERR(priv->map);
+
+	return 0;
+}
+
 static int nft_jhash_dump(struct sk_buff *skb,
 			  const struct nft_expr *expr)
 {
@@ -168,6 +253,18 @@ static int nft_jhash_dump(struct sk_buff *skb,
 	return -1;
 }
 
+static int nft_jhash_map_dump(struct sk_buff *skb,
+			       const struct nft_expr *expr)
+{
+	const struct nft_jhash *priv = nft_expr_priv(expr);
+
+	if (nft_jhash_dump(skb, expr) ||
+	    nla_put_string(skb, NFTA_HASH_SET_NAME, priv->map->name))
+		return -1;
+
+	return 0;
+}
+
 static int nft_symhash_dump(struct sk_buff *skb,
 			    const struct nft_expr *expr)
 {
@@ -188,6 +285,18 @@ static int nft_symhash_dump(struct sk_buff *skb,
 	return -1;
 }
 
+static int nft_symhash_map_dump(struct sk_buff *skb,
+				const struct nft_expr *expr)
+{
+	const struct nft_symhash *priv = nft_expr_priv(expr);
+
+	if (nft_symhash_dump(skb, expr) ||
+	    nla_put_string(skb, NFTA_HASH_SET_NAME, priv->map->name))
+		return -1;
+
+	return 0;
+}
+
 static struct nft_expr_type nft_hash_type;
 static const struct nft_expr_ops nft_jhash_ops = {
 	.type		= &nft_hash_type,
@@ -197,6 +306,14 @@ static const struct nft_expr_ops nft_jhash_ops = {
 	.dump		= nft_jhash_dump,
 };
 
+static const struct nft_expr_ops nft_jhash_map_ops = {
+	.type		= &nft_hash_type,
+	.size		= NFT_EXPR_SIZE(sizeof(struct nft_jhash)),
+	.eval		= nft_jhash_map_eval,
+	.init		= nft_jhash_map_init,
+	.dump		= nft_jhash_map_dump,
+};
+
 static const struct nft_expr_ops nft_symhash_ops = {
 	.type		= &nft_hash_type,
 	.size		= NFT_EXPR_SIZE(sizeof(struct nft_symhash)),
@@ -205,6 +322,14 @@ static const struct nft_expr_ops nft_symhash_ops = {
 	.dump		= nft_symhash_dump,
 };
 
+static const struct nft_expr_ops nft_symhash_map_ops = {
+	.type		= &nft_hash_type,
+	.size		= NFT_EXPR_SIZE(sizeof(struct nft_symhash)),
+	.eval		= nft_symhash_map_eval,
+	.init		= nft_symhash_map_init,
+	.dump		= nft_symhash_map_dump,
+};
+
 static const struct nft_expr_ops *
 nft_hash_select_ops(const struct nft_ctx *ctx,
 		    const struct nlattr * const tb[])
@@ -217,8 +342,12 @@ nft_hash_select_ops(const struct nft_ctx *ctx,
 	type = ntohl(nla_get_be32(tb[NFTA_HASH_TYPE]));
 	switch (type) {
 	case NFT_HASH_SYM:
+		if (tb[NFTA_HASH_SET_NAME])
+			return &nft_symhash_map_ops;
 		return &nft_symhash_ops;
 	case NFT_HASH_JENKINS:
+		if (tb[NFTA_HASH_SET_NAME])
+			return &nft_jhash_map_ops;
 		return &nft_jhash_ops;
 	default:
 		break;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/18] netfilter: nf_nat: move common nat code to nat core
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 04/18] netfilter: nft_hash: add map lookups for hashing operations Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 06/18] netfilter: xtables: allow table definitions not backed by hook_ops Pablo Neira Ayuso
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

Copy-pasted, both l3 helpers almost use same code here.
Split out the common part into an 'inet' helper.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_nat_core.h      |  7 ++++
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 55 +------------------------
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 48 +---------------------
 net/netfilter/nf_nat_core.c              | 70 ++++++++++++++++++++++++++++++++
 4 files changed, 81 insertions(+), 99 deletions(-)

diff --git a/include/net/netfilter/nf_nat_core.h b/include/net/netfilter/nf_nat_core.h
index 235bd0e9a5aa..0d84dd29108d 100644
--- a/include/net/netfilter/nf_nat_core.h
+++ b/include/net/netfilter/nf_nat_core.h
@@ -11,6 +11,13 @@
 unsigned int nf_nat_packet(struct nf_conn *ct, enum ip_conntrack_info ctinfo,
 			   unsigned int hooknum, struct sk_buff *skb);
 
+unsigned int
+nf_nat_inet_fn(void *priv, struct sk_buff *skb,
+	       const struct nf_hook_state *state,
+	       unsigned int (*do_chain)(void *priv,
+					struct sk_buff *skb,
+					const struct nf_hook_state *state));
+
 int nf_xfrm_me_harder(struct net *net, struct sk_buff *skb, unsigned int family);
 
 static inline int nf_nat_initialized(struct nf_conn *ct,
diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index 325e02956bf5..29b5aceac66d 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -250,24 +250,12 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
 {
 	struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
-	struct nf_conn_nat *nat;
-	/* maniptype == SRC for postrouting. */
-	enum nf_nat_manip_type maniptype = HOOK2MANIP(state->hook);
 
 	ct = nf_ct_get(skb, &ctinfo);
-	/* Can't track?  It's not due to stress, or conntrack would
-	 * have dropped it.  Hence it's the user's responsibilty to
-	 * packet filter it out, or implement conntrack/NAT for that
-	 * protocol. 8) --RR
-	 */
 	if (!ct)
 		return NF_ACCEPT;
 
-	nat = nfct_nat(ct);
-
-	switch (ctinfo) {
-	case IP_CT_RELATED:
-	case IP_CT_RELATED_REPLY:
+	if (ctinfo == IP_CT_RELATED || ctinfo == IP_CT_RELATED_REPLY) {
 		if (ip_hdr(skb)->protocol == IPPROTO_ICMP) {
 			if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo,
 							   state->hook))
@@ -275,48 +263,9 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
 			else
 				return NF_ACCEPT;
 		}
-		/* Only ICMPs can be IP_CT_IS_REPLY: */
-		/* fall through */
-	case IP_CT_NEW:
-		/* Seen it before?  This can happen for loopback, retrans,
-		 * or local packets.
-		 */
-		if (!nf_nat_initialized(ct, maniptype)) {
-			unsigned int ret;
-
-			ret = do_chain(priv, skb, state);
-			if (ret != NF_ACCEPT)
-				return ret;
-
-			if (nf_nat_initialized(ct, HOOK2MANIP(state->hook)))
-				break;
-
-			ret = nf_nat_alloc_null_binding(ct, state->hook);
-			if (ret != NF_ACCEPT)
-				return ret;
-		} else {
-			pr_debug("Already setup manip %s for ct %p\n",
-				 maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
-				 ct);
-			if (nf_nat_oif_changed(state->hook, ctinfo, nat,
-					       state->out))
-				goto oif_changed;
-		}
-		break;
-
-	default:
-		/* ESTABLISHED */
-		WARN_ON(ctinfo != IP_CT_ESTABLISHED &&
-			ctinfo != IP_CT_ESTABLISHED_REPLY);
-		if (nf_nat_oif_changed(state->hook, ctinfo, nat, state->out))
-			goto oif_changed;
 	}
 
-	return nf_nat_packet(ct, ctinfo, state->hook, skb);
-
-oif_changed:
-	nf_ct_kill_acct(ct, ctinfo, skb);
-	return NF_DROP;
+	return nf_nat_inet_fn(priv, skb, state, do_chain);
 }
 EXPORT_SYMBOL_GPL(nf_nat_ipv4_fn);
 
diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index f1582b6f9588..3ec228984f82 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -261,8 +261,6 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
 {
 	struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
-	struct nf_conn_nat *nat;
-	enum nf_nat_manip_type maniptype = HOOK2MANIP(state->hook);
 	__be16 frag_off;
 	int hdrlen;
 	u8 nexthdr;
@@ -276,11 +274,7 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
 	if (!ct)
 		return NF_ACCEPT;
 
-	nat = nfct_nat(ct);
-
-	switch (ctinfo) {
-	case IP_CT_RELATED:
-	case IP_CT_RELATED_REPLY:
+	if (ctinfo == IP_CT_RELATED || ctinfo == IP_CT_RELATED_REPLY) {
 		nexthdr = ipv6_hdr(skb)->nexthdr;
 		hdrlen = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr),
 					  &nexthdr, &frag_off);
@@ -293,47 +287,9 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
 			else
 				return NF_ACCEPT;
 		}
-		/* Only ICMPs can be IP_CT_IS_REPLY: */
-		/* fall through */
-	case IP_CT_NEW:
-		/* Seen it before?  This can happen for loopback, retrans,
-		 * or local packets.
-		 */
-		if (!nf_nat_initialized(ct, maniptype)) {
-			unsigned int ret;
-
-			ret = do_chain(priv, skb, state);
-			if (ret != NF_ACCEPT)
-				return ret;
-
-			if (nf_nat_initialized(ct, HOOK2MANIP(state->hook)))
-				break;
-
-			ret = nf_nat_alloc_null_binding(ct, state->hook);
-			if (ret != NF_ACCEPT)
-				return ret;
-		} else {
-			pr_debug("Already setup manip %s for ct %p\n",
-				 maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
-				 ct);
-			if (nf_nat_oif_changed(state->hook, ctinfo, nat, state->out))
-				goto oif_changed;
-		}
-		break;
-
-	default:
-		/* ESTABLISHED */
-		WARN_ON(ctinfo != IP_CT_ESTABLISHED &&
-			ctinfo != IP_CT_ESTABLISHED_REPLY);
-		if (nf_nat_oif_changed(state->hook, ctinfo, nat, state->out))
-			goto oif_changed;
 	}
 
-	return nf_nat_packet(ct, ctinfo, state->hook, skb);
-
-oif_changed:
-	nf_ct_kill_acct(ct, ctinfo, skb);
-	return NF_DROP;
+	return nf_nat_inet_fn(priv, skb, state, do_chain);
 }
 EXPORT_SYMBOL_GPL(nf_nat_ipv6_fn);
 
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 37b3c9913b08..0cd503aacbf0 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -513,6 +513,76 @@ unsigned int nf_nat_packet(struct nf_conn *ct,
 }
 EXPORT_SYMBOL_GPL(nf_nat_packet);
 
+unsigned int
+nf_nat_inet_fn(void *priv, struct sk_buff *skb,
+	       const struct nf_hook_state *state,
+	       unsigned int (*do_chain)(void *priv,
+					struct sk_buff *skb,
+					const struct nf_hook_state *state))
+{
+	struct nf_conn *ct;
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn_nat *nat;
+	/* maniptype == SRC for postrouting. */
+	enum nf_nat_manip_type maniptype = HOOK2MANIP(state->hook);
+
+	ct = nf_ct_get(skb, &ctinfo);
+	/* Can't track?  It's not due to stress, or conntrack would
+	 * have dropped it.  Hence it's the user's responsibilty to
+	 * packet filter it out, or implement conntrack/NAT for that
+	 * protocol. 8) --RR
+	 */
+	if (!ct)
+		return NF_ACCEPT;
+
+	nat = nfct_nat(ct);
+
+	switch (ctinfo) {
+	case IP_CT_RELATED:
+	case IP_CT_RELATED_REPLY:
+		/* Only ICMPs can be IP_CT_IS_REPLY.  Fallthrough */
+	case IP_CT_NEW:
+		/* Seen it before?  This can happen for loopback, retrans,
+		 * or local packets.
+		 */
+		if (!nf_nat_initialized(ct, maniptype)) {
+			unsigned int ret;
+
+			ret = do_chain(priv, skb, state);
+			if (ret != NF_ACCEPT)
+				return ret;
+
+			if (nf_nat_initialized(ct, HOOK2MANIP(state->hook)))
+				break;
+
+			ret = nf_nat_alloc_null_binding(ct, state->hook);
+			if (ret != NF_ACCEPT)
+				return ret;
+		} else {
+			pr_debug("Already setup manip %s for ct %p (status bits 0x%lx)\n",
+				 maniptype == NF_NAT_MANIP_SRC ? "SRC" : "DST",
+				 ct, ct->status);
+			if (nf_nat_oif_changed(state->hook, ctinfo, nat,
+					       state->out))
+				goto oif_changed;
+		}
+		break;
+	default:
+		/* ESTABLISHED */
+		WARN_ON(ctinfo != IP_CT_ESTABLISHED &&
+			ctinfo != IP_CT_ESTABLISHED_REPLY);
+		if (nf_nat_oif_changed(state->hook, ctinfo, nat, state->out))
+			goto oif_changed;
+	}
+
+	return nf_nat_packet(ct, ctinfo, state->hook, skb);
+
+oif_changed:
+	nf_ct_kill_acct(ct, ctinfo, skb);
+	return NF_DROP;
+}
+EXPORT_SYMBOL_GPL(nf_nat_inet_fn);
+
 struct nf_nat_proto_clean {
 	u8	l3proto;
 	u8	l4proto;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/18] netfilter: xtables: allow table definitions not backed by hook_ops
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 05/18] netfilter: nf_nat: move common nat code to nat core Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 07/18] netfilter: nf_tables: allow chain type to override hook register Pablo Neira Ayuso
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

The ip(6)tables nat table is currently receiving skbs from the netfilter
core, after a followup patch skbs will be coming from the netfilter nat
core instead, so the table is no longer backed by normal hook_ops.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/ip_tables.c  | 5 ++++-
 net/ipv6/netfilter/ip6_tables.c | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 444f125f3974..ddcc56c37d4d 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1782,6 +1782,8 @@ int ipt_register_table(struct net *net, const struct xt_table *table,
 
 	/* set res now, will see skbs right after nf_register_net_hooks */
 	WRITE_ONCE(*res, new_table);
+	if (!ops)
+		return 0;
 
 	ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
 	if (ret != 0) {
@@ -1799,7 +1801,8 @@ int ipt_register_table(struct net *net, const struct xt_table *table,
 void ipt_unregister_table(struct net *net, struct xt_table *table,
 			  const struct nf_hook_ops *ops)
 {
-	nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
+	if (ops)
+		nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
 	__ipt_unregister_table(net, table);
 }
 
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 7097bbf95843..e18b14b2e019 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1792,6 +1792,8 @@ int ip6t_register_table(struct net *net, const struct xt_table *table,
 
 	/* set res now, will see skbs right after nf_register_net_hooks */
 	WRITE_ONCE(*res, new_table);
+	if (!ops)
+		return 0;
 
 	ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
 	if (ret != 0) {
@@ -1809,7 +1811,8 @@ int ip6t_register_table(struct net *net, const struct xt_table *table,
 void ip6t_unregister_table(struct net *net, struct xt_table *table,
 			   const struct nf_hook_ops *ops)
 {
-	nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
+	if (ops)
+		nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
 	__ip6t_unregister_table(net, table);
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/18] netfilter: nf_tables: allow chain type to override hook register
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 06/18] netfilter: xtables: allow table definitions not backed by hook_ops Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 08/18] netfilter: core: export raw versions of add/delete hook functions Pablo Neira Ayuso
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

Will be used in followup patch when nat types no longer
use nf_register_net_hook() but will instead register with the nat core.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h       |  8 ++++----
 net/ipv4/netfilter/nft_chain_nat_ipv4.c | 19 +++++++++++++------
 net/ipv6/netfilter/nft_chain_nat_ipv6.c | 20 ++++++++++++++------
 net/netfilter/nf_tables_api.c           | 23 ++++++++++++++++-------
 4 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 435c9e3b9181..a94fd0c730d6 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -880,8 +880,8 @@ enum nft_chain_types {
  * 	@owner: module owner
  * 	@hook_mask: mask of valid hooks
  * 	@hooks: array of hook functions
- *	@init: chain initialization function
- *	@free: chain release function
+ *	@ops_register: base chain register function
+ *	@ops_unregister: base chain unregister function
  */
 struct nft_chain_type {
 	const char			*name;
@@ -890,8 +890,8 @@ struct nft_chain_type {
 	struct module			*owner;
 	unsigned int			hook_mask;
 	nf_hookfn			*hooks[NF_MAX_HOOKS];
-	int				(*init)(struct nft_ctx *ctx);
-	void				(*free)(struct nft_ctx *ctx);
+	int				(*ops_register)(struct net *net, const struct nf_hook_ops *ops);
+	void				(*ops_unregister)(struct net *net, const struct nf_hook_ops *ops);
 };
 
 int nft_chain_validate_dependency(const struct nft_chain *chain,
diff --git a/net/ipv4/netfilter/nft_chain_nat_ipv4.c b/net/ipv4/netfilter/nft_chain_nat_ipv4.c
index 285baccfbdea..bbcb624b6b81 100644
--- a/net/ipv4/netfilter/nft_chain_nat_ipv4.c
+++ b/net/ipv4/netfilter/nft_chain_nat_ipv4.c
@@ -66,14 +66,21 @@ static unsigned int nft_nat_ipv4_local_fn(void *priv,
 	return nf_nat_ipv4_local_fn(priv, skb, state, nft_nat_do_chain);
 }
 
-static int nft_nat_ipv4_init(struct nft_ctx *ctx)
+static int nft_nat_ipv4_reg(struct net *net, const struct nf_hook_ops *ops)
 {
-	return nf_ct_netns_get(ctx->net, ctx->family);
+	int ret = nf_register_net_hook(net, ops);
+	if (ret == 0) {
+		ret = nf_ct_netns_get(net, NFPROTO_IPV4);
+		if (ret)
+			 nf_unregister_net_hook(net, ops);
+	}
+	return ret;
 }
 
-static void nft_nat_ipv4_free(struct nft_ctx *ctx)
+static void nft_nat_ipv4_unreg(struct net *net, const struct nf_hook_ops *ops)
 {
-	nf_ct_netns_put(ctx->net, ctx->family);
+	nf_unregister_net_hook(net, ops);
+	nf_ct_netns_put(net, NFPROTO_IPV4);
 }
 
 static const struct nft_chain_type nft_chain_nat_ipv4 = {
@@ -91,8 +98,8 @@ static const struct nft_chain_type nft_chain_nat_ipv4 = {
 		[NF_INET_LOCAL_OUT]	= nft_nat_ipv4_local_fn,
 		[NF_INET_LOCAL_IN]	= nft_nat_ipv4_fn,
 	},
-	.init		= nft_nat_ipv4_init,
-	.free		= nft_nat_ipv4_free,
+	.ops_register = nft_nat_ipv4_reg,
+	.ops_unregister = nft_nat_ipv4_unreg,
 };
 
 static int __init nft_chain_nat_init(void)
diff --git a/net/ipv6/netfilter/nft_chain_nat_ipv6.c b/net/ipv6/netfilter/nft_chain_nat_ipv6.c
index 100a6bd1046a..05bcb2c23125 100644
--- a/net/ipv6/netfilter/nft_chain_nat_ipv6.c
+++ b/net/ipv6/netfilter/nft_chain_nat_ipv6.c
@@ -64,14 +64,22 @@ static unsigned int nft_nat_ipv6_local_fn(void *priv,
 	return nf_nat_ipv6_local_fn(priv, skb, state, nft_nat_do_chain);
 }
 
-static int nft_nat_ipv6_init(struct nft_ctx *ctx)
+static int nft_nat_ipv6_reg(struct net *net, const struct nf_hook_ops *ops)
 {
-	return nf_ct_netns_get(ctx->net, ctx->family);
+	int ret = nf_register_net_hook(net, ops);
+	if (ret == 0) {
+		ret = nf_ct_netns_get(net, NFPROTO_IPV6);
+		if (ret)
+			 nf_unregister_net_hook(net, ops);
+	}
+
+	return ret;
 }
 
-static void nft_nat_ipv6_free(struct nft_ctx *ctx)
+static void nft_nat_ipv6_unreg(struct net *net, const struct nf_hook_ops *ops)
 {
-	nf_ct_netns_put(ctx->net, ctx->family);
+	nf_unregister_net_hook(net, ops);
+	nf_ct_netns_put(net, NFPROTO_IPV6);
 }
 
 static const struct nft_chain_type nft_chain_nat_ipv6 = {
@@ -89,8 +97,8 @@ static const struct nft_chain_type nft_chain_nat_ipv6 = {
 		[NF_INET_LOCAL_OUT]	= nft_nat_ipv6_local_fn,
 		[NF_INET_LOCAL_IN]	= nft_nat_ipv6_fn,
 	},
-	.init		= nft_nat_ipv6_init,
-	.free		= nft_nat_ipv6_free,
+	.ops_register		= nft_nat_ipv6_reg,
+	.ops_unregister		= nft_nat_ipv6_unreg,
 };
 
 static int __init nft_chain_nat_ipv6_init(void)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 18bd584fadda..ded54b2abfbc 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -129,6 +129,7 @@ static int nf_tables_register_hook(struct net *net,
 				   const struct nft_table *table,
 				   struct nft_chain *chain)
 {
+	const struct nft_base_chain *basechain;
 	struct nf_hook_ops *ops;
 	int ret;
 
@@ -136,7 +137,12 @@ static int nf_tables_register_hook(struct net *net,
 	    !nft_is_base_chain(chain))
 		return 0;
 
-	ops = &nft_base_chain(chain)->ops;
+	basechain = nft_base_chain(chain);
+	ops = &basechain->ops;
+
+	if (basechain->type->ops_register)
+		return basechain->type->ops_register(net, ops);
+
 	ret = nf_register_net_hook(net, ops);
 	if (ret == -EBUSY && nf_tables_allow_nat_conflict(net, ops)) {
 		ops->nat_hook = false;
@@ -151,11 +157,19 @@ static void nf_tables_unregister_hook(struct net *net,
 				      const struct nft_table *table,
 				      struct nft_chain *chain)
 {
+	const struct nft_base_chain *basechain;
+	const struct nf_hook_ops *ops;
+
 	if (table->flags & NFT_TABLE_F_DORMANT ||
 	    !nft_is_base_chain(chain))
 		return;
+	basechain = nft_base_chain(chain);
+	ops = &basechain->ops;
+
+	if (basechain->type->ops_unregister)
+		return basechain->type->ops_unregister(net, ops);
 
-	nf_unregister_net_hook(net, &nft_base_chain(chain)->ops);
+	nf_unregister_net_hook(net, ops);
 }
 
 static int nft_trans_table_add(struct nft_ctx *ctx, int msg_type)
@@ -1262,8 +1276,6 @@ static void nf_tables_chain_destroy(struct nft_ctx *ctx)
 	if (nft_is_base_chain(chain)) {
 		struct nft_base_chain *basechain = nft_base_chain(chain);
 
-		if (basechain->type->free)
-			basechain->type->free(ctx);
 		module_put(basechain->type->owner);
 		free_percpu(basechain->stats);
 		if (basechain->stats)
@@ -1396,9 +1408,6 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 		}
 
 		basechain->type = hook.type;
-		if (basechain->type->init)
-			basechain->type->init(ctx);
-
 		chain = &basechain->chain;
 
 		ops		= &basechain->ops;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/18] netfilter: core: export raw versions of add/delete hook functions
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 07/18] netfilter: nf_tables: allow chain type to override hook register Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 09/18] netfilter: nf_nat: add nat hook register functions to nf_nat Pablo Neira Ayuso
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

This will allow the nat core to reuse the nf_hook infrastructure
to maintain nat lookup functions.

The raw versions don't assume a particular hook location, the
functions get added/deleted from the hook blob that is passed to the
functions.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/core.c         | 75 +++++++++++++++++++++++++++++++-------------
 net/netfilter/nf_internals.h |  5 +++
 2 files changed, 59 insertions(+), 21 deletions(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 0f6b8172fb9a..5f0ebf9a8d5b 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -186,9 +186,31 @@ static void hooks_validate(const struct nf_hook_entries *hooks)
 #endif
 }
 
+int nf_hook_entries_insert_raw(struct nf_hook_entries __rcu **pp,
+				const struct nf_hook_ops *reg)
+{
+	struct nf_hook_entries *new_hooks;
+	struct nf_hook_entries *p;
+
+	p = rcu_dereference_raw(*pp);
+	new_hooks = nf_hook_entries_grow(p, reg);
+	if (IS_ERR(new_hooks))
+		return PTR_ERR(new_hooks);
+
+	hooks_validate(new_hooks);
+
+	rcu_assign_pointer(*pp, new_hooks);
+
+	BUG_ON(p == new_hooks);
+	nf_hook_entries_free(p);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nf_hook_entries_insert_raw);
+
 /*
  * __nf_hook_entries_try_shrink - try to shrink hook array
  *
+ * @old -- current hook blob at @pp
  * @pp -- location of hook blob
  *
  * Hook unregistration must always succeed, so to-be-removed hooks
@@ -201,14 +223,14 @@ static void hooks_validate(const struct nf_hook_entries *hooks)
  *
  * Returns address to free, or NULL.
  */
-static void *__nf_hook_entries_try_shrink(struct nf_hook_entries __rcu **pp)
+static void *__nf_hook_entries_try_shrink(struct nf_hook_entries *old,
+					  struct nf_hook_entries __rcu **pp)
 {
-	struct nf_hook_entries *old, *new = NULL;
 	unsigned int i, j, skip = 0, hook_entries;
+	struct nf_hook_entries *new = NULL;
 	struct nf_hook_ops **orig_ops;
 	struct nf_hook_ops **new_ops;
 
-	old = nf_entry_dereference(*pp);
 	if (WARN_ON_ONCE(!old))
 		return NULL;
 
@@ -347,11 +369,10 @@ static int __nf_register_net_hook(struct net *net, int pf,
  * This cannot fail, hook unregistration must always succeed.
  * Therefore replace the to-be-removed hook with a dummy hook.
  */
-static void nf_remove_net_hook(struct nf_hook_entries *old,
-			       const struct nf_hook_ops *unreg, int pf)
+static bool nf_remove_net_hook(struct nf_hook_entries *old,
+			       const struct nf_hook_ops *unreg)
 {
 	struct nf_hook_ops **orig_ops;
-	bool found = false;
 	unsigned int i;
 
 	orig_ops = nf_hook_entries_get_hook_ops(old);
@@ -360,21 +381,10 @@ static void nf_remove_net_hook(struct nf_hook_entries *old,
 			continue;
 		WRITE_ONCE(old->hooks[i].hook, accept_all);
 		WRITE_ONCE(orig_ops[i], &dummy_ops);
-		found = true;
-		break;
+		return true;
 	}
 
-	if (found) {
-#ifdef CONFIG_NETFILTER_INGRESS
-		if (pf == NFPROTO_NETDEV && unreg->hooknum == NF_NETDEV_INGRESS)
-			net_dec_ingress_queue();
-#endif
-#ifdef HAVE_JUMP_LABEL
-		static_key_slow_dec(&nf_hooks_needed[pf][unreg->hooknum]);
-#endif
-	} else {
-		WARN_ONCE(1, "hook not found, pf %d num %d", pf, unreg->hooknum);
-	}
+	return false;
 }
 
 static void __nf_unregister_net_hook(struct net *net, int pf,
@@ -395,9 +405,19 @@ static void __nf_unregister_net_hook(struct net *net, int pf,
 		return;
 	}
 
-	nf_remove_net_hook(p, reg, pf);
+	if (nf_remove_net_hook(p, reg)) {
+#ifdef CONFIG_NETFILTER_INGRESS
+		if (pf == NFPROTO_NETDEV && reg->hooknum == NF_NETDEV_INGRESS)
+			net_dec_ingress_queue();
+#endif
+#ifdef HAVE_JUMP_LABEL
+		static_key_slow_dec(&nf_hooks_needed[pf][reg->hooknum]);
+#endif
+	} else {
+		WARN_ONCE(1, "hook not found, pf %d num %d", pf, reg->hooknum);
+	}
 
-	p = __nf_hook_entries_try_shrink(pp);
+	p = __nf_hook_entries_try_shrink(p, pp);
 	mutex_unlock(&nf_hook_mutex);
 	if (!p)
 		return;
@@ -417,6 +437,19 @@ void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg)
 }
 EXPORT_SYMBOL(nf_unregister_net_hook);
 
+void nf_hook_entries_delete_raw(struct nf_hook_entries __rcu **pp,
+				const struct nf_hook_ops *reg)
+{
+	struct nf_hook_entries *p;
+
+	p = rcu_dereference_raw(*pp);
+	if (nf_remove_net_hook(p, reg)) {
+		p = __nf_hook_entries_try_shrink(p, pp);
+		nf_hook_entries_free(p);
+	}
+}
+EXPORT_SYMBOL_GPL(nf_hook_entries_delete_raw);
+
 int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg)
 {
 	int err;
diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h
index 18f6d7ae995b..e15779fd58e3 100644
--- a/net/netfilter/nf_internals.h
+++ b/net/netfilter/nf_internals.h
@@ -15,4 +15,9 @@ void nf_queue_nf_hook_drop(struct net *net);
 /* nf_log.c */
 int __init netfilter_log_init(void);
 
+/* core.c */
+void nf_hook_entries_delete_raw(struct nf_hook_entries __rcu **pp,
+				const struct nf_hook_ops *reg);
+int nf_hook_entries_insert_raw(struct nf_hook_entries __rcu **pp,
+				const struct nf_hook_ops *reg);
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/18] netfilter: nf_nat: add nat hook register functions to nf_nat
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 08/18] netfilter: core: export raw versions of add/delete hook functions Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 10/18] netfilter: nf_nat: add nat type hooks to nat core Pablo Neira Ayuso
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

This adds the infrastructure to register nat hooks with the nat core
instead of the netfilter core.

nat hooks are used to configure nat bindings.  Such hooks are registered
from ip(6)table_nat or by the nftables core when a nat chain is added.

After next patch, nat hooks will be registered with nf_nat instead of
netfilter core.  This allows to use many nat lookup functions at the
same time while doing the real packet rewrite (nat transformation) in
one place.

This change doesn't convert the intended users yet to ease review.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_nat.h |   4 ++
 net/netfilter/nf_nat_core.c    | 157 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 161 insertions(+)

diff --git a/include/net/netfilter/nf_nat.h b/include/net/netfilter/nf_nat.h
index da3d601cadee..a17eb2f8d40e 100644
--- a/include/net/netfilter/nf_nat.h
+++ b/include/net/netfilter/nf_nat.h
@@ -75,4 +75,8 @@ static inline bool nf_nat_oif_changed(unsigned int hooknum,
 #endif
 }
 
+int nf_nat_register_fn(struct net *net, const struct nf_hook_ops *ops,
+		       const struct nf_hook_ops *nat_ops, unsigned int ops_count);
+void nf_nat_unregister_fn(struct net *net, const struct nf_hook_ops *ops,
+			  unsigned int ops_count);
 #endif
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 0cd503aacbf0..f531d77dd684 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -32,6 +32,8 @@
 #include <net/netfilter/nf_conntrack_zones.h>
 #include <linux/netfilter/nf_nat.h>
 
+#include "nf_internals.h"
+
 static spinlock_t nf_nat_locks[CONNTRACK_LOCKS];
 
 static DEFINE_MUTEX(nf_nat_proto_mutex);
@@ -39,11 +41,27 @@ static const struct nf_nat_l3proto __rcu *nf_nat_l3protos[NFPROTO_NUMPROTO]
 						__read_mostly;
 static const struct nf_nat_l4proto __rcu **nf_nat_l4protos[NFPROTO_NUMPROTO]
 						__read_mostly;
+static unsigned int nat_net_id __read_mostly;
 
 static struct hlist_head *nf_nat_bysource __read_mostly;
 static unsigned int nf_nat_htable_size __read_mostly;
 static unsigned int nf_nat_hash_rnd __read_mostly;
 
+struct nf_nat_lookup_hook_priv {
+	struct nf_hook_entries __rcu *entries;
+
+	struct rcu_head rcu_head;
+};
+
+struct nf_nat_hooks_net {
+	struct nf_hook_ops *nat_hook_ops;
+	unsigned int users;
+};
+
+struct nat_net {
+	struct nf_nat_hooks_net nat_proto_net[NFPROTO_NUMPROTO];
+};
+
 inline const struct nf_nat_l3proto *
 __nf_nat_l3proto_find(u8 family)
 {
@@ -871,6 +889,138 @@ static struct nf_ct_helper_expectfn follow_master_nat = {
 	.expectfn	= nf_nat_follow_master,
 };
 
+int nf_nat_register_fn(struct net *net, const struct nf_hook_ops *ops,
+		       const struct nf_hook_ops *orig_nat_ops, unsigned int ops_count)
+{
+	struct nat_net *nat_net = net_generic(net, nat_net_id);
+	struct nf_nat_hooks_net *nat_proto_net;
+	struct nf_nat_lookup_hook_priv *priv;
+	unsigned int hooknum = ops->hooknum;
+	struct nf_hook_ops *nat_ops;
+	int i, ret;
+
+	if (WARN_ON_ONCE(ops->pf >= ARRAY_SIZE(nat_net->nat_proto_net)))
+		return -EINVAL;
+
+	nat_proto_net = &nat_net->nat_proto_net[ops->pf];
+
+	for (i = 0; i < ops_count; i++) {
+		if (WARN_ON(orig_nat_ops[i].pf != ops->pf))
+			return -EINVAL;
+		if (orig_nat_ops[i].hooknum == hooknum) {
+			hooknum = i;
+			break;
+		}
+	}
+
+	if (WARN_ON_ONCE(i == ops_count))
+		return -EINVAL;
+
+	mutex_lock(&nf_nat_proto_mutex);
+	if (!nat_proto_net->nat_hook_ops) {
+		WARN_ON(nat_proto_net->users != 0);
+
+		nat_ops = kmemdup(orig_nat_ops, sizeof(*orig_nat_ops) * ops_count, GFP_KERNEL);
+		if (!nat_ops) {
+			mutex_unlock(&nf_nat_proto_mutex);
+			return -ENOMEM;
+		}
+
+		for (i = 0; i < ops_count; i++) {
+			priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+			if (priv) {
+				nat_ops[i].priv = priv;
+				continue;
+			}
+			mutex_unlock(&nf_nat_proto_mutex);
+			while (i)
+				kfree(nat_ops[--i].priv);
+			kfree(nat_ops);
+			return -ENOMEM;
+		}
+
+		ret = nf_register_net_hooks(net, nat_ops, ops_count);
+		if (ret < 0) {
+			mutex_unlock(&nf_nat_proto_mutex);
+			for (i = 0; i < ops_count; i++)
+				kfree(nat_ops[i].priv);
+			kfree(nat_ops);
+			return ret;
+		}
+
+		nat_proto_net->nat_hook_ops = nat_ops;
+	}
+
+	nat_ops = nat_proto_net->nat_hook_ops;
+	priv = nat_ops[hooknum].priv;
+	if (WARN_ON_ONCE(!priv)) {
+		mutex_unlock(&nf_nat_proto_mutex);
+		return -EOPNOTSUPP;
+	}
+
+	ret = nf_hook_entries_insert_raw(&priv->entries, ops);
+	if (ret == 0)
+		nat_proto_net->users++;
+
+	mutex_unlock(&nf_nat_proto_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(nf_nat_register_fn);
+
+void nf_nat_unregister_fn(struct net *net, const struct nf_hook_ops *ops,
+		          unsigned int ops_count)
+{
+	struct nat_net *nat_net = net_generic(net, nat_net_id);
+	struct nf_nat_hooks_net *nat_proto_net;
+	struct nf_nat_lookup_hook_priv *priv;
+	struct nf_hook_ops *nat_ops;
+	int hooknum = ops->hooknum;
+	int i;
+
+	if (ops->pf >= ARRAY_SIZE(nat_net->nat_proto_net))
+		return;
+
+	nat_proto_net = &nat_net->nat_proto_net[ops->pf];
+
+	mutex_lock(&nf_nat_proto_mutex);
+	if (WARN_ON(nat_proto_net->users == 0))
+		goto unlock;
+
+	nat_proto_net->users--;
+
+	nat_ops = nat_proto_net->nat_hook_ops;
+	for (i = 0; i < ops_count; i++) {
+		if (nat_ops[i].hooknum == hooknum) {
+			hooknum = i;
+			break;
+		}
+	}
+	if (WARN_ON_ONCE(i == ops_count))
+		goto unlock;
+	priv = nat_ops[hooknum].priv;
+	nf_hook_entries_delete_raw(&priv->entries, ops);
+
+	if (nat_proto_net->users == 0) {
+		nf_unregister_net_hooks(net, nat_ops, ops_count);
+
+		for (i = 0; i < ops_count; i++) {
+			priv = nat_ops[i].priv;
+			kfree_rcu(priv, rcu_head);
+		}
+
+		nat_proto_net->nat_hook_ops = NULL;
+		kfree(nat_ops);
+	}
+unlock:
+	mutex_unlock(&nf_nat_proto_mutex);
+}
+EXPORT_SYMBOL_GPL(nf_nat_unregister_fn);
+
+static struct pernet_operations nat_net_ops = {
+	.id = &nat_net_id,
+	.size = sizeof(struct nat_net),
+};
+
 static int __init nf_nat_init(void)
 {
 	int ret, i;
@@ -894,6 +1044,12 @@ static int __init nf_nat_init(void)
 	for (i = 0; i < CONNTRACK_LOCKS; i++)
 		spin_lock_init(&nf_nat_locks[i]);
 
+	ret = register_pernet_subsys(&nat_net_ops);
+	if (ret < 0) {
+		nf_ct_extend_unregister(&nat_extend);
+		return ret;
+	}
+
 	nf_ct_helper_expectfn_register(&follow_master_nat);
 
 	BUG_ON(nfnetlink_parse_nat_setup_hook != NULL);
@@ -925,6 +1081,7 @@ static void __exit nf_nat_cleanup(void)
 		kfree(nf_nat_l4protos[i]);
 	synchronize_net();
 	nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
+	unregister_pernet_subsys(&nat_net_ops);
 }
 
 MODULE_LICENSE("GPL");
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/18] netfilter: nf_nat: add nat type hooks to nat core
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (8 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 09/18] netfilter: nf_nat: add nat hook register functions to nf_nat Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 11/18] netfilter: lift one-nat-hook-only restriction Pablo Neira Ayuso
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

Currently the packet rewrite and instantiation of nat NULL bindings
happens from the protocol specific nat backend.

Invocation occurs either via ip(6)table_nat or the nf_tables nat chain type.

Invocation looks like this (simplified):
NF_HOOK()
   |
   `---iptable_nat
	 |
	 `---> nf_nat_l3proto_ipv4 -> nf_nat_packet
	               |
          new packet? pass skb though iptables nat chain
                       |
		       `---> iptable_nat: ipt_do_table

In nft case, this looks the same (nft_chain_nat_ipv4 instead of
iptable_nat).

This is a problem for two reasons:
1. Can't use iptables nat and nf_tables nat at the same time,
   as the first user adds a nat binding (nf_nat_l3proto_ipv4 adds a
   NULL binding if do_table() did not find a matching nat rule so we
   can detect post-nat tuple collisions).
2. If you use e.g. nft_masq, snat, redir, etc. uses must also register
   an empty base chain so that the nat core gets called fro NF_HOOK()
   to do the reverse translation, which is neither obvious nor user
   friendly.

After this change, the base hook gets registered not from iptable_nat or
nftables nat hooks, but from the l3 nat core.

iptables/nft nat base hooks get registered with the nat core instead:

NF_HOOK()
   |
   `---> nf_nat_l3proto_ipv4 -> nf_nat_packet
		|
         new packet? pass skb through iptables/nftables nat chains
                |
		+-> iptables_nat: ipt_do_table
	        +-> nft nat chain x
	        `-> nft nat chain y

The nat core deals with null bindings and reverse translation.
When no mapping exists, it calls the registered nat lookup hooks until
one creates a new mapping.
If both iptables and nftables nat hooks exist, the first matching
one is used (i.e., higher priority wins).

Also, nft users do not need to create empty nat hooks anymore,
nat core always registers the base hooks that take care of reverse/reply
translation.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_nat_core.h      |  5 +-
 include/net/netfilter/nf_nat_l3proto.h   | 52 ++-----------------
 net/ipv4/netfilter/iptable_nat.c         | 85 ++++++++++++++++----------------
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 82 ++++++++++++++++++++----------
 net/ipv4/netfilter/nft_chain_nat_ipv4.c  | 51 +++----------------
 net/ipv6/netfilter/ip6table_nat.c        | 84 +++++++++++++++----------------
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 83 ++++++++++++++++++++-----------
 net/ipv6/netfilter/nft_chain_nat_ipv6.c  | 48 +++---------------
 net/netfilter/nf_nat_core.c              | 31 +++++++-----
 9 files changed, 232 insertions(+), 289 deletions(-)

diff --git a/include/net/netfilter/nf_nat_core.h b/include/net/netfilter/nf_nat_core.h
index 0d84dd29108d..c78e9be14b3d 100644
--- a/include/net/netfilter/nf_nat_core.h
+++ b/include/net/netfilter/nf_nat_core.h
@@ -13,10 +13,7 @@ unsigned int nf_nat_packet(struct nf_conn *ct, enum ip_conntrack_info ctinfo,
 
 unsigned int
 nf_nat_inet_fn(void *priv, struct sk_buff *skb,
-	       const struct nf_hook_state *state,
-	       unsigned int (*do_chain)(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state));
+	       const struct nf_hook_state *state);
 
 int nf_xfrm_me_harder(struct net *net, struct sk_buff *skb, unsigned int family);
 
diff --git a/include/net/netfilter/nf_nat_l3proto.h b/include/net/netfilter/nf_nat_l3proto.h
index 8bad2560576f..d300b8f03972 100644
--- a/include/net/netfilter/nf_nat_l3proto.h
+++ b/include/net/netfilter/nf_nat_l3proto.h
@@ -44,58 +44,14 @@ int nf_nat_icmp_reply_translation(struct sk_buff *skb, struct nf_conn *ct,
 				  enum ip_conntrack_info ctinfo,
 				  unsigned int hooknum);
 
-unsigned int nf_nat_ipv4_in(void *priv, struct sk_buff *skb,
-			    const struct nf_hook_state *state,
-			    unsigned int (*do_chain)(void *priv,
-						     struct sk_buff *skb,
-						     const struct nf_hook_state *state));
-
-unsigned int nf_nat_ipv4_out(void *priv, struct sk_buff *skb,
-			     const struct nf_hook_state *state,
-			     unsigned int (*do_chain)(void *priv,
-						      struct sk_buff *skb,
-						      const struct nf_hook_state *state));
-
-unsigned int nf_nat_ipv4_local_fn(void *priv,
-				  struct sk_buff *skb,
-				  const struct nf_hook_state *state,
-				  unsigned int (*do_chain)(void *priv,
-							   struct sk_buff *skb,
-							   const struct nf_hook_state *state));
-
-unsigned int nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
-			    const struct nf_hook_state *state,
-			    unsigned int (*do_chain)(void *priv,
-						     struct sk_buff *skb,
-						     const struct nf_hook_state *state));
-
 int nf_nat_icmpv6_reply_translation(struct sk_buff *skb, struct nf_conn *ct,
 				    enum ip_conntrack_info ctinfo,
 				    unsigned int hooknum, unsigned int hdrlen);
 
-unsigned int nf_nat_ipv6_in(void *priv, struct sk_buff *skb,
-			    const struct nf_hook_state *state,
-			    unsigned int (*do_chain)(void *priv,
-						     struct sk_buff *skb,
-						     const struct nf_hook_state *state));
-
-unsigned int nf_nat_ipv6_out(void *priv, struct sk_buff *skb,
-			     const struct nf_hook_state *state,
-			     unsigned int (*do_chain)(void *priv,
-						      struct sk_buff *skb,
-						      const struct nf_hook_state *state));
-
-unsigned int nf_nat_ipv6_local_fn(void *priv,
-				  struct sk_buff *skb,
-				  const struct nf_hook_state *state,
-				  unsigned int (*do_chain)(void *priv,
-							   struct sk_buff *skb,
-							   const struct nf_hook_state *state));
+int nf_nat_l3proto_ipv4_register_fn(struct net *net, const struct nf_hook_ops *ops);
+void nf_nat_l3proto_ipv4_unregister_fn(struct net *net, const struct nf_hook_ops *ops);
 
-unsigned int nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
-			    const struct nf_hook_state *state,
-			    unsigned int (*do_chain)(void *priv,
-						     struct sk_buff *skb,
-						     const struct nf_hook_state *state));
+int nf_nat_l3proto_ipv6_register_fn(struct net *net, const struct nf_hook_ops *ops);
+void nf_nat_l3proto_ipv6_unregister_fn(struct net *net, const struct nf_hook_ops *ops);
 
 #endif /* _NF_NAT_L3PROTO_H */
diff --git a/net/ipv4/netfilter/iptable_nat.c b/net/ipv4/netfilter/iptable_nat.c
index 529d89ec31e8..a317445448bf 100644
--- a/net/ipv4/netfilter/iptable_nat.c
+++ b/net/ipv4/netfilter/iptable_nat.c
@@ -38,69 +38,58 @@ static unsigned int iptable_nat_do_chain(void *priv,
 	return ipt_do_table(skb, state, state->net->ipv4.nat_table);
 }
 
-static unsigned int iptable_nat_ipv4_fn(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_fn(priv, skb, state, iptable_nat_do_chain);
-}
-
-static unsigned int iptable_nat_ipv4_in(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_in(priv, skb, state, iptable_nat_do_chain);
-}
-
-static unsigned int iptable_nat_ipv4_out(void *priv,
-					 struct sk_buff *skb,
-					 const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_out(priv, skb, state, iptable_nat_do_chain);
-}
-
-static unsigned int iptable_nat_ipv4_local_fn(void *priv,
-					      struct sk_buff *skb,
-					      const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_local_fn(priv, skb, state, iptable_nat_do_chain);
-}
-
 static const struct nf_hook_ops nf_nat_ipv4_ops[] = {
-	/* Before packet filtering, change destination */
 	{
-		.hook		= iptable_nat_ipv4_in,
+		.hook		= iptable_nat_do_chain,
 		.pf		= NFPROTO_IPV4,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_PRE_ROUTING,
 		.priority	= NF_IP_PRI_NAT_DST,
 	},
-	/* After packet filtering, change source */
 	{
-		.hook		= iptable_nat_ipv4_out,
+		.hook		= iptable_nat_do_chain,
 		.pf		= NFPROTO_IPV4,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_POST_ROUTING,
 		.priority	= NF_IP_PRI_NAT_SRC,
 	},
-	/* Before packet filtering, change destination */
 	{
-		.hook		= iptable_nat_ipv4_local_fn,
+		.hook		= iptable_nat_do_chain,
 		.pf		= NFPROTO_IPV4,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_LOCAL_OUT,
 		.priority	= NF_IP_PRI_NAT_DST,
 	},
-	/* After packet filtering, change source */
 	{
-		.hook		= iptable_nat_ipv4_fn,
+		.hook		= iptable_nat_do_chain,
 		.pf		= NFPROTO_IPV4,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_LOCAL_IN,
 		.priority	= NF_IP_PRI_NAT_SRC,
 	},
 };
 
+static int ipt_nat_register_lookups(struct net *net)
+{
+	int i, ret;
+
+	for (i = 0; i < ARRAY_SIZE(nf_nat_ipv4_ops); i++) {
+		ret = nf_nat_l3proto_ipv4_register_fn(net, &nf_nat_ipv4_ops[i]);
+		if (ret) {
+			while (i)
+				nf_nat_l3proto_ipv4_unregister_fn(net, &nf_nat_ipv4_ops[--i]);
+
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static void ipt_nat_unregister_lookups(struct net *net)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(nf_nat_ipv4_ops); i++)
+		nf_nat_l3proto_ipv4_unregister_fn(net, &nf_nat_ipv4_ops[i]);
+}
+
 static int __net_init iptable_nat_table_init(struct net *net)
 {
 	struct ipt_replace *repl;
@@ -113,7 +102,18 @@ static int __net_init iptable_nat_table_init(struct net *net)
 	if (repl == NULL)
 		return -ENOMEM;
 	ret = ipt_register_table(net, &nf_nat_ipv4_table, repl,
-				 nf_nat_ipv4_ops, &net->ipv4.nat_table);
+				 NULL, &net->ipv4.nat_table);
+	if (ret < 0) {
+		kfree(repl);
+		return ret;
+	}
+
+	ret = ipt_nat_register_lookups(net);
+	if (ret < 0) {
+		ipt_unregister_table(net, net->ipv4.nat_table, NULL);
+		net->ipv4.nat_table = NULL;
+	}
+
 	kfree(repl);
 	return ret;
 }
@@ -122,7 +122,8 @@ static void __net_exit iptable_nat_net_exit(struct net *net)
 {
 	if (!net->ipv4.nat_table)
 		return;
-	ipt_unregister_table(net, net->ipv4.nat_table, nf_nat_ipv4_ops);
+	ipt_nat_unregister_lookups(net);
+	ipt_unregister_table(net, net->ipv4.nat_table, NULL);
 	net->ipv4.nat_table = NULL;
 }
 
diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index 29b5aceac66d..6115bf1ff6f0 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -241,12 +241,9 @@ int nf_nat_icmp_reply_translation(struct sk_buff *skb,
 }
 EXPORT_SYMBOL_GPL(nf_nat_icmp_reply_translation);
 
-unsigned int
+static unsigned int
 nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
-	       const struct nf_hook_state *state,
-	       unsigned int (*do_chain)(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state))
+	       const struct nf_hook_state *state)
 {
 	struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
@@ -265,35 +262,28 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
 		}
 	}
 
-	return nf_nat_inet_fn(priv, skb, state, do_chain);
+	return nf_nat_inet_fn(priv, skb, state);
 }
 EXPORT_SYMBOL_GPL(nf_nat_ipv4_fn);
 
-unsigned int
+static unsigned int
 nf_nat_ipv4_in(void *priv, struct sk_buff *skb,
-	       const struct nf_hook_state *state,
-	       unsigned int (*do_chain)(void *priv,
-					 struct sk_buff *skb,
-					 const struct nf_hook_state *state))
+	       const struct nf_hook_state *state)
 {
 	unsigned int ret;
 	__be32 daddr = ip_hdr(skb)->daddr;
 
-	ret = nf_nat_ipv4_fn(priv, skb, state, do_chain);
+	ret = nf_nat_ipv4_fn(priv, skb, state);
 	if (ret != NF_DROP && ret != NF_STOLEN &&
 	    daddr != ip_hdr(skb)->daddr)
 		skb_dst_drop(skb);
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv4_in);
 
-unsigned int
+static unsigned int
 nf_nat_ipv4_out(void *priv, struct sk_buff *skb,
-		const struct nf_hook_state *state,
-		unsigned int (*do_chain)(void *priv,
-					  struct sk_buff *skb,
-					  const struct nf_hook_state *state))
+		const struct nf_hook_state *state)
 {
 #ifdef CONFIG_XFRM
 	const struct nf_conn *ct;
@@ -302,7 +292,7 @@ nf_nat_ipv4_out(void *priv, struct sk_buff *skb,
 #endif
 	unsigned int ret;
 
-	ret = nf_nat_ipv4_fn(priv, skb, state, do_chain);
+	ret = nf_nat_ipv4_fn(priv, skb, state);
 #ifdef CONFIG_XFRM
 	if (ret != NF_DROP && ret != NF_STOLEN &&
 	    !(IPCB(skb)->flags & IPSKB_XFRM_TRANSFORMED) &&
@@ -322,21 +312,17 @@ nf_nat_ipv4_out(void *priv, struct sk_buff *skb,
 #endif
 	return ret;
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv4_out);
 
-unsigned int
+static unsigned int
 nf_nat_ipv4_local_fn(void *priv, struct sk_buff *skb,
-		     const struct nf_hook_state *state,
-		     unsigned int (*do_chain)(void *priv,
-					       struct sk_buff *skb,
-					       const struct nf_hook_state *state))
+		     const struct nf_hook_state *state)
 {
 	const struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
 	unsigned int ret;
 	int err;
 
-	ret = nf_nat_ipv4_fn(priv, skb, state, do_chain);
+	ret = nf_nat_ipv4_fn(priv, skb, state);
 	if (ret != NF_DROP && ret != NF_STOLEN &&
 	    (ct = nf_ct_get(skb, &ctinfo)) != NULL) {
 		enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
@@ -360,7 +346,49 @@ nf_nat_ipv4_local_fn(void *priv, struct sk_buff *skb,
 	}
 	return ret;
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv4_local_fn);
+
+static const struct nf_hook_ops nf_nat_ipv4_ops[] = {
+	/* Before packet filtering, change destination */
+	{
+		.hook		= nf_nat_ipv4_in,
+		.pf		= NFPROTO_IPV4,
+		.hooknum	= NF_INET_PRE_ROUTING,
+		.priority	= NF_IP_PRI_NAT_DST,
+	},
+	/* After packet filtering, change source */
+	{
+		.hook		= nf_nat_ipv4_out,
+		.pf		= NFPROTO_IPV4,
+		.hooknum	= NF_INET_POST_ROUTING,
+		.priority	= NF_IP_PRI_NAT_SRC,
+	},
+	/* Before packet filtering, change destination */
+	{
+		.hook		= nf_nat_ipv4_local_fn,
+		.pf		= NFPROTO_IPV4,
+		.hooknum	= NF_INET_LOCAL_OUT,
+		.priority	= NF_IP_PRI_NAT_DST,
+	},
+	/* After packet filtering, change source */
+	{
+		.hook		= nf_nat_ipv4_fn,
+		.pf		= NFPROTO_IPV4,
+		.hooknum	= NF_INET_LOCAL_IN,
+		.priority	= NF_IP_PRI_NAT_SRC,
+	},
+};
+
+int nf_nat_l3proto_ipv4_register_fn(struct net *net, const struct nf_hook_ops *ops)
+{
+	return nf_nat_register_fn(net, ops, nf_nat_ipv4_ops, ARRAY_SIZE(nf_nat_ipv4_ops));
+}
+EXPORT_SYMBOL_GPL(nf_nat_l3proto_ipv4_register_fn);
+
+void nf_nat_l3proto_ipv4_unregister_fn(struct net *net, const struct nf_hook_ops *ops)
+{
+	nf_nat_unregister_fn(net, ops, ARRAY_SIZE(nf_nat_ipv4_ops));
+}
+EXPORT_SYMBOL_GPL(nf_nat_l3proto_ipv4_unregister_fn);
 
 static int __init nf_nat_l3proto_ipv4_init(void)
 {
diff --git a/net/ipv4/netfilter/nft_chain_nat_ipv4.c b/net/ipv4/netfilter/nft_chain_nat_ipv4.c
index bbcb624b6b81..a3c4ea303e3e 100644
--- a/net/ipv4/netfilter/nft_chain_nat_ipv4.c
+++ b/net/ipv4/netfilter/nft_chain_nat_ipv4.c
@@ -27,8 +27,8 @@
 #include <net/ip.h>
 
 static unsigned int nft_nat_do_chain(void *priv,
-				      struct sk_buff *skb,
-				      const struct nf_hook_state *state)
+				     struct sk_buff *skb,
+				     const struct nf_hook_state *state)
 {
 	struct nft_pktinfo pkt;
 
@@ -38,49 +38,14 @@ static unsigned int nft_nat_do_chain(void *priv,
 	return nft_do_chain(&pkt, priv);
 }
 
-static unsigned int nft_nat_ipv4_fn(void *priv,
-				    struct sk_buff *skb,
-				    const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_fn(priv, skb, state, nft_nat_do_chain);
-}
-
-static unsigned int nft_nat_ipv4_in(void *priv,
-				    struct sk_buff *skb,
-				    const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_in(priv, skb, state, nft_nat_do_chain);
-}
-
-static unsigned int nft_nat_ipv4_out(void *priv,
-				     struct sk_buff *skb,
-				     const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_out(priv, skb, state, nft_nat_do_chain);
-}
-
-static unsigned int nft_nat_ipv4_local_fn(void *priv,
-					  struct sk_buff *skb,
-					  const struct nf_hook_state *state)
-{
-	return nf_nat_ipv4_local_fn(priv, skb, state, nft_nat_do_chain);
-}
-
 static int nft_nat_ipv4_reg(struct net *net, const struct nf_hook_ops *ops)
 {
-	int ret = nf_register_net_hook(net, ops);
-	if (ret == 0) {
-		ret = nf_ct_netns_get(net, NFPROTO_IPV4);
-		if (ret)
-			 nf_unregister_net_hook(net, ops);
-	}
-	return ret;
+	return nf_nat_l3proto_ipv4_register_fn(net, ops);
 }
 
 static void nft_nat_ipv4_unreg(struct net *net, const struct nf_hook_ops *ops)
 {
-	nf_unregister_net_hook(net, ops);
-	nf_ct_netns_put(net, NFPROTO_IPV4);
+	nf_nat_l3proto_ipv4_unregister_fn(net, ops);
 }
 
 static const struct nft_chain_type nft_chain_nat_ipv4 = {
@@ -93,10 +58,10 @@ static const struct nft_chain_type nft_chain_nat_ipv4 = {
 			  (1 << NF_INET_LOCAL_OUT) |
 			  (1 << NF_INET_LOCAL_IN),
 	.hooks		= {
-		[NF_INET_PRE_ROUTING]	= nft_nat_ipv4_in,
-		[NF_INET_POST_ROUTING]	= nft_nat_ipv4_out,
-		[NF_INET_LOCAL_OUT]	= nft_nat_ipv4_local_fn,
-		[NF_INET_LOCAL_IN]	= nft_nat_ipv4_fn,
+		[NF_INET_PRE_ROUTING]	= nft_nat_do_chain,
+		[NF_INET_POST_ROUTING]	= nft_nat_do_chain,
+		[NF_INET_LOCAL_OUT]	= nft_nat_do_chain,
+		[NF_INET_LOCAL_IN]	= nft_nat_do_chain,
 	},
 	.ops_register = nft_nat_ipv4_reg,
 	.ops_unregister = nft_nat_ipv4_unreg,
diff --git a/net/ipv6/netfilter/ip6table_nat.c b/net/ipv6/netfilter/ip6table_nat.c
index 2bf554e18af8..67ba70ab9f5c 100644
--- a/net/ipv6/netfilter/ip6table_nat.c
+++ b/net/ipv6/netfilter/ip6table_nat.c
@@ -40,69 +40,58 @@ static unsigned int ip6table_nat_do_chain(void *priv,
 	return ip6t_do_table(skb, state, state->net->ipv6.ip6table_nat);
 }
 
-static unsigned int ip6table_nat_fn(void *priv,
-				    struct sk_buff *skb,
-				    const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_fn(priv, skb, state, ip6table_nat_do_chain);
-}
-
-static unsigned int ip6table_nat_in(void *priv,
-				    struct sk_buff *skb,
-				    const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_in(priv, skb, state, ip6table_nat_do_chain);
-}
-
-static unsigned int ip6table_nat_out(void *priv,
-				     struct sk_buff *skb,
-				     const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_out(priv, skb, state, ip6table_nat_do_chain);
-}
-
-static unsigned int ip6table_nat_local_fn(void *priv,
-					  struct sk_buff *skb,
-					  const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_local_fn(priv, skb, state, ip6table_nat_do_chain);
-}
-
 static const struct nf_hook_ops nf_nat_ipv6_ops[] = {
-	/* Before packet filtering, change destination */
 	{
-		.hook		= ip6table_nat_in,
+		.hook		= ip6table_nat_do_chain,
 		.pf		= NFPROTO_IPV6,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_PRE_ROUTING,
 		.priority	= NF_IP6_PRI_NAT_DST,
 	},
-	/* After packet filtering, change source */
 	{
-		.hook		= ip6table_nat_out,
+		.hook		= ip6table_nat_do_chain,
 		.pf		= NFPROTO_IPV6,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_POST_ROUTING,
 		.priority	= NF_IP6_PRI_NAT_SRC,
 	},
-	/* Before packet filtering, change destination */
 	{
-		.hook		= ip6table_nat_local_fn,
+		.hook		= ip6table_nat_do_chain,
 		.pf		= NFPROTO_IPV6,
-		.nat_hook	= true,
 		.hooknum	= NF_INET_LOCAL_OUT,
 		.priority	= NF_IP6_PRI_NAT_DST,
 	},
-	/* After packet filtering, change source */
 	{
-		.hook		= ip6table_nat_fn,
-		.nat_hook	= true,
+		.hook		= ip6table_nat_do_chain,
 		.pf		= NFPROTO_IPV6,
 		.hooknum	= NF_INET_LOCAL_IN,
 		.priority	= NF_IP6_PRI_NAT_SRC,
 	},
 };
 
+static int ip6t_nat_register_lookups(struct net *net)
+{
+	int i, ret;
+
+	for (i = 0; i < ARRAY_SIZE(nf_nat_ipv6_ops); i++) {
+		ret = nf_nat_l3proto_ipv6_register_fn(net, &nf_nat_ipv6_ops[i]);
+		if (ret) {
+			while (i)
+				nf_nat_l3proto_ipv6_unregister_fn(net, &nf_nat_ipv6_ops[--i]);
+
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static void ip6t_nat_unregister_lookups(struct net *net)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(nf_nat_ipv6_ops); i++)
+		nf_nat_l3proto_ipv6_unregister_fn(net, &nf_nat_ipv6_ops[i]);
+}
+
 static int __net_init ip6table_nat_table_init(struct net *net)
 {
 	struct ip6t_replace *repl;
@@ -115,7 +104,17 @@ static int __net_init ip6table_nat_table_init(struct net *net)
 	if (repl == NULL)
 		return -ENOMEM;
 	ret = ip6t_register_table(net, &nf_nat_ipv6_table, repl,
-				  nf_nat_ipv6_ops, &net->ipv6.ip6table_nat);
+				  NULL, &net->ipv6.ip6table_nat);
+	if (ret < 0) {
+		kfree(repl);
+		return ret;
+	}
+
+	ret = ip6t_nat_register_lookups(net);
+	if (ret < 0) {
+		ip6t_unregister_table(net, net->ipv6.ip6table_nat, NULL);
+		net->ipv6.ip6table_nat = NULL;
+	}
 	kfree(repl);
 	return ret;
 }
@@ -124,7 +123,8 @@ static void __net_exit ip6table_nat_net_exit(struct net *net)
 {
 	if (!net->ipv6.ip6table_nat)
 		return;
-	ip6t_unregister_table(net, net->ipv6.ip6table_nat, nf_nat_ipv6_ops);
+	ip6t_nat_unregister_lookups(net);
+	ip6t_unregister_table(net, net->ipv6.ip6table_nat, NULL);
 	net->ipv6.ip6table_nat = NULL;
 }
 
diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index 3ec228984f82..ca6d38698b1a 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -252,12 +252,9 @@ int nf_nat_icmpv6_reply_translation(struct sk_buff *skb,
 }
 EXPORT_SYMBOL_GPL(nf_nat_icmpv6_reply_translation);
 
-unsigned int
+static unsigned int
 nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
-	       const struct nf_hook_state *state,
-	       unsigned int (*do_chain)(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state))
+	       const struct nf_hook_state *state)
 {
 	struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
@@ -289,35 +286,27 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
 		}
 	}
 
-	return nf_nat_inet_fn(priv, skb, state, do_chain);
+	return nf_nat_inet_fn(priv, skb, state);
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv6_fn);
 
-unsigned int
+static unsigned int
 nf_nat_ipv6_in(void *priv, struct sk_buff *skb,
-	       const struct nf_hook_state *state,
-	       unsigned int (*do_chain)(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state))
+	       const struct nf_hook_state *state)
 {
 	unsigned int ret;
 	struct in6_addr daddr = ipv6_hdr(skb)->daddr;
 
-	ret = nf_nat_ipv6_fn(priv, skb, state, do_chain);
+	ret = nf_nat_ipv6_fn(priv, skb, state);
 	if (ret != NF_DROP && ret != NF_STOLEN &&
 	    ipv6_addr_cmp(&daddr, &ipv6_hdr(skb)->daddr))
 		skb_dst_drop(skb);
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv6_in);
 
-unsigned int
+static unsigned int
 nf_nat_ipv6_out(void *priv, struct sk_buff *skb,
-		const struct nf_hook_state *state,
-		unsigned int (*do_chain)(void *priv,
-					 struct sk_buff *skb,
-					 const struct nf_hook_state *state))
+		const struct nf_hook_state *state)
 {
 #ifdef CONFIG_XFRM
 	const struct nf_conn *ct;
@@ -326,7 +315,7 @@ nf_nat_ipv6_out(void *priv, struct sk_buff *skb,
 #endif
 	unsigned int ret;
 
-	ret = nf_nat_ipv6_fn(priv, skb, state, do_chain);
+	ret = nf_nat_ipv6_fn(priv, skb, state);
 #ifdef CONFIG_XFRM
 	if (ret != NF_DROP && ret != NF_STOLEN &&
 	    !(IP6CB(skb)->flags & IP6SKB_XFRM_TRANSFORMED) &&
@@ -346,21 +335,17 @@ nf_nat_ipv6_out(void *priv, struct sk_buff *skb,
 #endif
 	return ret;
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv6_out);
 
-unsigned int
+static unsigned int
 nf_nat_ipv6_local_fn(void *priv, struct sk_buff *skb,
-		     const struct nf_hook_state *state,
-		     unsigned int (*do_chain)(void *priv,
-					      struct sk_buff *skb,
-					      const struct nf_hook_state *state))
+		     const struct nf_hook_state *state)
 {
 	const struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
 	unsigned int ret;
 	int err;
 
-	ret = nf_nat_ipv6_fn(priv, skb, state, do_chain);
+	ret = nf_nat_ipv6_fn(priv, skb, state);
 	if (ret != NF_DROP && ret != NF_STOLEN &&
 	    (ct = nf_ct_get(skb, &ctinfo)) != NULL) {
 		enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
@@ -384,7 +369,49 @@ nf_nat_ipv6_local_fn(void *priv, struct sk_buff *skb,
 	}
 	return ret;
 }
-EXPORT_SYMBOL_GPL(nf_nat_ipv6_local_fn);
+
+static const struct nf_hook_ops nf_nat_ipv6_ops[] = {
+	/* Before packet filtering, change destination */
+	{
+		.hook		= nf_nat_ipv6_in,
+		.pf		= NFPROTO_IPV6,
+		.hooknum	= NF_INET_PRE_ROUTING,
+		.priority	= NF_IP6_PRI_NAT_DST,
+	},
+	/* After packet filtering, change source */
+	{
+		.hook		= nf_nat_ipv6_out,
+		.pf		= NFPROTO_IPV6,
+		.hooknum	= NF_INET_POST_ROUTING,
+		.priority	= NF_IP6_PRI_NAT_SRC,
+	},
+	/* Before packet filtering, change destination */
+	{
+		.hook		= nf_nat_ipv6_local_fn,
+		.pf		= NFPROTO_IPV6,
+		.hooknum	= NF_INET_LOCAL_OUT,
+		.priority	= NF_IP6_PRI_NAT_DST,
+	},
+	/* After packet filtering, change source */
+	{
+		.hook		= nf_nat_ipv6_fn,
+		.pf		= NFPROTO_IPV6,
+		.hooknum	= NF_INET_LOCAL_IN,
+		.priority	= NF_IP6_PRI_NAT_SRC,
+	},
+};
+
+int nf_nat_l3proto_ipv6_register_fn(struct net *net, const struct nf_hook_ops *ops)
+{
+	return nf_nat_register_fn(net, ops, nf_nat_ipv6_ops, ARRAY_SIZE(nf_nat_ipv6_ops));
+}
+EXPORT_SYMBOL_GPL(nf_nat_l3proto_ipv6_register_fn);
+
+void nf_nat_l3proto_ipv6_unregister_fn(struct net *net, const struct nf_hook_ops *ops)
+{
+	nf_nat_unregister_fn(net, ops, ARRAY_SIZE(nf_nat_ipv6_ops));
+}
+EXPORT_SYMBOL_GPL(nf_nat_l3proto_ipv6_unregister_fn);
 
 static int __init nf_nat_l3proto_ipv6_init(void)
 {
diff --git a/net/ipv6/netfilter/nft_chain_nat_ipv6.c b/net/ipv6/netfilter/nft_chain_nat_ipv6.c
index 05bcb2c23125..8a081ad7d5db 100644
--- a/net/ipv6/netfilter/nft_chain_nat_ipv6.c
+++ b/net/ipv6/netfilter/nft_chain_nat_ipv6.c
@@ -36,50 +36,14 @@ static unsigned int nft_nat_do_chain(void *priv,
 	return nft_do_chain(&pkt, priv);
 }
 
-static unsigned int nft_nat_ipv6_fn(void *priv,
-				    struct sk_buff *skb,
-				    const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_fn(priv, skb, state, nft_nat_do_chain);
-}
-
-static unsigned int nft_nat_ipv6_in(void *priv,
-				    struct sk_buff *skb,
-				    const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_in(priv, skb, state, nft_nat_do_chain);
-}
-
-static unsigned int nft_nat_ipv6_out(void *priv,
-				     struct sk_buff *skb,
-				     const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_out(priv, skb, state, nft_nat_do_chain);
-}
-
-static unsigned int nft_nat_ipv6_local_fn(void *priv,
-					  struct sk_buff *skb,
-					  const struct nf_hook_state *state)
-{
-	return nf_nat_ipv6_local_fn(priv, skb, state, nft_nat_do_chain);
-}
-
 static int nft_nat_ipv6_reg(struct net *net, const struct nf_hook_ops *ops)
 {
-	int ret = nf_register_net_hook(net, ops);
-	if (ret == 0) {
-		ret = nf_ct_netns_get(net, NFPROTO_IPV6);
-		if (ret)
-			 nf_unregister_net_hook(net, ops);
-	}
-
-	return ret;
+	return nf_nat_l3proto_ipv6_register_fn(net, ops);
 }
 
 static void nft_nat_ipv6_unreg(struct net *net, const struct nf_hook_ops *ops)
 {
-	nf_unregister_net_hook(net, ops);
-	nf_ct_netns_put(net, NFPROTO_IPV6);
+	nf_nat_l3proto_ipv6_unregister_fn(net, ops);
 }
 
 static const struct nft_chain_type nft_chain_nat_ipv6 = {
@@ -92,10 +56,10 @@ static const struct nft_chain_type nft_chain_nat_ipv6 = {
 			  (1 << NF_INET_LOCAL_OUT) |
 			  (1 << NF_INET_LOCAL_IN),
 	.hooks		= {
-		[NF_INET_PRE_ROUTING]	= nft_nat_ipv6_in,
-		[NF_INET_POST_ROUTING]	= nft_nat_ipv6_out,
-		[NF_INET_LOCAL_OUT]	= nft_nat_ipv6_local_fn,
-		[NF_INET_LOCAL_IN]	= nft_nat_ipv6_fn,
+		[NF_INET_PRE_ROUTING]	= nft_nat_do_chain,
+		[NF_INET_POST_ROUTING]	= nft_nat_do_chain,
+		[NF_INET_LOCAL_OUT]	= nft_nat_do_chain,
+		[NF_INET_LOCAL_IN]	= nft_nat_do_chain,
 	},
 	.ops_register		= nft_nat_ipv6_reg,
 	.ops_unregister		= nft_nat_ipv6_unreg,
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index f531d77dd684..489599b549cf 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -533,10 +533,7 @@ EXPORT_SYMBOL_GPL(nf_nat_packet);
 
 unsigned int
 nf_nat_inet_fn(void *priv, struct sk_buff *skb,
-	       const struct nf_hook_state *state,
-	       unsigned int (*do_chain)(void *priv,
-					struct sk_buff *skb,
-					const struct nf_hook_state *state))
+	       const struct nf_hook_state *state)
 {
 	struct nf_conn *ct;
 	enum ip_conntrack_info ctinfo;
@@ -564,15 +561,23 @@ nf_nat_inet_fn(void *priv, struct sk_buff *skb,
 		 * or local packets.
 		 */
 		if (!nf_nat_initialized(ct, maniptype)) {
+			struct nf_nat_lookup_hook_priv *lpriv = priv;
+			struct nf_hook_entries *e = rcu_dereference(lpriv->entries);
 			unsigned int ret;
-
-			ret = do_chain(priv, skb, state);
-			if (ret != NF_ACCEPT)
-				return ret;
-
-			if (nf_nat_initialized(ct, HOOK2MANIP(state->hook)))
-				break;
-
+			int i;
+
+			if (!e)
+				goto null_bind;
+
+			for (i = 0; i < e->num_hook_entries; i++) {
+				ret = e->hooks[i].hook(e->hooks[i].priv, skb,
+						       state);
+				if (ret != NF_ACCEPT)
+					return ret;
+				if (nf_nat_initialized(ct, maniptype))
+					goto do_nat;
+			}
+null_bind:
 			ret = nf_nat_alloc_null_binding(ct, state->hook);
 			if (ret != NF_ACCEPT)
 				return ret;
@@ -592,7 +597,7 @@ nf_nat_inet_fn(void *priv, struct sk_buff *skb,
 		if (nf_nat_oif_changed(state->hook, ctinfo, nat, state->out))
 			goto oif_changed;
 	}
-
+do_nat:
 	return nf_nat_packet(ct, ctinfo, state->hook, skb);
 
 oif_changed:
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/18] netfilter: lift one-nat-hook-only restriction
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (9 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 10/18] netfilter: nf_nat: add nat type hooks to nat core Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 12/18] netfilter: make NF_OSF non-visible symbol Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

This reverts commit f92b40a8b2645
("netfilter: core: only allow one nat hook per hook point"), this
limitation is no longer needed.  The nat core now invokes these
functions and makes sure that hook evaluation stops after a mapping is
created and a null binding is created otherwise.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h     |  1 -
 net/netfilter/core.c          |  5 ----
 net/netfilter/nf_tables_api.c | 66 ++-----------------------------------------
 3 files changed, 2 insertions(+), 70 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 85a1a0b32c66..72f5871b9a0a 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -67,7 +67,6 @@ struct nf_hook_ops {
 	struct net_device	*dev;
 	void			*priv;
 	u_int8_t		pf;
-	bool			nat_hook;
 	unsigned int		hooknum;
 	/* Hooks are ordered in ascending priority. */
 	int			priority;
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 5f0ebf9a8d5b..907d6ef8f3c1 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -138,11 +138,6 @@ nf_hook_entries_grow(const struct nf_hook_entries *old,
 			continue;
 		}
 
-		if (reg->nat_hook && orig_ops[i]->nat_hook) {
-			kvfree(new);
-			return ERR_PTR(-EBUSY);
-		}
-
 		if (inserted || reg->priority > orig_ops[i]->priority) {
 			new_ops[nhooks] = (void *)orig_ops[i];
 			new->hooks[nhooks] = old->hooks[i];
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ded54b2abfbc..a2bb31472aa1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -74,64 +74,12 @@ static void nft_trans_destroy(struct nft_trans *trans)
 	kfree(trans);
 }
 
-/* removal requests are queued in the commit_list, but not acted upon
- * until after all new rules are in place.
- *
- * Therefore, nf_register_net_hook(net, &nat_hook) runs before pending
- * nf_unregister_net_hook().
- *
- * nf_register_net_hook thus fails if a nat hook is already in place
- * even if the conflicting hook is about to be removed.
- *
- * If collision is detected, search commit_log for DELCHAIN matching
- * the new nat hooknum; if we find one collision is temporary:
- *
- * Either transaction is aborted (new/colliding hook is removed), or
- * transaction is committed (old hook is removed).
- */
-static bool nf_tables_allow_nat_conflict(const struct net *net,
-					 const struct nf_hook_ops *ops)
-{
-	const struct nft_trans *trans;
-	bool ret = false;
-
-	if (!ops->nat_hook)
-		return false;
-
-	list_for_each_entry(trans, &net->nft.commit_list, list) {
-		const struct nf_hook_ops *pending_ops;
-		const struct nft_chain *pending;
-
-		if (trans->msg_type != NFT_MSG_NEWCHAIN &&
-		    trans->msg_type != NFT_MSG_DELCHAIN)
-			continue;
-
-		pending = trans->ctx.chain;
-		if (!nft_is_base_chain(pending))
-			continue;
-
-		pending_ops = &nft_base_chain(pending)->ops;
-		if (pending_ops->nat_hook &&
-		    pending_ops->pf == ops->pf &&
-		    pending_ops->hooknum == ops->hooknum) {
-			/* other hook registration already pending? */
-			if (trans->msg_type == NFT_MSG_NEWCHAIN)
-				return false;
-
-			ret = true;
-		}
-	}
-
-	return ret;
-}
-
 static int nf_tables_register_hook(struct net *net,
 				   const struct nft_table *table,
 				   struct nft_chain *chain)
 {
 	const struct nft_base_chain *basechain;
-	struct nf_hook_ops *ops;
-	int ret;
+	const struct nf_hook_ops *ops;
 
 	if (table->flags & NFT_TABLE_F_DORMANT ||
 	    !nft_is_base_chain(chain))
@@ -143,14 +91,7 @@ static int nf_tables_register_hook(struct net *net,
 	if (basechain->type->ops_register)
 		return basechain->type->ops_register(net, ops);
 
-	ret = nf_register_net_hook(net, ops);
-	if (ret == -EBUSY && nf_tables_allow_nat_conflict(net, ops)) {
-		ops->nat_hook = false;
-		ret = nf_register_net_hook(net, ops);
-		ops->nat_hook = true;
-	}
-
-	return ret;
+	return nf_register_net_hook(net, ops);
 }
 
 static void nf_tables_unregister_hook(struct net *net,
@@ -1418,9 +1359,6 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 		ops->hook	= hook.type->hooks[ops->hooknum];
 		ops->dev	= hook.dev;
 
-		if (basechain->type->type == NFT_CHAIN_T_NAT)
-			ops->nat_hook = true;
-
 		chain->flags |= NFT_BASE_CHAIN;
 		basechain->policy = policy;
 	} else {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 12/18] netfilter: make NF_OSF non-visible symbol
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (10 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 11/18] netfilter: lift one-nat-hook-only restriction Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 13/18] netfilter: nft_set_rbtree: add timeout support Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Fernando Fernandez Mancera <ffmancera@riseup.net>

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index e57c9d479503..a5b60e6a983e 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -445,7 +445,7 @@ config NETFILTER_SYNPROXY
 endif # NF_CONNTRACK
 
 config NF_OSF
-	tristate 'Passive OS fingerprint infrastructure'
+	tristate
 
 config NF_TABLES
 	select NETFILTER_NETLINK
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 13/18] netfilter: nft_set_rbtree: add timeout support
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (11 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 12/18] netfilter: make NF_OSF non-visible symbol Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 14/18] netfilter: ip6t_rpfilter: provide input interface for route lookup Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Add garbage collection logic to expire elements stored in the rb-tree
representation.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_set_rbtree.c | 75 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 72 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 22c57d7612c4..d260ce2d6671 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -22,6 +22,7 @@ struct nft_rbtree {
 	struct rb_root		root;
 	rwlock_t		lock;
 	seqcount_t		count;
+	struct delayed_work	gc_work;
 };
 
 struct nft_rbtree_elem {
@@ -265,6 +266,7 @@ static void nft_rbtree_activate(const struct net *net,
 	struct nft_rbtree_elem *rbe = elem->priv;
 
 	nft_set_elem_change_active(net, set, &rbe->ext);
+	nft_set_elem_clear_busy(&rbe->ext);
 }
 
 static bool nft_rbtree_flush(const struct net *net,
@@ -272,8 +274,12 @@ static bool nft_rbtree_flush(const struct net *net,
 {
 	struct nft_rbtree_elem *rbe = priv;
 
-	nft_set_elem_change_active(net, set, &rbe->ext);
-	return true;
+	if (!nft_set_elem_mark_busy(&rbe->ext) ||
+	    !nft_is_active(net, &rbe->ext)) {
+		nft_set_elem_change_active(net, set, &rbe->ext);
+		return true;
+	}
+	return false;
 }
 
 static void *nft_rbtree_deactivate(const struct net *net,
@@ -347,6 +353,62 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
 	read_unlock_bh(&priv->lock);
 }
 
+static void nft_rbtree_gc(struct work_struct *work)
+{
+	struct nft_set_gc_batch *gcb = NULL;
+	struct rb_node *node, *prev = NULL;
+	struct nft_rbtree_elem *rbe;
+	struct nft_rbtree *priv;
+	struct nft_set *set;
+	int i;
+
+	priv = container_of(work, struct nft_rbtree, gc_work.work);
+	set  = nft_set_container_of(priv);
+
+	write_lock_bh(&priv->lock);
+	write_seqcount_begin(&priv->count);
+	for (node = rb_first(&priv->root); node != NULL; node = rb_next(node)) {
+		rbe = rb_entry(node, struct nft_rbtree_elem, node);
+
+		if (nft_rbtree_interval_end(rbe)) {
+			prev = node;
+			continue;
+		}
+		if (!nft_set_elem_expired(&rbe->ext))
+			continue;
+		if (nft_set_elem_mark_busy(&rbe->ext))
+			continue;
+
+		gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
+		if (!gcb)
+			goto out;
+
+		atomic_dec(&set->nelems);
+		nft_set_gc_batch_add(gcb, rbe);
+
+		if (prev) {
+			rbe = rb_entry(prev, struct nft_rbtree_elem, node);
+			atomic_dec(&set->nelems);
+			nft_set_gc_batch_add(gcb, rbe);
+		}
+		node = rb_next(node);
+	}
+out:
+	if (gcb) {
+		for (i = 0; i < gcb->head.cnt; i++) {
+			rbe = gcb->elems[i];
+			rb_erase(&rbe->node, &priv->root);
+		}
+	}
+	write_seqcount_end(&priv->count);
+	write_unlock_bh(&priv->lock);
+
+	nft_set_gc_batch_complete(gcb);
+
+	queue_delayed_work(system_power_efficient_wq, &priv->gc_work,
+			   nft_set_gc_interval(set));
+}
+
 static unsigned int nft_rbtree_privsize(const struct nlattr * const nla[],
 					const struct nft_set_desc *desc)
 {
@@ -362,6 +424,12 @@ static int nft_rbtree_init(const struct nft_set *set,
 	rwlock_init(&priv->lock);
 	seqcount_init(&priv->count);
 	priv->root = RB_ROOT;
+
+	INIT_DEFERRABLE_WORK(&priv->gc_work, nft_rbtree_gc);
+	if (set->flags & NFT_SET_TIMEOUT)
+		queue_delayed_work(system_power_efficient_wq, &priv->gc_work,
+				   nft_set_gc_interval(set));
+
 	return 0;
 }
 
@@ -371,6 +439,7 @@ static void nft_rbtree_destroy(const struct nft_set *set)
 	struct nft_rbtree_elem *rbe;
 	struct rb_node *node;
 
+	cancel_delayed_work_sync(&priv->gc_work);
 	while ((node = priv->root.rb_node) != NULL) {
 		rb_erase(node, &priv->root);
 		rbe = rb_entry(node, struct nft_rbtree_elem, node);
@@ -395,7 +464,7 @@ static bool nft_rbtree_estimate(const struct nft_set_desc *desc, u32 features,
 
 static struct nft_set_type nft_rbtree_type __read_mostly = {
 	.owner		= THIS_MODULE,
-	.features	= NFT_SET_INTERVAL | NFT_SET_MAP | NFT_SET_OBJECT,
+	.features	= NFT_SET_INTERVAL | NFT_SET_MAP | NFT_SET_OBJECT | NFT_SET_TIMEOUT,
 	.ops		= {
 		.privsize	= nft_rbtree_privsize,
 		.elemsize	= offsetof(struct nft_rbtree_elem, ext),
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 14/18] netfilter: ip6t_rpfilter: provide input interface for route lookup
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (12 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 13/18] netfilter: nft_set_rbtree: add timeout support Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 15/18] netfilter: add struct nf_ct_hook and use it Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Vincent Bernat <vincent@bernat.im>

In commit 47b7e7f82802, this bit was removed at the same time the
RT6_LOOKUP_F_IFACE flag was removed. However, it is needed when
link-local addresses are used, which is a very common case: when
packets are routed, neighbor solicitations are done using link-local
addresses. For example, the following neighbor solicitation is not
matched by "-m rpfilter":

    IP6 fe80::5254:33ff:fe00:1 > ff02::1:ff00:3: ICMP6, neighbor
    solicitation, who has 2001:db8::5254:33ff:fe00:3, length 32

Commit 47b7e7f82802 doesn't quite explain why we shouldn't use
RT6_LOOKUP_F_IFACE in the rpfilter case. I suppose the interface check
later in the function would make it redundant. However, the remaining
of the routing code is using RT6_LOOKUP_F_IFACE when there is no
source address (which matches rpfilter's case with a non-unicast
destination, like with neighbor solicitation).

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Fixes: 47b7e7f82802 ("netfilter: don't set F_IFACE on ipv6 fib lookups")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv6/netfilter/ip6t_rpfilter.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/netfilter/ip6t_rpfilter.c b/net/ipv6/netfilter/ip6t_rpfilter.c
index d12f511929f5..0fe61ede77c6 100644
--- a/net/ipv6/netfilter/ip6t_rpfilter.c
+++ b/net/ipv6/netfilter/ip6t_rpfilter.c
@@ -48,6 +48,8 @@ static bool rpfilter_lookup_reverse6(struct net *net, const struct sk_buff *skb,
 	}
 
 	fl6.flowi6_mark = flags & XT_RPFILTER_VALID_MARK ? skb->mark : 0;
+	if ((flags & XT_RPFILTER_LOOSE) == 0)
+		fl6.flowi6_oif = dev->ifindex;
 
 	rt = (void *)ip6_route_lookup(net, &fl6, skb, lookup_flags);
 	if (rt->dst.error)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 15/18] netfilter: add struct nf_ct_hook and use it
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (13 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 14/18] netfilter: ip6t_rpfilter: provide input interface for route lookup Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 16/18] netfilter: add struct nf_nat_hook " Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Move the nf_ct_destroy indirection to the struct nf_ct_hook.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h         |  7 ++++++-
 net/netfilter/core.c              | 14 +++++++-------
 net/netfilter/nf_conntrack_core.c |  9 ++++++---
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 72f5871b9a0a..75ded6f6eebe 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -373,13 +373,18 @@ nf_nat_decode_session(struct sk_buff *skb, struct flowi *fl, u_int8_t family)
 
 extern void (*ip_ct_attach)(struct sk_buff *, const struct sk_buff *) __rcu;
 void nf_ct_attach(struct sk_buff *, const struct sk_buff *);
-extern void (*nf_ct_destroy)(struct nf_conntrack *) __rcu;
 #else
 static inline void nf_ct_attach(struct sk_buff *new, struct sk_buff *skb) {}
 #endif
 
 struct nf_conn;
 enum ip_conntrack_info;
+
+struct nf_ct_hook {
+	void (*destroy)(struct nf_conntrack *);
+};
+extern struct nf_ct_hook __rcu *nf_ct_hook;
+
 struct nlattr;
 
 struct nfnl_ct_hook {
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 907d6ef8f3c1..1bd844ea1d7c 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -563,6 +563,9 @@ EXPORT_SYMBOL(skb_make_writable);
 struct nfnl_ct_hook __rcu *nfnl_ct_hook __read_mostly;
 EXPORT_SYMBOL_GPL(nfnl_ct_hook);
 
+struct nf_ct_hook __rcu *nf_ct_hook __read_mostly;
+EXPORT_SYMBOL_GPL(nf_ct_hook);
+
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 /* This does not belong here, but locally generated errors need it if connection
    tracking in use: without this, connection may not be in hash table, and hence
@@ -585,17 +588,14 @@ void nf_ct_attach(struct sk_buff *new, const struct sk_buff *skb)
 }
 EXPORT_SYMBOL(nf_ct_attach);
 
-void (*nf_ct_destroy)(struct nf_conntrack *) __rcu __read_mostly;
-EXPORT_SYMBOL(nf_ct_destroy);
-
 void nf_conntrack_destroy(struct nf_conntrack *nfct)
 {
-	void (*destroy)(struct nf_conntrack *);
+	struct nf_ct_hook *ct_hook;
 
 	rcu_read_lock();
-	destroy = rcu_dereference(nf_ct_destroy);
-	BUG_ON(destroy == NULL);
-	destroy(nfct);
+	ct_hook = rcu_dereference(nf_ct_hook);
+	BUG_ON(ct_hook == NULL);
+	ct_hook->destroy(nfct);
 	rcu_read_unlock();
 }
 EXPORT_SYMBOL(nf_conntrack_destroy);
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 605441727008..8b2a8644d955 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1813,8 +1813,7 @@ void nf_conntrack_cleanup_start(void)
 
 void nf_conntrack_cleanup_end(void)
 {
-	RCU_INIT_POINTER(nf_ct_destroy, NULL);
-
+	RCU_INIT_POINTER(nf_ct_hook, NULL);
 	cancel_delayed_work_sync(&conntrack_gc_work.dwork);
 	nf_ct_free_hashtable(nf_conntrack_hash, nf_conntrack_htable_size);
 
@@ -2131,11 +2130,15 @@ int nf_conntrack_init_start(void)
 	return ret;
 }
 
+static struct nf_ct_hook nf_conntrack_hook = {
+	.destroy	= destroy_conntrack,
+};
+
 void nf_conntrack_init_end(void)
 {
 	/* For use by REJECT target */
 	RCU_INIT_POINTER(ip_ct_attach, nf_conntrack_attach);
-	RCU_INIT_POINTER(nf_ct_destroy, destroy_conntrack);
+	RCU_INIT_POINTER(nf_ct_hook, &nf_conntrack_hook);
 }
 
 /*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 16/18] netfilter: add struct nf_nat_hook and use it
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (14 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 15/18] netfilter: add struct nf_ct_hook and use it Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 17/18] netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Move decode_session() and parse_nat_setup_hook() indirections to struct
nf_nat_hook structure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h            | 21 ++++++++++++++++-----
 include/net/netfilter/nf_nat_core.h  |  7 -------
 net/netfilter/core.c                 |  8 +++-----
 net/netfilter/nf_conntrack_core.c    |  5 -----
 net/netfilter/nf_conntrack_netlink.c | 10 +++++-----
 net/netfilter/nf_nat_core.c          | 23 ++++++++++++-----------
 6 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 75ded6f6eebe..e8d09dc028f6 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -320,18 +320,29 @@ int nf_route(struct net *net, struct dst_entry **dst, struct flowi *fl,
 int nf_reroute(struct sk_buff *skb, struct nf_queue_entry *entry);
 
 #include <net/flow.h>
-extern void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
+
+struct nf_conn;
+enum nf_nat_manip_type;
+struct nlattr;
+
+struct nf_nat_hook {
+	int (*parse_nat_setup)(struct nf_conn *ct, enum nf_nat_manip_type manip,
+			       const struct nlattr *attr);
+	void (*decode_session)(struct sk_buff *skb, struct flowi *fl);
+};
+
+extern struct nf_nat_hook __rcu *nf_nat_hook;
 
 static inline void
 nf_nat_decode_session(struct sk_buff *skb, struct flowi *fl, u_int8_t family)
 {
 #ifdef CONFIG_NF_NAT_NEEDED
-	void (*decodefn)(struct sk_buff *, struct flowi *);
+	struct nf_nat_hook *nat_hook;
 
 	rcu_read_lock();
-	decodefn = rcu_dereference(nf_nat_decode_session_hook);
-	if (decodefn)
-		decodefn(skb, fl);
+	nat_hook = rcu_dereference(nf_nat_hook);
+	if (nat_hook->decode_session)
+		nat_hook->decode_session(skb, fl);
 	rcu_read_unlock();
 #endif
 }
diff --git a/include/net/netfilter/nf_nat_core.h b/include/net/netfilter/nf_nat_core.h
index c78e9be14b3d..dc7cd0440229 100644
--- a/include/net/netfilter/nf_nat_core.h
+++ b/include/net/netfilter/nf_nat_core.h
@@ -26,11 +26,4 @@ static inline int nf_nat_initialized(struct nf_conn *ct,
 		return ct->status & IPS_DST_NAT_DONE;
 }
 
-struct nlattr;
-
-extern int
-(*nfnetlink_parse_nat_setup_hook)(struct nf_conn *ct,
-				  enum nf_nat_manip_type manip,
-				  const struct nlattr *attr);
-
 #endif /* _NF_NAT_CORE_H */
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 1bd844ea1d7c..e0ae4aae96f5 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -574,6 +574,9 @@ void (*ip_ct_attach)(struct sk_buff *, const struct sk_buff *)
 		__rcu __read_mostly;
 EXPORT_SYMBOL(ip_ct_attach);
 
+struct nf_nat_hook __rcu *nf_nat_hook __read_mostly;
+EXPORT_SYMBOL_GPL(nf_nat_hook);
+
 void nf_ct_attach(struct sk_buff *new, const struct sk_buff *skb)
 {
 	void (*attach)(struct sk_buff *, const struct sk_buff *);
@@ -608,11 +611,6 @@ const struct nf_conntrack_zone nf_ct_zone_dflt = {
 EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
 #endif /* CONFIG_NF_CONNTRACK */
 
-#ifdef CONFIG_NF_NAT_NEEDED
-void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
-EXPORT_SYMBOL(nf_nat_decode_session_hook);
-#endif
-
 static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
 {
 	int h;
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 8b2a8644d955..8d109d750073 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -58,11 +58,6 @@
 
 #include "nf_internals.h"
 
-int (*nfnetlink_parse_nat_setup_hook)(struct nf_conn *ct,
-				      enum nf_nat_manip_type manip,
-				      const struct nlattr *attr) __read_mostly;
-EXPORT_SYMBOL_GPL(nfnetlink_parse_nat_setup_hook);
-
 __cacheline_aligned_in_smp spinlock_t nf_conntrack_locks[CONNTRACK_LOCKS];
 EXPORT_SYMBOL_GPL(nf_conntrack_locks);
 
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index d807b8770be3..39327a42879f 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1431,11 +1431,11 @@ ctnetlink_parse_nat_setup(struct nf_conn *ct,
 			  enum nf_nat_manip_type manip,
 			  const struct nlattr *attr)
 {
-	typeof(nfnetlink_parse_nat_setup_hook) parse_nat_setup;
+	struct nf_nat_hook *nat_hook;
 	int err;
 
-	parse_nat_setup = rcu_dereference(nfnetlink_parse_nat_setup_hook);
-	if (!parse_nat_setup) {
+	nat_hook = rcu_dereference(nf_nat_hook);
+	if (!nat_hook) {
 #ifdef CONFIG_MODULES
 		rcu_read_unlock();
 		nfnl_unlock(NFNL_SUBSYS_CTNETLINK);
@@ -1446,13 +1446,13 @@ ctnetlink_parse_nat_setup(struct nf_conn *ct,
 		}
 		nfnl_lock(NFNL_SUBSYS_CTNETLINK);
 		rcu_read_lock();
-		if (nfnetlink_parse_nat_setup_hook)
+		if (nat_hook->parse_nat_setup)
 			return -EAGAIN;
 #endif
 		return -EOPNOTSUPP;
 	}
 
-	err = parse_nat_setup(ct, manip, attr);
+	err = nat_hook->parse_nat_setup(ct, manip, attr);
 	if (err == -EAGAIN) {
 #ifdef CONFIG_MODULES
 		rcu_read_unlock();
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 489599b549cf..f4d264676cfe 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -1026,6 +1026,13 @@ static struct pernet_operations nat_net_ops = {
 	.size = sizeof(struct nat_net),
 };
 
+struct nf_nat_hook nat_hook = {
+	.parse_nat_setup	= nfnetlink_parse_nat_setup,
+#ifdef CONFIG_XFRM
+	.decode_session		= __nf_nat_decode_session,
+#endif
+};
+
 static int __init nf_nat_init(void)
 {
 	int ret, i;
@@ -1057,13 +1064,9 @@ static int __init nf_nat_init(void)
 
 	nf_ct_helper_expectfn_register(&follow_master_nat);
 
-	BUG_ON(nfnetlink_parse_nat_setup_hook != NULL);
-	RCU_INIT_POINTER(nfnetlink_parse_nat_setup_hook,
-			   nfnetlink_parse_nat_setup);
-#ifdef CONFIG_XFRM
-	BUG_ON(nf_nat_decode_session_hook != NULL);
-	RCU_INIT_POINTER(nf_nat_decode_session_hook, __nf_nat_decode_session);
-#endif
+	WARN_ON(nf_nat_hook != NULL);
+	RCU_INIT_POINTER(nf_nat_hook, &nat_hook);
+
 	return 0;
 }
 
@@ -1076,10 +1079,8 @@ static void __exit nf_nat_cleanup(void)
 
 	nf_ct_extend_unregister(&nat_extend);
 	nf_ct_helper_expectfn_unregister(&follow_master_nat);
-	RCU_INIT_POINTER(nfnetlink_parse_nat_setup_hook, NULL);
-#ifdef CONFIG_XFRM
-	RCU_INIT_POINTER(nf_nat_decode_session_hook, NULL);
-#endif
+	RCU_INIT_POINTER(nf_nat_hook, NULL);
+
 	synchronize_rcu();
 
 	for (i = 0; i < NFPROTO_NUMPROTO; i++)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 17/18] netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (15 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 16/18] netfilter: add struct nf_nat_hook " Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 18:42 ` [PATCH 18/18] netfilter: nf_tables: remove nft_af_info Pablo Neira Ayuso
  2018-05-23 20:37 ` [PATCH 00/18] Netfilter updates for net-next David Miller
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

In nfqueue, two consecutive skbuffs may race to create the conntrack
entry. Hence, the one that loses the race gets dropped due to clash in
the insertion into the hashes from the nf_conntrack_confirm() path.

This patch adds a new nf_conntrack_update() function which searches for
possible clashes and resolve them. NAT mangling for the packet losing
race is corrected by using the conntrack information that won race.

In order to avoid direct module dependencies with conntrack and NAT, the
nf_ct_hook and nf_nat_hook structures are used for this purpose.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h         |  5 +++
 net/netfilter/nf_conntrack_core.c | 77 +++++++++++++++++++++++++++++++++++++++
 net/netfilter/nf_nat_core.c       | 41 +++++++++++++--------
 net/netfilter/nfnetlink_queue.c   | 28 ++++++++++++--
 4 files changed, 132 insertions(+), 19 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index e8d09dc028f6..04551af2ff23 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -324,11 +324,15 @@ int nf_reroute(struct sk_buff *skb, struct nf_queue_entry *entry);
 struct nf_conn;
 enum nf_nat_manip_type;
 struct nlattr;
+enum ip_conntrack_dir;
 
 struct nf_nat_hook {
 	int (*parse_nat_setup)(struct nf_conn *ct, enum nf_nat_manip_type manip,
 			       const struct nlattr *attr);
 	void (*decode_session)(struct sk_buff *skb, struct flowi *fl);
+	unsigned int (*manip_pkt)(struct sk_buff *skb, struct nf_conn *ct,
+				  enum nf_nat_manip_type mtype,
+				  enum ip_conntrack_dir dir);
 };
 
 extern struct nf_nat_hook __rcu *nf_nat_hook;
@@ -392,6 +396,7 @@ struct nf_conn;
 enum ip_conntrack_info;
 
 struct nf_ct_hook {
+	int (*update)(struct net *net, struct sk_buff *skb);
 	void (*destroy)(struct nf_conntrack *);
 };
 extern struct nf_ct_hook __rcu *nf_ct_hook;
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 8d109d750073..3465da2a98bd 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1607,6 +1607,82 @@ static void nf_conntrack_attach(struct sk_buff *nskb, const struct sk_buff *skb)
 	nf_conntrack_get(skb_nfct(nskb));
 }
 
+static int nf_conntrack_update(struct net *net, struct sk_buff *skb)
+{
+	const struct nf_conntrack_l3proto *l3proto;
+	const struct nf_conntrack_l4proto *l4proto;
+	struct nf_conntrack_tuple_hash *h;
+	struct nf_conntrack_tuple tuple;
+	enum ip_conntrack_info ctinfo;
+	struct nf_nat_hook *nat_hook;
+	unsigned int dataoff, status;
+	struct nf_conn *ct;
+	u16 l3num;
+	u8 l4num;
+
+	ct = nf_ct_get(skb, &ctinfo);
+	if (!ct || nf_ct_is_confirmed(ct))
+		return 0;
+
+	l3num = nf_ct_l3num(ct);
+	l3proto = nf_ct_l3proto_find_get(l3num);
+
+	if (l3proto->get_l4proto(skb, skb_network_offset(skb), &dataoff,
+				 &l4num) <= 0)
+		return -1;
+
+	l4proto = nf_ct_l4proto_find_get(l3num, l4num);
+
+	if (!nf_ct_get_tuple(skb, skb_network_offset(skb), dataoff, l3num,
+			     l4num, net, &tuple, l3proto, l4proto))
+		return -1;
+
+	if (ct->status & IPS_SRC_NAT) {
+		memcpy(tuple.src.u3.all,
+		       ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.all,
+		       sizeof(tuple.src.u3.all));
+		tuple.src.u.all =
+			ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.all;
+	}
+
+	if (ct->status & IPS_DST_NAT) {
+		memcpy(tuple.dst.u3.all,
+		       ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.all,
+		       sizeof(tuple.dst.u3.all));
+		tuple.dst.u.all =
+			ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.all;
+	}
+
+	h = nf_conntrack_find_get(net, nf_ct_zone(ct), &tuple);
+	if (!h)
+		return 0;
+
+	/* Store status bits of the conntrack that is clashing to re-do NAT
+	 * mangling according to what it has been done already to this packet.
+	 */
+	status = ct->status;
+
+	nf_ct_put(ct);
+	ct = nf_ct_tuplehash_to_ctrack(h);
+	nf_ct_set(skb, ct, ctinfo);
+
+	nat_hook = rcu_dereference(nf_nat_hook);
+	if (!nat_hook)
+		return 0;
+
+	if (status & IPS_SRC_NAT &&
+	    nat_hook->manip_pkt(skb, ct, NF_NAT_MANIP_SRC,
+				IP_CT_DIR_ORIGINAL) == NF_DROP)
+		return -1;
+
+	if (status & IPS_DST_NAT &&
+	    nat_hook->manip_pkt(skb, ct, NF_NAT_MANIP_DST,
+				IP_CT_DIR_ORIGINAL) == NF_DROP)
+		return -1;
+
+	return 0;
+}
+
 /* Bring out ya dead! */
 static struct nf_conn *
 get_next_corpse(int (*iter)(struct nf_conn *i, void *data),
@@ -2126,6 +2202,7 @@ int nf_conntrack_init_start(void)
 }
 
 static struct nf_ct_hook nf_conntrack_hook = {
+	.update		= nf_conntrack_update,
 	.destroy	= destroy_conntrack,
 };
 
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index f4d264676cfe..821f8d835f7a 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -493,17 +493,36 @@ nf_nat_alloc_null_binding(struct nf_conn *ct, unsigned int hooknum)
 }
 EXPORT_SYMBOL_GPL(nf_nat_alloc_null_binding);
 
+static unsigned int nf_nat_manip_pkt(struct sk_buff *skb, struct nf_conn *ct,
+				     enum nf_nat_manip_type mtype,
+				     enum ip_conntrack_dir dir)
+{
+	const struct nf_nat_l3proto *l3proto;
+	const struct nf_nat_l4proto *l4proto;
+	struct nf_conntrack_tuple target;
+
+	/* We are aiming to look like inverse of other direction. */
+	nf_ct_invert_tuplepr(&target, &ct->tuplehash[!dir].tuple);
+
+	l3proto = __nf_nat_l3proto_find(target.src.l3num);
+	l4proto = __nf_nat_l4proto_find(target.src.l3num,
+					target.dst.protonum);
+	if (!l3proto->manip_pkt(skb, 0, l4proto, &target, mtype))
+		return NF_DROP;
+
+	return NF_ACCEPT;
+}
+
 /* Do packet manipulations according to nf_nat_setup_info. */
 unsigned int nf_nat_packet(struct nf_conn *ct,
 			   enum ip_conntrack_info ctinfo,
 			   unsigned int hooknum,
 			   struct sk_buff *skb)
 {
-	const struct nf_nat_l3proto *l3proto;
-	const struct nf_nat_l4proto *l4proto;
+	enum nf_nat_manip_type mtype = HOOK2MANIP(hooknum);
 	enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
+	unsigned int verdict = NF_ACCEPT;
 	unsigned long statusbit;
-	enum nf_nat_manip_type mtype = HOOK2MANIP(hooknum);
 
 	if (mtype == NF_NAT_MANIP_SRC)
 		statusbit = IPS_SRC_NAT;
@@ -515,19 +534,10 @@ unsigned int nf_nat_packet(struct nf_conn *ct,
 		statusbit ^= IPS_NAT_MASK;
 
 	/* Non-atomic: these bits don't change. */
-	if (ct->status & statusbit) {
-		struct nf_conntrack_tuple target;
-
-		/* We are aiming to look like inverse of other direction. */
-		nf_ct_invert_tuplepr(&target, &ct->tuplehash[!dir].tuple);
+	if (ct->status & statusbit)
+		verdict = nf_nat_manip_pkt(skb, ct, mtype, dir);
 
-		l3proto = __nf_nat_l3proto_find(target.src.l3num);
-		l4proto = __nf_nat_l4proto_find(target.src.l3num,
-						target.dst.protonum);
-		if (!l3proto->manip_pkt(skb, 0, l4proto, &target, mtype))
-			return NF_DROP;
-	}
-	return NF_ACCEPT;
+	return verdict;
 }
 EXPORT_SYMBOL_GPL(nf_nat_packet);
 
@@ -1031,6 +1041,7 @@ struct nf_nat_hook nat_hook = {
 #ifdef CONFIG_XFRM
 	.decode_session		= __nf_nat_decode_session,
 #endif
+	.manip_pkt		= nf_nat_manip_pkt,
 };
 
 static int __init nf_nat_init(void)
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index 74a04638ef03..2c173042ac0e 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -227,6 +227,25 @@ find_dequeue_entry(struct nfqnl_instance *queue, unsigned int id)
 	return entry;
 }
 
+static void nfqnl_reinject(struct nf_queue_entry *entry, unsigned int verdict)
+{
+	struct nf_ct_hook *ct_hook;
+	int err;
+
+	if (verdict == NF_ACCEPT ||
+	    verdict == NF_STOP) {
+		rcu_read_lock();
+		ct_hook = rcu_dereference(nf_ct_hook);
+		if (ct_hook) {
+			err = ct_hook->update(entry->state.net, entry->skb);
+			if (err < 0)
+				verdict = NF_DROP;
+		}
+		rcu_read_unlock();
+	}
+	nf_reinject(entry, verdict);
+}
+
 static void
 nfqnl_flush(struct nfqnl_instance *queue, nfqnl_cmpfn cmpfn, unsigned long data)
 {
@@ -237,7 +256,7 @@ nfqnl_flush(struct nfqnl_instance *queue, nfqnl_cmpfn cmpfn, unsigned long data)
 		if (!cmpfn || cmpfn(entry, data)) {
 			list_del(&entry->list);
 			queue->queue_total--;
-			nf_reinject(entry, NF_DROP);
+			nfqnl_reinject(entry, NF_DROP);
 		}
 	}
 	spin_unlock_bh(&queue->lock);
@@ -686,7 +705,7 @@ __nfqnl_enqueue_packet(struct net *net, struct nfqnl_instance *queue,
 err_out_unlock:
 	spin_unlock_bh(&queue->lock);
 	if (failopen)
-		nf_reinject(entry, NF_ACCEPT);
+		nfqnl_reinject(entry, NF_ACCEPT);
 err_out:
 	return err;
 }
@@ -1085,7 +1104,8 @@ static int nfqnl_recv_verdict_batch(struct net *net, struct sock *ctnl,
 	list_for_each_entry_safe(entry, tmp, &batch_list, list) {
 		if (nfqa[NFQA_MARK])
 			entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK]));
-		nf_reinject(entry, verdict);
+
+		nfqnl_reinject(entry, verdict);
 	}
 	return 0;
 }
@@ -1208,7 +1228,7 @@ static int nfqnl_recv_verdict(struct net *net, struct sock *ctnl,
 	if (nfqa[NFQA_MARK])
 		entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK]));
 
-	nf_reinject(entry, verdict);
+	nfqnl_reinject(entry, verdict);
 	return 0;
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 18/18] netfilter: nf_tables: remove nft_af_info.
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (16 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 17/18] netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks Pablo Neira Ayuso
@ 2018-05-23 18:42 ` Pablo Neira Ayuso
  2018-05-23 20:37 ` [PATCH 00/18] Netfilter updates for net-next David Miller
  18 siblings, 0 replies; 20+ messages in thread
From: Pablo Neira Ayuso @ 2018-05-23 18:42 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Taehee Yoo <ap420073@gmail.com>

The struct nft_af_info was removed.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netns/nftables.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/net/netns/nftables.h b/include/net/netns/nftables.h
index 48134353411d..29c3851b486a 100644
--- a/include/net/netns/nftables.h
+++ b/include/net/netns/nftables.h
@@ -4,8 +4,6 @@
 
 #include <linux/list.h>
 
-struct nft_af_info;
-
 struct netns_nftables {
 	struct list_head	tables;
 	struct list_head	commit_list;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 00/18] Netfilter updates for net-next
  2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (17 preceding siblings ...)
  2018-05-23 18:42 ` [PATCH 18/18] netfilter: nf_tables: remove nft_af_info Pablo Neira Ayuso
@ 2018-05-23 20:37 ` David Miller
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2018-05-23 20:37 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 23 May 2018 20:42:36 +0200

> The following patchset contains Netfilter updates for your net-next
> tree, they are:
 ...
> This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
> core: add missing __rcu annotation") in your tree and 2c205dd3981f
> ("netfilter: add struct nf_nat_hook and use it") coming in this batch.
> This conflict can be solved by leaving the __rcu tag on
> __netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
> related to nf_nat_decode_session_hook - which is gone after
> 2c205dd3981f, as described by:
> 
> diff --cc net/netfilter/core.c
> index e0ae4aae96f5,206fb2c4c319..168af54db975
> --- a/net/netfilter/core.c
> +++ b/net/netfilter/core.c
 ...
> I can also merge your net-next tree into nf-next, solve the conflict and
> resend the pull request if you prefer so.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks for the merge conflict resolution guide.

Pulled, thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-05-23 20:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-23 18:42 [PATCH 00/18] Netfilter updates for net-next Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 01/18] netfilter: fix fallout from xt/nf osf separation Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 02/18] netfilter: nf_tables: remove old nf_log based tracing Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 03/18] netfilter: nft_numgen: add map lookups for numgen random operations Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 04/18] netfilter: nft_hash: add map lookups for hashing operations Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 05/18] netfilter: nf_nat: move common nat code to nat core Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 06/18] netfilter: xtables: allow table definitions not backed by hook_ops Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 07/18] netfilter: nf_tables: allow chain type to override hook register Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 08/18] netfilter: core: export raw versions of add/delete hook functions Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 09/18] netfilter: nf_nat: add nat hook register functions to nf_nat Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 10/18] netfilter: nf_nat: add nat type hooks to nat core Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 11/18] netfilter: lift one-nat-hook-only restriction Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 12/18] netfilter: make NF_OSF non-visible symbol Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 13/18] netfilter: nft_set_rbtree: add timeout support Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 14/18] netfilter: ip6t_rpfilter: provide input interface for route lookup Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 15/18] netfilter: add struct nf_ct_hook and use it Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 16/18] netfilter: add struct nf_nat_hook " Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 17/18] netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks Pablo Neira Ayuso
2018-05-23 18:42 ` [PATCH 18/18] netfilter: nf_tables: remove nft_af_info Pablo Neira Ayuso
2018-05-23 20:37 ` [PATCH 00/18] Netfilter updates for net-next David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).