netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Netfilter fixes for net
@ 2018-10-22 20:07 Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 1/8] netfilter: nft_set_rbtree: allow loose matching of closing element in interval Pablo Neira Ayuso
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for your net tree:

1) rbtree lookup from control plane returns the left-hand side element
   of the range when the interval end flag is set on.

2) osf extension is not supported from the input path, reject this from
   the control plane, from Fernando Fernandez Mancera.

3) xt_TEE is leaving output interface unset due to a recent incorrect
   netns rework, from Taehee Yoo.

4) xt_TEE allows to select an interface which does not belong to this
   netnamespace, from Taehee Yoo.

5) Zero private extension area in nft_compat, just like we do in x_tables,
   otherwise we leak kernel memory to userspace.

6) Missing .checkentry and .destroy entries in new DNAT extensions breaks
   it since we never load nf_conntrack dependencies, from Paolo Abeni.

7) Do not remove flowtable hook from netns exit path, the netdevice handler
   already deals with this, also from Taehee Yoo.

8) Only cleanup flowtable entries that reside in this netnamespace, also
   from Taehee Yoo.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks.

----------------------------------------------------------------

The following changes since commit 9a4890bd6d6325a1c88564a20ab310b2d56f6094:

  rds: RDS (tcp) hangs on sendto() to unresponding address (2018-10-10 22:19:52 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to a3fb3698cadf27dc142b24394c401625e14d80d0:

  netfilter: nf_flow_table: do not remove offload when other netns's interface is down (2018-10-19 13:30:48 +0200)

----------------------------------------------------------------
Fernando Fernandez Mancera (1):
      netfilter: nft_osf: usage from output path is not valid

Pablo Neira Ayuso (2):
      netfilter: nft_set_rbtree: allow loose matching of closing element in interval
      netfilter: nft_compat: do not dump private area

Paolo Abeni (1):
      netfilter: xt_nat: fix DNAT target for shifted portmap ranges

Taehee Yoo (4):
      netfilter: xt_TEE: fix wrong interface selection
      netfilter: xt_TEE: add missing code to get interface index in checkentry.
      netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine
      netfilter: nf_flow_table: do not remove offload when other netns's interface is down

 net/netfilter/nf_flow_table_core.c |  9 +++--
 net/netfilter/nf_tables_api.c      |  3 --
 net/netfilter/nft_compat.c         | 24 +++++++++++-
 net/netfilter/nft_osf.c            | 10 +++++
 net/netfilter/nft_set_rbtree.c     | 10 ++++-
 net/netfilter/xt_TEE.c             | 76 +++++++++++++++++++++++++++++---------
 net/netfilter/xt_nat.c             |  2 +
 7 files changed, 107 insertions(+), 27 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/8] netfilter: nft_set_rbtree: allow loose matching of closing element in interval
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 2/8] netfilter: nft_osf: usage from output path is not valid Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Allow to find closest matching for the right side of an interval (end
flag set on) so we allow lookups in inner ranges, eg. 10-20 in 5-25.

Fixes: ba0e4d9917b4 ("netfilter: nf_tables: get set elements via netlink")
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_set_rbtree.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 0e5ec126f6ad..fa61208371f8 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -135,9 +135,12 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
 		d = memcmp(this, key, set->klen);
 		if (d < 0) {
 			parent = rcu_dereference_raw(parent->rb_left);
-			interval = rbe;
+			if (!(flags & NFT_SET_ELEM_INTERVAL_END))
+				interval = rbe;
 		} else if (d > 0) {
 			parent = rcu_dereference_raw(parent->rb_right);
+			if (flags & NFT_SET_ELEM_INTERVAL_END)
+				interval = rbe;
 		} else {
 			if (!nft_set_elem_active(&rbe->ext, genmask))
 				parent = rcu_dereference_raw(parent->rb_left);
@@ -154,7 +157,10 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
 
 	if (set->flags & NFT_SET_INTERVAL && interval != NULL &&
 	    nft_set_elem_active(&interval->ext, genmask) &&
-	    !nft_rbtree_interval_end(interval)) {
+	    ((!nft_rbtree_interval_end(interval) &&
+	      !(flags & NFT_SET_ELEM_INTERVAL_END)) ||
+	     (nft_rbtree_interval_end(interval) &&
+	      (flags & NFT_SET_ELEM_INTERVAL_END)))) {
 		*elem = interval;
 		return true;
 	}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/8] netfilter: nft_osf: usage from output path is not valid
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 1/8] netfilter: nft_set_rbtree: allow loose matching of closing element in interval Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 3/8] netfilter: xt_TEE: fix wrong interface selection Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Fernando Fernandez Mancera <ffmancera@riseup.net>

The nft_osf extension, like xt_osf, is not supported from the output
path.

Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_osf.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c
index a35fb59ace73..df4e3e0412ed 100644
--- a/net/netfilter/nft_osf.c
+++ b/net/netfilter/nft_osf.c
@@ -69,6 +69,15 @@ static int nft_osf_dump(struct sk_buff *skb, const struct nft_expr *expr)
 	return -1;
 }
 
+static int nft_osf_validate(const struct nft_ctx *ctx,
+			    const struct nft_expr *expr,
+			    const struct nft_data **data)
+{
+	return nft_chain_validate_hooks(ctx->chain, (1 << NF_INET_LOCAL_IN) |
+						    (1 << NF_INET_PRE_ROUTING) |
+						    (1 << NF_INET_FORWARD));
+}
+
 static struct nft_expr_type nft_osf_type;
 static const struct nft_expr_ops nft_osf_op = {
 	.eval		= nft_osf_eval,
@@ -76,6 +85,7 @@ static const struct nft_expr_ops nft_osf_op = {
 	.init		= nft_osf_init,
 	.dump		= nft_osf_dump,
 	.type		= &nft_osf_type,
+	.validate	= nft_osf_validate,
 };
 
 static struct nft_expr_type nft_osf_type __read_mostly = {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/8] netfilter: xt_TEE: fix wrong interface selection
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 1/8] netfilter: nft_set_rbtree: allow loose matching of closing element in interval Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 2/8] netfilter: nft_osf: usage from output path is not valid Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 4/8] netfilter: xt_TEE: add missing code to get interface index in checkentry Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Taehee Yoo <ap420073@gmail.com>

TEE netdevice notifier handler checks only interface name. however
each netns can have same interface name. hence other netns's interface
could be selected.

test commands:
   %ip netns add vm1
   %iptables -I INPUT -p icmp -j TEE --gateway 192.168.1.1 --oif enp2s0
   %ip link set enp2s0 netns vm1

Above rule is in the root netns. but that rule could get enp2s0
ifindex of vm1 by notifier handler.

After this patch, TEE rule is added to the per-netns list.

Fixes: 9e2f6c5d78db ("netfilter: Rework xt_TEE netdevice notifier")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/xt_TEE.c | 69 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 52 insertions(+), 17 deletions(-)

diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index 0d0d68c989df..673ad2099f97 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -14,6 +14,8 @@
 #include <linux/skbuff.h>
 #include <linux/route.h>
 #include <linux/netfilter/x_tables.h>
+#include <net/net_namespace.h>
+#include <net/netns/generic.h>
 #include <net/route.h>
 #include <net/netfilter/ipv4/nf_dup_ipv4.h>
 #include <net/netfilter/ipv6/nf_dup_ipv6.h>
@@ -25,8 +27,15 @@ struct xt_tee_priv {
 	int			oif;
 };
 
+static unsigned int tee_net_id __read_mostly;
 static const union nf_inet_addr tee_zero_address;
 
+struct tee_net {
+	struct list_head priv_list;
+	/* lock protects the priv_list */
+	struct mutex lock;
+};
+
 static unsigned int
 tee_tg4(struct sk_buff *skb, const struct xt_action_param *par)
 {
@@ -51,17 +60,16 @@ tee_tg6(struct sk_buff *skb, const struct xt_action_param *par)
 }
 #endif
 
-static DEFINE_MUTEX(priv_list_mutex);
-static LIST_HEAD(priv_list);
-
 static int tee_netdev_event(struct notifier_block *this, unsigned long event,
 			    void *ptr)
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	struct net *net = dev_net(dev);
+	struct tee_net *tn = net_generic(net, tee_net_id);
 	struct xt_tee_priv *priv;
 
-	mutex_lock(&priv_list_mutex);
-	list_for_each_entry(priv, &priv_list, list) {
+	mutex_lock(&tn->lock);
+	list_for_each_entry(priv, &tn->priv_list, list) {
 		switch (event) {
 		case NETDEV_REGISTER:
 			if (!strcmp(dev->name, priv->tginfo->oif))
@@ -79,13 +87,14 @@ static int tee_netdev_event(struct notifier_block *this, unsigned long event,
 			break;
 		}
 	}
-	mutex_unlock(&priv_list_mutex);
+	mutex_unlock(&tn->lock);
 
 	return NOTIFY_DONE;
 }
 
 static int tee_tg_check(const struct xt_tgchk_param *par)
 {
+	struct tee_net *tn = net_generic(par->net, tee_net_id);
 	struct xt_tee_tginfo *info = par->targinfo;
 	struct xt_tee_priv *priv;
 
@@ -106,9 +115,9 @@ static int tee_tg_check(const struct xt_tgchk_param *par)
 		priv->oif     = -1;
 		info->priv    = priv;
 
-		mutex_lock(&priv_list_mutex);
-		list_add(&priv->list, &priv_list);
-		mutex_unlock(&priv_list_mutex);
+		mutex_lock(&tn->lock);
+		list_add(&priv->list, &tn->priv_list);
+		mutex_unlock(&tn->lock);
 	} else
 		info->priv = NULL;
 
@@ -118,12 +127,13 @@ static int tee_tg_check(const struct xt_tgchk_param *par)
 
 static void tee_tg_destroy(const struct xt_tgdtor_param *par)
 {
+	struct tee_net *tn = net_generic(par->net, tee_net_id);
 	struct xt_tee_tginfo *info = par->targinfo;
 
 	if (info->priv) {
-		mutex_lock(&priv_list_mutex);
+		mutex_lock(&tn->lock);
 		list_del(&info->priv->list);
-		mutex_unlock(&priv_list_mutex);
+		mutex_unlock(&tn->lock);
 		kfree(info->priv);
 	}
 	static_key_slow_dec(&xt_tee_enabled);
@@ -156,6 +166,21 @@ static struct xt_target tee_tg_reg[] __read_mostly = {
 #endif
 };
 
+static int __net_init tee_net_init(struct net *net)
+{
+	struct tee_net *tn = net_generic(net, tee_net_id);
+
+	INIT_LIST_HEAD(&tn->priv_list);
+	mutex_init(&tn->lock);
+	return 0;
+}
+
+static struct pernet_operations tee_net_ops = {
+	.init = tee_net_init,
+	.id   = &tee_net_id,
+	.size = sizeof(struct tee_net),
+};
+
 static struct notifier_block tee_netdev_notifier = {
 	.notifier_call = tee_netdev_event,
 };
@@ -164,22 +189,32 @@ static int __init tee_tg_init(void)
 {
 	int ret;
 
-	ret = xt_register_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
-	if (ret)
+	ret = register_pernet_subsys(&tee_net_ops);
+	if (ret < 0)
 		return ret;
+
+	ret = xt_register_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
+	if (ret < 0)
+		goto cleanup_subsys;
+
 	ret = register_netdevice_notifier(&tee_netdev_notifier);
-	if (ret) {
-		xt_unregister_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
-		return ret;
-	}
+	if (ret < 0)
+		goto unregister_targets;
 
 	return 0;
+
+unregister_targets:
+	xt_unregister_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
+cleanup_subsys:
+	unregister_pernet_subsys(&tee_net_ops);
+	return ret;
 }
 
 static void __exit tee_tg_exit(void)
 {
 	unregister_netdevice_notifier(&tee_netdev_notifier);
 	xt_unregister_targets(tee_tg_reg, ARRAY_SIZE(tee_tg_reg));
+	unregister_pernet_subsys(&tee_net_ops);
 }
 
 module_init(tee_tg_init);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/8] netfilter: xt_TEE: add missing code to get interface index in checkentry.
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2018-10-22 20:07 ` [PATCH 3/8] netfilter: xt_TEE: fix wrong interface selection Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 5/8] netfilter: nft_compat: do not dump private area Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Taehee Yoo <ap420073@gmail.com>

checkentry(tee_tg_check) should initialize priv->oif from dev if possible.
But only netdevice notifier handler can set that.
Hence priv->oif is always -1 until notifier handler is called.

Fixes: 9e2f6c5d78db ("netfilter: Rework xt_TEE netdevice notifier")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/xt_TEE.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index 673ad2099f97..1dae02a97ee3 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -104,6 +104,8 @@ static int tee_tg_check(const struct xt_tgchk_param *par)
 		return -EINVAL;
 
 	if (info->oif[0]) {
+		struct net_device *dev;
+
 		if (info->oif[sizeof(info->oif)-1] != '\0')
 			return -EINVAL;
 
@@ -115,6 +117,11 @@ static int tee_tg_check(const struct xt_tgchk_param *par)
 		priv->oif     = -1;
 		info->priv    = priv;
 
+		dev = dev_get_by_name(par->net, info->oif);
+		if (dev) {
+			priv->oif = dev->ifindex;
+			dev_put(dev);
+		}
 		mutex_lock(&tn->lock);
 		list_add(&priv->list, &tn->priv_list);
 		mutex_unlock(&tn->lock);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/8] netfilter: nft_compat: do not dump private area
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2018-10-22 20:07 ` [PATCH 4/8] netfilter: xt_TEE: add missing code to get interface index in checkentry Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 6/8] netfilter: xt_nat: fix DNAT target for shifted portmap ranges Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Zero pad private area, otherwise we expose private kernel pointer to
userspace. This patch also zeroes the tail area after the ->matchsize
and ->targetsize that results from XT_ALIGN().

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_compat.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 32535eea51b2..768292eac2a4 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -290,6 +290,24 @@ nft_target_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr)
 		module_put(target->me);
 }
 
+static int nft_extension_dump_info(struct sk_buff *skb, int attr,
+				   const void *info,
+				   unsigned int size, unsigned int user_size)
+{
+	unsigned int info_size, aligned_size = XT_ALIGN(size);
+	struct nlattr *nla;
+
+	nla = nla_reserve(skb, attr, aligned_size);
+	if (!nla)
+		return -1;
+
+	info_size = user_size ? : size;
+	memcpy(nla_data(nla), info, info_size);
+	memset(nla_data(nla) + info_size, 0, aligned_size - info_size);
+
+	return 0;
+}
+
 static int nft_target_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	const struct xt_target *target = expr->ops->data;
@@ -297,7 +315,8 @@ static int nft_target_dump(struct sk_buff *skb, const struct nft_expr *expr)
 
 	if (nla_put_string(skb, NFTA_TARGET_NAME, target->name) ||
 	    nla_put_be32(skb, NFTA_TARGET_REV, htonl(target->revision)) ||
-	    nla_put(skb, NFTA_TARGET_INFO, XT_ALIGN(target->targetsize), info))
+	    nft_extension_dump_info(skb, NFTA_TARGET_INFO, info,
+				    target->targetsize, target->usersize))
 		goto nla_put_failure;
 
 	return 0;
@@ -532,7 +551,8 @@ static int __nft_match_dump(struct sk_buff *skb, const struct nft_expr *expr,
 
 	if (nla_put_string(skb, NFTA_MATCH_NAME, match->name) ||
 	    nla_put_be32(skb, NFTA_MATCH_REV, htonl(match->revision)) ||
-	    nla_put(skb, NFTA_MATCH_INFO, XT_ALIGN(match->matchsize), info))
+	    nft_extension_dump_info(skb, NFTA_MATCH_INFO, info,
+				    match->matchsize, match->usersize))
 		goto nla_put_failure;
 
 	return 0;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/8] netfilter: xt_nat: fix DNAT target for shifted portmap ranges
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2018-10-22 20:07 ` [PATCH 5/8] netfilter: nft_compat: do not dump private area Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 7/8] netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Paolo Abeni <pabeni@redhat.com>

The commit 2eb0f624b709 ("netfilter: add NAT support for shifted
portmap ranges") did not set the checkentry/destroy callbacks for
the newly added DNAT target. As a result, rulesets using only
such nat targets are not effective, as the relevant conntrack hooks
are not enabled.
The above affect also nft_compat rulesets.
Fix the issue adding the missing initializers.

Fixes: 2eb0f624b709 ("netfilter: add NAT support for shifted portmap ranges")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/xt_nat.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/xt_nat.c b/net/netfilter/xt_nat.c
index 8af9707f8789..ac91170fc8c8 100644
--- a/net/netfilter/xt_nat.c
+++ b/net/netfilter/xt_nat.c
@@ -216,6 +216,8 @@ static struct xt_target xt_nat_target_reg[] __read_mostly = {
 	{
 		.name		= "DNAT",
 		.revision	= 2,
+		.checkentry	= xt_nat_checkentry,
+		.destroy	= xt_nat_destroy,
 		.target		= xt_dnat_target_v2,
 		.targetsize	= sizeof(struct nf_nat_range2),
 		.table		= "nat",
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/8] netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2018-10-22 20:07 ` [PATCH 6/8] netfilter: xt_nat: fix DNAT target for shifted portmap ranges Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-22 20:07 ` [PATCH 8/8] netfilter: nf_flow_table: do not remove offload when other netns's interface is down Pablo Neira Ayuso
  2018-10-23  3:21 ` [PATCH 0/8] Netfilter fixes for net David Miller
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Taehee Yoo <ap420073@gmail.com>

When device is unregistered, flowtable flush routine is called
by notifier_call(nf_tables_flowtable_event). and exit callback of
nftables pernet_operation(nf_tables_exit_net) also has flowtable flush
routine. but when network namespace is destroyed, both notifier_call
and pernet_operation are called. hence flowtable flush routine in
pernet_operation is unnecessary.

test commands:
   %ip netns add vm1
   %ip netns exec vm1 nft add table ip filter
   %ip netns exec vm1 nft add flowtable ip filter w \
	{ hook ingress priority 0\; devices = { lo }\; }
   %ip netns del vm1

splat looks like:
[  265.187019] WARNING: CPU: 0 PID: 87 at net/netfilter/core.c:309 nf_hook_entry_head+0xc7/0xf0
[  265.187112] Modules linked in: nf_flow_table_ipv4 nf_flow_table nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip_tables x_tables
[  265.187390] CPU: 0 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc3+ #5
[  265.187453] Workqueue: netns cleanup_net
[  265.187514] RIP: 0010:nf_hook_entry_head+0xc7/0xf0
[  265.187546] Code: 8d 81 68 03 00 00 5b c3 89 d0 83 fa 04 48 8d 84 c7 e8 11 00 00 76 81 0f 0b 31 c0 e9 78 ff ff ff 0f 0b 48 83 c4 08 31 c0 5b c3 <0f> 0b 31 c0 e9 65 ff ff ff 0f 0b 31 c0 e9 5c ff ff ff 48 89 0c 24
[  265.187573] RSP: 0018:ffff88011546f098 EFLAGS: 00010246
[  265.187624] RAX: ffffffff8d90e135 RBX: 1ffff10022a8de1c RCX: 0000000000000000
[  265.187645] RDX: 0000000000000000 RSI: 0000000000000005 RDI: ffff880116298040
[  265.187645] RBP: ffff88010ea4c1a8 R08: 0000000000000000 R09: 0000000000000000
[  265.187645] R10: ffff88011546f1d8 R11: ffffed0022c532c1 R12: ffff88010ea4c1d0
[  265.187645] R13: 0000000000000005 R14: dffffc0000000000 R15: ffff88010ea4c1c4
[  265.187645] FS:  0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[  265.187645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  265.187645] CR2: 00007fdfb8d00000 CR3: 0000000057a16000 CR4: 00000000001006f0
[  265.187645] Call Trace:
[  265.187645]  __nf_unregister_net_hook+0xca/0x5d0
[  265.187645]  ? nf_hook_entries_free.part.3+0x80/0x80
[  265.187645]  ? save_trace+0x300/0x300
[  265.187645]  nf_unregister_net_hooks+0x2e/0x40
[  265.187645]  nf_tables_exit_net+0x479/0x1340 [nf_tables]
[  265.187645]  ? find_held_lock+0x39/0x1c0
[  265.187645]  ? nf_tables_abort+0x30/0x30 [nf_tables]
[  265.187645]  ? inet_frag_destroy_rcu+0xd0/0xd0
[  265.187645]  ? trace_hardirqs_on+0x93/0x210
[  265.187645]  ? __bpf_trace_preemptirq_template+0x10/0x10
[  265.187645]  ? inet_frag_destroy_rcu+0xd0/0xd0
[  265.187645]  ? inet_frag_destroy_rcu+0xd0/0xd0
[  265.187645]  ? __mutex_unlock_slowpath+0x17f/0x740
[  265.187645]  ? wait_for_completion+0x710/0x710
[  265.187645]  ? bucket_table_free+0xb2/0x1f0
[  265.187645]  ? nested_table_free+0x130/0x130
[  265.187645]  ? __lock_is_held+0xb4/0x140
[  265.187645]  ops_exit_list.isra.10+0x94/0x140
[  265.187645]  cleanup_net+0x45b/0x900
[ ... ]

This WARNING means that hook unregisteration is failed because
all flowtables hooks are already unregistered by notifier_call.

Network namespace exit routine guarantees that all devices will be
unregistered first. then, other exit callbacks of pernet_operations
are called. so that removing flowtable flush routine in exit callback of
pernet_operation(nf_tables_exit_net) doesn't make flowtable leak.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 2cfb173cd0b2..d83c0d01a266 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7202,9 +7202,6 @@ static void __nft_release_tables(struct net *net)
 
 		list_for_each_entry(chain, &table->chains, list)
 			nf_tables_unregister_hook(net, table, chain);
-		list_for_each_entry(flowtable, &table->flowtables, list)
-			nf_unregister_net_hooks(net, flowtable->ops,
-						flowtable->ops_len);
 		/* No packets are walking on these chains anymore. */
 		ctx.table = table;
 		list_for_each_entry(chain, &table->chains, list) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 8/8] netfilter: nf_flow_table: do not remove offload when other netns's interface is down
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2018-10-22 20:07 ` [PATCH 7/8] netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine Pablo Neira Ayuso
@ 2018-10-22 20:07 ` Pablo Neira Ayuso
  2018-10-23  3:21 ` [PATCH 0/8] Netfilter fixes for net David Miller
  8 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2018-10-22 20:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Taehee Yoo <ap420073@gmail.com>

When interface is down, offload cleanup function(nf_flow_table_do_cleanup)
is called and that checks whether interface index of offload and
index of link down interface is same. but only interface index checking
is not enough because flowtable is not pernet list.
So that, if other netns's interface that has index is same with offload
is down, that offload will be removed.
This patch adds netns checking code to the offload cleanup routine.

Fixes: 59c466dd68e7 ("netfilter: nf_flow_table: add a new flow state for tearing down offloading")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_flow_table_core.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index d8125616edc7..c188e27972c7 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -478,14 +478,17 @@ EXPORT_SYMBOL_GPL(nf_flow_table_init);
 static void nf_flow_table_do_cleanup(struct flow_offload *flow, void *data)
 {
 	struct net_device *dev = data;
+	struct flow_offload_entry *e;
+
+	e = container_of(flow, struct flow_offload_entry, flow);
 
 	if (!dev) {
 		flow_offload_teardown(flow);
 		return;
 	}
-
-	if (flow->tuplehash[0].tuple.iifidx == dev->ifindex ||
-	    flow->tuplehash[1].tuple.iifidx == dev->ifindex)
+	if (net_eq(nf_ct_net(e->ct), dev_net(dev)) &&
+	    (flow->tuplehash[0].tuple.iifidx == dev->ifindex ||
+	     flow->tuplehash[1].tuple.iifidx == dev->ifindex))
 		flow_offload_dead(flow);
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/8] Netfilter fixes for net
  2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2018-10-22 20:07 ` [PATCH 8/8] netfilter: nf_flow_table: do not remove offload when other netns's interface is down Pablo Neira Ayuso
@ 2018-10-23  3:21 ` David Miller
  8 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2018-10-23  3:21 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 22 Oct 2018 22:07:16 +0200

> The following patchset contains Netfilter fixes for your net tree:
> 
> 1) rbtree lookup from control plane returns the left-hand side element
>    of the range when the interval end flag is set on.
> 
> 2) osf extension is not supported from the input path, reject this from
>    the control plane, from Fernando Fernandez Mancera.
> 
> 3) xt_TEE is leaving output interface unset due to a recent incorrect
>    netns rework, from Taehee Yoo.
> 
> 4) xt_TEE allows to select an interface which does not belong to this
>    netnamespace, from Taehee Yoo.
> 
> 5) Zero private extension area in nft_compat, just like we do in x_tables,
>    otherwise we leak kernel memory to userspace.
> 
> 6) Missing .checkentry and .destroy entries in new DNAT extensions breaks
>    it since we never load nf_conntrack dependencies, from Paolo Abeni.
> 
> 7) Do not remove flowtable hook from netns exit path, the netdevice handler
>    already deals with this, also from Taehee Yoo.
> 
> 8) Only cleanup flowtable entries that reside in this netnamespace, also
>    from Taehee Yoo.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-10-23 11:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-22 20:07 [PATCH 0/8] Netfilter fixes for net Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 1/8] netfilter: nft_set_rbtree: allow loose matching of closing element in interval Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 2/8] netfilter: nft_osf: usage from output path is not valid Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 3/8] netfilter: xt_TEE: fix wrong interface selection Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 4/8] netfilter: xt_TEE: add missing code to get interface index in checkentry Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 5/8] netfilter: nft_compat: do not dump private area Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 6/8] netfilter: xt_nat: fix DNAT target for shifted portmap ranges Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 7/8] netfilter: nf_flow_table: remove flowtable hook flush routine in netns exit routine Pablo Neira Ayuso
2018-10-22 20:07 ` [PATCH 8/8] netfilter: nf_flow_table: do not remove offload when other netns's interface is down Pablo Neira Ayuso
2018-10-23  3:21 ` [PATCH 0/8] Netfilter fixes for net David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).