All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH nf-next 0/2] netfilter: untracked object removal
@ 2017-03-08 12:49 Florian Westphal
  2017-03-08 12:49 ` [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects Florian Westphal
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Florian Westphal @ 2017-03-08 12:49 UTC (permalink / raw)
  To: netfilter-devel

These patches remove the percpu untracked objects, they get replaced
with a new (kernel internal) ctinfo state.

This avoids reference counter operations for untracked packets and
removes the need to check a conntrack for the UNTRACKED status bit
before setting connmark, labels, etc.

I checked with following rule set and things appear to work as
expected (i.e., ssh connections don't show up in conntrack -L):

*raw
:PREROUTING ACCEPT [455:34825]
:OUTPUT ACCEPT [251:29555]
[775:63699] -A PREROUTING -p tcp -m tcp --dport 22 -j NOTRACK
[251:29555] -A OUTPUT -p tcp -m tcp --sport 22 -j NOTRACK
COMMIT
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
[0:0] -A INPUT -m conntrack --ctstate INVALID -j DROP
[337:26377] -A INPUT -p tcp -m conntrack --ctstate UNTRACKED -m tcp --dport 22 -j ACCEPT
[0:0] -A INPUT -m conntrack --ctstate UNTRACKED
[102:13883] -A OUTPUT -p tcp -m conntrack --ctstate UNTRACKED -m tcp --sport 22 -j ACCEPT
[0:0] -A OUTPUT -m conntrack --ctstate UNTRACKED
COMMIT

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects
  2017-03-08 12:49 [PATCH nf-next 0/2] netfilter: untracked object removal Florian Westphal
@ 2017-03-08 12:49 ` Florian Westphal
  2017-03-08 12:49 ` [PATCH nf-next 2/2] netfilter: remove nf_ct_is_untracked Florian Westphal
  2017-03-08 16:29 ` [PATCH nf-next 0/2] netfilter: untracked object removal Eric Dumazet
  2 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-03-08 12:49 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

resurrect an old patch from Pablo Neira to remove the untracked objects.

Currently, there are four possible states of an skb wrt. conntrack.

1. No conntrack attached, ct is NULL.
2. Normal (kmem cache allocated) ct attached.
3. a template (kmalloc'd), not in any hash tables at any point in time
4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
   IPS_UNTRACKED_BIT in ct->status.

Untracked is supposed to be identical to case 1.  It exists only
so users can check

-m conntrack --ctstate UNTRACKED vs.
-m conntrack --ctstate INVALID

e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
supposed to be a no-op.

Thus currently we need to check
 ct == NULL || nf_ct_is_untracked(ct)

in a lot of places in order to avoid altering untracked objects.

The other consequence of the percpu untracked object is that all
-j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
(inc/dec the untracked conntracks refcount).

This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
make the distinction instead.

The (few) places that care about packet invalid (ct is NULL) vs.
packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
but all other places can omit the nf_ct_is_untracked() check.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/ip_vs.h                                |  6 +--
 include/net/netfilter/nf_conntrack.h               | 10 +----
 include/uapi/linux/netfilter/nf_conntrack_common.h |  6 ++-
 net/ipv4/netfilter/nf_dup_ipv4.c                   |  3 +-
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c     |  3 +-
 net/ipv6/netfilter/nf_dup_ipv6.c                   |  3 +-
 net/netfilter/nf_conntrack_core.c                  | 43 +++-------------------
 net/netfilter/nf_nat_core.c                        |  3 --
 net/netfilter/nft_ct.c                             | 14 +++----
 net/netfilter/xt_CT.c                              | 16 ++++----
 net/netfilter/xt_conntrack.c                       | 11 +++---
 net/netfilter/xt_state.c                           | 13 +++----
 12 files changed, 39 insertions(+), 92 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 7bdfa7d78363..4a8ec999b62e 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1554,12 +1554,8 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		struct nf_conn *untracked;
-
 		nf_conntrack_put(&ct->ct_general);
-		untracked = nf_ct_untracked_get();
-		nf_conntrack_get(&untracked->ct_general);
-		nf_ct_set(skb, untracked, IP_CT_NEW);
+		nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 	}
 #endif
 }
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index f540f9ad2af4..012b99f563e5 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -243,14 +243,6 @@ extern s32 (*nf_ct_nat_offset)(const struct nf_conn *ct,
 			       enum ip_conntrack_dir dir,
 			       u32 seq);
 
-/* Fake conntrack entry for untracked connections */
-DECLARE_PER_CPU(struct nf_conn, nf_conntrack_untracked);
-static inline struct nf_conn *nf_ct_untracked_get(void)
-{
-	return raw_cpu_ptr(&nf_conntrack_untracked);
-}
-void nf_ct_untracked_status_or(unsigned long bits);
-
 /* Iterate over all conntracks: if iter returns true, it's deleted. */
 void nf_ct_iterate_cleanup(struct net *net,
 			   int (*iter)(struct nf_conn *i, void *data),
@@ -283,7 +275,7 @@ static inline int nf_ct_is_dying(const struct nf_conn *ct)
 
 static inline int nf_ct_is_untracked(const struct nf_conn *ct)
 {
-	return test_bit(IPS_UNTRACKED_BIT, &ct->status);
+	return false;
 }
 
 /* Packet is received from loopback */
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index 6a8e33dd4ecb..b4a0a1940118 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -28,12 +28,14 @@ enum ip_conntrack_info {
 	/* only for userspace compatibility */
 #ifndef __KERNEL__
 	IP_CT_NEW_REPLY = IP_CT_NUMBER,
+#else
+	IP_CT_UNTRACKED = 7,
 #endif
 };
 
 #define NF_CT_STATE_INVALID_BIT			(1 << 0)
 #define NF_CT_STATE_BIT(ctinfo)			(1 << ((ctinfo) % IP_CT_IS_REPLY + 1))
-#define NF_CT_STATE_UNTRACKED_BIT		(1 << (IP_CT_NUMBER + 1))
+#define NF_CT_STATE_UNTRACKED_BIT		(1 << (IP_CT_UNTRACKED + 1))
 
 /* Bitset representing status of connection. */
 enum ip_conntrack_status {
@@ -94,7 +96,7 @@ enum ip_conntrack_status {
 	IPS_TEMPLATE_BIT = 11,
 	IPS_TEMPLATE = (1 << IPS_TEMPLATE_BIT),
 
-	/* Conntrack is a fake untracked entry */
+	/* Conntrack is a fake untracked entry.  Obsolete and not used anymore */
 	IPS_UNTRACKED_BIT = 12,
 	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
 
diff --git a/net/ipv4/netfilter/nf_dup_ipv4.c b/net/ipv4/netfilter/nf_dup_ipv4.c
index f0dbff05fc28..39895b9ddeb9 100644
--- a/net/ipv4/netfilter/nf_dup_ipv4.c
+++ b/net/ipv4/netfilter/nf_dup_ipv4.c
@@ -69,8 +69,7 @@ void nf_dup_ipv4(struct net *net, struct sk_buff *skb, unsigned int hooknum,
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	/* Avoid counting cloned packets towards the original connection. */
 	nf_reset(skb);
-	nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-	nf_conntrack_get(skb_nfct(skb));
+	nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 #endif
 	/*
 	 * If we are in PREROUTING/INPUT, decrease the TTL to mitigate potential
diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index d2c2ccbfbe72..d5f028e33f65 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -221,8 +221,7 @@ icmpv6_error(struct net *net, struct nf_conn *tmpl,
 	type = icmp6h->icmp6_type - 130;
 	if (type >= 0 && type < sizeof(noct_valid_new) &&
 	    noct_valid_new[type]) {
-		nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-		nf_conntrack_get(skb_nfct(skb));
+		nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 		return NF_ACCEPT;
 	}
 
diff --git a/net/ipv6/netfilter/nf_dup_ipv6.c b/net/ipv6/netfilter/nf_dup_ipv6.c
index 888ecd106e5f..4a7ddeddbaab 100644
--- a/net/ipv6/netfilter/nf_dup_ipv6.c
+++ b/net/ipv6/netfilter/nf_dup_ipv6.c
@@ -58,8 +58,7 @@ void nf_dup_ipv6(struct net *net, struct sk_buff *skb, unsigned int hooknum,
 
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	nf_reset(skb);
-	nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-	nf_conntrack_get(skb_nfct(skb));
+	nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 #endif
 	if (hooknum == NF_INET_PRE_ROUTING ||
 	    hooknum == NF_INET_LOCAL_IN) {
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 071b97fcbefb..3297b14b6b85 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -181,9 +181,6 @@ EXPORT_SYMBOL_GPL(nf_conntrack_htable_size);
 unsigned int nf_conntrack_max __read_mostly;
 seqcount_t nf_conntrack_generation __read_mostly;
 
-DEFINE_PER_CPU(struct nf_conn, nf_conntrack_untracked);
-EXPORT_PER_CPU_SYMBOL(nf_conntrack_untracked);
-
 static unsigned int nf_conntrack_hash_rnd __read_mostly;
 
 static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple,
@@ -1315,9 +1312,10 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned int hooknum,
 	int ret;
 
 	tmpl = nf_ct_get(skb, &ctinfo);
-	if (tmpl) {
+	if (tmpl || ctinfo == IP_CT_UNTRACKED) {
 		/* Previously seen (loopback or untracked)?  Ignore. */
-		if (!nf_ct_is_template(tmpl)) {
+		if ((tmpl && !nf_ct_is_template(tmpl)) ||
+		     ctinfo == IP_CT_UNTRACKED) {
 			NF_CT_STAT_INC_ATOMIC(net, ignore);
 			return NF_ACCEPT;
 		}
@@ -1630,18 +1628,6 @@ void nf_ct_free_hashtable(void *hash, unsigned int size)
 }
 EXPORT_SYMBOL_GPL(nf_ct_free_hashtable);
 
-static int untrack_refs(void)
-{
-	int cnt = 0, cpu;
-
-	for_each_possible_cpu(cpu) {
-		struct nf_conn *ct = &per_cpu(nf_conntrack_untracked, cpu);
-
-		cnt += atomic_read(&ct->ct_general.use) - 1;
-	}
-	return cnt;
-}
-
 void nf_conntrack_cleanup_start(void)
 {
 	conntrack_gc_work.exiting = true;
@@ -1651,8 +1637,6 @@ void nf_conntrack_cleanup_start(void)
 void nf_conntrack_cleanup_end(void)
 {
 	RCU_INIT_POINTER(nf_ct_destroy, NULL);
-	while (untrack_refs() > 0)
-		schedule();
 
 	cancel_delayed_work_sync(&conntrack_gc_work.dwork);
 	nf_ct_free_hashtable(nf_conntrack_hash, nf_conntrack_htable_size);
@@ -1826,20 +1810,11 @@ EXPORT_SYMBOL_GPL(nf_conntrack_set_hashsize);
 module_param_call(hashsize, nf_conntrack_set_hashsize, param_get_uint,
 		  &nf_conntrack_htable_size, 0600);
 
-void nf_ct_untracked_status_or(unsigned long bits)
-{
-	int cpu;
-
-	for_each_possible_cpu(cpu)
-		per_cpu(nf_conntrack_untracked, cpu).status |= bits;
-}
-EXPORT_SYMBOL_GPL(nf_ct_untracked_status_or);
-
 int nf_conntrack_init_start(void)
 {
 	int max_factor = 8;
 	int ret = -ENOMEM;
-	int i, cpu;
+	int i;
 
 	seqcount_init(&nf_conntrack_generation);
 
@@ -1922,15 +1897,6 @@ int nf_conntrack_init_start(void)
 	if (ret < 0)
 		goto err_proto;
 
-	/* Set up fake conntrack: to never be deleted, not in any hashes */
-	for_each_possible_cpu(cpu) {
-		struct nf_conn *ct = &per_cpu(nf_conntrack_untracked, cpu);
-		write_pnet(&ct->ct_net, &init_net);
-		atomic_set(&ct->ct_general.use, 1);
-	}
-	/*  - and look it like as a confirmed connection */
-	nf_ct_untracked_status_or(IPS_CONFIRMED | IPS_UNTRACKED);
-
 	conntrack_gc_work_init(&conntrack_gc_work);
 	queue_delayed_work(system_long_wq, &conntrack_gc_work.dwork, HZ);
 
@@ -1978,6 +1944,7 @@ int nf_conntrack_init_net(struct net *net)
 	int ret = -ENOMEM;
 	int cpu;
 
+	BUILD_BUG_ON(IP_CT_UNTRACKED == IP_CT_NUMBER);
 	atomic_set(&net->ct.count, 0);
 
 	net->ct.pcpu_lists = alloc_percpu(struct ct_pcpu);
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 94b14c5a8b17..508799f7dc1c 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -874,9 +874,6 @@ static int __init nf_nat_init(void)
 
 	nf_ct_helper_expectfn_register(&follow_master_nat);
 
-	/* Initialize fake conntrack so that NAT will skip it */
-	nf_ct_untracked_status_or(IPS_NAT_DONE_MASK);
-
 	BUG_ON(nfnetlink_parse_nat_setup_hook != NULL);
 	RCU_INIT_POINTER(nfnetlink_parse_nat_setup_hook,
 			   nfnetlink_parse_nat_setup);
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index c6b8022c0e47..1f25ebadde09 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -66,12 +66,12 @@ static void nft_ct_get_eval(const struct nft_expr *expr,
 
 	switch (priv->key) {
 	case NFT_CT_STATE:
-		if (ct == NULL)
-			state = NF_CT_STATE_INVALID_BIT;
-		else if (nf_ct_is_untracked(ct))
+		if (ct)
+			state = NF_CT_STATE_BIT(ctinfo);
+		else if (ctinfo == IP_CT_UNTRACKED)
 			state = NF_CT_STATE_UNTRACKED_BIT;
 		else
-			state = NF_CT_STATE_BIT(ctinfo);
+			state = NF_CT_STATE_INVALID_BIT;
 		*dest = state;
 		return;
 	default:
@@ -708,12 +708,10 @@ static void nft_notrack_eval(const struct nft_expr *expr,
 
 	ct = nf_ct_get(pkt->skb, &ctinfo);
 	/* Previously seen (loopback or untracked)?  Ignore. */
-	if (ct)
+	if (ct || ctinfo == IP_CT_UNTRACKED)
 		return;
 
-	ct = nf_ct_untracked_get();
-	atomic_inc(&ct->ct_general.use);
-	nf_ct_set(skb, ct, IP_CT_NEW);
+	nf_ct_set(skb, ct, IP_CT_UNTRACKED);
 }
 
 static struct nft_expr_type nft_notrack_type;
diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c
index b008db0184b8..3cbe1bcf6a74 100644
--- a/net/netfilter/xt_CT.c
+++ b/net/netfilter/xt_CT.c
@@ -26,11 +26,12 @@ static inline int xt_ct_target(struct sk_buff *skb, struct nf_conn *ct)
 	if (skb->_nfct != 0)
 		return XT_CONTINUE;
 
-	/* special case the untracked ct : we want the percpu object */
-	if (!ct)
-		ct = nf_ct_untracked_get();
-	atomic_inc(&ct->ct_general.use);
-	nf_ct_set(skb, ct, IP_CT_NEW);
+	if (ct) {
+		atomic_inc(&ct->ct_general.use);
+		nf_ct_set(skb, ct, IP_CT_NEW);
+	} else {
+		nf_ct_set(skb, ct, IP_CT_UNTRACKED);
+	}
 
 	return XT_CONTINUE;
 }
@@ -335,7 +336,7 @@ static void xt_ct_tg_destroy(const struct xt_tgdtor_param *par,
 	struct nf_conn *ct = info->ct;
 	struct nf_conn_help *help;
 
-	if (ct && !nf_ct_is_untracked(ct)) {
+	if (ct) {
 		help = nfct_help(ct);
 		if (help)
 			module_put(help->helper->me);
@@ -412,8 +413,7 @@ notrack_tg(struct sk_buff *skb, const struct xt_action_param *par)
 	if (skb->_nfct != 0)
 		return XT_CONTINUE;
 
-	nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-	nf_conntrack_get(skb_nfct(skb));
+	nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 
 	return XT_CONTINUE;
 }
diff --git a/net/netfilter/xt_conntrack.c b/net/netfilter/xt_conntrack.c
index c0fb217bc649..39cf1d019240 100644
--- a/net/netfilter/xt_conntrack.c
+++ b/net/netfilter/xt_conntrack.c
@@ -172,12 +172,11 @@ conntrack_mt(const struct sk_buff *skb, struct xt_action_param *par,
 
 	ct = nf_ct_get(skb, &ctinfo);
 
-	if (ct) {
-		if (nf_ct_is_untracked(ct))
-			statebit = XT_CONNTRACK_STATE_UNTRACKED;
-		else
-			statebit = XT_CONNTRACK_STATE_BIT(ctinfo);
-	} else
+	if (ct)
+		statebit = XT_CONNTRACK_STATE_BIT(ctinfo);
+	else if (ctinfo == IP_CT_UNTRACKED)
+		statebit = XT_CONNTRACK_STATE_UNTRACKED;
+	else
 		statebit = XT_CONNTRACK_STATE_INVALID;
 
 	if (info->match_flags & XT_CONNTRACK_STATE) {
diff --git a/net/netfilter/xt_state.c b/net/netfilter/xt_state.c
index 5746a33789a5..5fbd79194d21 100644
--- a/net/netfilter/xt_state.c
+++ b/net/netfilter/xt_state.c
@@ -28,14 +28,13 @@ state_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	unsigned int statebit;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-	if (!ct)
+	if (ct)
+		statebit = XT_STATE_BIT(ctinfo);
+	else if (ctinfo == IP_CT_UNTRACKED)
+		statebit = XT_STATE_UNTRACKED;
+	else
 		statebit = XT_STATE_INVALID;
-	else {
-		if (nf_ct_is_untracked(ct))
-			statebit = XT_STATE_UNTRACKED;
-		else
-			statebit = XT_STATE_BIT(ctinfo);
-	}
+
 	return (sinfo->statemask & statebit);
 }
 
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH nf-next 2/2] netfilter: remove nf_ct_is_untracked
  2017-03-08 12:49 [PATCH nf-next 0/2] netfilter: untracked object removal Florian Westphal
  2017-03-08 12:49 ` [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects Florian Westphal
@ 2017-03-08 12:49 ` Florian Westphal
  2017-03-08 16:29 ` [PATCH nf-next 0/2] netfilter: untracked object removal Eric Dumazet
  2 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-03-08 12:49 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

This function is now obsolete and always returns false,
change has no effect on generated code.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/ip_vs.h                       |  4 ++--
 include/net/netfilter/nf_conntrack.h      |  5 -----
 include/net/netfilter/nf_conntrack_core.h |  2 +-
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c  |  4 ----
 net/ipv4/netfilter/nf_socket_ipv4.c       |  2 +-
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c  |  4 ----
 net/netfilter/ipvs/ip_vs_ftp.c            |  2 +-
 net/netfilter/ipvs/ip_vs_nfct.c           |  4 ++--
 net/netfilter/ipvs/ip_vs_xmit.c           |  8 ++++----
 net/netfilter/nf_conntrack_netlink.c      | 12 +-----------
 net/netfilter/xt_HMARK.c                  |  2 +-
 net/netfilter/xt_cluster.c                |  3 ---
 net/netfilter/xt_connlabel.c              |  2 +-
 net/netfilter/xt_connmark.c               |  4 ++--
 net/netfilter/xt_ipvs.c                   |  2 +-
 net/openvswitch/conntrack.c               |  5 -----
 16 files changed, 17 insertions(+), 48 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 4a8ec999b62e..ef4874af1d48 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1553,7 +1553,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-	if (!ct || !nf_ct_is_untracked(ct)) {
+	if (ct) {
 		nf_conntrack_put(&ct->ct_general);
 		nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 	}
@@ -1614,7 +1614,7 @@ static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp,
 	if (!(cp->flags & IP_VS_CONN_F_NFCT))
 		return false;
 	ct = nf_ct_get(skb, &ctinfo);
-	if (ct && !nf_ct_is_untracked(ct))
+	if (ct)
 		return true;
 #endif
 	return false;
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 012b99f563e5..4978a82b75fa 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -273,11 +273,6 @@ static inline int nf_ct_is_dying(const struct nf_conn *ct)
 	return test_bit(IPS_DYING_BIT, &ct->status);
 }
 
-static inline int nf_ct_is_untracked(const struct nf_conn *ct)
-{
-	return false;
-}
-
 /* Packet is received from loopback */
 static inline bool nf_is_loopback_packet(const struct sk_buff *skb)
 {
diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 84ec7ca5f195..81d7f8a30945 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -65,7 +65,7 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb)
 	struct nf_conn *ct = (struct nf_conn *)skb_nfct(skb);
 	int ret = NF_ACCEPT;
 
-	if (ct && !nf_ct_is_untracked(ct)) {
+	if (ct) {
 		if (!nf_ct_is_confirmed(ct))
 			ret = __nf_conntrack_confirm(skb);
 		if (likely(ret == NF_ACCEPT))
diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index f8aad03d674b..27bb5cf70131 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -269,10 +269,6 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
 	if (!ct)
 		return NF_ACCEPT;
 
-	/* Don't try to NAT if this packet is not conntracked */
-	if (nf_ct_is_untracked(ct))
-		return NF_ACCEPT;
-
 	nat = nf_ct_nat_ext_add(ct);
 	if (nat == NULL)
 		return NF_ACCEPT;
diff --git a/net/ipv4/netfilter/nf_socket_ipv4.c b/net/ipv4/netfilter/nf_socket_ipv4.c
index a83d558e1aae..e9293bdebba0 100644
--- a/net/ipv4/netfilter/nf_socket_ipv4.c
+++ b/net/ipv4/netfilter/nf_socket_ipv4.c
@@ -139,7 +139,7 @@ struct sock *nf_sk_lookup_slow_v4(struct net *net, const struct sk_buff *skb,
 	 * SNAT-ted connection.
 	 */
 	ct = nf_ct_get(skb, &ctinfo);
-	if (ct && !nf_ct_is_untracked(ct) &&
+	if (ct &&
 	    ((iph->protocol != IPPROTO_ICMP &&
 	      ctinfo == IP_CT_ESTABLISHED_REPLY) ||
 	     (iph->protocol == IPPROTO_ICMP &&
diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index e0be97e636a4..922b5aef273c 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -273,10 +273,6 @@ nf_nat_ipv6_fn(void *priv, struct sk_buff *skb,
 	if (!ct)
 		return NF_ACCEPT;
 
-	/* Don't try to NAT if this packet is not conntracked */
-	if (nf_ct_is_untracked(ct))
-		return NF_ACCEPT;
-
 	nat = nf_ct_nat_ext_add(ct);
 	if (nat == NULL)
 		return NF_ACCEPT;
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index d30c327bb578..ad539e57eef1 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -260,7 +260,7 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
 		buf_len = strlen(buf);
 
 		ct = nf_ct_get(skb, &ctinfo);
-		if (ct && !nf_ct_is_untracked(ct) && nfct_nat(ct)) {
+		if (ct && nfct_nat(ct)) {
 			/* If mangling fails this function will return 0
 			 * which will cause the packet to be dropped.
 			 * Mangling can only fail under memory pressure,
diff --git a/net/netfilter/ipvs/ip_vs_nfct.c b/net/netfilter/ipvs/ip_vs_nfct.c
index fc230d99aa3b..6cf3fd81a5ec 100644
--- a/net/netfilter/ipvs/ip_vs_nfct.c
+++ b/net/netfilter/ipvs/ip_vs_nfct.c
@@ -85,7 +85,7 @@ ip_vs_update_conntrack(struct sk_buff *skb, struct ip_vs_conn *cp, int outin)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 	struct nf_conntrack_tuple new_tuple;
 
-	if (ct == NULL || nf_ct_is_confirmed(ct) || nf_ct_is_untracked(ct) ||
+	if (ct == NULL || nf_ct_is_confirmed(ct) ||
 	    nf_ct_is_dying(ct))
 		return;
 
@@ -232,7 +232,7 @@ void ip_vs_nfct_expect_related(struct sk_buff *skb, struct nf_conn *ct,
 {
 	struct nf_conntrack_expect *exp;
 
-	if (ct == NULL || nf_ct_is_untracked(ct))
+	if (ct == NULL)
 		return;
 
 	exp = nf_ct_expect_alloc(ct);
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 4e1a98fcc8c3..2eab1e0400f4 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -775,7 +775,7 @@ ip_vs_nat_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 		enum ip_conntrack_info ctinfo;
 		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-		if (ct && !nf_ct_is_untracked(ct)) {
+		if (ct) {
 			IP_VS_DBG_RL_PKT(10, AF_INET, pp, skb, ipvsh->off,
 					 "ip_vs_nat_xmit(): "
 					 "stopping DNAT to local address");
@@ -866,7 +866,7 @@ ip_vs_nat_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 		enum ip_conntrack_info ctinfo;
 		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-		if (ct && !nf_ct_is_untracked(ct)) {
+		if (ct) {
 			IP_VS_DBG_RL_PKT(10, AF_INET6, pp, skb, ipvsh->off,
 					 "ip_vs_nat_xmit_v6(): "
 					 "stopping DNAT to local address");
@@ -1338,7 +1338,7 @@ ip_vs_icmp_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 		enum ip_conntrack_info ctinfo;
 		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-		if (ct && !nf_ct_is_untracked(ct)) {
+		if (ct) {
 			IP_VS_DBG(10, "%s(): "
 				  "stopping DNAT to local address %pI4\n",
 				  __func__, &cp->daddr.ip);
@@ -1429,7 +1429,7 @@ ip_vs_icmp_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 		enum ip_conntrack_info ctinfo;
 		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-		if (ct && !nf_ct_is_untracked(ct)) {
+		if (ct) {
 			IP_VS_DBG(10, "%s(): "
 				  "stopping DNAT to local address %pI6\n",
 				  __func__, &cp->daddr.in6);
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 6806b5e73567..29265f412039 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -627,10 +627,6 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item)
 	unsigned int flags = 0, group;
 	int err;
 
-	/* ignore our fake conntrack entry */
-	if (nf_ct_is_untracked(ct))
-		return 0;
-
 	if (events & (1 << IPCT_DESTROY)) {
 		type = IPCTNL_MSG_CT_DELETE;
 		group = NFNLGRP_CONNTRACK_DESTROY;
@@ -2172,13 +2168,7 @@ ctnetlink_glue_build_size(const struct nf_conn *ct)
 static struct nf_conn *ctnetlink_glue_get_ct(const struct sk_buff *skb,
 					     enum ip_conntrack_info *ctinfo)
 {
-	struct nf_conn *ct;
-
-	ct = nf_ct_get(skb, ctinfo);
-	if (ct && nf_ct_is_untracked(ct))
-		ct = NULL;
-
-	return ct;
+	return nf_ct_get(skb, ctinfo);
 }
 
 static int __ctnetlink_glue_build(struct sk_buff *skb, struct nf_conn *ct)
diff --git a/net/netfilter/xt_HMARK.c b/net/netfilter/xt_HMARK.c
index 02afaf48a729..60e6dbe12460 100644
--- a/net/netfilter/xt_HMARK.c
+++ b/net/netfilter/xt_HMARK.c
@@ -84,7 +84,7 @@ hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
 	struct nf_conntrack_tuple *otuple;
 	struct nf_conntrack_tuple *rtuple;
 
-	if (ct == NULL || nf_ct_is_untracked(ct))
+	if (ct == NULL)
 		return -1;
 
 	otuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
diff --git a/net/netfilter/xt_cluster.c b/net/netfilter/xt_cluster.c
index 9a9884a39c0e..57ef175dfbfa 100644
--- a/net/netfilter/xt_cluster.c
+++ b/net/netfilter/xt_cluster.c
@@ -121,9 +121,6 @@ xt_cluster_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	if (ct == NULL)
 		return false;
 
-	if (nf_ct_is_untracked(ct))
-		return false;
-
 	if (ct->master)
 		hash = xt_cluster_hash(ct->master, info);
 	else
diff --git a/net/netfilter/xt_connlabel.c b/net/netfilter/xt_connlabel.c
index 7827128d5a95..23372879e6e3 100644
--- a/net/netfilter/xt_connlabel.c
+++ b/net/netfilter/xt_connlabel.c
@@ -29,7 +29,7 @@ connlabel_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	bool invert = info->options & XT_CONNLABEL_OP_INVERT;
 
 	ct = nf_ct_get(skb, &ctinfo);
-	if (ct == NULL || nf_ct_is_untracked(ct))
+	if (ct == NULL)
 		return invert;
 
 	labels = nf_ct_labels_find(ct);
diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c
index 9935d5029b0e..ec377cc6a369 100644
--- a/net/netfilter/xt_connmark.c
+++ b/net/netfilter/xt_connmark.c
@@ -44,7 +44,7 @@ connmark_tg(struct sk_buff *skb, const struct xt_action_param *par)
 	u_int32_t newmark;
 
 	ct = nf_ct_get(skb, &ctinfo);
-	if (ct == NULL || nf_ct_is_untracked(ct))
+	if (ct == NULL)
 		return XT_CONTINUE;
 
 	switch (info->mode) {
@@ -97,7 +97,7 @@ connmark_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	const struct nf_conn *ct;
 
 	ct = nf_ct_get(skb, &ctinfo);
-	if (ct == NULL || nf_ct_is_untracked(ct))
+	if (ct == NULL)
 		return false;
 
 	return ((ct->mark & info->mask) == info->mark) ^ info->invert;
diff --git a/net/netfilter/xt_ipvs.c b/net/netfilter/xt_ipvs.c
index 0fdc89064488..42540d26c2b8 100644
--- a/net/netfilter/xt_ipvs.c
+++ b/net/netfilter/xt_ipvs.c
@@ -116,7 +116,7 @@ ipvs_mt(const struct sk_buff *skb, struct xt_action_param *par)
 		enum ip_conntrack_info ctinfo;
 		struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-		if (ct == NULL || nf_ct_is_untracked(ct)) {
+		if (ct == NULL) {
 			match = false;
 			goto out_put_cp;
 		}
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 85cd59526670..23aa2a185b95 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -796,11 +796,6 @@ static int ovs_ct_nat(struct net *net, struct sw_flow_key *key,
 	enum nf_nat_manip_type maniptype;
 	int err;
 
-	if (nf_ct_is_untracked(ct)) {
-		/* A NAT action may only be performed on tracked packets. */
-		return NF_ACCEPT;
-	}
-
 	/* Add NAT extension if not confirmed yet. */
 	if (!nf_ct_is_confirmed(ct) && !nf_ct_nat_ext_add(ct))
 		return NF_ACCEPT;   /* Can't NAT. */
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH nf-next 0/2] netfilter: untracked object removal
  2017-03-08 12:49 [PATCH nf-next 0/2] netfilter: untracked object removal Florian Westphal
  2017-03-08 12:49 ` [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects Florian Westphal
  2017-03-08 12:49 ` [PATCH nf-next 2/2] netfilter: remove nf_ct_is_untracked Florian Westphal
@ 2017-03-08 16:29 ` Eric Dumazet
  2 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2017-03-08 16:29 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Wed, 2017-03-08 at 13:49 +0100, Florian Westphal wrote:
> These patches remove the percpu untracked objects, they get replaced
> with a new (kernel internal) ctinfo state.
> 
> This avoids reference counter operations for untracked packets and
> removes the need to check a conntrack for the UNTRACKED status bit
> before setting connmark, labels, etc.
> 
> I checked with following rule set and things appear to work as
> expected (i.e., ssh connections don't show up in conntrack -L):
> 
> *raw
> :PREROUTING ACCEPT [455:34825]
> :OUTPUT ACCEPT [251:29555]
> [775:63699] -A PREROUTING -p tcp -m tcp --dport 22 -j NOTRACK
> [251:29555] -A OUTPUT -p tcp -m tcp --sport 22 -j NOTRACK
> COMMIT
> *filter
> :INPUT ACCEPT [0:0]
> :FORWARD ACCEPT [0:0]
> :OUTPUT ACCEPT [0:0]
> [0:0] -A INPUT -m conntrack --ctstate INVALID -j DROP
> [337:26377] -A INPUT -p tcp -m conntrack --ctstate UNTRACKED -m tcp --dport 22 -j ACCEPT
> [0:0] -A INPUT -m conntrack --ctstate UNTRACKED
> [102:13883] -A OUTPUT -p tcp -m conntrack --ctstate UNTRACKED -m tcp --sport 22 -j ACCEPT
> [0:0] -A OUTPUT -m conntrack --ctstate UNTRACKED
> COMMIT


Very nice series Florian, thanks !




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects
  2017-04-15  9:43   ` Florian Westphal
@ 2017-04-15  9:49     ` Pablo Neira Ayuso
  0 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2017-04-15  9:49 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Sat, Apr 15, 2017 at 11:43:15AM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Hi Florian,
> > 
> > On Fri, Apr 14, 2017 at 08:31:08PM +0200, Florian Westphal wrote:
> > [...]
> > > resurrect an old patch from Pablo Neira to remove the untracked objects.
> > 
> > Thanks for doing so :)
> > 
> > > diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
> > > index 6a8e33dd4ecb..b4a0a1940118 100644
> > > --- a/include/uapi/linux/netfilter/nf_conntrack_common.h
> > > +++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
> > > @@ -28,12 +28,14 @@ enum ip_conntrack_info {
> > >  	/* only for userspace compatibility */
> > >  #ifndef __KERNEL__
> > >  	IP_CT_NEW_REPLY = IP_CT_NUMBER,
> > > +#else
> > > +	IP_CT_UNTRACKED = 7,
> > 
> > This seems to be exposed via nfnetlink_queue conntrack support.
> 
> Yet another argument for removing this, its a bug.
> (missing !nf_ct_is_untracked() check).
> 
> Probably should also check !nf_ct_is_template() there.

We already do there. So no problem :)

> > > @@ -94,7 +96,7 @@ enum ip_conntrack_status {
> > >  	IPS_TEMPLATE_BIT = 11,
> > >  	IPS_TEMPLATE = (1 << IPS_TEMPLATE_BIT),
> > >  
> > > -	/* Conntrack is a fake untracked entry */
> > > +	/* Conntrack is a fake untracked entry.  Obsolete and not used anymore */
> > >  	IPS_UNTRACKED_BIT = 12,
> > >  	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
> > 
> > I wonder if we should just set this via ctnetlink, so the interface
> > keeps stable. Even if we don't use this bit anymore. These are also
> > exposed via ctnetlink CTA_STATUS.
> 
> Not following, sorry (still jetlagged).
> ctnetlink_new_conntrack() always places the ct into the hashes, so its
> not 'untracked'.
> 
> Setting and/or changing IPS_UNTRACKED_BIT via ctnetlink should be
> disallowed.
> 
> Is that what you meant?

Just got confused in this nfnetlink_queue path, this looks good! :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects
  2017-04-15  9:21 ` Pablo Neira Ayuso
  2017-04-15  9:43   ` Florian Westphal
@ 2017-04-15  9:47   ` Pablo Neira Ayuso
  1 sibling, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2017-04-15  9:47 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Sat, Apr 15, 2017 at 11:21:31AM +0200, Pablo Neira Ayuso wrote:
> Hi Florian,
> 
> On Fri, Apr 14, 2017 at 08:31:08PM +0200, Florian Westphal wrote:
> [...]
> > resurrect an old patch from Pablo Neira to remove the untracked objects.
> 
> Thanks for doing so :)
> 
> > diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
> > index 6a8e33dd4ecb..b4a0a1940118 100644
> > --- a/include/uapi/linux/netfilter/nf_conntrack_common.h
> > +++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
> > @@ -28,12 +28,14 @@ enum ip_conntrack_info {
> >  	/* only for userspace compatibility */
> >  #ifndef __KERNEL__
> >  	IP_CT_NEW_REPLY = IP_CT_NUMBER,
> > +#else
> > +	IP_CT_UNTRACKED = 7,
> 
> This seems to be exposed via nfnetlink_queue conntrack support.
> 
> > @@ -94,7 +96,7 @@ enum ip_conntrack_status {
> >  	IPS_TEMPLATE_BIT = 11,
> >  	IPS_TEMPLATE = (1 << IPS_TEMPLATE_BIT),
> >  
> > -	/* Conntrack is a fake untracked entry */
> > +	/* Conntrack is a fake untracked entry.  Obsolete and not used anymore */
> >  	IPS_UNTRACKED_BIT = 12,
> >  	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
> 
> I wonder if we should just set this via ctnetlink, so the interface
> keeps stable. Even if we don't use this bit anymore. These are also
> exposed via ctnetlink CTA_STATUS.

Forget this.

This patch is perfectly fine.

Applied, thanks Florian!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects
  2017-04-15  9:21 ` Pablo Neira Ayuso
@ 2017-04-15  9:43   ` Florian Westphal
  2017-04-15  9:49     ` Pablo Neira Ayuso
  2017-04-15  9:47   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2017-04-15  9:43 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Hi Florian,
> 
> On Fri, Apr 14, 2017 at 08:31:08PM +0200, Florian Westphal wrote:
> [...]
> > resurrect an old patch from Pablo Neira to remove the untracked objects.
> 
> Thanks for doing so :)
> 
> > diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
> > index 6a8e33dd4ecb..b4a0a1940118 100644
> > --- a/include/uapi/linux/netfilter/nf_conntrack_common.h
> > +++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
> > @@ -28,12 +28,14 @@ enum ip_conntrack_info {
> >  	/* only for userspace compatibility */
> >  #ifndef __KERNEL__
> >  	IP_CT_NEW_REPLY = IP_CT_NUMBER,
> > +#else
> > +	IP_CT_UNTRACKED = 7,
> 
> This seems to be exposed via nfnetlink_queue conntrack support.

Yet another argument for removing this, its a bug.
(missing !nf_ct_is_untracked() check).

Probably should also check !nf_ct_is_template() there.

> > @@ -94,7 +96,7 @@ enum ip_conntrack_status {
> >  	IPS_TEMPLATE_BIT = 11,
> >  	IPS_TEMPLATE = (1 << IPS_TEMPLATE_BIT),
> >  
> > -	/* Conntrack is a fake untracked entry */
> > +	/* Conntrack is a fake untracked entry.  Obsolete and not used anymore */
> >  	IPS_UNTRACKED_BIT = 12,
> >  	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
> 
> I wonder if we should just set this via ctnetlink, so the interface
> keeps stable. Even if we don't use this bit anymore. These are also
> exposed via ctnetlink CTA_STATUS.

Not following, sorry (still jetlagged).
ctnetlink_new_conntrack() always places the ct into the hashes, so its
not 'untracked'.

Setting and/or changing IPS_UNTRACKED_BIT via ctnetlink should be
disallowed.

Is that what you meant?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects
  2017-04-14 18:31 [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects Florian Westphal
@ 2017-04-15  9:21 ` Pablo Neira Ayuso
  2017-04-15  9:43   ` Florian Westphal
  2017-04-15  9:47   ` Pablo Neira Ayuso
  0 siblings, 2 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2017-04-15  9:21 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

Hi Florian,

On Fri, Apr 14, 2017 at 08:31:08PM +0200, Florian Westphal wrote:
[...]
> resurrect an old patch from Pablo Neira to remove the untracked objects.

Thanks for doing so :)

> diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
> index 6a8e33dd4ecb..b4a0a1940118 100644
> --- a/include/uapi/linux/netfilter/nf_conntrack_common.h
> +++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
> @@ -28,12 +28,14 @@ enum ip_conntrack_info {
>  	/* only for userspace compatibility */
>  #ifndef __KERNEL__
>  	IP_CT_NEW_REPLY = IP_CT_NUMBER,
> +#else
> +	IP_CT_UNTRACKED = 7,

This seems to be exposed via nfnetlink_queue conntrack support.

> @@ -94,7 +96,7 @@ enum ip_conntrack_status {
>  	IPS_TEMPLATE_BIT = 11,
>  	IPS_TEMPLATE = (1 << IPS_TEMPLATE_BIT),
>  
> -	/* Conntrack is a fake untracked entry */
> +	/* Conntrack is a fake untracked entry.  Obsolete and not used anymore */
>  	IPS_UNTRACKED_BIT = 12,
>  	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),

I wonder if we should just set this via ctnetlink, so the interface
keeps stable. Even if we don't use this bit anymore. These are also
exposed via ctnetlink CTA_STATUS.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects
@ 2017-04-14 18:31 Florian Westphal
  2017-04-15  9:21 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2017-04-14 18:31 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

resurrect an old patch from Pablo Neira to remove the untracked objects.

Currently, there are four possible states of an skb wrt. conntrack.

1. No conntrack attached, ct is NULL.
2. Normal (kmem cache allocated) ct attached.
3. a template (kmalloc'd), not in any hash tables at any point in time
4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
   IPS_UNTRACKED_BIT in ct->status.

Untracked is supposed to be identical to case 1.  It exists only
so users can check

-m conntrack --ctstate UNTRACKED vs.
-m conntrack --ctstate INVALID

e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
supposed to be a no-op.

Thus currently we need to check
 ct == NULL || nf_ct_is_untracked(ct)

in a lot of places in order to avoid altering untracked objects.

The other consequence of the percpu untracked object is that all
-j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
(inc/dec the untracked conntracks refcount).

This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
make the distinction instead.

The (few) places that care about packet invalid (ct is NULL) vs.
packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
but all other places can omit the nf_ct_is_untracked() check.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/ip_vs.h                                |  6 +--
 include/net/netfilter/nf_conntrack.h               | 10 +----
 include/uapi/linux/netfilter/nf_conntrack_common.h |  6 ++-
 net/ipv4/netfilter/nf_dup_ipv4.c                   |  3 +-
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c     |  3 +-
 net/ipv6/netfilter/nf_dup_ipv6.c                   |  3 +-
 net/netfilter/nf_conntrack_core.c                  | 48 +++-------------------
 net/netfilter/nf_nat_core.c                        |  3 --
 net/netfilter/nft_ct.c                             | 14 +++----
 net/netfilter/xt_CT.c                              | 16 ++++----
 net/netfilter/xt_conntrack.c                       | 11 +++--
 net/netfilter/xt_state.c                           | 13 +++---
 12 files changed, 39 insertions(+), 97 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 8a4a57b887fb..9a75d9933e63 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1556,12 +1556,8 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		struct nf_conn *untracked;
-
 		nf_conntrack_put(&ct->ct_general);
-		untracked = nf_ct_untracked_get();
-		nf_conntrack_get(&untracked->ct_general);
-		nf_ct_set(skb, untracked, IP_CT_NEW);
+		nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 	}
 #endif
 }
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 19605878da47..012b99f563e5 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -243,14 +243,6 @@ extern s32 (*nf_ct_nat_offset)(const struct nf_conn *ct,
 			       enum ip_conntrack_dir dir,
 			       u32 seq);
 
-/* Fake conntrack entry for untracked connections */
-DECLARE_PER_CPU_ALIGNED(struct nf_conn, nf_conntrack_untracked);
-static inline struct nf_conn *nf_ct_untracked_get(void)
-{
-	return raw_cpu_ptr(&nf_conntrack_untracked);
-}
-void nf_ct_untracked_status_or(unsigned long bits);
-
 /* Iterate over all conntracks: if iter returns true, it's deleted. */
 void nf_ct_iterate_cleanup(struct net *net,
 			   int (*iter)(struct nf_conn *i, void *data),
@@ -283,7 +275,7 @@ static inline int nf_ct_is_dying(const struct nf_conn *ct)
 
 static inline int nf_ct_is_untracked(const struct nf_conn *ct)
 {
-	return test_bit(IPS_UNTRACKED_BIT, &ct->status);
+	return false;
 }
 
 /* Packet is received from loopback */
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index 6a8e33dd4ecb..b4a0a1940118 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -28,12 +28,14 @@ enum ip_conntrack_info {
 	/* only for userspace compatibility */
 #ifndef __KERNEL__
 	IP_CT_NEW_REPLY = IP_CT_NUMBER,
+#else
+	IP_CT_UNTRACKED = 7,
 #endif
 };
 
 #define NF_CT_STATE_INVALID_BIT			(1 << 0)
 #define NF_CT_STATE_BIT(ctinfo)			(1 << ((ctinfo) % IP_CT_IS_REPLY + 1))
-#define NF_CT_STATE_UNTRACKED_BIT		(1 << (IP_CT_NUMBER + 1))
+#define NF_CT_STATE_UNTRACKED_BIT		(1 << (IP_CT_UNTRACKED + 1))
 
 /* Bitset representing status of connection. */
 enum ip_conntrack_status {
@@ -94,7 +96,7 @@ enum ip_conntrack_status {
 	IPS_TEMPLATE_BIT = 11,
 	IPS_TEMPLATE = (1 << IPS_TEMPLATE_BIT),
 
-	/* Conntrack is a fake untracked entry */
+	/* Conntrack is a fake untracked entry.  Obsolete and not used anymore */
 	IPS_UNTRACKED_BIT = 12,
 	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
 
diff --git a/net/ipv4/netfilter/nf_dup_ipv4.c b/net/ipv4/netfilter/nf_dup_ipv4.c
index f0dbff05fc28..39895b9ddeb9 100644
--- a/net/ipv4/netfilter/nf_dup_ipv4.c
+++ b/net/ipv4/netfilter/nf_dup_ipv4.c
@@ -69,8 +69,7 @@ void nf_dup_ipv4(struct net *net, struct sk_buff *skb, unsigned int hooknum,
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	/* Avoid counting cloned packets towards the original connection. */
 	nf_reset(skb);
-	nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-	nf_conntrack_get(skb_nfct(skb));
+	nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 #endif
 	/*
 	 * If we are in PREROUTING/INPUT, decrease the TTL to mitigate potential
diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index d2c2ccbfbe72..d5f028e33f65 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -221,8 +221,7 @@ icmpv6_error(struct net *net, struct nf_conn *tmpl,
 	type = icmp6h->icmp6_type - 130;
 	if (type >= 0 && type < sizeof(noct_valid_new) &&
 	    noct_valid_new[type]) {
-		nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-		nf_conntrack_get(skb_nfct(skb));
+		nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 		return NF_ACCEPT;
 	}
 
diff --git a/net/ipv6/netfilter/nf_dup_ipv6.c b/net/ipv6/netfilter/nf_dup_ipv6.c
index 888ecd106e5f..4a7ddeddbaab 100644
--- a/net/ipv6/netfilter/nf_dup_ipv6.c
+++ b/net/ipv6/netfilter/nf_dup_ipv6.c
@@ -58,8 +58,7 @@ void nf_dup_ipv6(struct net *net, struct sk_buff *skb, unsigned int hooknum,
 
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 	nf_reset(skb);
-	nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-	nf_conntrack_get(skb_nfct(skb));
+	nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 #endif
 	if (hooknum == NF_INET_PRE_ROUTING ||
 	    hooknum == NF_INET_LOCAL_IN) {
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index bcf1d2a6539e..03150f60714d 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -180,14 +180,6 @@ EXPORT_SYMBOL_GPL(nf_conntrack_htable_size);
 
 unsigned int nf_conntrack_max __read_mostly;
 seqcount_t nf_conntrack_generation __read_mostly;
-
-/* nf_conn must be 8 bytes aligned, as the 3 LSB bits are used
- * for the nfctinfo. We cheat by (ab)using the PER CPU cache line
- * alignment to enforce this.
- */
-DEFINE_PER_CPU_ALIGNED(struct nf_conn, nf_conntrack_untracked);
-EXPORT_PER_CPU_SYMBOL(nf_conntrack_untracked);
-
 static unsigned int nf_conntrack_hash_rnd __read_mostly;
 
 static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple,
@@ -1314,9 +1306,10 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned int hooknum,
 	int ret;
 
 	tmpl = nf_ct_get(skb, &ctinfo);
-	if (tmpl) {
+	if (tmpl || ctinfo == IP_CT_UNTRACKED) {
 		/* Previously seen (loopback or untracked)?  Ignore. */
-		if (!nf_ct_is_template(tmpl)) {
+		if ((tmpl && !nf_ct_is_template(tmpl)) ||
+		     ctinfo == IP_CT_UNTRACKED) {
 			NF_CT_STAT_INC_ATOMIC(net, ignore);
 			return NF_ACCEPT;
 		}
@@ -1629,18 +1622,6 @@ void nf_ct_free_hashtable(void *hash, unsigned int size)
 }
 EXPORT_SYMBOL_GPL(nf_ct_free_hashtable);
 
-static int untrack_refs(void)
-{
-	int cnt = 0, cpu;
-
-	for_each_possible_cpu(cpu) {
-		struct nf_conn *ct = &per_cpu(nf_conntrack_untracked, cpu);
-
-		cnt += atomic_read(&ct->ct_general.use) - 1;
-	}
-	return cnt;
-}
-
 void nf_conntrack_cleanup_start(void)
 {
 	conntrack_gc_work.exiting = true;
@@ -1650,8 +1631,6 @@ void nf_conntrack_cleanup_start(void)
 void nf_conntrack_cleanup_end(void)
 {
 	RCU_INIT_POINTER(nf_ct_destroy, NULL);
-	while (untrack_refs() > 0)
-		schedule();
 
 	cancel_delayed_work_sync(&conntrack_gc_work.dwork);
 	nf_ct_free_hashtable(nf_conntrack_hash, nf_conntrack_htable_size);
@@ -1825,20 +1804,11 @@ EXPORT_SYMBOL_GPL(nf_conntrack_set_hashsize);
 module_param_call(hashsize, nf_conntrack_set_hashsize, param_get_uint,
 		  &nf_conntrack_htable_size, 0600);
 
-void nf_ct_untracked_status_or(unsigned long bits)
-{
-	int cpu;
-
-	for_each_possible_cpu(cpu)
-		per_cpu(nf_conntrack_untracked, cpu).status |= bits;
-}
-EXPORT_SYMBOL_GPL(nf_ct_untracked_status_or);
-
 int nf_conntrack_init_start(void)
 {
 	int max_factor = 8;
 	int ret = -ENOMEM;
-	int i, cpu;
+	int i;
 
 	seqcount_init(&nf_conntrack_generation);
 
@@ -1921,15 +1891,6 @@ int nf_conntrack_init_start(void)
 	if (ret < 0)
 		goto err_proto;
 
-	/* Set up fake conntrack: to never be deleted, not in any hashes */
-	for_each_possible_cpu(cpu) {
-		struct nf_conn *ct = &per_cpu(nf_conntrack_untracked, cpu);
-		write_pnet(&ct->ct_net, &init_net);
-		atomic_set(&ct->ct_general.use, 1);
-	}
-	/*  - and look it like as a confirmed connection */
-	nf_ct_untracked_status_or(IPS_CONFIRMED | IPS_UNTRACKED);
-
 	conntrack_gc_work_init(&conntrack_gc_work);
 	queue_delayed_work(system_long_wq, &conntrack_gc_work.dwork, HZ);
 
@@ -1977,6 +1938,7 @@ int nf_conntrack_init_net(struct net *net)
 	int ret = -ENOMEM;
 	int cpu;
 
+	BUILD_BUG_ON(IP_CT_UNTRACKED == IP_CT_NUMBER);
 	atomic_set(&net->ct.count, 0);
 
 	net->ct.pcpu_lists = alloc_percpu(struct ct_pcpu);
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index fb0e65411785..3602f9ee769b 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -861,9 +861,6 @@ static int __init nf_nat_init(void)
 
 	nf_ct_helper_expectfn_register(&follow_master_nat);
 
-	/* Initialize fake conntrack so that NAT will skip it */
-	nf_ct_untracked_status_or(IPS_NAT_DONE_MASK);
-
 	BUG_ON(nfnetlink_parse_nat_setup_hook != NULL);
 	RCU_INIT_POINTER(nfnetlink_parse_nat_setup_hook,
 			   nfnetlink_parse_nat_setup);
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 6e23dbbedd7f..6c6fd48b024c 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -72,12 +72,12 @@ static void nft_ct_get_eval(const struct nft_expr *expr,
 
 	switch (priv->key) {
 	case NFT_CT_STATE:
-		if (ct == NULL)
-			state = NF_CT_STATE_INVALID_BIT;
-		else if (nf_ct_is_untracked(ct))
+		if (ct)
+			state = NF_CT_STATE_BIT(ctinfo);
+		else if (ctinfo == IP_CT_UNTRACKED)
 			state = NF_CT_STATE_UNTRACKED_BIT;
 		else
-			state = NF_CT_STATE_BIT(ctinfo);
+			state = NF_CT_STATE_INVALID_BIT;
 		*dest = state;
 		return;
 	default:
@@ -718,12 +718,10 @@ static void nft_notrack_eval(const struct nft_expr *expr,
 
 	ct = nf_ct_get(pkt->skb, &ctinfo);
 	/* Previously seen (loopback or untracked)?  Ignore. */
-	if (ct)
+	if (ct || ctinfo == IP_CT_UNTRACKED)
 		return;
 
-	ct = nf_ct_untracked_get();
-	atomic_inc(&ct->ct_general.use);
-	nf_ct_set(skb, ct, IP_CT_NEW);
+	nf_ct_set(skb, ct, IP_CT_UNTRACKED);
 }
 
 static struct nft_expr_type nft_notrack_type;
diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c
index b008db0184b8..3cbe1bcf6a74 100644
--- a/net/netfilter/xt_CT.c
+++ b/net/netfilter/xt_CT.c
@@ -26,11 +26,12 @@ static inline int xt_ct_target(struct sk_buff *skb, struct nf_conn *ct)
 	if (skb->_nfct != 0)
 		return XT_CONTINUE;
 
-	/* special case the untracked ct : we want the percpu object */
-	if (!ct)
-		ct = nf_ct_untracked_get();
-	atomic_inc(&ct->ct_general.use);
-	nf_ct_set(skb, ct, IP_CT_NEW);
+	if (ct) {
+		atomic_inc(&ct->ct_general.use);
+		nf_ct_set(skb, ct, IP_CT_NEW);
+	} else {
+		nf_ct_set(skb, ct, IP_CT_UNTRACKED);
+	}
 
 	return XT_CONTINUE;
 }
@@ -335,7 +336,7 @@ static void xt_ct_tg_destroy(const struct xt_tgdtor_param *par,
 	struct nf_conn *ct = info->ct;
 	struct nf_conn_help *help;
 
-	if (ct && !nf_ct_is_untracked(ct)) {
+	if (ct) {
 		help = nfct_help(ct);
 		if (help)
 			module_put(help->helper->me);
@@ -412,8 +413,7 @@ notrack_tg(struct sk_buff *skb, const struct xt_action_param *par)
 	if (skb->_nfct != 0)
 		return XT_CONTINUE;
 
-	nf_ct_set(skb, nf_ct_untracked_get(), IP_CT_NEW);
-	nf_conntrack_get(skb_nfct(skb));
+	nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 
 	return XT_CONTINUE;
 }
diff --git a/net/netfilter/xt_conntrack.c b/net/netfilter/xt_conntrack.c
index c0fb217bc649..39cf1d019240 100644
--- a/net/netfilter/xt_conntrack.c
+++ b/net/netfilter/xt_conntrack.c
@@ -172,12 +172,11 @@ conntrack_mt(const struct sk_buff *skb, struct xt_action_param *par,
 
 	ct = nf_ct_get(skb, &ctinfo);
 
-	if (ct) {
-		if (nf_ct_is_untracked(ct))
-			statebit = XT_CONNTRACK_STATE_UNTRACKED;
-		else
-			statebit = XT_CONNTRACK_STATE_BIT(ctinfo);
-	} else
+	if (ct)
+		statebit = XT_CONNTRACK_STATE_BIT(ctinfo);
+	else if (ctinfo == IP_CT_UNTRACKED)
+		statebit = XT_CONNTRACK_STATE_UNTRACKED;
+	else
 		statebit = XT_CONNTRACK_STATE_INVALID;
 
 	if (info->match_flags & XT_CONNTRACK_STATE) {
diff --git a/net/netfilter/xt_state.c b/net/netfilter/xt_state.c
index 5746a33789a5..5fbd79194d21 100644
--- a/net/netfilter/xt_state.c
+++ b/net/netfilter/xt_state.c
@@ -28,14 +28,13 @@ state_mt(const struct sk_buff *skb, struct xt_action_param *par)
 	unsigned int statebit;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
-	if (!ct)
+	if (ct)
+		statebit = XT_STATE_BIT(ctinfo);
+	else if (ctinfo == IP_CT_UNTRACKED)
+		statebit = XT_STATE_UNTRACKED;
+	else
 		statebit = XT_STATE_INVALID;
-	else {
-		if (nf_ct_is_untracked(ct))
-			statebit = XT_STATE_UNTRACKED;
-		else
-			statebit = XT_STATE_BIT(ctinfo);
-	}
+
 	return (sinfo->statemask & statebit);
 }
 
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-15  9:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-08 12:49 [PATCH nf-next 0/2] netfilter: untracked object removal Florian Westphal
2017-03-08 12:49 ` [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects Florian Westphal
2017-03-08 12:49 ` [PATCH nf-next 2/2] netfilter: remove nf_ct_is_untracked Florian Westphal
2017-03-08 16:29 ` [PATCH nf-next 0/2] netfilter: untracked object removal Eric Dumazet
2017-04-14 18:31 [PATCH nf-next 1/2] netfilter: kill the fake untracked conntrack objects Florian Westphal
2017-04-15  9:21 ` Pablo Neira Ayuso
2017-04-15  9:43   ` Florian Westphal
2017-04-15  9:49     ` Pablo Neira Ayuso
2017-04-15  9:47   ` Pablo Neira Ayuso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.