All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue
@ 2013-07-29 13:41 Florian Westphal
  2013-07-29 13:41 ` [PATCH 1/7] netfilter: connlabels: remove unneeded includes Florian Westphal
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel

Hi Pablo,

The following contains a bunch of patches that id'like to see in -next tree.

The two tproxy patches remove the nf_tproxy_core module - the TPROXY
target is changed to use the generic sock_edemux destructor
(this is one reason for the negative diffstat).

The three conntrack patches are yet another attempt at removing
the extra ecache timer:  It implements redelivery via delayed work item
- the advantage is that redelivery is now under scheduler control and
thus competes fairly with the userspace event consumers.

I got slightly better results than current master branch, and a
lot better results compared to the old "single timer" based patch.

Because nf_conntrack_netlink.c contains a bit of redundant code copied
from nf_conntrack_core I rebased the "cleanup" parts of your patch titled

"netfilter: nf_conntrack: fix race in timer handling with reliable events"

which is sitting in patchwork: http://patchwork.ozlabs.org/patch/180436/

If you prefer to forward-port the cleanup bits yourself jsut let me know
when you're finished an I will rebase my changes.

Patches will also be sent in reply to this email.

The following changes since commit 496e4ae7dc944faa1721bfda7e9d834d5611a874:

  netfilter: nf_queue: add NFQA_SKB_CSUM_NOTVERIFIED info flag (2013-06-30 18:15:48 +0200)

are available in the git repository at:
  git://chamillionaire.breakpoint.cc/fw/nf-next.git pull-20130729

Florian Westphal (7):
      netfilter: connlabels: remove unneeded includes
      netfilter: nf_queue: relax NFQA_CT attribute check
      netfilter: tproxy: remove nf_tproxy_core module, keep tw sock assigned to skb
      netfilter: tproxy: remove nf_tproxy_core.h
      netfilter: conntrack: remove duplicate code in conntrack_netlink
      netfilter: conntrack: don't send destroy events from iterator
      netfilter: conntrack: remove timer from ecache extension

 Documentation/networking/tproxy.txt                |    5 +-
 include/net/netfilter/nf_conntrack.h               |   14 +-
 include/net/netfilter/nf_conntrack_ecache.h        |    9 +-
 include/net/netfilter/nf_tproxy_core.h             |  210 --------------------
 include/net/netns/conntrack.h                      |    5 +-
 include/uapi/linux/netfilter/nf_conntrack_common.h |    8 +-
 net/ipv4/netfilter/ipt_MASQUERADE.c                |    2 +-
 net/ipv6/netfilter/ip6t_MASQUERADE.c               |    2 +-
 net/netfilter/Kconfig                              |   22 +--
 net/netfilter/Makefile                             |    3 -
 net/netfilter/nf_conntrack_core.c                  |  131 +++----------
 net/netfilter/nf_conntrack_ecache.c                |   63 +++++-
 net/netfilter/nf_conntrack_labels.c                |    4 -
 net/netfilter/nf_conntrack_netlink.c               |   18 +--
 net/netfilter/nf_conntrack_proto.c                 |    4 +-
 net/netfilter/nf_nat_core.c                        |    6 +-
 net/netfilter/nf_tproxy_core.c                     |   62 ------
 net/netfilter/nfnetlink_queue_core.c               |    4 +-
 net/netfilter/xt_TPROXY.c                          |  167 ++++++++++++++++-
 net/netfilter/xt_socket.c                          |   66 ++++++-
 20 files changed, 353 insertions(+), 452 deletions(-)
 delete mode 100644 include/net/netfilter/nf_tproxy_core.h
 delete mode 100644 net/netfilter/nf_tproxy_core.c

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/7] netfilter: connlabels: remove unneeded includes
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  2013-07-31 16:56   ` Pablo Neira Ayuso
  2013-07-29 13:41 ` [PATCH 2/7] netfilter: nf_queue: relax NFQA_CT attribute check Florian Westphal
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

leftovers from the (never merged) v1 patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_labels.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_labels.c b/net/netfilter/nf_conntrack_labels.c
index 8fe2e99..d3c352b 100644
--- a/net/netfilter/nf_conntrack_labels.c
+++ b/net/netfilter/nf_conntrack_labels.c
@@ -8,12 +8,8 @@
  * published by the Free Software Foundation.
  */
 
-#include <linux/ctype.h>
 #include <linux/export.h>
-#include <linux/jhash.h>
-#include <linux/spinlock.h>
 #include <linux/types.h>
-#include <linux/slab.h>
 
 #include <net/netfilter/nf_conntrack_ecache.h>
 #include <net/netfilter/nf_conntrack_labels.h>
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/7] netfilter: nf_queue: relax NFQA_CT attribute check
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
  2013-07-29 13:41 ` [PATCH 1/7] netfilter: connlabels: remove unneeded includes Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  2013-07-31 16:56   ` Pablo Neira Ayuso
  2013-07-29 13:41 ` [PATCH 3/7] netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb Florian Westphal
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

Allow modifying attributes of the conntrack associated with a packet
without first requesting ct data via CFG_F_CONNTRACK or extra
nfnetlink_conntrack socket.

Also remove unneded rcu_read_lock; the entire function is already
protected by rcu.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nfnetlink_queue_core.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nfnetlink_queue_core.c b/net/netfilter/nfnetlink_queue_core.c
index 971ea14..ec9de12 100644
--- a/net/netfilter/nfnetlink_queue_core.c
+++ b/net/netfilter/nfnetlink_queue_core.c
@@ -987,8 +987,7 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff *skb,
 	if (entry == NULL)
 		return -ENOENT;
 
-	rcu_read_lock();
-	if (nfqa[NFQA_CT] && (queue->flags & NFQA_CFG_F_CONNTRACK))
+	if (nfqa[NFQA_CT])
 		ct = nfqnl_ct_parse(entry->skb, nfqa[NFQA_CT], &ctinfo);
 
 	if (nfqa[NFQA_PAYLOAD]) {
@@ -1002,7 +1001,6 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff *skb,
 		if (ct)
 			nfqnl_ct_seq_adjust(skb, ct, ctinfo, diff);
 	}
-	rcu_read_unlock();
 
 	if (nfqa[NFQA_MARK])
 		entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK]));
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/7] netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
  2013-07-29 13:41 ` [PATCH 1/7] netfilter: connlabels: remove unneeded includes Florian Westphal
  2013-07-29 13:41 ` [PATCH 2/7] netfilter: nf_queue: relax NFQA_CT attribute check Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  2013-07-31 16:57   ` Pablo Neira Ayuso
  2013-07-29 13:41 ` [PATCH 4/7] netfilter: tproxy: remove nf_tproxy_core.h Florian Westphal
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

The module was "permanent", due to the special tproxy skb->destructor.
Nowadays we have tcp early demux and its sock_edemux destructor in
networking core which can be used instead.

Thanks to early demux changes the input path now also handles
"skb->sk is tw socket" correctly, so this no longer needs the special
handling introduced with commit d503b30bd648b3cb4e5f50b65d27e389960cc6d9
(netfilter: tproxy: do not assign timewait sockets to skb->sk).

Thus:
- move assign_sock function to where its needed
- don't prevent timewait sockets from being assigned to the skb
- remove nf_tproxy_core.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Documentation/networking/tproxy.txt    |    5 +--
 include/net/netfilter/nf_tproxy_core.h |    4 --
 net/netfilter/Kconfig                  |   22 +++---------
 net/netfilter/Makefile                 |    3 --
 net/netfilter/nf_tproxy_core.c         |   62 --------------------------------
 net/netfilter/xt_TPROXY.c              |    9 +++++
 6 files changed, 16 insertions(+), 89 deletions(-)
 delete mode 100644 net/netfilter/nf_tproxy_core.c

diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt
index 7b5996d..ec11429 100644
--- a/Documentation/networking/tproxy.txt
+++ b/Documentation/networking/tproxy.txt
@@ -2,9 +2,8 @@ Transparent proxy support
 =========================
 
 This feature adds Linux 2.2-like transparent proxy support to current kernels.
-To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in
-your kernel config. You will need policy routing too, so be sure to enable that
-as well.
+To use it, enable the socket match and the TPROXY target in your kernel config.
+You will need policy routing too, so be sure to enable that as well.
 
 
 1. Making non-local sockets work
diff --git a/include/net/netfilter/nf_tproxy_core.h b/include/net/netfilter/nf_tproxy_core.h
index 36d9379..975ffa4 100644
--- a/include/net/netfilter/nf_tproxy_core.h
+++ b/include/net/netfilter/nf_tproxy_core.h
@@ -203,8 +203,4 @@ nf_tproxy_get_sock_v6(struct net *net, const u8 protocol,
 }
 #endif
 
-/* assign a socket to the skb -- consumes sk */
-void
-nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk);
-
 #endif
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 56d22ca..c45fc1a 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -410,20 +410,6 @@ config NF_NAT_TFTP
 
 endif # NF_CONNTRACK
 
-# transparent proxy support
-config NETFILTER_TPROXY
-	tristate "Transparent proxying support"
-	depends on IP_NF_MANGLE
-	depends on NETFILTER_ADVANCED
-	help
-	  This option enables transparent proxying support, that is,
-	  support for handling non-locally bound IPv4 TCP and UDP sockets.
-	  For it to work you will have to configure certain iptables rules
-	  and use policy routing. For more information on how to set it up
-	  see Documentation/networking/tproxy.txt.
-
-	  To compile it as a module, choose M here.  If unsure, say N.
-
 config NETFILTER_XTABLES
 	tristate "Netfilter Xtables support (required for ip_tables)"
 	default m if NETFILTER_ADVANCED=n
@@ -720,10 +706,10 @@ config NETFILTER_XT_TARGET_TEE
 	this clone be rerouted to another nexthop.
 
 config NETFILTER_XT_TARGET_TPROXY
-	tristate '"TPROXY" target support'
-	depends on NETFILTER_TPROXY
+	tristate '"TPROXY" target transparent proxying support'
 	depends on NETFILTER_XTABLES
 	depends on NETFILTER_ADVANCED
+	depends on IP_NF_MANGLE
 	select NF_DEFRAG_IPV4
 	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES
 	help
@@ -731,6 +717,9 @@ config NETFILTER_XT_TARGET_TPROXY
 	  REDIRECT.  It can only be used in the mangle table and is useful
 	  to redirect traffic to a transparent proxy.  It does _not_ depend
 	  on Netfilter connection tracking and NAT, unlike REDIRECT.
+	  For it to work you will have to configure certain iptables rules
+	  and use policy routing. For more information on how to set it up
+	  see Documentation/networking/tproxy.txt.
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
@@ -1180,7 +1169,6 @@ config NETFILTER_XT_MATCH_SCTP
 
 config NETFILTER_XT_MATCH_SOCKET
 	tristate '"socket" match support'
-	depends on NETFILTER_TPROXY
 	depends on NETFILTER_XTABLES
 	depends on NETFILTER_ADVANCED
 	depends on !NF_CONNTRACK || NF_CONNTRACK
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index a1abf87..ebfa7dc 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -61,9 +61,6 @@ obj-$(CONFIG_NF_NAT_IRC) += nf_nat_irc.o
 obj-$(CONFIG_NF_NAT_SIP) += nf_nat_sip.o
 obj-$(CONFIG_NF_NAT_TFTP) += nf_nat_tftp.o
 
-# transparent proxy support
-obj-$(CONFIG_NETFILTER_TPROXY) += nf_tproxy_core.o
-
 # generic X tables 
 obj-$(CONFIG_NETFILTER_XTABLES) += x_tables.o xt_tcpudp.o
 
diff --git a/net/netfilter/nf_tproxy_core.c b/net/netfilter/nf_tproxy_core.c
deleted file mode 100644
index 474d621..0000000
--- a/net/netfilter/nf_tproxy_core.c
+++ /dev/null
@@ -1,62 +0,0 @@
-/*
- * Transparent proxy support for Linux/iptables
- *
- * Copyright (c) 2006-2007 BalaBit IT Ltd.
- * Author: Balazs Scheidler, Krisztian Kovacs
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- */
-
-#include <linux/module.h>
-
-#include <linux/net.h>
-#include <linux/if.h>
-#include <linux/netdevice.h>
-#include <net/udp.h>
-#include <net/netfilter/nf_tproxy_core.h>
-
-
-static void
-nf_tproxy_destructor(struct sk_buff *skb)
-{
-	struct sock *sk = skb->sk;
-
-	skb->sk = NULL;
-	skb->destructor = NULL;
-
-	if (sk)
-		sock_put(sk);
-}
-
-/* consumes sk */
-void
-nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk)
-{
-	/* assigning tw sockets complicates things; most
-	 * skb->sk->X checks would have to test sk->sk_state first */
-	if (sk->sk_state == TCP_TIME_WAIT) {
-		inet_twsk_put(inet_twsk(sk));
-		return;
-	}
-
-	skb_orphan(skb);
-	skb->sk = sk;
-	skb->destructor = nf_tproxy_destructor;
-}
-EXPORT_SYMBOL_GPL(nf_tproxy_assign_sock);
-
-static int __init nf_tproxy_init(void)
-{
-	pr_info("NF_TPROXY: Transparent proxy support initialized, version 4.1.0\n");
-	pr_info("NF_TPROXY: Copyright (c) 2006-2007 BalaBit IT Ltd.\n");
-	return 0;
-}
-
-module_init(nf_tproxy_init);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Krisztian Kovacs");
-MODULE_DESCRIPTION("Transparent proxy support core routines");
diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
index d7f1953..17c40de 100644
--- a/net/netfilter/xt_TPROXY.c
+++ b/net/netfilter/xt_TPROXY.c
@@ -117,6 +117,15 @@ tproxy_handle_time_wait4(struct sk_buff *skb, __be32 laddr, __be16 lport,
 	return sk;
 }
 
+/* assign a socket to the skb -- consumes sk */
+static void
+nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk)
+{
+	skb_orphan(skb);
+	skb->sk = sk;
+	skb->destructor = sock_edemux;
+}
+
 static unsigned int
 tproxy_tg4(struct sk_buff *skb, __be32 laddr, __be16 lport,
 	   u_int32_t mark_mask, u_int32_t mark_value)
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/7] netfilter: tproxy: remove nf_tproxy_core.h
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
                   ` (2 preceding siblings ...)
  2013-07-29 13:41 ` [PATCH 3/7] netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  2013-07-31 16:58   ` Pablo Neira Ayuso
  2013-07-29 13:41 ` [PATCH 5/7] netfilter: conntrack: remove duplicate code in conntrack_netlink Florian Westphal
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

We've removed nf_tproxy_core.ko, so also remove its header.
The lookup helpers are split and then moved to tproxy target/socket match.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_tproxy_core.h |  206 --------------------------------
 net/netfilter/xt_TPROXY.c              |  158 ++++++++++++++++++++++++-
 net/netfilter/xt_socket.c              |   66 ++++++++++-
 3 files changed, 218 insertions(+), 212 deletions(-)
 delete mode 100644 include/net/netfilter/nf_tproxy_core.h

diff --git a/include/net/netfilter/nf_tproxy_core.h b/include/net/netfilter/nf_tproxy_core.h
deleted file mode 100644
index 975ffa4..0000000
--- a/include/net/netfilter/nf_tproxy_core.h
+++ /dev/null
@@ -1,206 +0,0 @@
-#ifndef _NF_TPROXY_CORE_H
-#define _NF_TPROXY_CORE_H
-
-#include <linux/types.h>
-#include <linux/in.h>
-#include <linux/skbuff.h>
-#include <net/sock.h>
-#include <net/inet_hashtables.h>
-#include <net/inet6_hashtables.h>
-#include <net/tcp.h>
-
-#define NFT_LOOKUP_ANY         0
-#define NFT_LOOKUP_LISTENER    1
-#define NFT_LOOKUP_ESTABLISHED 2
-
-/* look up and get a reference to a matching socket */
-
-
-/* This function is used by the 'TPROXY' target and the 'socket'
- * match. The following lookups are supported:
- *
- * Explicit TProxy target rule
- * ===========================
- *
- * This is used when the user wants to intercept a connection matching
- * an explicit iptables rule. In this case the sockets are assumed
- * matching in preference order:
- *
- *   - match: if there's a fully established connection matching the
- *     _packet_ tuple, it is returned, assuming the redirection
- *     already took place and we process a packet belonging to an
- *     established connection
- *
- *   - match: if there's a listening socket matching the redirection
- *     (e.g. on-port & on-ip of the connection), it is returned,
- *     regardless if it was bound to 0.0.0.0 or an explicit
- *     address. The reasoning is that if there's an explicit rule, it
- *     does not really matter if the listener is bound to an interface
- *     or to 0. The user already stated that he wants redirection
- *     (since he added the rule).
- *
- * "socket" match based redirection (no specific rule)
- * ===================================================
- *
- * There are connections with dynamic endpoints (e.g. FTP data
- * connection) that the user is unable to add explicit rules
- * for. These are taken care of by a generic "socket" rule. It is
- * assumed that the proxy application is trusted to open such
- * connections without explicit iptables rule (except of course the
- * generic 'socket' rule). In this case the following sockets are
- * matched in preference order:
- *
- *   - match: if there's a fully established connection matching the
- *     _packet_ tuple
- *
- *   - match: if there's a non-zero bound listener (possibly with a
- *     non-local address) We don't accept zero-bound listeners, since
- *     then local services could intercept traffic going through the
- *     box.
- *
- * Please note that there's an overlap between what a TPROXY target
- * and a socket match will match. Normally if you have both rules the
- * "socket" match will be the first one, effectively all packets
- * belonging to established connections going through that one.
- */
-static inline struct sock *
-nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
-		      const __be32 saddr, const __be32 daddr,
-		      const __be16 sport, const __be16 dport,
-		      const struct net_device *in, int lookup_type)
-{
-	struct sock *sk;
-
-	/* look up socket */
-	switch (protocol) {
-	case IPPROTO_TCP:
-		switch (lookup_type) {
-		case NFT_LOOKUP_ANY:
-			sk = __inet_lookup(net, &tcp_hashinfo,
-					   saddr, sport, daddr, dport,
-					   in->ifindex);
-			break;
-		case NFT_LOOKUP_LISTENER:
-			sk = inet_lookup_listener(net, &tcp_hashinfo,
-						    saddr, sport,
-						    daddr, dport,
-						    in->ifindex);
-
-			/* NOTE: we return listeners even if bound to
-			 * 0.0.0.0, those are filtered out in
-			 * xt_socket, since xt_TPROXY needs 0 bound
-			 * listeners too */
-
-			break;
-		case NFT_LOOKUP_ESTABLISHED:
-			sk = inet_lookup_established(net, &tcp_hashinfo,
-						    saddr, sport, daddr, dport,
-						    in->ifindex);
-			break;
-		default:
-			WARN_ON(1);
-			sk = NULL;
-			break;
-		}
-		break;
-	case IPPROTO_UDP:
-		sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
-				     in->ifindex);
-		if (sk && lookup_type != NFT_LOOKUP_ANY) {
-			int connected = (sk->sk_state == TCP_ESTABLISHED);
-			int wildcard = (inet_sk(sk)->inet_rcv_saddr == 0);
-
-			/* NOTE: we return listeners even if bound to
-			 * 0.0.0.0, those are filtered out in
-			 * xt_socket, since xt_TPROXY needs 0 bound
-			 * listeners too */
-			if ((lookup_type == NFT_LOOKUP_ESTABLISHED && (!connected || wildcard)) ||
-			    (lookup_type == NFT_LOOKUP_LISTENER && connected)) {
-				sock_put(sk);
-				sk = NULL;
-			}
-		}
-		break;
-	default:
-		WARN_ON(1);
-		sk = NULL;
-	}
-
-	pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, lookup type: %d, sock %p\n",
-		 protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), lookup_type, sk);
-
-	return sk;
-}
-
-#if IS_ENABLED(CONFIG_IPV6)
-static inline struct sock *
-nf_tproxy_get_sock_v6(struct net *net, const u8 protocol,
-		      const struct in6_addr *saddr, const struct in6_addr *daddr,
-		      const __be16 sport, const __be16 dport,
-		      const struct net_device *in, int lookup_type)
-{
-	struct sock *sk;
-
-	/* look up socket */
-	switch (protocol) {
-	case IPPROTO_TCP:
-		switch (lookup_type) {
-		case NFT_LOOKUP_ANY:
-			sk = inet6_lookup(net, &tcp_hashinfo,
-					  saddr, sport, daddr, dport,
-					  in->ifindex);
-			break;
-		case NFT_LOOKUP_LISTENER:
-			sk = inet6_lookup_listener(net, &tcp_hashinfo,
-						   saddr, sport,
-						   daddr, ntohs(dport),
-						   in->ifindex);
-
-			/* NOTE: we return listeners even if bound to
-			 * 0.0.0.0, those are filtered out in
-			 * xt_socket, since xt_TPROXY needs 0 bound
-			 * listeners too */
-
-			break;
-		case NFT_LOOKUP_ESTABLISHED:
-			sk = __inet6_lookup_established(net, &tcp_hashinfo,
-							saddr, sport, daddr, ntohs(dport),
-							in->ifindex);
-			break;
-		default:
-			WARN_ON(1);
-			sk = NULL;
-			break;
-		}
-		break;
-	case IPPROTO_UDP:
-		sk = udp6_lib_lookup(net, saddr, sport, daddr, dport,
-				     in->ifindex);
-		if (sk && lookup_type != NFT_LOOKUP_ANY) {
-			int connected = (sk->sk_state == TCP_ESTABLISHED);
-			int wildcard = ipv6_addr_any(&inet6_sk(sk)->rcv_saddr);
-
-			/* NOTE: we return listeners even if bound to
-			 * 0.0.0.0, those are filtered out in
-			 * xt_socket, since xt_TPROXY needs 0 bound
-			 * listeners too */
-			if ((lookup_type == NFT_LOOKUP_ESTABLISHED && (!connected || wildcard)) ||
-			    (lookup_type == NFT_LOOKUP_LISTENER && connected)) {
-				sock_put(sk);
-				sk = NULL;
-			}
-		}
-		break;
-	default:
-		WARN_ON(1);
-		sk = NULL;
-	}
-
-	pr_debug("tproxy socket lookup: proto %u %pI6:%u -> %pI6:%u, lookup type: %d, sock %p\n",
-		 protocol, saddr, ntohs(sport), daddr, ntohs(dport), lookup_type, sk);
-
-	return sk;
-}
-#endif
-
-#endif
diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
index 17c40de..a6f844e 100644
--- a/net/netfilter/xt_TPROXY.c
+++ b/net/netfilter/xt_TPROXY.c
@@ -15,7 +15,9 @@
 #include <linux/ip.h>
 #include <net/checksum.h>
 #include <net/udp.h>
+#include <net/tcp.h>
 #include <net/inet_sock.h>
+#include <net/inet_hashtables.h>
 #include <linux/inetdevice.h>
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter_ipv4/ip_tables.h>
@@ -26,13 +28,18 @@
 #define XT_TPROXY_HAVE_IPV6 1
 #include <net/if_inet6.h>
 #include <net/addrconf.h>
+#include <net/inet6_hashtables.h>
 #include <linux/netfilter_ipv6/ip6_tables.h>
 #include <net/netfilter/ipv6/nf_defrag_ipv6.h>
 #endif
 
-#include <net/netfilter/nf_tproxy_core.h>
 #include <linux/netfilter/xt_TPROXY.h>
 
+enum nf_tproxy_lookup_t {
+	 NFT_LOOKUP_LISTENER,
+	 NFT_LOOKUP_ESTABLISHED,
+};
+
 static bool tproxy_sk_is_transparent(struct sock *sk)
 {
 	if (sk->sk_state != TCP_TIME_WAIT) {
@@ -68,6 +75,155 @@ tproxy_laddr4(struct sk_buff *skb, __be32 user_laddr, __be32 daddr)
 	return laddr ? laddr : daddr;
 }
 
+/*
+ * This is used when the user wants to intercept a connection matching
+ * an explicit iptables rule. In this case the sockets are assumed
+ * matching in preference order:
+ *
+ *   - match: if there's a fully established connection matching the
+ *     _packet_ tuple, it is returned, assuming the redirection
+ *     already took place and we process a packet belonging to an
+ *     established connection
+ *
+ *   - match: if there's a listening socket matching the redirection
+ *     (e.g. on-port & on-ip of the connection), it is returned,
+ *     regardless if it was bound to 0.0.0.0 or an explicit
+ *     address. The reasoning is that if there's an explicit rule, it
+ *     does not really matter if the listener is bound to an interface
+ *     or to 0. The user already stated that he wants redirection
+ *     (since he added the rule).
+ *
+ * Please note that there's an overlap between what a TPROXY target
+ * and a socket match will match. Normally if you have both rules the
+ * "socket" match will be the first one, effectively all packets
+ * belonging to established connections going through that one.
+ */
+static inline struct sock *
+nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
+		      const __be32 saddr, const __be32 daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in,
+		      const enum nf_tproxy_lookup_t lookup_type)
+{
+	struct sock *sk;
+
+	switch (protocol) {
+	case IPPROTO_TCP:
+		switch (lookup_type) {
+		case NFT_LOOKUP_LISTENER:
+			sk = inet_lookup_listener(net, &tcp_hashinfo,
+						    saddr, sport,
+						    daddr, dport,
+						    in->ifindex);
+
+			/* NOTE: we return listeners even if bound to
+			 * 0.0.0.0, those are filtered out in
+			 * xt_socket, since xt_TPROXY needs 0 bound
+			 * listeners too */
+
+			break;
+		case NFT_LOOKUP_ESTABLISHED:
+			sk = inet_lookup_established(net, &tcp_hashinfo,
+						    saddr, sport, daddr, dport,
+						    in->ifindex);
+			break;
+		default:
+			BUG();
+		}
+		break;
+	case IPPROTO_UDP:
+		sk = udp4_lib_lookup(net, saddr, sport, daddr, dport,
+				     in->ifindex);
+		if (sk) {
+			int connected = (sk->sk_state == TCP_ESTABLISHED);
+			int wildcard = (inet_sk(sk)->inet_rcv_saddr == 0);
+
+			/* NOTE: we return listeners even if bound to
+			 * 0.0.0.0, those are filtered out in
+			 * xt_socket, since xt_TPROXY needs 0 bound
+			 * listeners too */
+			if ((lookup_type == NFT_LOOKUP_ESTABLISHED && (!connected || wildcard)) ||
+			    (lookup_type == NFT_LOOKUP_LISTENER && connected)) {
+				sock_put(sk);
+				sk = NULL;
+			}
+		}
+		break;
+	default:
+		WARN_ON(1);
+		sk = NULL;
+	}
+
+	pr_debug("tproxy socket lookup: proto %u %08x:%u -> %08x:%u, lookup type: %d, sock %p\n",
+		 protocol, ntohl(saddr), ntohs(sport), ntohl(daddr), ntohs(dport), lookup_type, sk);
+
+	return sk;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static inline struct sock *
+nf_tproxy_get_sock_v6(struct net *net, const u8 protocol,
+		      const struct in6_addr *saddr, const struct in6_addr *daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in,
+		      const enum nf_tproxy_lookup_t lookup_type)
+{
+	struct sock *sk;
+
+	switch (protocol) {
+	case IPPROTO_TCP:
+		switch (lookup_type) {
+		case NFT_LOOKUP_LISTENER:
+			sk = inet6_lookup_listener(net, &tcp_hashinfo,
+						   saddr, sport,
+						   daddr, ntohs(dport),
+						   in->ifindex);
+
+			/* NOTE: we return listeners even if bound to
+			 * 0.0.0.0, those are filtered out in
+			 * xt_socket, since xt_TPROXY needs 0 bound
+			 * listeners too */
+
+			break;
+		case NFT_LOOKUP_ESTABLISHED:
+			sk = __inet6_lookup_established(net, &tcp_hashinfo,
+							saddr, sport, daddr, ntohs(dport),
+							in->ifindex);
+			break;
+		default:
+			BUG();
+		}
+		break;
+	case IPPROTO_UDP:
+		sk = udp6_lib_lookup(net, saddr, sport, daddr, dport,
+				     in->ifindex);
+		if (sk) {
+			int connected = (sk->sk_state == TCP_ESTABLISHED);
+			int wildcard = ipv6_addr_any(&inet6_sk(sk)->rcv_saddr);
+
+			/* NOTE: we return listeners even if bound to
+			 * 0.0.0.0, those are filtered out in
+			 * xt_socket, since xt_TPROXY needs 0 bound
+			 * listeners too */
+			if ((lookup_type == NFT_LOOKUP_ESTABLISHED && (!connected || wildcard)) ||
+			    (lookup_type == NFT_LOOKUP_LISTENER && connected)) {
+				sock_put(sk);
+				sk = NULL;
+			}
+		}
+		break;
+	default:
+		WARN_ON(1);
+		sk = NULL;
+	}
+
+	pr_debug("tproxy socket lookup: proto %u %pI6:%u -> %pI6:%u, lookup type: %d, sock %p\n",
+		 protocol, saddr, ntohs(sport), daddr, ntohs(dport), lookup_type, sk);
+
+	return sk;
+}
+#endif
+
 /**
  * tproxy_handle_time_wait4 - handle IPv4 TCP TIME_WAIT reopen redirections
  * @skb:	The skb being processed.
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
index f8b7191..a7dd108 100644
--- a/net/netfilter/xt_socket.c
+++ b/net/netfilter/xt_socket.c
@@ -19,12 +19,12 @@
 #include <net/icmp.h>
 #include <net/sock.h>
 #include <net/inet_sock.h>
-#include <net/netfilter/nf_tproxy_core.h>
 #include <net/netfilter/ipv4/nf_defrag_ipv4.h>
 
 #if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
 #define XT_SOCKET_HAVE_IPV6 1
 #include <linux/netfilter_ipv6/ip6_tables.h>
+#include <net/inet6_hashtables.h>
 #include <net/netfilter/ipv6/nf_defrag_ipv6.h>
 #endif
 
@@ -101,6 +101,43 @@ extract_icmp4_fields(const struct sk_buff *skb,
 	return 0;
 }
 
+/* "socket" match based redirection (no specific rule)
+ * ===================================================
+ *
+ * There are connections with dynamic endpoints (e.g. FTP data
+ * connection) that the user is unable to add explicit rules
+ * for. These are taken care of by a generic "socket" rule. It is
+ * assumed that the proxy application is trusted to open such
+ * connections without explicit iptables rule (except of course the
+ * generic 'socket' rule). In this case the following sockets are
+ * matched in preference order:
+ *
+ *   - match: if there's a fully established connection matching the
+ *     _packet_ tuple
+ *
+ *   - match: if there's a non-zero bound listener (possibly with a
+ *     non-local address) We don't accept zero-bound listeners, since
+ *     then local services could intercept traffic going through the
+ *     box.
+ */
+static struct sock *
+xt_socket_get_sock_v4(struct net *net, const u8 protocol,
+		      const __be32 saddr, const __be32 daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in)
+{
+	switch (protocol) {
+	case IPPROTO_TCP:
+		return __inet_lookup(net, &tcp_hashinfo,
+				     saddr, sport, daddr, dport,
+				     in->ifindex);
+	case IPPROTO_UDP:
+		return udp4_lib_lookup(net, saddr, sport, daddr, dport,
+				       in->ifindex);
+	}
+	return NULL;
+}
+
 static bool
 socket_match(const struct sk_buff *skb, struct xt_action_param *par,
 	     const struct xt_socket_mtinfo1 *info)
@@ -156,9 +193,9 @@ socket_match(const struct sk_buff *skb, struct xt_action_param *par,
 #endif
 
 	if (!sk)
-		sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), protocol,
+		sk = xt_socket_get_sock_v4(dev_net(skb->dev), protocol,
 					   saddr, daddr, sport, dport,
-					   par->in, NFT_LOOKUP_ANY);
+					   par->in);
 	if (sk) {
 		bool wildcard;
 		bool transparent = true;
@@ -261,6 +298,25 @@ extract_icmp6_fields(const struct sk_buff *skb,
 	return 0;
 }
 
+static struct sock *
+xt_socket_get_sock_v6(struct net *net, const u8 protocol,
+		      const struct in6_addr *saddr, const struct in6_addr *daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in)
+{
+	switch (protocol) {
+	case IPPROTO_TCP:
+		return inet6_lookup(net, &tcp_hashinfo,
+				    saddr, sport, daddr, dport,
+				    in->ifindex);
+	case IPPROTO_UDP:
+		return udp6_lib_lookup(net, saddr, sport, daddr, dport,
+				       in->ifindex);
+	}
+
+	return NULL;
+}
+
 static bool
 socket_mt6_v1_v2(const struct sk_buff *skb, struct xt_action_param *par)
 {
@@ -298,9 +354,9 @@ socket_mt6_v1_v2(const struct sk_buff *skb, struct xt_action_param *par)
 	}
 
 	if (!sk)
-		sk = nf_tproxy_get_sock_v6(dev_net(skb->dev), tproto,
+		sk = xt_socket_get_sock_v6(dev_net(skb->dev), tproto,
 					   saddr, daddr, sport, dport,
-					   par->in, NFT_LOOKUP_ANY);
+					   par->in);
 	if (sk) {
 		bool wildcard;
 		bool transparent = true;
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/7] netfilter: conntrack: remove duplicate code in conntrack_netlink
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
                   ` (3 preceding siblings ...)
  2013-07-29 13:41 ` [PATCH 4/7] netfilter: tproxy: remove nf_tproxy_core.h Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  2013-07-31 16:58   ` Pablo Neira Ayuso
  2013-07-29 13:41 ` [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator Florian Westphal
  2013-07-29 13:41 ` [PATCH 7/7] netfilter: conntrack: remove timer from ecache extension Florian Westphal
  6 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

ctnetlink contains copy-paste code from death_by_timeout.  In order to
avoid changing both places in upcoming event delivery patch,
export death_by_timeout functionality and use it in the ctnetlink code.

Based on earlier patch from Pablo Neira.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_conntrack.h |    3 +--
 net/netfilter/nf_conntrack_core.c    |   29 ++++++++++++++++-------------
 net/netfilter/nf_conntrack_netlink.c |   18 +++---------------
 3 files changed, 20 insertions(+), 30 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 644d9c2..939aced 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -181,8 +181,7 @@ __nf_conntrack_find(struct net *net, u16 zone,
 		    const struct nf_conntrack_tuple *tuple);
 
 extern int nf_conntrack_hash_check_insert(struct nf_conn *ct);
-extern void nf_ct_delete_from_lists(struct nf_conn *ct);
-extern void nf_ct_dying_timeout(struct nf_conn *ct);
+bool nf_ct_delete(struct nf_conn *ct, u32 pid, int report);
 
 extern void nf_conntrack_flush_report(struct net *net, u32 portid, int report);
 
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 0283bae..5759358 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -238,7 +238,7 @@ destroy_conntrack(struct nf_conntrack *nfct)
 	nf_conntrack_free(ct);
 }
 
-void nf_ct_delete_from_lists(struct nf_conn *ct)
+static void nf_ct_delete_from_lists(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 
@@ -253,7 +253,6 @@ void nf_ct_delete_from_lists(struct nf_conn *ct)
 			     &net->ct.dying);
 	spin_unlock_bh(&nf_conntrack_lock);
 }
-EXPORT_SYMBOL_GPL(nf_ct_delete_from_lists);
 
 static void death_by_event(unsigned long ul_conntrack)
 {
@@ -275,7 +274,7 @@ static void death_by_event(unsigned long ul_conntrack)
 	nf_ct_put(ct);
 }
 
-void nf_ct_dying_timeout(struct nf_conn *ct)
+static void nf_ct_dying_timeout(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 	struct nf_conntrack_ecache *ecache = nf_ct_ecache_find(ct);
@@ -288,27 +287,33 @@ void nf_ct_dying_timeout(struct nf_conn *ct)
 		(prandom_u32() % net->ct.sysctl_events_retry_timeout);
 	add_timer(&ecache->timeout);
 }
-EXPORT_SYMBOL_GPL(nf_ct_dying_timeout);
 
-static void death_by_timeout(unsigned long ul_conntrack)
+bool nf_ct_delete(struct nf_conn *ct, u32 portid, int report)
 {
-	struct nf_conn *ct = (void *)ul_conntrack;
 	struct nf_conn_tstamp *tstamp;
 
 	tstamp = nf_conn_tstamp_find(ct);
 	if (tstamp && tstamp->stop == 0)
 		tstamp->stop = ktime_to_ns(ktime_get_real());
 
-	if (!test_bit(IPS_DYING_BIT, &ct->status) &&
-	    unlikely(nf_conntrack_event(IPCT_DESTROY, ct) < 0)) {
+	if (!nf_ct_is_dying(ct) &&
+	    unlikely(nf_conntrack_event_report(IPCT_DESTROY, ct,
+	    portid, report) < 0)) {
 		/* destroy event was not delivered */
 		nf_ct_delete_from_lists(ct);
 		nf_ct_dying_timeout(ct);
-		return;
+		return false;
 	}
 	set_bit(IPS_DYING_BIT, &ct->status);
 	nf_ct_delete_from_lists(ct);
 	nf_ct_put(ct);
+	return true;
+}
+EXPORT_SYMBOL_GPL(nf_ct_delete);
+
+static void death_by_timeout(unsigned long ul_conntrack)
+{
+	nf_ct_delete((struct nf_conn *)ul_conntrack, 0, 0);
 }
 
 /*
@@ -643,10 +648,7 @@ static noinline int early_drop(struct net *net, unsigned int hash)
 		return dropped;
 
 	if (del_timer(&ct->timeout)) {
-		death_by_timeout((unsigned long)ct);
-		/* Check if we indeed killed this entry. Reliable event
-		   delivery may have inserted it into the dying list. */
-		if (test_bit(IPS_DYING_BIT, &ct->status)) {
+		if (nf_ct_delete(ct, 0, 0)) {
 			dropped = 1;
 			NF_CT_STAT_INC_ATOMIC(net, early_drop);
 		}
@@ -1253,6 +1255,7 @@ void nf_ct_iterate_cleanup(struct net *net,
 		/* Time to push up daises... */
 		if (del_timer(&ct->timeout))
 			death_by_timeout((unsigned long)ct);
+
 		/* ... else the timer will get him soon. */
 
 		nf_ct_put(ct);
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index f83a522..d26f306 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1038,21 +1038,9 @@ ctnetlink_del_conntrack(struct sock *ctnl, struct sk_buff *skb,
 		}
 	}
 
-	if (del_timer(&ct->timeout)) {
-		if (nf_conntrack_event_report(IPCT_DESTROY, ct,
-					      NETLINK_CB(skb).portid,
-					      nlmsg_report(nlh)) < 0) {
-			nf_ct_delete_from_lists(ct);
-			/* we failed to report the event, try later */
-			nf_ct_dying_timeout(ct);
-			nf_ct_put(ct);
-			return 0;
-		}
-		/* death_by_timeout would report the event again */
-		set_bit(IPS_DYING_BIT, &ct->status);
-		nf_ct_delete_from_lists(ct);
-		nf_ct_put(ct);
-	}
+	if (del_timer(&ct->timeout))
+		nf_ct_delete(ct, NETLINK_CB(skb).portid, nlmsg_report(nlh));
+
 	nf_ct_put(ct);
 
 	return 0;
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
                   ` (4 preceding siblings ...)
  2013-07-29 13:41 ` [PATCH 5/7] netfilter: conntrack: remove duplicate code in conntrack_netlink Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  2013-07-31 17:04   ` Pablo Neira Ayuso
  2013-07-29 13:41 ` [PATCH 7/7] netfilter: conntrack: remove timer from ecache extension Florian Westphal
  6 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

Let nf_ct_delete handle delivery of the DESTROY event.

This means we now also no longer send such events for conntracks that
are still unconfirmed.

Based on earlier patch from Pablo Neira.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_conntrack.h |    4 ++-
 net/ipv4/netfilter/ipt_MASQUERADE.c  |    2 +-
 net/ipv6/netfilter/ip6t_MASQUERADE.c |    2 +-
 net/netfilter/nf_conntrack_core.c    |   36 +++------------------------------
 net/netfilter/nf_conntrack_proto.c   |    4 +-
 net/netfilter/nf_nat_core.c          |    6 ++--
 6 files changed, 14 insertions(+), 40 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 939aced..61767dc 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -248,7 +248,9 @@ extern void nf_ct_untracked_status_or(unsigned long bits);
 
 /* Iterate over all conntracks: if iter returns true, it's deleted. */
 extern void
-nf_ct_iterate_cleanup(struct net *net, int (*iter)(struct nf_conn *i, void *data), void *data);
+nf_ct_iterate_cleanup(struct net *net,
+		      int (*iter)(struct nf_conn *i, void *data),
+		      void *data, u32 portid, int report);
 extern void nf_conntrack_free(struct nf_conn *ct);
 extern struct nf_conn *
 nf_conntrack_alloc(struct net *net, u16 zone,
diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c b/net/ipv4/netfilter/ipt_MASQUERADE.c
index 30e4de9..00352ce 100644
--- a/net/ipv4/netfilter/ipt_MASQUERADE.c
+++ b/net/ipv4/netfilter/ipt_MASQUERADE.c
@@ -118,7 +118,7 @@ static int masq_device_event(struct notifier_block *this,
 		NF_CT_ASSERT(dev->ifindex != 0);
 
 		nf_ct_iterate_cleanup(net, device_cmp,
-				      (void *)(long)dev->ifindex);
+				      (void *)(long)dev->ifindex, 0, 0);
 	}
 
 	return NOTIFY_DONE;
diff --git a/net/ipv6/netfilter/ip6t_MASQUERADE.c b/net/ipv6/netfilter/ip6t_MASQUERADE.c
index 47bff61..3e4e92d 100644
--- a/net/ipv6/netfilter/ip6t_MASQUERADE.c
+++ b/net/ipv6/netfilter/ip6t_MASQUERADE.c
@@ -76,7 +76,7 @@ static int masq_device_event(struct notifier_block *this,
 
 	if (event == NETDEV_DOWN)
 		nf_ct_iterate_cleanup(net, device_cmp,
-				      (void *)(long)dev->ifindex);
+				      (void *)(long)dev->ifindex, 0, 0);
 
 	return NOTIFY_DONE;
 }
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5759358..0161f83 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1246,7 +1246,7 @@ found:
 
 void nf_ct_iterate_cleanup(struct net *net,
 			   int (*iter)(struct nf_conn *i, void *data),
-			   void *data)
+			   void *data, u32 portid, int report)
 {
 	struct nf_conn *ct;
 	unsigned int bucket = 0;
@@ -1254,7 +1254,7 @@ void nf_ct_iterate_cleanup(struct net *net,
 	while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) {
 		/* Time to push up daises... */
 		if (del_timer(&ct->timeout))
-			death_by_timeout((unsigned long)ct);
+			nf_ct_delete(ct, portid, report);
 
 		/* ... else the timer will get him soon. */
 
@@ -1263,30 +1263,6 @@ void nf_ct_iterate_cleanup(struct net *net,
 }
 EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup);
 
-struct __nf_ct_flush_report {
-	u32 portid;
-	int report;
-};
-
-static int kill_report(struct nf_conn *i, void *data)
-{
-	struct __nf_ct_flush_report *fr = (struct __nf_ct_flush_report *)data;
-	struct nf_conn_tstamp *tstamp;
-
-	tstamp = nf_conn_tstamp_find(i);
-	if (tstamp && tstamp->stop == 0)
-		tstamp->stop = ktime_to_ns(ktime_get_real());
-
-	/* If we fail to deliver the event, death_by_timeout() will retry */
-	if (nf_conntrack_event_report(IPCT_DESTROY, i,
-				      fr->portid, fr->report) < 0)
-		return 1;
-
-	/* Avoid the delivery of the destroy event in death_by_timeout(). */
-	set_bit(IPS_DYING_BIT, &i->status);
-	return 1;
-}
-
 static int kill_all(struct nf_conn *i, void *data)
 {
 	return 1;
@@ -1304,11 +1280,7 @@ EXPORT_SYMBOL_GPL(nf_ct_free_hashtable);
 
 void nf_conntrack_flush_report(struct net *net, u32 portid, int report)
 {
-	struct __nf_ct_flush_report fr = {
-		.portid	= portid,
-		.report = report,
-	};
-	nf_ct_iterate_cleanup(net, kill_report, &fr);
+	nf_ct_iterate_cleanup(net, kill_all, NULL, portid, report);
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_flush_report);
 
@@ -1389,7 +1361,7 @@ void nf_conntrack_cleanup_net_list(struct list_head *net_exit_list)
 i_see_dead_people:
 	busy = 0;
 	list_for_each_entry(net, net_exit_list, exit_list) {
-		nf_ct_iterate_cleanup(net, kill_all, NULL);
+		nf_ct_iterate_cleanup(net, kill_all, NULL, 0, 0);
 		nf_ct_release_dying_list(net);
 		if (atomic_read(&net->ct.count) != 0)
 			busy = 1;
diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
index 0ab9636..ce30041 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -281,7 +281,7 @@ void nf_ct_l3proto_pernet_unregister(struct net *net,
 	nf_ct_l3proto_unregister_sysctl(net, proto);
 
 	/* Remove all contrack entries for this protocol */
-	nf_ct_iterate_cleanup(net, kill_l3proto, proto);
+	nf_ct_iterate_cleanup(net, kill_l3proto, proto, 0, 0);
 }
 EXPORT_SYMBOL_GPL(nf_ct_l3proto_pernet_unregister);
 
@@ -476,7 +476,7 @@ void nf_ct_l4proto_pernet_unregister(struct net *net,
 	nf_ct_l4proto_unregister_sysctl(net, pn, l4proto);
 
 	/* Remove all contrack entries for this protocol */
-	nf_ct_iterate_cleanup(net, kill_l4proto, l4proto);
+	nf_ct_iterate_cleanup(net, kill_l4proto, l4proto, 0, 0);
 }
 EXPORT_SYMBOL_GPL(nf_ct_l4proto_pernet_unregister);
 
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 038eee5..6ff8083 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -497,7 +497,7 @@ static void nf_nat_l4proto_clean(u8 l3proto, u8 l4proto)
 
 	rtnl_lock();
 	for_each_net(net)
-		nf_ct_iterate_cleanup(net, nf_nat_proto_remove, &clean);
+		nf_ct_iterate_cleanup(net, nf_nat_proto_remove, &clean, 0, 0);
 	rtnl_unlock();
 }
 
@@ -511,7 +511,7 @@ static void nf_nat_l3proto_clean(u8 l3proto)
 	rtnl_lock();
 
 	for_each_net(net)
-		nf_ct_iterate_cleanup(net, nf_nat_proto_remove, &clean);
+		nf_ct_iterate_cleanup(net, nf_nat_proto_remove, &clean, 0, 0);
 	rtnl_unlock();
 }
 
@@ -749,7 +749,7 @@ static void __net_exit nf_nat_net_exit(struct net *net)
 {
 	struct nf_nat_proto_clean clean = {};
 
-	nf_ct_iterate_cleanup(net, &nf_nat_proto_remove, &clean);
+	nf_ct_iterate_cleanup(net, &nf_nat_proto_remove, &clean, 0, 0);
 	synchronize_rcu();
 	nf_ct_free_hashtable(net->ct.nat_bysource, net->ct.nat_htable_size);
 }
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/7] netfilter: conntrack: remove timer from ecache extension
  2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
                   ` (5 preceding siblings ...)
  2013-07-29 13:41 ` [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator Florian Westphal
@ 2013-07-29 13:41 ` Florian Westphal
  6 siblings, 0 replies; 16+ messages in thread
From: Florian Westphal @ 2013-07-29 13:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

This brings the (per-conntrack) ecache extension back to 24 bytes in size
(was 112 byte on x86_64 with lockdep on).

When event delivery fails, re-delivery is attempted via work queue.
As long as the work queue has events to deliver, and at least one
delivery succeeded, it is rescheduled without delay,  if no
pending event was delivered after 0.1 seconds to avoid hogging cpu.

As the dying list also contains entries that do not need event
redelivery, a new status bit is added to identify these conntracks.

We cannot use !IPS_DYING_BIT, as entries whose event was already
sent can be recycled at any time due to SLAB_DESTROY_BY_RCU.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_conntrack.h               |    7 ++
 include/net/netfilter/nf_conntrack_ecache.h        |    9 ++-
 include/net/netns/conntrack.h                      |    5 +-
 include/uapi/linux/netfilter/nf_conntrack_common.h |    8 ++-
 net/netfilter/nf_conntrack_core.c                  |   68 ++------------------
 net/netfilter/nf_conntrack_ecache.c                |   63 +++++++++++++++---
 6 files changed, 85 insertions(+), 75 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 61767dc..9e56299 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -71,6 +71,13 @@ struct nf_conn_help {
 #include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
 #include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
 
+/*
+ * We need to use special "null" values, not used in hash table
+ */
+#define NFCT_UNCONFIRMED_NULLS_VAL	((1<<30)+0)
+#define NFCT_DYING_NULLS_VAL		((1<<30)+1)
+#define NFCT_TEMPLATE_NULLS_VAL		((1<<30)+2)
+
 struct nf_conn {
 	/* Usage count in here is 1 for hash table/destruct timer, 1 per skb,
            plus 1 for any connection(s) we are `master' for */
diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h
index 092dc65..1435245 100644
--- a/include/net/netfilter/nf_conntrack_ecache.h
+++ b/include/net/netfilter/nf_conntrack_ecache.h
@@ -18,7 +18,6 @@ struct nf_conntrack_ecache {
 	u16 ctmask;		/* bitmask of ct events to be delivered */
 	u16 expmask;		/* bitmask of expect events to be delivered */
 	u32 portid;		/* netlink portid of destroyer */
-	struct timer_list timeout;
 };
 
 static inline struct nf_conntrack_ecache *
@@ -212,6 +211,12 @@ extern void nf_conntrack_ecache_pernet_fini(struct net *net);
 
 extern int nf_conntrack_ecache_init(void);
 extern void nf_conntrack_ecache_fini(void);
+
+static inline void nf_conntrack_ecache_work(struct net *net)
+{
+	if (!delayed_work_pending(&net->ct.ecache_dwork))
+		schedule_delayed_work(&net->ct.ecache_dwork, HZ);
+}
 #else /* CONFIG_NF_CONNTRACK_EVENTS */
 
 static inline void nf_conntrack_event_cache(enum ip_conntrack_events event,
@@ -251,6 +256,8 @@ static inline int nf_conntrack_ecache_init(void)
 static inline void nf_conntrack_ecache_fini(void)
 {
 }
+
+static inline void nf_conntrack_ecache_work(struct net *net) { }
 #endif /* CONFIG_NF_CONNTRACK_EVENTS */
 
 #endif /*_NF_CONNTRACK_ECACHE_H*/
diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
index c9c0c53..ae58be0 100644
--- a/include/net/netns/conntrack.h
+++ b/include/net/netns/conntrack.h
@@ -4,6 +4,7 @@
 #include <linux/list.h>
 #include <linux/list_nulls.h>
 #include <linux/atomic.h>
+#include <linux/workqueue.h>
 #include <linux/netfilter/nf_conntrack_tcp.h>
 
 struct ctl_table_header;
@@ -72,11 +73,13 @@ struct netns_ct {
 	struct hlist_nulls_head	unconfirmed;
 	struct hlist_nulls_head	dying;
 	struct hlist_nulls_head tmpl;
+#ifdef CONFIG_NF_CONNTRACK_EVENTS
+	struct delayed_work ecache_dwork;
+#endif
 	struct ip_conntrack_stat __percpu *stat;
 	struct nf_ct_event_notifier __rcu *nf_conntrack_event_cb;
 	struct nf_exp_event_notifier __rcu *nf_expect_event_cb;
 	int			sysctl_events;
-	unsigned int		sysctl_events_retry_timeout;
 	int			sysctl_acct;
 	int			sysctl_tstamp;
 	int			sysctl_checksum;
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index d69483f..4269a8b 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -68,7 +68,9 @@ enum ip_conntrack_status {
 	/* Both together */
 	IPS_NAT_DONE_MASK = (IPS_DST_NAT_DONE | IPS_SRC_NAT_DONE),
 
-	/* Connection is dying (removed from lists), can not be unset. */
+	/* Connection is dying (removed from hash), DESTROY event delivered.
+	 * cannot be unset.
+	 */
 	IPS_DYING_BIT = 9,
 	IPS_DYING = (1 << IPS_DYING_BIT),
 
@@ -87,6 +89,10 @@ enum ip_conntrack_status {
 	/* Conntrack got a helper explicitly attached via CT target. */
 	IPS_HELPER_BIT = 13,
 	IPS_HELPER = (1 << IPS_HELPER_BIT),
+
+	/* Conntrack removed from hash, but ecache must re-deliver destroy event */
+	IPS_ECACHE_REDELIVER_BIT = 14,
+	IPS_ECACHE_REDELIVER = (1 << IPS_ECACHE_REDELIVER_BIT),
 };
 
 /* Connection tracking event types */
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 0161f83..4f3d496 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -249,45 +249,11 @@ static void nf_ct_delete_from_lists(struct nf_conn *ct)
 	NF_CT_STAT_INC(net, delete_list);
 	clean_from_lists(ct);
 	/* add this conntrack to the dying list */
-	hlist_nulls_add_head(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
+	hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
 			     &net->ct.dying);
 	spin_unlock_bh(&nf_conntrack_lock);
 }
 
-static void death_by_event(unsigned long ul_conntrack)
-{
-	struct nf_conn *ct = (void *)ul_conntrack;
-	struct net *net = nf_ct_net(ct);
-	struct nf_conntrack_ecache *ecache = nf_ct_ecache_find(ct);
-
-	BUG_ON(ecache == NULL);
-
-	if (nf_conntrack_event(IPCT_DESTROY, ct) < 0) {
-		/* bad luck, let's retry again */
-		ecache->timeout.expires = jiffies +
-			(prandom_u32() % net->ct.sysctl_events_retry_timeout);
-		add_timer(&ecache->timeout);
-		return;
-	}
-	/* we've got the event delivered, now it's dying */
-	set_bit(IPS_DYING_BIT, &ct->status);
-	nf_ct_put(ct);
-}
-
-static void nf_ct_dying_timeout(struct nf_conn *ct)
-{
-	struct net *net = nf_ct_net(ct);
-	struct nf_conntrack_ecache *ecache = nf_ct_ecache_find(ct);
-
-	BUG_ON(ecache == NULL);
-
-	/* set a new timer to retry event delivery */
-	setup_timer(&ecache->timeout, death_by_event, (unsigned long)ct);
-	ecache->timeout.expires = jiffies +
-		(prandom_u32() % net->ct.sysctl_events_retry_timeout);
-	add_timer(&ecache->timeout);
-}
-
 bool nf_ct_delete(struct nf_conn *ct, u32 portid, int report)
 {
 	struct nf_conn_tstamp *tstamp;
@@ -301,7 +267,8 @@ bool nf_ct_delete(struct nf_conn *ct, u32 portid, int report)
 	    portid, report) < 0)) {
 		/* destroy event was not delivered */
 		nf_ct_delete_from_lists(ct);
-		nf_ct_dying_timeout(ct);
+		set_bit(IPS_ECACHE_REDELIVER_BIT, &ct->status);
+		nf_conntrack_ecache_work(nf_ct_net(ct));
 		return false;
 	}
 	set_bit(IPS_DYING_BIT, &ct->status);
@@ -1284,21 +1251,6 @@ void nf_conntrack_flush_report(struct net *net, u32 portid, int report)
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_flush_report);
 
-static void nf_ct_release_dying_list(struct net *net)
-{
-	struct nf_conntrack_tuple_hash *h;
-	struct nf_conn *ct;
-	struct hlist_nulls_node *n;
-
-	spin_lock_bh(&nf_conntrack_lock);
-	hlist_nulls_for_each_entry(h, n, &net->ct.dying, hnnode) {
-		ct = nf_ct_tuplehash_to_ctrack(h);
-		/* never fails to remove them, no listeners at this point */
-		nf_ct_kill(ct);
-	}
-	spin_unlock_bh(&nf_conntrack_lock);
-}
-
 static int untrack_refs(void)
 {
 	int cnt = 0, cpu;
@@ -1362,7 +1314,6 @@ i_see_dead_people:
 	busy = 0;
 	list_for_each_entry(net, net_exit_list, exit_list) {
 		nf_ct_iterate_cleanup(net, kill_all, NULL, 0, 0);
-		nf_ct_release_dying_list(net);
 		if (atomic_read(&net->ct.count) != 0)
 			busy = 1;
 	}
@@ -1582,21 +1533,14 @@ void nf_conntrack_init_end(void)
 	RCU_INIT_POINTER(nf_ct_nat_offset, NULL);
 }
 
-/*
- * We need to use special "null" values, not used in hash table
- */
-#define UNCONFIRMED_NULLS_VAL	((1<<30)+0)
-#define DYING_NULLS_VAL		((1<<30)+1)
-#define TEMPLATE_NULLS_VAL	((1<<30)+2)
-
 int nf_conntrack_init_net(struct net *net)
 {
 	int ret;
 
 	atomic_set(&net->ct.count, 0);
-	INIT_HLIST_NULLS_HEAD(&net->ct.unconfirmed, UNCONFIRMED_NULLS_VAL);
-	INIT_HLIST_NULLS_HEAD(&net->ct.dying, DYING_NULLS_VAL);
-	INIT_HLIST_NULLS_HEAD(&net->ct.tmpl, TEMPLATE_NULLS_VAL);
+	INIT_HLIST_NULLS_HEAD(&net->ct.unconfirmed, NFCT_UNCONFIRMED_NULLS_VAL);
+	INIT_HLIST_NULLS_HEAD(&net->ct.dying, NFCT_DYING_NULLS_VAL);
+	INIT_HLIST_NULLS_HEAD(&net->ct.tmpl, NFCT_TEMPLATE_NULLS_VAL);
 	net->ct.stat = alloc_percpu(struct ip_conntrack_stat);
 	if (!net->ct.stat) {
 		ret = -ENOMEM;
diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c
index 1df1761..bdb9491 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -22,6 +22,7 @@
 #include <linux/netdevice.h>
 #include <linux/slab.h>
 #include <linux/export.h>
+#include <linux/rculist_nulls.h>
 
 #include <net/netfilter/nf_conntrack.h>
 #include <net/netfilter/nf_conntrack_core.h>
@@ -29,6 +30,56 @@
 
 static DEFINE_MUTEX(nf_ct_ecache_mutex);
 
+#define ECACHE_MAX_EVICTS 1000
+#define ECACHE_RETRY_WAIT (HZ/10) /* at most 10 retries/s when congested */
+
+static void ecache_work(struct work_struct *work)
+{
+	struct netns_ct *ctnet =
+		container_of(work, struct netns_ct, ecache_dwork.work);
+	struct nf_conntrack_tuple_hash *h;
+	struct hlist_nulls_node *n;
+	unsigned int evicted = 0, delay;
+
+	mutex_lock(&nf_ct_ecache_mutex);
+	rcu_read_lock();
+
+	hlist_nulls_for_each_entry_rcu(h, n, &ctnet->dying, hnnode) {
+		struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+		if (!test_bit(IPS_ECACHE_REDELIVER_BIT, &ct->status) ||
+		    nf_ct_is_dying(ct))
+			continue;
+
+		if (nf_conntrack_event(IPCT_DESTROY, ct))
+			break;
+
+		/* we've got the event delivered, now it's dying */
+		set_bit(IPS_DYING_BIT, &ct->status);
+		nf_ct_put(ct);
+
+		if (++evicted >= ECACHE_MAX_EVICTS || need_resched())
+			break;
+	}
+
+	rcu_read_unlock();
+	mutex_unlock(&nf_ct_ecache_mutex);
+
+	if (is_a_nulls(n)) {
+		if (get_nulls_value(n) == NFCT_DYING_NULLS_VAL)
+			return; /* done, all events delivered */
+		/* else, found recycled element, restart */
+		delay = 0;
+	} else if (evicted) {
+		/* made some progress, restart */
+		delay = 0;
+	} else {
+		/* userspace is congested, back off */
+		delay = ECACHE_RETRY_WAIT;
+	}
+	schedule_delayed_work(&ctnet->ecache_dwork, delay);
+}
+
 /* deliver cached events and clear cache entry - must be called with locally
  * disabled softirqs */
 void nf_ct_deliver_cached_events(struct nf_conn *ct)
@@ -157,7 +208,6 @@ EXPORT_SYMBOL_GPL(nf_ct_expect_unregister_notifier);
 
 #define NF_CT_EVENTS_DEFAULT 1
 static int nf_ct_events __read_mostly = NF_CT_EVENTS_DEFAULT;
-static int nf_ct_events_retry_timeout __read_mostly = 15*HZ;
 
 #ifdef CONFIG_SYSCTL
 static struct ctl_table event_sysctl_table[] = {
@@ -168,13 +218,6 @@ static struct ctl_table event_sysctl_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
-	{
-		.procname	= "nf_conntrack_events_retry_timeout",
-		.data		= &init_net.ct.sysctl_events_retry_timeout,
-		.maxlen		= sizeof(unsigned int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_jiffies,
-	},
 	{}
 };
 #endif /* CONFIG_SYSCTL */
@@ -196,7 +239,6 @@ static int nf_conntrack_event_init_sysctl(struct net *net)
 		goto out;
 
 	table[0].data = &net->ct.sysctl_events;
-	table[1].data = &net->ct.sysctl_events_retry_timeout;
 
 	/* Don't export sysctls to unprivileged users */
 	if (net->user_ns != &init_user_ns)
@@ -238,12 +280,13 @@ static void nf_conntrack_event_fini_sysctl(struct net *net)
 int nf_conntrack_ecache_pernet_init(struct net *net)
 {
 	net->ct.sysctl_events = nf_ct_events;
-	net->ct.sysctl_events_retry_timeout = nf_ct_events_retry_timeout;
+	INIT_DELAYED_WORK(&net->ct.ecache_dwork, ecache_work);
 	return nf_conntrack_event_init_sysctl(net);
 }
 
 void nf_conntrack_ecache_pernet_fini(struct net *net)
 {
+	cancel_delayed_work_sync(&net->ct.ecache_dwork);
 	nf_conntrack_event_fini_sysctl(net);
 }
 
-- 
1.7.8.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/7] netfilter: connlabels: remove unneeded includes
  2013-07-29 13:41 ` [PATCH 1/7] netfilter: connlabels: remove unneeded includes Florian Westphal
@ 2013-07-31 16:56   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-31 16:56 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Jul 29, 2013 at 03:41:50PM +0200, Florian Westphal wrote:
> leftovers from the (never merged) v1 patch.

Applied, thanks Florian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/7] netfilter: nf_queue: relax NFQA_CT attribute check
  2013-07-29 13:41 ` [PATCH 2/7] netfilter: nf_queue: relax NFQA_CT attribute check Florian Westphal
@ 2013-07-31 16:56   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-31 16:56 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Jul 29, 2013 at 03:41:51PM +0200, Florian Westphal wrote:
> Allow modifying attributes of the conntrack associated with a packet
> without first requesting ct data via CFG_F_CONNTRACK or extra
> nfnetlink_conntrack socket.

Applied, thanks.

> Also remove unneded rcu_read_lock; the entire function is already
> protected by rcu.

Indeed, that code runs on call_rcu from nfnetlink.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/7] netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb
  2013-07-29 13:41 ` [PATCH 3/7] netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb Florian Westphal
@ 2013-07-31 16:57   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-31 16:57 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Jul 29, 2013 at 03:41:52PM +0200, Florian Westphal wrote:
> The module was "permanent", due to the special tproxy skb->destructor.
> Nowadays we have tcp early demux and its sock_edemux destructor in
> networking core which can be used instead.
> 
> Thanks to early demux changes the input path now also handles
> "skb->sk is tw socket" correctly, so this no longer needs the special
> handling introduced with commit d503b30bd648b3cb4e5f50b65d27e389960cc6d9
> (netfilter: tproxy: do not assign timewait sockets to skb->sk).
> 
> Thus:
> - move assign_sock function to where its needed
> - don't prevent timewait sockets from being assigned to the skb
> - remove nf_tproxy_core.

Applied, thanks Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/7] netfilter: tproxy: remove nf_tproxy_core.h
  2013-07-29 13:41 ` [PATCH 4/7] netfilter: tproxy: remove nf_tproxy_core.h Florian Westphal
@ 2013-07-31 16:58   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-31 16:58 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Jul 29, 2013 at 03:41:53PM +0200, Florian Westphal wrote:
> We've removed nf_tproxy_core.ko, so also remove its header.
> The lookup helpers are split and then moved to tproxy target/socket match.

Applied, thanks. Minor comestic change: I have fixed some comments
from:

/*
 * ... */

to:

/*
 * ...
 */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/7] netfilter: conntrack: remove duplicate code in conntrack_netlink
  2013-07-29 13:41 ` [PATCH 5/7] netfilter: conntrack: remove duplicate code in conntrack_netlink Florian Westphal
@ 2013-07-31 16:58   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-31 16:58 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Jul 29, 2013 at 03:41:54PM +0200, Florian Westphal wrote:
> ctnetlink contains copy-paste code from death_by_timeout.  In order to
> avoid changing both places in upcoming event delivery patch,
> export death_by_timeout functionality and use it in the ctnetlink code.
> 
> Based on earlier patch from Pablo Neira.

I like it. Applied, thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator
  2013-07-29 13:41 ` [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator Florian Westphal
@ 2013-07-31 17:04   ` Pablo Neira Ayuso
  2013-07-31 20:43     ` Florian Westphal
  0 siblings, 1 reply; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-07-31 17:04 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Mon, Jul 29, 2013 at 03:41:55PM +0200, Florian Westphal wrote:
> Let nf_ct_delete handle delivery of the DESTROY event.
> 
> This means we now also no longer send such events for conntracks that
> are still unconfirmed.

Not sure why this happens by looking at the patch. Are you refering to
conntrack with IPS_CONFIRMED unset?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator
  2013-07-31 17:04   ` Pablo Neira Ayuso
@ 2013-07-31 20:43     ` Florian Westphal
  2013-08-09 10:08       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 16+ messages in thread
From: Florian Westphal @ 2013-07-31 20:43 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Mon, Jul 29, 2013 at 03:41:55PM +0200, Florian Westphal wrote:
> > Let nf_ct_delete handle delivery of the DESTROY event.
> > 
> > This means we now also no longer send such events for conntracks that
> > are still unconfirmed.
> 
> Not sure why this happens by looking at the patch. Are you refering to
> conntrack with IPS_CONFIRMED unset?

Doh.  You are right of course.

get_next_corpse also iterates over the unconfirmed list, and ivokes
iter() for those (and iter is kill_report() which calls
nf_conntrack_event_report()).

But nf_conntrack_event_report() just returns in !IPS_CONFIRMED case.

Thanks for pointing it out.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator
  2013-07-31 20:43     ` Florian Westphal
@ 2013-08-09 10:08       ` Pablo Neira Ayuso
  0 siblings, 0 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2013-08-09 10:08 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Wed, Jul 31, 2013 at 10:43:59PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Mon, Jul 29, 2013 at 03:41:55PM +0200, Florian Westphal wrote:
> > > Let nf_ct_delete handle delivery of the DESTROY event.
> > > 
> > > This means we now also no longer send such events for conntracks that
> > > are still unconfirmed.
> > 
> > Not sure why this happens by looking at the patch. Are you refering to
> > conntrack with IPS_CONFIRMED unset?
> 
> Doh.  You are right of course.
> 
> get_next_corpse also iterates over the unconfirmed list, and ivokes
> iter() for those (and iter is kill_report() which calls
> nf_conntrack_event_report()).
> 
> But nf_conntrack_event_report() just returns in !IPS_CONFIRMED case.
> 
> Thanks for pointing it out.

Removed that line from the description and applied this patch. Thanks Florian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-08-09 10:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-29 13:41 [PATCH 0/7 -next] rm tproxy_core, ct event redelivery via workqueue Florian Westphal
2013-07-29 13:41 ` [PATCH 1/7] netfilter: connlabels: remove unneeded includes Florian Westphal
2013-07-31 16:56   ` Pablo Neira Ayuso
2013-07-29 13:41 ` [PATCH 2/7] netfilter: nf_queue: relax NFQA_CT attribute check Florian Westphal
2013-07-31 16:56   ` Pablo Neira Ayuso
2013-07-29 13:41 ` [PATCH 3/7] netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb Florian Westphal
2013-07-31 16:57   ` Pablo Neira Ayuso
2013-07-29 13:41 ` [PATCH 4/7] netfilter: tproxy: remove nf_tproxy_core.h Florian Westphal
2013-07-31 16:58   ` Pablo Neira Ayuso
2013-07-29 13:41 ` [PATCH 5/7] netfilter: conntrack: remove duplicate code in conntrack_netlink Florian Westphal
2013-07-31 16:58   ` Pablo Neira Ayuso
2013-07-29 13:41 ` [PATCH 6/7] netfilter: conntrack: don't send destroy events from iterator Florian Westphal
2013-07-31 17:04   ` Pablo Neira Ayuso
2013-07-31 20:43     ` Florian Westphal
2013-08-09 10:08       ` Pablo Neira Ayuso
2013-07-29 13:41 ` [PATCH 7/7] netfilter: conntrack: remove timer from ecache extension Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.