All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v3] Add TCP encap_rcv hook
@ 2012-04-12  7:42 Simon Horman
       [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12  7:42 UTC (permalink / raw)
  To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

This hook is based on a hook of the same name provided by UDP.  It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.

It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.

The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01

My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The first version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html

Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

---
 include/linux/tcp.h |    3 +++
 net/ipv4/tcp_ipv4.c |   23 ++++++++++++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

v3
* First post to netdev
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
  and release lock on return.

v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
  the encap_rcv callback returns a positive value.
  This is the same behaviour as the UDP hook.

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
 	 * contains related tcp_cookie_transactions fields.
 	 */
 	struct tcp_cookie_values  *cookie_values;
+
+	/* For encapsulation sockets. */
+	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 };
 
 static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..9898f71 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	const struct iphdr *iph;
 	const struct tcphdr *th;
 	struct sock *sk;
+	struct tcp_sock *tp;
 	int ret;
 	struct net *net = dev_net(skb->dev);
+	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 
 	if (skb->pkt_type != PACKET_HOST)
 		goto discard_it;
@@ -1726,9 +1728,27 @@ process:
 
 	bh_lock_sock_nested(sk);
 	ret = 0;
+
+	tp = tcp_sk(sk);
+	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+	if (encap_rcv != NULL) {
+		/*
+		 * This is an encapsulation socket so pass the skb to
+		 * the socket's tcp_encap_rcv() hook. Otherwise, just
+		 * fall through and pass this up the TCP socket.
+		 * up->encap_rcv() returns the following value:
+		 * <=0 if skb was successfully passed to the encap
+		 *     handler or was discarded by it.
+		 * >0 if skb should be passed on to TCP.
+		 */
+		if (encap_rcv(sk, skb) <= 0) {
+			ret = 0;
+			goto unlock_sock;
+		}
+	}
+
 	if (!sock_owned_by_user(sk)) {
 #ifdef CONFIG_NET_DMA
-		struct tcp_sock *tp = tcp_sk(sk);
 		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
 			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
 		if (tp->ucopy.dma_chan)
@@ -1744,6 +1764,7 @@ process:
 		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
 		goto discard_and_relse;
 	}
+unlock_sock:
 	bh_unlock_sock(sk);
 
 	sock_put(sk);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC v3] Add TCP encap_rcv hook
       [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2012-04-12  8:20   ` Eric Dumazet
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
  2012-04-12 13:10     ` [RFC v3] Add TCP encap_rcv hook Simon Horman
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12  8:20 UTC (permalink / raw)
  To: Simon Horman; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> This hook is based on a hook of the same name provided by UDP.  It provides
> a way for to receive packets that have a TCP header and treat them in some
> alternate way.
> 
> It is intended to be used by an implementation of the STT tunneling
> protocol within Open vSwtich's datapath. A prototype of such an
> implementation has been made.
> 
> The STT draft is available at
> http://tools.ietf.org/html/draft-davie-stt-01
> 
> My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> The first version can be found at:
> http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
> 
> Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> 

Hi Simon

Oh well, this is insane :(

> ---
>  include/linux/tcp.h |    3 +++
>  net/ipv4/tcp_ipv4.c |   23 ++++++++++++++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> v3
> * First post to netdev
> * Replace more UDP references with TCP
> * Move socket accesses to inside socket lock
>   and release lock on return.
> 
> v2
> * Fix comment to refer to TCP rather than UDP
> * Allow skb to continue traversing the stack if
>   the encap_rcv callback returns a positive value.
>   This is the same behaviour as the UDP hook.
> 
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index b6c62d2..7210b23 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -472,6 +472,9 @@ struct tcp_sock {
>  	 * contains related tcp_cookie_transactions fields.
>  	 */
>  	struct tcp_cookie_values  *cookie_values;
> +
> +	/* For encapsulation sockets. */
> +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>  };
>  

This adds a new cache miss for all incoming tcp frames...

>  static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 3a25cf7..9898f71 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  	const struct iphdr *iph;
>  	const struct tcphdr *th;
>  	struct sock *sk;
> +	struct tcp_sock *tp;
>  	int ret;
>  	struct net *net = dev_net(skb->dev);
> +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>  
>  	if (skb->pkt_type != PACKET_HOST)
>  		goto discard_it;
> @@ -1726,9 +1728,27 @@ process:
>  
>  	bh_lock_sock_nested(sk);
>  	ret = 0;
> +
> +	tp = tcp_sk(sk);
> +	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> +	if (encap_rcv != NULL) {

and a new conditional...

> +		/*
> +		 * This is an encapsulation socket so pass the skb to
> +		 * the socket's tcp_encap_rcv() hook. Otherwise, just
> +		 * fall through and pass this up the TCP socket.
> +		 * up->encap_rcv() returns the following value:
> +		 * <=0 if skb was successfully passed to the encap
> +		 *     handler or was discarded by it.
> +		 * >0 if skb should be passed on to TCP.
> +		 */
> +		if (encap_rcv(sk, skb) <= 0) {
> +			ret = 0;
> +			goto unlock_sock;
> +		}
> +	}
> +
>  	if (!sock_owned_by_user(sk)) {
>  #ifdef CONFIG_NET_DMA
> -		struct tcp_sock *tp = tcp_sk(sk);
>  		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
>  			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
>  		if (tp->ucopy.dma_chan)
> @@ -1744,6 +1764,7 @@ process:
>  		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
>  		goto discard_and_relse;
>  	}
> +unlock_sock:
>  	bh_unlock_sock(sk);
>  
>  	sock_put(sk);

I dont know, this sounds as a hack. Since you obviously spent a lot of
time on this stuff, lets be constructive.

I really suggest you take a look at <linux/static_key.h>

So that on machines without any need for this encap_rcv, we dont even
need to fetch tp->encap_rcv

if (static_key_false(&stt_active)) {
	/* stt might be used on this socket */
	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
	if (encap_rcv) {
		...
	}
}

This way, if stt is not used/loaded, we have a single NOP

If stt is used, NOP is patched to a JMP stt_code


I probably implement this idea on UDP shortly so that you can have a
reference for your implementation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  8:20   ` Eric Dumazet
@ 2012-04-12  9:05     ` Eric Dumazet
  2012-04-12  9:10       ` Eric Dumazet
                         ` (2 more replies)
  2012-04-12 13:10     ` [RFC v3] Add TCP encap_rcv hook Simon Horman
  1 sibling, 3 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12  9:05 UTC (permalink / raw)
  To: Simon Horman, David Miller
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

Most machines dont use UDP encapsulation (L2TP)

Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.

Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.

If static_key is not yet enabled, the fast path does a single JMP .

When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.

Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
---
 include/net/udp.h    |    1 +
 net/ipv4/udp.c       |   12 +++++++++++-
 net/l2tp/l2tp_core.c |    1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index 5d606d9..9671f5f 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -267,4 +267,5 @@ extern void udp_init(void);
 extern int udp4_ufo_send_check(struct sk_buff *skb);
 extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
 	netdev_features_t features);
+extern void udp_encap_enable(void);
 #endif	/* _UDP_H */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index fe14105..ad1e0dd 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -107,6 +107,7 @@
 #include <net/checksum.h>
 #include <net/xfrm.h>
 #include <trace/events/udp.h>
+#include <linux/static_key.h>
 #include "udp_impl.h"
 
 struct udp_table udp_table __read_mostly;
@@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 
 }
 
+static struct static_key udp_encap_needed __read_mostly;
+void udp_encap_enable(void)
+{
+	if (!static_key_enabled(&udp_encap_needed))
+		static_key_slow_inc(&udp_encap_needed);
+}
+EXPORT_SYMBOL(udp_encap_enable);
+
 /* returns:
  *  -1: error
  *   0: success
@@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 		goto drop;
 	nf_reset(skb);
 
-	if (up->encap_type) {
+	if (static_key_false(&udp_encap_needed) && up->encap_type) {
 		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 
 		/*
@@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
 			/* FALLTHROUGH */
 		case UDP_ENCAP_L2TPINUDP:
 			up->encap_type = val;
+			udp_encap_enable();
 			break;
 		default:
 			err = -ENOPROTOOPT;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 89ff8c6..f6732b6 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
 		/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
 		udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
 		udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
+		udp_encap_enable();
 	}
 
 	sk->sk_user_data = tunnel;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
@ 2012-04-12  9:10       ` Eric Dumazet
  2012-04-12 14:35       ` Simon Horman
  2012-04-13 17:41       ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
  2 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2012-04-12  9:10 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller

On Thu, 2012-04-12 at 11:05 +0200, Eric Dumazet wrote:

> If static_key is not yet enabled, the fast path does a single JMP .
> 
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.

Small note Simon,

The jump trick is effective on x86 (and maybe some other arches) when

CONFIG_JUMP_LABEL=y

Else, its replaced by atomic_read(...) > 0, a cnditional jump but
reading a read_mostly/shared variable, instead of a per socket field.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v3] Add TCP encap_rcv hook
  2012-04-12  8:20   ` Eric Dumazet
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
@ 2012-04-12 13:10     ` Simon Horman
  1 sibling, 0 replies; 9+ messages in thread
From: Simon Horman @ 2012-04-12 13:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

On Thu, Apr 12, 2012 at 10:20:29AM +0200, Eric Dumazet wrote:
> On Thu, 2012-04-12 at 16:42 +0900, Simon Horman wrote:
> > This hook is based on a hook of the same name provided by UDP.  It provides
> > a way for to receive packets that have a TCP header and treat them in some
> > alternate way.
> > 
> > It is intended to be used by an implementation of the STT tunneling
> > protocol within Open vSwtich's datapath. A prototype of such an
> > implementation has been made.
> > 
> > The STT draft is available at
> > http://tools.ietf.org/html/draft-davie-stt-01
> > 
> > My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
> > The first version can be found at:
> > http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg08877.html
> > 
> > Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> > 
> 
> Hi Simon
> 
> Oh well, this is insane :(
> 
> > ---
> >  include/linux/tcp.h |    3 +++
> >  net/ipv4/tcp_ipv4.c |   23 ++++++++++++++++++++++-
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > v3
> > * First post to netdev
> > * Replace more UDP references with TCP
> > * Move socket accesses to inside socket lock
> >   and release lock on return.
> > 
> > v2
> > * Fix comment to refer to TCP rather than UDP
> > * Allow skb to continue traversing the stack if
> >   the encap_rcv callback returns a positive value.
> >   This is the same behaviour as the UDP hook.
> > 
> > diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> > index b6c62d2..7210b23 100644
> > --- a/include/linux/tcp.h
> > +++ b/include/linux/tcp.h
> > @@ -472,6 +472,9 @@ struct tcp_sock {
> >  	 * contains related tcp_cookie_transactions fields.
> >  	 */
> >  	struct tcp_cookie_values  *cookie_values;
> > +
> > +	/* For encapsulation sockets. */
> > +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> >  };
> >  
> 
> This adds a new cache miss for all incoming tcp frames...
> 
> >  static inline struct tcp_sock *tcp_sk(const struct sock *sk)
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 3a25cf7..9898f71 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1666,8 +1666,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
> >  	const struct iphdr *iph;
> >  	const struct tcphdr *th;
> >  	struct sock *sk;
> > +	struct tcp_sock *tp;
> >  	int ret;
> >  	struct net *net = dev_net(skb->dev);
> > +	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
> >  
> >  	if (skb->pkt_type != PACKET_HOST)
> >  		goto discard_it;
> > @@ -1726,9 +1728,27 @@ process:
> >  
> >  	bh_lock_sock_nested(sk);
> >  	ret = 0;
> > +
> > +	tp = tcp_sk(sk);
> > +	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> > +	if (encap_rcv != NULL) {
> 
> and a new conditional...
> 
> > +		/*
> > +		 * This is an encapsulation socket so pass the skb to
> > +		 * the socket's tcp_encap_rcv() hook. Otherwise, just
> > +		 * fall through and pass this up the TCP socket.
> > +		 * up->encap_rcv() returns the following value:
> > +		 * <=0 if skb was successfully passed to the encap
> > +		 *     handler or was discarded by it.
> > +		 * >0 if skb should be passed on to TCP.
> > +		 */
> > +		if (encap_rcv(sk, skb) <= 0) {
> > +			ret = 0;
> > +			goto unlock_sock;
> > +		}
> > +	}
> > +
> >  	if (!sock_owned_by_user(sk)) {
> >  #ifdef CONFIG_NET_DMA
> > -		struct tcp_sock *tp = tcp_sk(sk);
> >  		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
> >  			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
> >  		if (tp->ucopy.dma_chan)
> > @@ -1744,6 +1764,7 @@ process:
> >  		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
> >  		goto discard_and_relse;
> >  	}
> > +unlock_sock:
> >  	bh_unlock_sock(sk);
> >  
> >  	sock_put(sk);
> 
> I dont know, this sounds as a hack. Since you obviously spent a lot of
> time on this stuff, lets be constructive.

Hi Eric,

Thanks, I didn't really expect my patch to go in smoothly as is.
Though it may well be my first brush with insanity.

> 
> I really suggest you take a look at <linux/static_key.h>
> 
> So that on machines without any need for this encap_rcv, we dont even
> need to fetch tp->encap_rcv
> 
> if (static_key_false(&stt_active)) {
> 	/* stt might be used on this socket */
> 	encap_rcv = ACCESS_ONCE(tp->encap_rcv);
> 	if (encap_rcv) {
> 		...
> 	}
> }
> 
> This way, if stt is not used/loaded, we have a single NOP
> 
> If stt is used, NOP is patched to a JMP stt_code
> 
> 
> I probably implement this idea on UDP shortly so that you can have a
> reference for your implementation.

Thanks, I see your UDP code now. I'll see about getting the same thing
working for TCP.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
  2012-04-12  9:10       ` Eric Dumazet
@ 2012-04-12 14:35       ` Simon Horman
       [not found]         ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  2012-04-13 17:41       ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
  2 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2012-04-12 14:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller

On Thu, Apr 12, 2012 at 11:05:28AM +0200, Eric Dumazet wrote:
> Most machines dont use UDP encapsulation (L2TP)
> 
> Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
> test if L2TP never setup the encap_rcv on a socket.
> 
> Idea of this patch came after Simon Horman proposal to add a hook on TCP
> as well.
> 
> If static_key is not yet enabled, the fast path does a single JMP .
> 
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.

Thanks Eric,

I have not had a chance to test your code, though it should be easy enough
to do so in the context of Open vSwitch as its CAPWAP implementation makes
use of UDP's encap_rcv (which is how I arrived at adding hook to TCP to
implement STT for Open vSwtich).

I have incorporated your static_key code into a new version of my TCP
encap_rcv patch and that does appear to work. I will post it ASAP.

> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
> ---
>  include/net/udp.h    |    1 +
>  net/ipv4/udp.c       |   12 +++++++++++-
>  net/l2tp/l2tp_core.c |    1 +
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/udp.h b/include/net/udp.h
> index 5d606d9..9671f5f 100644
> --- a/include/net/udp.h
> +++ b/include/net/udp.h
> @@ -267,4 +267,5 @@ extern void udp_init(void);
>  extern int udp4_ufo_send_check(struct sk_buff *skb);
>  extern struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
>  	netdev_features_t features);
> +extern void udp_encap_enable(void);
>  #endif	/* _UDP_H */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index fe14105..ad1e0dd 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -107,6 +107,7 @@
>  #include <net/checksum.h>
>  #include <net/xfrm.h>
>  #include <trace/events/udp.h>
> +#include <linux/static_key.h>
>  #include "udp_impl.h"
>  
>  struct udp_table udp_table __read_mostly;
> @@ -1379,6 +1380,14 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>  
>  }
>  
> +static struct static_key udp_encap_needed __read_mostly;
> +void udp_encap_enable(void)
> +{
> +	if (!static_key_enabled(&udp_encap_needed))
> +		static_key_slow_inc(&udp_encap_needed);
> +}
> +EXPORT_SYMBOL(udp_encap_enable);
> +
>  /* returns:
>   *  -1: error
>   *   0: success
> @@ -1400,7 +1409,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>  		goto drop;
>  	nf_reset(skb);
>  
> -	if (up->encap_type) {
> +	if (static_key_false(&udp_encap_needed) && up->encap_type) {
>  		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
>  
>  		/*
> @@ -1760,6 +1769,7 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
>  			/* FALLTHROUGH */
>  		case UDP_ENCAP_L2TPINUDP:
>  			up->encap_type = val;
> +			udp_encap_enable();
>  			break;
>  		default:
>  			err = -ENOPROTOOPT;
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 89ff8c6..f6732b6 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1424,6 +1424,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32
>  		/* Mark socket as an encapsulation socket. See net/ipv4/udp.c */
>  		udp_sk(sk)->encap_type = UDP_ENCAP_L2TPINUDP;
>  		udp_sk(sk)->encap_rcv = l2tp_udp_encap_recv;
> +		udp_encap_enable();
>  	}
>  
>  	sk->sk_user_data = tunnel;
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v4] Add TCP encap_rcv hook
       [not found]         ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2012-04-12 14:40           ` Simon Horman
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2012-04-12 14:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller

This hook is based on a hook of the same name provided by UDP.  It provides
a way for to receive packets that have a TCP header and treat them in some
alternate way.

It is intended to be used by an implementation of the STT tunneling
protocol within Open vSwtich's datapath. A prototype of such an
implementation has been made.

The STT draft is available at
http://tools.ietf.org/html/draft-davie-stt-01

My prototype STT implementation has been posted to the dev-UOEtcQmXneFl884UGnbwIQ@public.gmane.org
The second version can be found at:
http://www.mail-archive.com/dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org/msg09001.html
It needs to be updated to call tcp_encap_enable()

Cc: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

---
v4
* Make use of static_key,
  a tonic for insanity suggested by Eric Dumazet

v3
* Replace more UDP references with TCP
* Move socket accesses to inside socket lock
  and release lock on return.

v2
* Fix comment to refer to TCP rather than UDP
* Allow skb to continue traversing the stack if
  the encap_rcv callback returns a positive value.
  This is the same behaviour as the UDP hook.
---
 include/linux/tcp.h |    3 +++
 include/net/tcp.h   |    1 +
 net/ipv4/tcp_ipv4.c |   34 +++++++++++++++++++++++++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index b6c62d2..7210b23 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -472,6 +472,9 @@ struct tcp_sock {
 	 * contains related tcp_cookie_transactions fields.
 	 */
 	struct tcp_cookie_values  *cookie_values;
+
+	/* For encapsulation sockets. */
+	int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 };
 
 static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index f75a04d..f2c4ac0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1575,5 +1575,6 @@ static inline struct tcp_extend_values *tcp_xv(struct request_values *rvp)
 
 extern void tcp_v4_init(void);
 extern void tcp_init(void);
+extern void tcp_encap_enable(void);
 
 #endif	/* _TCP_H */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a25cf7..dadcec6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -62,6 +62,7 @@
 #include <linux/init.h>
 #include <linux/times.h>
 #include <linux/slab.h>
+#include <linux/static_key.h>
 
 #include <net/net_namespace.h>
 #include <net/icmp.h>
@@ -1657,6 +1658,14 @@ csum_err:
 }
 EXPORT_SYMBOL(tcp_v4_do_rcv);
 
+static struct static_key tcp_encap_needed __read_mostly;
+void tcp_encap_enable(void)
+{
+	if (!static_key_enabled(&tcp_encap_needed))
+		static_key_slow_inc(&tcp_encap_needed);
+}
+EXPORT_SYMBOL(tcp_encap_enable);
+
 /*
  *	From tcp_input.c
  */
@@ -1666,6 +1675,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	const struct iphdr *iph;
 	const struct tcphdr *th;
 	struct sock *sk;
+	struct tcp_sock *tp;
 	int ret;
 	struct net *net = dev_net(skb->dev);
 
@@ -1726,9 +1736,30 @@ process:
 
 	bh_lock_sock_nested(sk);
 	ret = 0;
+
+	tp = tcp_sk(sk);
+	if (static_key_false(&tcp_encap_needed)) {
+		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
+		encap_rcv = ACCESS_ONCE(tp->encap_rcv);
+		if (encap_rcv != NULL) {
+			/*
+			 * This is an encapsulation socket so pass the skb to
+			 * the socket's tcp_encap_rcv() hook. Otherwise, just
+			 * fall through and pass this up the TCP socket.
+			 * up->encap_rcv() returns the following value:
+			 * <=0 if skb was successfully passed to the encap
+			 *     handler or was discarded by it.
+			 * >0 if skb should be passed on to TCP.
+			 */
+			if (encap_rcv(sk, skb) <= 0) {
+				ret = 0;
+				goto unlock_sock;
+			}
+		}
+	}
+
 	if (!sock_owned_by_user(sk)) {
 #ifdef CONFIG_NET_DMA
-		struct tcp_sock *tp = tcp_sk(sk);
 		if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
 			tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
 		if (tp->ucopy.dma_chan)
@@ -1744,6 +1775,7 @@ process:
 		NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
 		goto discard_and_relse;
 	}
+unlock_sock:
 	bh_unlock_sock(sk);
 
 	sock_put(sk);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
  2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
  2012-04-12  9:10       ` Eric Dumazet
  2012-04-12 14:35       ` Simon Horman
@ 2012-04-13 17:41       ` David Miller
       [not found]         ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
  2 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2012-04-13 17:41 UTC (permalink / raw)
  To: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	bcrl-Bw31MaZKKs3YtjvyW6yDsg

From: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Thu, 12 Apr 2012 11:05:28 +0200

> Most machines dont use UDP encapsulation (L2TP)
> 
> Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
> test if L2TP never setup the encap_rcv on a socket.
> 
> Idea of this patch came after Simon Horman proposal to add a hook on TCP
> as well.
> 
> If static_key is not yet enabled, the fast path does a single JMP .
> 
> When static_key is enabled, JMP destination is patched to reach the real
> encap_type/encap_rcv logic, possibly adding cache misses.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Applied to net-next, thanks Eric.

Ban, please incorporate this scheme when you respin your
fixed ipv6 encap/l2tp patches.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] udp: intoduce udp_encap_needed static_key
       [not found]         ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
@ 2012-04-13 17:45           ` Benjamin LaHaise
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin LaHaise @ 2012-04-13 17:45 UTC (permalink / raw)
  To: David Miller
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w

Hi folks,

On Fri, Apr 13, 2012 at 01:41:08PM -0400, David Miller wrote:
> Ban, please incorporate this scheme when you respin your
> fixed ipv6 encap/l2tp patches.
> 
> Thanks.

Okay, will do.  Thanks for the feedback and heads up.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-04-13 17:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-12  7:42 [RFC v3] Add TCP encap_rcv hook Simon Horman
     [not found] ` <20120412074159.GA10866-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12  8:20   ` Eric Dumazet
2012-04-12  9:05     ` [PATCH net-next] udp: intoduce udp_encap_needed static_key Eric Dumazet
2012-04-12  9:10       ` Eric Dumazet
2012-04-12 14:35       ` Simon Horman
     [not found]         ` <20120412143552.GA8730-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2012-04-12 14:40           ` [RFC v4] Add TCP encap_rcv hook Simon Horman
2012-04-13 17:41       ` [PATCH net-next] udp: intoduce udp_encap_needed static_key David Miller
     [not found]         ` <20120413.134108.1844473866612154303.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2012-04-13 17:45           ` Benjamin LaHaise
2012-04-12 13:10     ` [RFC v3] Add TCP encap_rcv hook Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.