[PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
@ 2013-07-29 14:50 Hannes Frederic Sowa
  2013-07-30  8:21 ` Steffen Klassert
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-29 14:50 UTC (permalink / raw)
  To: netdev; +Cc: vi0oss, steffen.klassert

xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a
socket in case it is locally generated. If the packet first traversed
a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic
because of address family type mismatch in the error reporting functions.

This bug fixes only the panic part of the excellent bug report
<https://bugzilla.kernel.org/show_bug.cgi?id=58691> by vi0oss.

It is unclear to me how to let these mtu change events travel up the
stack. A match on (skb->sk && (skb->dev | tunnel_types)) could be made
first and we forcefully decrease the mtu of that interface with a warning?
This seems to be fragile to me.

Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv4/xfrm4_output.c | 4 ++--
 net/ipv6/xfrm6_output.c | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 327a617..61f6cae 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -32,10 +32,10 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
 	dst = skb_dst(skb);
 	mtu = dst_mtu(dst);
 	if (skb->len > mtu) {
-		if (skb->sk)
+		if (skb->sk && skb->sk->sk_family == AF_INET)
 			ip_local_error(skb->sk, EMSGSIZE, ip_hdr(skb)->daddr,
 				       inet_sk(skb->sk)->inet_dport, mtu);
-		else
+		else if (!skb->sk)
 			icmp_send(skb, ICMP_DEST_UNREACH,
 				  ICMP_FRAG_NEEDED, htonl(mtu));
 		ret = -EMSGSIZE;
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 8755a30..29f6db7 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -59,6 +59,9 @@ static void xfrm6_local_error(struct sk_buff *skb, u32 mtu)
 	struct flowi6 fl6;
 	struct sock *sk = skb->sk;
 
+	if (sk->sk_family != AF_INET6)
+		return;
+
 	fl6.fl6_dport = inet_sk(sk)->inet_dport;
 	fl6.daddr = ipv6_hdr(skb)->daddr;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-07-29 14:50 [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family Hannes Frederic Sowa
@ 2013-07-30  8:21 ` Steffen Klassert
  2013-07-30  8:30   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2013-07-30  8:21 UTC (permalink / raw)
  To: netdev, vi0oss, hannes

On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote:
> xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a
> socket in case it is locally generated. If the packet first traversed
> a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic
> because of address family type mismatch in the error reporting functions.
> 

So the skb is still owned by a socket of the inner address family.
Is this intentional? Maybe the ndo_start_xmit() function of the
tunnel device should orphan the skb if we tunnel the packet
through a different address family.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-07-30  8:21 ` Steffen Klassert
@ 2013-07-30  8:30   ` Hannes Frederic Sowa
  2013-07-30 10:26     ` Steffen Klassert
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-30  8:30 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, vi0oss

On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote:
> On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote:
> > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a
> > socket in case it is locally generated. If the packet first traversed
> > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic
> > because of address family type mismatch in the error reporting functions.
> > 
> 
> So the skb is still owned by a socket of the inner address family.
> Is this intentional? Maybe the ndo_start_xmit() function of the
> tunnel device should orphan the skb if we tunnel the packet
> through a different address family.

I thought about this, too. But then we would stop accounting the data
to the socket while it is travelling the stack. I don't know about the
possible problems resulting from this.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-07-30  8:30   ` Hannes Frederic Sowa
@ 2013-07-30 10:26     ` Steffen Klassert
  2013-07-30 10:40       ` Hannes Frederic Sowa
  2013-08-01  8:11       ` Hannes Frederic Sowa
  0 siblings, 2 replies; 16+ messages in thread
From: Steffen Klassert @ 2013-07-30 10:26 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, vi0oss

On Tue, Jul 30, 2013 at 10:30:17AM +0200, Hannes Frederic Sowa wrote:
> On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote:
> > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote:
> > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a
> > > socket in case it is locally generated. If the packet first traversed
> > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic
> > > because of address family type mismatch in the error reporting functions.
> > > 
> > 
> > So the skb is still owned by a socket of the inner address family.
> > Is this intentional? Maybe the ndo_start_xmit() function of the
> > tunnel device should orphan the skb if we tunnel the packet
> > through a different address family.
> 
> I thought about this, too. But then we would stop accounting the data
> to the socket while it is travelling the stack. I don't know about the
> possible problems resulting from this.
> 

I'm also not absolutely sure, but we reinsert the packet to
the ipv4/ipv6 output path which is also used to output forwarded
packets. So the code should be prepared for handling a skb without
socket context.

There are already situations where we orphan the skb in some
tunnel xmit functions. For example if we tunnel through
another namespace.

Lets see if anyone else has an opinion to that. I'll go
and look what we can do if we have to fix it in the IPsec
code in the meantime.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-07-30 10:26     ` Steffen Klassert
@ 2013-07-30 10:40       ` Hannes Frederic Sowa
  2013-07-31  0:01         ` David Miller
  2013-08-01  8:11       ` Hannes Frederic Sowa
  1 sibling, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-07-30 10:40 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, vi0oss

On Tue, Jul 30, 2013 at 12:26:11PM +0200, Steffen Klassert wrote:
> On Tue, Jul 30, 2013 at 10:30:17AM +0200, Hannes Frederic Sowa wrote:
> > On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote:
> > > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote:
> > > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a
> > > > socket in case it is locally generated. If the packet first traversed
> > > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic
> > > > because of address family type mismatch in the error reporting functions.
> > > > 
> > > 
> > > So the skb is still owned by a socket of the inner address family.
> > > Is this intentional? Maybe the ndo_start_xmit() function of the
> > > tunnel device should orphan the skb if we tunnel the packet
> > > through a different address family.
> > 
> > I thought about this, too. But then we would stop accounting the data
> > to the socket while it is travelling the stack. I don't know about the
> > possible problems resulting from this.
> > 
> 
> I'm also not absolutely sure, but we reinsert the packet to
> the ipv4/ipv6 output path which is also used to output forwarded
> packets. So the code should be prepared for handling a skb without
> socket context.
> 
> There are already situations where we orphan the skb in some
> tunnel xmit functions. For example if we tunnel through
> another namespace.

Somehow this seems the way to go.

Even if we get a matching address family socket we would still call
ipv6_local_error with the wrong fl6.daddr for that socket. I do think
IPv4 has the same issue, but I have not checked.

Because skbs could circulate multiple times through devices, a check as
(skb->dev->type | tunnel_types) seems to be not enough, too.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-07-30 10:40       ` Hannes Frederic Sowa
@ 2013-07-31  0:01         ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2013-07-31  0:01 UTC (permalink / raw)
  To: hannes; +Cc: steffen.klassert, netdev, vi0oss

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue, 30 Jul 2013 12:40:38 +0200

> On Tue, Jul 30, 2013 at 12:26:11PM +0200, Steffen Klassert wrote:
>> On Tue, Jul 30, 2013 at 10:30:17AM +0200, Hannes Frederic Sowa wrote:
>> > On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote:
>> > > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote:
>> > > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a
>> > > > socket in case it is locally generated. If the packet first traversed
>> > > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic
>> > > > because of address family type mismatch in the error reporting functions.
>> > > > 
>> > > 
>> > > So the skb is still owned by a socket of the inner address family.
>> > > Is this intentional? Maybe the ndo_start_xmit() function of the
>> > > tunnel device should orphan the skb if we tunnel the packet
>> > > through a different address family.
>> > 
>> > I thought about this, too. But then we would stop accounting the data
>> > to the socket while it is travelling the stack. I don't know about the
>> > possible problems resulting from this.
>> > 
>> 
>> I'm also not absolutely sure, but we reinsert the packet to
>> the ipv4/ipv6 output path which is also used to output forwarded
>> packets. So the code should be prepared for handling a skb without
>> socket context.
>> 
>> There are already situations where we orphan the skb in some
>> tunnel xmit functions. For example if we tunnel through
>> another namespace.
> 
> Somehow this seems the way to go.

Agreed, I think we should just orphan the SKB.

There was talk about this when the skb_scrub_packet() interface was
added, maybe we should do it unconditionally after all.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-07-30 10:26     ` Steffen Klassert
  2013-07-30 10:40       ` Hannes Frederic Sowa
@ 2013-08-01  8:11       ` Hannes Frederic Sowa
  2013-08-01 10:05         ` Steffen Klassert
  1 sibling, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-01  8:11 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, vi0oss

On Tue, Jul 30, 2013 at 12:26:11PM +0200, Steffen Klassert wrote:
> Lets see if anyone else has an opinion to that. I'll go
> and look what we can do if we have to fix it in the IPsec
> code in the meantime.

If you have not yet done so, I would try to find a solution over the weekend.

Thanks,

  Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-08-01  8:11       ` Hannes Frederic Sowa
@ 2013-08-01 10:05         ` Steffen Klassert
  2013-08-08 22:44           ` Hannes Frederic Sowa
  0 siblings, 1 reply; 16+ messages in thread
From: Steffen Klassert @ 2013-08-01 10:05 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, vi0oss

On Thu, Aug 01, 2013 at 10:11:50AM +0200, Hannes Frederic Sowa wrote:
> 
> If you have not yet done so, I would try to find a solution over the weekend.
> 

I have not yet done so, please go ahead.

Thanks a lot!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-08-01 10:05         ` Steffen Klassert
@ 2013-08-08 22:44           ` Hannes Frederic Sowa
  2013-08-08 22:57             ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-08 22:44 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev, vi0oss

On Thu, Aug 01, 2013 at 12:05:22PM +0200, Steffen Klassert wrote:
> On Thu, Aug 01, 2013 at 10:11:50AM +0200, Hannes Frederic Sowa wrote:
> > 
> > If you have not yet done so, I would try to find a solution over the weekend.
> > 
> 
> I have not yet done so, please go ahead.

Ok, this patch should do the trick. In xfrm6 error path we now generate
a plain icmpv6 packet back to us in this specific case vi0oss reported. I
wonder if we should suppress these.

Btw. is the memset(IPCB, 0) actually needed in the ipv4 output path?

[PATCH RFC] net: orphan socket when skb traverses tunnel interface

When a local generated packet traverses a tunnel the socket has to be
orphaned from the skb. Otherwise lower layer, like xfrm, try to report
errors directly to this socket. If, because of a transitioning tunnel,
the address family changes we deliver an invalid error or panic in the
error reporting functions.

Also add a call to secpath_reset() in the ipv6 tunnel xmit path. It
seems like it was just forgotten.

Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/ip6_tunnel.h  | 5 +++++
 net/ipv4/ip_tunnel_core.c | 1 +
 2 files changed, 6 insertions(+)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 4da5de1..92770de 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -6,6 +6,8 @@
 #include <linux/if_tunnel.h>
 #include <linux/ip6_tunnel.h>
 
+#include <net/xfrm.h>
+
 #define IP6TUNNEL_ERR_TIMEO (30*HZ)
 
 /* capable of sending packets */
@@ -74,7 +76,10 @@ static inline void ip6tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct net_device_stats *stats = &dev->stats;
 	int pkt_len, err;
 
+	skb_orphan(skb);
 	nf_reset(skb);
+	secpath_reset(skb);
+	skb->rxhash = 0;
 	pkt_len = skb->len;
 	err = ip6_local_out(skb);
 
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 7167b08..8a1a378 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -55,6 +55,7 @@ int iptunnel_xmit(struct net *net, struct rtable *rt,
 	struct iphdr *iph;
 	int err;
 
+	skb_orphan(skb);
 	nf_reset(skb);
 	secpath_reset(skb);
 	skb->rxhash = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-08-08 22:44           ` Hannes Frederic Sowa
@ 2013-08-08 22:57             ` Eric Dumazet
  2013-08-08 23:06               ` Hannes Frederic Sowa
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2013-08-08 22:57 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Steffen Klassert, netdev, vi0oss

On Fri, 2013-08-09 at 00:44 +0200, Hannes Frederic Sowa wrote:
> On Thu, Aug 01, 2013 at 12:05:22PM +0200, Steffen Klassert wrote:
> > On Thu, Aug 01, 2013 at 10:11:50AM +0200, Hannes Frederic Sowa wrote:
> > > 
> > > If you have not yet done so, I would try to find a solution over the weekend.
> > > 
> > 
> > I have not yet done so, please go ahead.
> 
> Ok, this patch should do the trick. In xfrm6 error path we now generate
> a plain icmpv6 packet back to us in this specific case vi0oss reported. I
> wonder if we should suppress these.
> 
> Btw. is the memset(IPCB, 0) actually needed in the ipv4 output path?
> 
> [PATCH RFC] net: orphan socket when skb traverses tunnel interface
> 
> When a local generated packet traverses a tunnel the socket has to be
> orphaned from the skb. Otherwise lower layer, like xfrm, try to report
> errors directly to this socket. If, because of a transitioning tunnel,
> the address family changes we deliver an invalid error or panic in the
> error reporting functions.
> 
> Also add a call to secpath_reset() in the ipv6 tunnel xmit path. It
> seems like it was just forgotten.
> 
> Reported-by: <vi0oss@gmail.com>
> Cc: Steffen Klassert <steffen.klassert@secunet.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---

Doing so breaks flow control.

A single socket can flood and fill the Qdisc, even a friendly one like a
local TCP flow (see TCP Small Queues)

Can't we make the error reporting more robust instead ?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-08-08 22:57             ` Eric Dumazet
@ 2013-08-08 23:06               ` Hannes Frederic Sowa
  2013-08-10 16:16                 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-08 23:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Steffen Klassert, netdev, vi0oss

On Thu, Aug 08, 2013 at 03:57:57PM -0700, Eric Dumazet wrote:
> On Fri, 2013-08-09 at 00:44 +0200, Hannes Frederic Sowa wrote:
> > On Thu, Aug 01, 2013 at 12:05:22PM +0200, Steffen Klassert wrote:
> > > On Thu, Aug 01, 2013 at 10:11:50AM +0200, Hannes Frederic Sowa wrote:
> > > > 
> > > > If you have not yet done so, I would try to find a solution over the weekend.
> > > > 
> > > 
> > > I have not yet done so, please go ahead.
> > 
> > Ok, this patch should do the trick. In xfrm6 error path we now generate
> > a plain icmpv6 packet back to us in this specific case vi0oss reported. I
> > wonder if we should suppress these.
> > 
> > Btw. is the memset(IPCB, 0) actually needed in the ipv4 output path?
> > 
> > [PATCH RFC] net: orphan socket when skb traverses tunnel interface
> > 
> > When a local generated packet traverses a tunnel the socket has to be
> > orphaned from the skb. Otherwise lower layer, like xfrm, try to report
> > errors directly to this socket. If, because of a transitioning tunnel,
> > the address family changes we deliver an invalid error or panic in the
> > error reporting functions.
> > 
> > Also add a call to secpath_reset() in the ipv6 tunnel xmit path. It
> > seems like it was just forgotten.
> > 
> > Reported-by: <vi0oss@gmail.com>
> > Cc: Steffen Klassert <steffen.klassert@secunet.com>
> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > ---
> 
> Doing so breaks flow control.
> 
> A single socket can flood and fill the Qdisc, even a friendly one like a
> local TCP flow (see TCP Small Queues)
> 
> Can't we make the error reporting more robust instead ?

Hm, I thought so. My other patch (some mails above in this thread) checks for
a switch in address family and does prohibit the panic, but generates wrong
error reports back to the socket if the address family does not switch (which
maybe get ignored).

I will check if the skb->encapsulated bit could help. Actually I don't
know what the correct behaviour for error reporting should be in the end,
maybe: dispatch packet back to tunnel interface, let tunnel interface
decapsulate the inner packet and use this inner header to inform the
original socket about the error.

Thanks,

  Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-08-08 23:06               ` Hannes Frederic Sowa
@ 2013-08-10 16:16                 ` Hannes Frederic Sowa
  2013-08-12  5:54                   ` Steffen Klassert
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-10 16:16 UTC (permalink / raw)
  To: Eric Dumazet, Steffen Klassert, netdev, vi0oss

On Fri, Aug 09, 2013 at 01:06:20AM +0200, Hannes Frederic Sowa wrote:
> I will check if the skb->encapsulated bit could help. Actually I don't
> know what the correct behaviour for error reporting should be in the end,
> maybe: dispatch packet back to tunnel interface, let tunnel interface
> decapsulate the inner packet and use this inner header to inform the
> original socket about the error.

Seems skb->encapsulated helps, but I still have to wire it up for the ipv6
tunnels.

I just prototyped this patch, but I fear I now introduced a dependency
from core xfrm to ipv6, which I would like to have prevented (this would
even happen if I put xfrm_local_error in a header file). Is this actually
a problem? I fear so. The other way would be to put the local_error
handler as function pointers somewhere reachable from struct sock.

[PATCH RFC] xfrm: make local error reporting more robust

In xfrm4 and xfrm6 we need to take care about sockets of the other
address family. This could happen because a 6in4 or 4in6 tunnel could
get protected by ipsec.

If the skb->encapsulation bit is set, use the addresses of the inner ip
header instead of the outer one. This is currently working for IPv4 and
will be wired up for IPv6 in a future patch.

(intentionally removed signed-off)

Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/xfrm.h      |  1 +
 net/ipv4/xfrm4_output.c |  3 +--
 net/ipv6/xfrm6_output.c | 18 +++++-------------
 net/xfrm/xfrm_output.c  | 26 ++++++++++++++++++++++++++
 4 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 94ce082..11b73cb 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1476,6 +1476,7 @@ extern int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi,
 extern int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
 extern int xfrm_output_resume(struct sk_buff *skb, int err);
 extern int xfrm_output(struct sk_buff *skb);
+extern void xfrm_local_error(struct sk_buff *skb, unsigned int mtu);
 extern int xfrm_inner_extract_output(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm4_extract_header(struct sk_buff *skb);
 extern int xfrm4_extract_input(struct xfrm_state *x, struct sk_buff *skb);
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 327a617..cfc386f 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -33,8 +33,7 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
 	mtu = dst_mtu(dst);
 	if (skb->len > mtu) {
 		if (skb->sk)
-			ip_local_error(skb->sk, EMSGSIZE, ip_hdr(skb)->daddr,
-				       inet_sk(skb->sk)->inet_dport, mtu);
+			xfrm_local_error(skb, mtu);
 		else
 			icmp_send(skb, ICMP_DEST_UNREACH,
 				  ICMP_FRAG_NEEDED, htonl(mtu));
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 8755a30..7970965 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -48,23 +48,15 @@ static void xfrm6_local_rxpmtu(struct sk_buff *skb, u32 mtu)
 	struct flowi6 fl6;
 	struct sock *sk = skb->sk;
 
+	if (sk->sk_family != AF_INET6)
+		return;
+
 	fl6.flowi6_oif = sk->sk_bound_dev_if;
 	fl6.daddr = ipv6_hdr(skb)->daddr;
 
 	ipv6_local_rxpmtu(sk, &fl6, mtu);
 }
 
-static void xfrm6_local_error(struct sk_buff *skb, u32 mtu)
-{
-	struct flowi6 fl6;
-	struct sock *sk = skb->sk;
-
-	fl6.fl6_dport = inet_sk(sk)->inet_dport;
-	fl6.daddr = ipv6_hdr(skb)->daddr;
-
-	ipv6_local_error(sk, EMSGSIZE, &fl6, mtu);
-}
-
 static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 {
 	int mtu, ret = 0;
@@ -80,7 +72,7 @@ static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 		if (xfrm6_local_dontfrag(skb))
 			xfrm6_local_rxpmtu(skb, mtu);
 		else if (skb->sk)
-			xfrm6_local_error(skb, mtu);
+			xfrm_local_error(skb, mtu);
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		ret = -EMSGSIZE;
@@ -142,7 +134,7 @@ static int __xfrm6_output(struct sk_buff *skb)
 		xfrm6_local_rxpmtu(skb, mtu);
 		return -EMSGSIZE;
 	} else if (!skb->local_df && skb->len > mtu && skb->sk) {
-		xfrm6_local_error(skb, mtu);
+		xfrm_local_error(skb, mtu);
 		return -EMSGSIZE;
 	}
 
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index eb4a842..2a16cb8 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -214,5 +214,31 @@ int xfrm_inner_extract_output(struct xfrm_state *x, struct sk_buff *skb)
 	return inner_mode->afinfo->extract_output(x, skb);
 }
 
+void xfrm_local_error(struct sk_buff *skb, unsigned int mtu)
+{
+	switch (skb->sk->sk_family) {
+	case AF_INET: {
+		struct iphdr *hdr = skb->encapsulation ? inner_ip_hdr(skb) :
+			ip_hdr(skb);
+		ip_local_error(skb->sk, EMSGSIZE, hdr->daddr,
+			       inet_sk(skb->sk)->inet_dport, mtu);
+		return;
+	}
+#if IS_ENABLED(CONFIG_IPV6)
+	case AF_INET6: {
+		struct flowi6 fl6;
+		struct ipv6hdr *hdr = skb->encapsulation ? inner_ipv6_hdr(skb) :
+			inner_ipv6_hdr(skb);
+		fl6.fl6_dport = inet_sk(skb->sk)->inet_dport;
+		fl6.daddr = hdr->daddr;
+		ipv6_local_error(skb->sk, EMSGSIZE, &fl6, mtu);
+		return;
+	}
+#endif
+	}
+	BUG();
+}
+
 EXPORT_SYMBOL_GPL(xfrm_output);
 EXPORT_SYMBOL_GPL(xfrm_inner_extract_output);
+EXPORT_SYMBOL_GPL(xfrm_local_error);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family
  2013-08-10 16:16                 ` Hannes Frederic Sowa
@ 2013-08-12  5:54                   ` Steffen Klassert
  2013-08-13  0:48                     ` [PATCH net-next] xfrm: make local error reporting more robust Hannes Frederic Sowa
  2013-08-13  1:56                     ` [PATCH net-next v2] " Hannes Frederic Sowa
  0 siblings, 2 replies; 16+ messages in thread
From: Steffen Klassert @ 2013-08-12  5:54 UTC (permalink / raw)
  To: Eric Dumazet, netdev, vi0oss

On Sat, Aug 10, 2013 at 06:16:29PM +0200, Hannes Frederic Sowa wrote:
> 
> Seems skb->encapsulated helps, but I still have to wire it up for the ipv6
> tunnels.
> 
> I just prototyped this patch, but I fear I now introduced a dependency
> from core xfrm to ipv6, which I would like to have prevented (this would
> even happen if I put xfrm_local_error in a header file). Is this actually
> a problem? I fear so. The other way would be to put the local_error
> handler as function pointers somewhere reachable from struct sock.
> 

Maybe we should put a local_error() function pointer to struct
xfrm_state_afinfo and call it via inner_mode->afinfo->local_error().

This should always call the right local_error function and we
would not need to touch generic networking code to fix it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next] xfrm: make local error reporting more robust
  2013-08-12  5:54                   ` Steffen Klassert
@ 2013-08-13  0:48                     ` Hannes Frederic Sowa
  2013-08-13  1:56                     ` [PATCH net-next v2] " Hannes Frederic Sowa
  1 sibling, 0 replies; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-13  0:48 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Eric Dumazet, netdev, vi0oss

On Mon, Aug 12, 2013 at 07:54:14AM +0200, Steffen Klassert wrote:
> On Sat, Aug 10, 2013 at 06:16:29PM +0200, Hannes Frederic Sowa wrote:
> > 
> > Seems skb->encapsulated helps, but I still have to wire it up for the ipv6
> > tunnels.
> > 
> > I just prototyped this patch, but I fear I now introduced a dependency
> > from core xfrm to ipv6, which I would like to have prevented (this would
> > even happen if I put xfrm_local_error in a header file). Is this actually
> > a problem? I fear so. The other way would be to put the local_error
> > handler as function pointers somewhere reachable from struct sock.
> > 
> 
> Maybe we should put a local_error() function pointer to struct
> xfrm_state_afinfo and call it via inner_mode->afinfo->local_error().
> 
> This should always call the right local_error function and we
> would not need to touch generic networking code to fix it.

afinfo was a great hint. But I could not use inner_mode, because it
would still point to the wrong afinfo instance.

I tested this patch with vi0ss's great test script:
<https://gist.github.com/vi/5640512>

I will look at the skb->encapsulation for IPv6 logic again, as soon
as I understand the side effects when I introduce this. This will then
correct the destination address when sending back ipv6 errors.

Steffen, if you are fine with this patch, do you think it is a candidate
for stable?

[PATCH net-next] xfrm: make local error reporting more robust

In xfrm4 and xfrm6 we need to take care about sockets of the other
address family. This could happen because a 6in4 or 4in6 tunnel could
get protected by ipsec.

Because we don't want to have a run-time dependency on ipv6 when only
using ipv4 xfrm we have to embed a pointer to the correct local_error
function in xfrm_state_afinet and look it up when returning an error
depending on the socket address family.

Thanks to vi0ss for the great bug report:
<https://bugzilla.kernel.org/show_bug.cgi?id=58691>

Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/xfrm.h      |  6 ++++++
 net/ipv4/xfrm4_output.c | 12 ++++++++++--
 net/ipv4/xfrm4_state.c  |  1 +
 net/ipv6/xfrm6_output.c |  4 ++--
 net/ipv6/xfrm6_state.c  |  1 +
 net/xfrm/xfrm_output.c  | 12 ++++++++++++
 net/xfrm/xfrm_state.c   |  7 ++-----
 7 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 94ce082..e823786 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -341,10 +341,13 @@ struct xfrm_state_afinfo {
 						  struct sk_buff *skb);
 	int			(*transport_finish)(struct sk_buff *skb,
 						    int async);
+	void			(*local_error)(struct sk_buff *skb, u32 mtu);
 };
 
 extern int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo);
 extern int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo);
+extern struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family);
+extern void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo);
 
 extern void xfrm_state_delete_tunnel(struct xfrm_state *x);
 
@@ -1477,6 +1480,7 @@ extern int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
 extern int xfrm_output_resume(struct sk_buff *skb, int err);
 extern int xfrm_output(struct sk_buff *skb);
 extern int xfrm_inner_extract_output(struct xfrm_state *x, struct sk_buff *skb);
+extern void xfrm_local_error(struct sk_buff *skb, int mtu);
 extern int xfrm4_extract_header(struct sk_buff *skb);
 extern int xfrm4_extract_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
@@ -1497,6 +1501,7 @@ extern int xfrm4_tunnel_register(struct xfrm_tunnel *handler, unsigned short fam
 extern int xfrm4_tunnel_deregister(struct xfrm_tunnel *handler, unsigned short family);
 extern int xfrm4_mode_tunnel_input_register(struct xfrm_tunnel *handler);
 extern int xfrm4_mode_tunnel_input_deregister(struct xfrm_tunnel *handler);
+extern void xfrm4_local_error(struct sk_buff *skb, u32 mtu);
 extern int xfrm6_extract_header(struct sk_buff *skb);
 extern int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi);
@@ -1514,6 +1519,7 @@ extern int xfrm6_output(struct sk_buff *skb);
 extern int xfrm6_output_finish(struct sk_buff *skb);
 extern int xfrm6_find_1stfragopt(struct xfrm_state *x, struct sk_buff *skb,
 				 u8 **prevhdr);
+extern void xfrm6_local_error(struct sk_buff *skb, u32 mtu);
 
 #ifdef CONFIG_XFRM
 extern int xfrm4_udp_encap_rcv(struct sock *sk, struct sk_buff *skb);
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 327a617..7a5491f 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -33,8 +33,7 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
 	mtu = dst_mtu(dst);
 	if (skb->len > mtu) {
 		if (skb->sk)
-			ip_local_error(skb->sk, EMSGSIZE, ip_hdr(skb)->daddr,
-				       inet_sk(skb->sk)->inet_dport, mtu);
+			xfrm_local_error(skb, mtu);
 		else
 			icmp_send(skb, ICMP_DEST_UNREACH,
 				  ICMP_FRAG_NEEDED, htonl(mtu));
@@ -99,3 +98,12 @@ int xfrm4_output(struct sk_buff *skb)
 			    x->outer_mode->afinfo->output_finish,
 			    !(IPCB(skb)->flags & IPSKB_REROUTED));
 }
+
+void xfrm4_local_error(struct sk_buff *skb, u32 mtu)
+{
+	struct iphdr *hdr;
+
+	hdr = skb->encapsulation ? inner_ip_hdr(skb) : ip_hdr(skb);
+	ip_local_error(skb->sk, EMSGSIZE, hdr->daddr,
+		       inet_sk(skb->sk)->inet_dport, mtu);
+}
diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c
index 9258e75..0b2a064 100644
--- a/net/ipv4/xfrm4_state.c
+++ b/net/ipv4/xfrm4_state.c
@@ -83,6 +83,7 @@ static struct xfrm_state_afinfo xfrm4_state_afinfo = {
 	.extract_input		= xfrm4_extract_input,
 	.extract_output		= xfrm4_extract_output,
 	.transport_finish	= xfrm4_transport_finish,
+	.local_error		= xfrm4_local_error,
 };
 
 void __init xfrm4_state_init(void)
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 8755a30..f6f9110 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -54,7 +54,7 @@ static void xfrm6_local_rxpmtu(struct sk_buff *skb, u32 mtu)
 	ipv6_local_rxpmtu(sk, &fl6, mtu);
 }
 
-static void xfrm6_local_error(struct sk_buff *skb, u32 mtu)
+void xfrm6_local_error(struct sk_buff *skb, u32 mtu)
 {
 	struct flowi6 fl6;
 	struct sock *sk = skb->sk;
@@ -80,7 +80,7 @@ static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 		if (xfrm6_local_dontfrag(skb))
 			xfrm6_local_rxpmtu(skb, mtu);
 		else if (skb->sk)
-			xfrm6_local_error(skb, mtu);
+			xfrm_local_error(skb, mtu);
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		ret = -EMSGSIZE;
diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index d8c70b8..3fc9701 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -183,6 +183,7 @@ static struct xfrm_state_afinfo xfrm6_state_afinfo = {
 	.extract_input		= xfrm6_extract_input,
 	.extract_output		= xfrm6_extract_output,
 	.transport_finish	= xfrm6_transport_finish,
+	.local_error		= xfrm6_local_error,
 };
 
 int __init xfrm6_state_init(void)
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index eb4a842..c445a2f 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -214,5 +214,17 @@ int xfrm_inner_extract_output(struct xfrm_state *x, struct sk_buff *skb)
 	return inner_mode->afinfo->extract_output(x, skb);
 }
 
+void xfrm_local_error(struct sk_buff *skb, int mtu)
+{
+	struct xfrm_state_afinfo *afinfo;
+
+	afinfo = xfrm_state_get_afinfo(skb->sk->sk_family);
+	if (!afinfo)
+		return;
+
+	afinfo->local_error(skb, mtu);
+	xfrm_state_put_afinfo(afinfo);
+}
+
 EXPORT_SYMBOL_GPL(xfrm_output);
 EXPORT_SYMBOL_GPL(xfrm_inner_extract_output);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 78f66fa..54c0acd 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -39,9 +39,6 @@ static DEFINE_SPINLOCK(xfrm_state_lock);
 
 static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;
 
-static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family);
-static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo);
-
 static inline unsigned int xfrm_dst_hash(struct net *net,
 					 const xfrm_address_t *daddr,
 					 const xfrm_address_t *saddr,
@@ -1860,7 +1857,7 @@ int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo)
 }
 EXPORT_SYMBOL(xfrm_state_unregister_afinfo);
 
-static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
+struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
 {
 	struct xfrm_state_afinfo *afinfo;
 	if (unlikely(family >= NPROTO))
@@ -1872,7 +1869,7 @@ static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
 	return afinfo;
 }
 
-static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo)
+void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo)
 {
 	rcu_read_unlock();
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v2] xfrm: make local error reporting more robust
  2013-08-12  5:54                   ` Steffen Klassert
  2013-08-13  0:48                     ` [PATCH net-next] xfrm: make local error reporting more robust Hannes Frederic Sowa
@ 2013-08-13  1:56                     ` Hannes Frederic Sowa
  2013-08-13 23:35                       ` David Miller
  1 sibling, 1 reply; 16+ messages in thread
From: Hannes Frederic Sowa @ 2013-08-13  1:56 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Eric Dumazet, netdev, vi0oss

On Mon, Aug 12, 2013 at 07:54:14AM +0200, Steffen Klassert wrote:
> On Sat, Aug 10, 2013 at 06:16:29PM +0200, Hannes Frederic Sowa wrote:
> > 
> > Seems skb->encapsulated helps, but I still have to wire it up for the ipv6
> > tunnels.
> > 
> > I just prototyped this patch, but I fear I now introduced a dependency
> > from core xfrm to ipv6, which I would like to have prevented (this would
> > even happen if I put xfrm_local_error in a header file). Is this actually
> > a problem? I fear so. The other way would be to put the local_error
> > handler as function pointers somewhere reachable from struct sock.
> > 
> 
> Maybe we should put a local_error() function pointer to struct
> xfrm_state_afinfo and call it via inner_mode->afinfo->local_error().
> 
> This should always call the right local_error function and we
> would not need to touch generic networking code to fix it.

Sorry, had to do a v2, because I missed two more unsafe skb->sk dereferences.
I will post a further one (unsafe determination of mtu) as a seperate patch
(needs its own commit message).

[PATCH net-next v2] xfrm: make local error reporting more robust

In xfrm4 and xfrm6 we need to take care about sockets of the other
address family. This could happen because a 6in4 or 4in6 tunnel could
get protected by ipsec.

Because we don't want to have a run-time dependency on ipv6 when only
using ipv4 xfrm we have to embed a pointer to the correct local_error
function in xfrm_state_afinet and look it up when returning an error
depending on the socket address family.

Thanks to vi0ss for the great bug report:
<https://bugzilla.kernel.org/show_bug.cgi?id=58691>

v2:
a) fix two more unsafe interpretations of skb->sk as ipv6 socket
   (xfrm6_local_dontfrag and __xfrm6_output)

Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/xfrm.h      |  6 ++++++
 net/ipv4/xfrm4_output.c | 12 ++++++++++--
 net/ipv4/xfrm4_state.c  |  1 +
 net/ipv6/xfrm6_output.c | 10 ++++++----
 net/ipv6/xfrm6_state.c  |  1 +
 net/xfrm/xfrm_output.c  | 12 ++++++++++++
 net/xfrm/xfrm_state.c   |  7 ++-----
 7 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 94ce082..e823786 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -341,10 +341,13 @@ struct xfrm_state_afinfo {
 						  struct sk_buff *skb);
 	int			(*transport_finish)(struct sk_buff *skb,
 						    int async);
+	void			(*local_error)(struct sk_buff *skb, u32 mtu);
 };
 
 extern int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo);
 extern int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo);
+extern struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family);
+extern void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo);
 
 extern void xfrm_state_delete_tunnel(struct xfrm_state *x);
 
@@ -1477,6 +1480,7 @@ extern int xfrm_input_resume(struct sk_buff *skb, int nexthdr);
 extern int xfrm_output_resume(struct sk_buff *skb, int err);
 extern int xfrm_output(struct sk_buff *skb);
 extern int xfrm_inner_extract_output(struct xfrm_state *x, struct sk_buff *skb);
+extern void xfrm_local_error(struct sk_buff *skb, int mtu);
 extern int xfrm4_extract_header(struct sk_buff *skb);
 extern int xfrm4_extract_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
@@ -1497,6 +1501,7 @@ extern int xfrm4_tunnel_register(struct xfrm_tunnel *handler, unsigned short fam
 extern int xfrm4_tunnel_deregister(struct xfrm_tunnel *handler, unsigned short family);
 extern int xfrm4_mode_tunnel_input_register(struct xfrm_tunnel *handler);
 extern int xfrm4_mode_tunnel_input_deregister(struct xfrm_tunnel *handler);
+extern void xfrm4_local_error(struct sk_buff *skb, u32 mtu);
 extern int xfrm6_extract_header(struct sk_buff *skb);
 extern int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi);
@@ -1514,6 +1519,7 @@ extern int xfrm6_output(struct sk_buff *skb);
 extern int xfrm6_output_finish(struct sk_buff *skb);
 extern int xfrm6_find_1stfragopt(struct xfrm_state *x, struct sk_buff *skb,
 				 u8 **prevhdr);
+extern void xfrm6_local_error(struct sk_buff *skb, u32 mtu);
 
 #ifdef CONFIG_XFRM
 extern int xfrm4_udp_encap_rcv(struct sock *sk, struct sk_buff *skb);
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 327a617..7a5491f 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -33,8 +33,7 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb)
 	mtu = dst_mtu(dst);
 	if (skb->len > mtu) {
 		if (skb->sk)
-			ip_local_error(skb->sk, EMSGSIZE, ip_hdr(skb)->daddr,
-				       inet_sk(skb->sk)->inet_dport, mtu);
+			xfrm_local_error(skb, mtu);
 		else
 			icmp_send(skb, ICMP_DEST_UNREACH,
 				  ICMP_FRAG_NEEDED, htonl(mtu));
@@ -99,3 +98,12 @@ int xfrm4_output(struct sk_buff *skb)
 			    x->outer_mode->afinfo->output_finish,
 			    !(IPCB(skb)->flags & IPSKB_REROUTED));
 }
+
+void xfrm4_local_error(struct sk_buff *skb, u32 mtu)
+{
+	struct iphdr *hdr;
+
+	hdr = skb->encapsulation ? inner_ip_hdr(skb) : ip_hdr(skb);
+	ip_local_error(skb->sk, EMSGSIZE, hdr->daddr,
+		       inet_sk(skb->sk)->inet_dport, mtu);
+}
diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c
index 9258e75..0b2a064 100644
--- a/net/ipv4/xfrm4_state.c
+++ b/net/ipv4/xfrm4_state.c
@@ -83,6 +83,7 @@ static struct xfrm_state_afinfo xfrm4_state_afinfo = {
 	.extract_input		= xfrm4_extract_input,
 	.extract_output		= xfrm4_extract_output,
 	.transport_finish	= xfrm4_transport_finish,
+	.local_error		= xfrm4_local_error,
 };
 
 void __init xfrm4_state_init(void)
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 8755a30..b64fff3 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -34,8 +34,10 @@ static int xfrm6_local_dontfrag(struct sk_buff *skb)
 	struct sock *sk = skb->sk;
 
 	if (sk) {
-		proto = sk->sk_protocol;
+		if (sk->sk_family != AF_INET6)
+			return 0;
 
+		proto = sk->sk_protocol;
 		if (proto == IPPROTO_UDP || proto == IPPROTO_RAW)
 			return inet6_sk(sk)->dontfrag;
 	}
@@ -54,7 +56,7 @@ static void xfrm6_local_rxpmtu(struct sk_buff *skb, u32 mtu)
 	ipv6_local_rxpmtu(sk, &fl6, mtu);
 }
 
-static void xfrm6_local_error(struct sk_buff *skb, u32 mtu)
+void xfrm6_local_error(struct sk_buff *skb, u32 mtu)
 {
 	struct flowi6 fl6;
 	struct sock *sk = skb->sk;
@@ -80,7 +82,7 @@ static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 		if (xfrm6_local_dontfrag(skb))
 			xfrm6_local_rxpmtu(skb, mtu);
 		else if (skb->sk)
-			xfrm6_local_error(skb, mtu);
+			xfrm_local_error(skb, mtu);
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		ret = -EMSGSIZE;
@@ -142,7 +144,7 @@ static int __xfrm6_output(struct sk_buff *skb)
 		xfrm6_local_rxpmtu(skb, mtu);
 		return -EMSGSIZE;
 	} else if (!skb->local_df && skb->len > mtu && skb->sk) {
-		xfrm6_local_error(skb, mtu);
+		xfrm_local_error(skb, mtu);
 		return -EMSGSIZE;
 	}
 
diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index d8c70b8..3fc9701 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -183,6 +183,7 @@ static struct xfrm_state_afinfo xfrm6_state_afinfo = {
 	.extract_input		= xfrm6_extract_input,
 	.extract_output		= xfrm6_extract_output,
 	.transport_finish	= xfrm6_transport_finish,
+	.local_error		= xfrm6_local_error,
 };
 
 int __init xfrm6_state_init(void)
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index eb4a842..c445a2f 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -214,5 +214,17 @@ int xfrm_inner_extract_output(struct xfrm_state *x, struct sk_buff *skb)
 	return inner_mode->afinfo->extract_output(x, skb);
 }
 
+void xfrm_local_error(struct sk_buff *skb, int mtu)
+{
+	struct xfrm_state_afinfo *afinfo;
+
+	afinfo = xfrm_state_get_afinfo(skb->sk->sk_family);
+	if (!afinfo)
+		return;
+
+	afinfo->local_error(skb, mtu);
+	xfrm_state_put_afinfo(afinfo);
+}
+
 EXPORT_SYMBOL_GPL(xfrm_output);
 EXPORT_SYMBOL_GPL(xfrm_inner_extract_output);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 78f66fa..54c0acd 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -39,9 +39,6 @@ static DEFINE_SPINLOCK(xfrm_state_lock);
 
 static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;
 
-static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family);
-static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo);
-
 static inline unsigned int xfrm_dst_hash(struct net *net,
 					 const xfrm_address_t *daddr,
 					 const xfrm_address_t *saddr,
@@ -1860,7 +1857,7 @@ int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo)
 }
 EXPORT_SYMBOL(xfrm_state_unregister_afinfo);
 
-static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
+struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
 {
 	struct xfrm_state_afinfo *afinfo;
 	if (unlikely(family >= NPROTO))
@@ -1872,7 +1869,7 @@ static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
 	return afinfo;
 }
 
-static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo)
+void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo)
 {
 	rcu_read_unlock();
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2] xfrm: make local error reporting more robust
  2013-08-13  1:56                     ` [PATCH net-next v2] " Hannes Frederic Sowa
@ 2013-08-13 23:35                       ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2013-08-13 23:35 UTC (permalink / raw)
  To: hannes; +Cc: steffen.klassert, eric.dumazet, netdev, vi0oss

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue, 13 Aug 2013 03:56:44 +0200

> On Mon, Aug 12, 2013 at 07:54:14AM +0200, Steffen Klassert wrote:
>> On Sat, Aug 10, 2013 at 06:16:29PM +0200, Hannes Frederic Sowa wrote:
>> > 
>> > Seems skb->encapsulated helps, but I still have to wire it up for the ipv6
>> > tunnels.
>> > 
>> > I just prototyped this patch, but I fear I now introduced a dependency
>> > from core xfrm to ipv6, which I would like to have prevented (this would
>> > even happen if I put xfrm_local_error in a header file). Is this actually
>> > a problem? I fear so. The other way would be to put the local_error
>> > handler as function pointers somewhere reachable from struct sock.
>> > 
>> 
>> Maybe we should put a local_error() function pointer to struct
>> xfrm_state_afinfo and call it via inner_mode->afinfo->local_error().
>> 
>> This should always call the right local_error function and we
>> would not need to touch generic networking code to fix it.
> 
> Sorry, had to do a v2, because I missed two more unsafe skb->sk dereferences.
> I will post a further one (unsafe determination of mtu) as a seperate patch
> (needs its own commit message).
> 
> [PATCH net-next v2] xfrm: make local error reporting more robust

FWIW, this looks fine to me, and I hope Steffen will take care of
it soon.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-08-13 23:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-29 14:50 [PATCH RFC] xfrm{4,6}: only report errors back to local sockets if we don't cross address family Hannes Frederic Sowa
2013-07-30  8:21 ` Steffen Klassert
2013-07-30  8:30   ` Hannes Frederic Sowa
2013-07-30 10:26     ` Steffen Klassert
2013-07-30 10:40       ` Hannes Frederic Sowa
2013-07-31  0:01         ` David Miller
2013-08-01  8:11       ` Hannes Frederic Sowa
2013-08-01 10:05         ` Steffen Klassert
2013-08-08 22:44           ` Hannes Frederic Sowa
2013-08-08 22:57             ` Eric Dumazet
2013-08-08 23:06               ` Hannes Frederic Sowa
2013-08-10 16:16                 ` Hannes Frederic Sowa
2013-08-12  5:54                   ` Steffen Klassert
2013-08-13  0:48                     ` [PATCH net-next] xfrm: make local error reporting more robust Hannes Frederic Sowa
2013-08-13  1:56                     ` [PATCH net-next v2] " Hannes Frederic Sowa
2013-08-13 23:35                       ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).