All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ipv4: pmtu fixes
@ 2012-10-08  8:46 Steffen Klassert
  2012-10-08  8:47 ` [PATCH 1/3] ipv4: Always invalidate or update the route on pmtu events Steffen Klassert
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Steffen Klassert @ 2012-10-08  8:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

This patchset fixes some issues that came with the routing cache removal.

1) IPsec and others (udp, ipvs) may cache output routes, these routes
need to be invalidated on pmtu events in the same way e.g. tcp socket
cached routes are invalidated. With this we always invalidate or update
(if we already use a nh exeption route) the old route on pmtu events. 

This has the drawback that we may needlessly invalidate an uncached route,
but this fixes all the users that cache routes and pmtu events are rare, so
this should not be a real issue.

2) We create nh exeptions if a user (e.g. tracepath) tries to do pmtu
dicsovery with packets bigger than the output device mtu. The device mtu
is not learned and does not expire, so don't create an exeption route.

3) We report cached pmtu values to userspace even if they are expired.
Fix this by checking for expiration before we report.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] ipv4: Always invalidate or update the route on pmtu events
  2012-10-08  8:46 [PATCH 0/3] ipv4: pmtu fixes Steffen Klassert
@ 2012-10-08  8:47 ` Steffen Klassert
  2012-10-08  8:48 ` [PATCH 2/3] ipv4: Don't create nh exeption when the device mtu is smaller than the reported pmtu Steffen Klassert
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2012-10-08  8:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Some protocols, like IPsec still cache routes. So we need to invalidate
the old route on pmtu events to avoid the reuse of stale routes.
We also need to update the mtu and expire time of the route if we already
use a nh exeption route, otherwise we ignore newly learned pmtu values
after the first expiration.

With this patch we always invalidate or update the route on pmtu events.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/route.c |   22 +++++++++++-----------
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index ff62206..90ba835 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -904,22 +904,29 @@ out:	kfree_skb(skb);
 	return 0;
 }
 
-static u32 __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu)
+static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu)
 {
+	struct dst_entry *dst = &rt->dst;
 	struct fib_result res;
 
 	if (mtu < ip_rt_min_pmtu)
 		mtu = ip_rt_min_pmtu;
 
+	if (!rt->rt_pmtu) {
+		dst->obsolete = DST_OBSOLETE_KILL;
+	} else {
+		rt->rt_pmtu = mtu;
+		dst->expires = max(1UL, jiffies + ip_rt_mtu_expires);
+	}
+
 	rcu_read_lock();
-	if (fib_lookup(dev_net(rt->dst.dev), fl4, &res) == 0) {
+	if (fib_lookup(dev_net(dst->dev), fl4, &res) == 0) {
 		struct fib_nh *nh = &FIB_RES_NH(res);
 
 		update_or_create_fnhe(nh, fl4->daddr, 0, mtu,
 				      jiffies + ip_rt_mtu_expires);
 	}
 	rcu_read_unlock();
-	return mtu;
 }
 
 static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
@@ -929,14 +936,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
 	struct flowi4 fl4;
 
 	ip_rt_build_flow_key(&fl4, sk, skb);
-	mtu = __ip_rt_update_pmtu(rt, &fl4, mtu);
-
-	if (!rt->rt_pmtu) {
-		dst->obsolete = DST_OBSOLETE_KILL;
-	} else {
-		rt->rt_pmtu = mtu;
-		rt->dst.expires = max(1UL, jiffies + ip_rt_mtu_expires);
-	}
+	__ip_rt_update_pmtu(rt, &fl4, mtu);
 }
 
 void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu,
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] ipv4: Don't create nh exeption when the device mtu is smaller than the reported pmtu
  2012-10-08  8:46 [PATCH 0/3] ipv4: pmtu fixes Steffen Klassert
  2012-10-08  8:47 ` [PATCH 1/3] ipv4: Always invalidate or update the route on pmtu events Steffen Klassert
@ 2012-10-08  8:48 ` Steffen Klassert
  2012-10-08  8:48 ` [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace Steffen Klassert
  2012-10-08 18:47 ` [PATCH 0/3] ipv4: pmtu fixes David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2012-10-08  8:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

When a local tool like tracepath tries to send packets bigger than
the device mtu, we create a nh exeption and set the pmtu to device
mtu. The device mtu does not expire, so check if the device mtu is
smaller than the reported pmtu and don't crerate a nh exeption in
that case.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/route.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 90ba835..741df67 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -909,6 +909,9 @@ static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu)
 	struct dst_entry *dst = &rt->dst;
 	struct fib_result res;
 
+	if (dst->dev->mtu < mtu)
+		return;
+
 	if (mtu < ip_rt_min_pmtu)
 		mtu = ip_rt_min_pmtu;
 
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace
  2012-10-08  8:46 [PATCH 0/3] ipv4: pmtu fixes Steffen Klassert
  2012-10-08  8:47 ` [PATCH 1/3] ipv4: Always invalidate or update the route on pmtu events Steffen Klassert
  2012-10-08  8:48 ` [PATCH 2/3] ipv4: Don't create nh exeption when the device mtu is smaller than the reported pmtu Steffen Klassert
@ 2012-10-08  8:48 ` Steffen Klassert
  2012-10-08  9:55   ` Eric Dumazet
  2012-10-08 18:47 ` [PATCH 0/3] ipv4: pmtu fixes David Miller
  3 siblings, 1 reply; 8+ messages in thread
From: Steffen Klassert @ 2012-10-08  8:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

We report cached pmtu values even if they are already expired.
Change this to not report these values after they are expired.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/route.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 741df67..24b52dd 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2187,8 +2187,16 @@ static int rt_fill_info(struct net *net,  __be32 dst, __be32 src,
 	    nla_put_be32(skb, RTA_GATEWAY, rt->rt_gateway))
 		goto nla_put_failure;
 
+	expires = rt->dst.expires;
+	if (expires) {
+		if (time_before(jiffies, expires))
+			expires -= jiffies;
+		else
+			expires = 0;
+	}
+
 	memcpy(metrics, dst_metrics_ptr(&rt->dst), sizeof(metrics));
-	if (rt->rt_pmtu)
+	if (rt->rt_pmtu && expires)
 		metrics[RTAX_MTU - 1] = rt->rt_pmtu;
 	if (rtnetlink_put_metrics(skb, metrics) < 0)
 		goto nla_put_failure;
@@ -2198,13 +2206,6 @@ static int rt_fill_info(struct net *net,  __be32 dst, __be32 src,
 		goto nla_put_failure;
 
 	error = rt->dst.error;
-	expires = rt->dst.expires;
-	if (expires) {
-		if (time_before(jiffies, expires))
-			expires -= jiffies;
-		else
-			expires = 0;
-	}
 
 	if (rt_is_input_route(rt)) {
 		if (nla_put_u32(skb, RTA_IIF, rt->rt_iif))
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace
  2012-10-08  8:48 ` [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace Steffen Klassert
@ 2012-10-08  9:55   ` Eric Dumazet
  2012-10-08 10:38     ` Steffen Klassert
  2012-10-08 10:56     ` [PATCH v2 " Steffen Klassert
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Dumazet @ 2012-10-08  9:55 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev

On Mon, 2012-10-08 at 10:48 +0200, Steffen Klassert wrote:
> We report cached pmtu values even if they are already expired.
> Change this to not report these values after they are expired.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>  net/ipv4/route.c |   17 +++++++++--------
>  1 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 741df67..24b52dd 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2187,8 +2187,16 @@ static int rt_fill_info(struct net *net,  __be32 dst, __be32 src,
>  	    nla_put_be32(skb, RTA_GATEWAY, rt->rt_gateway))
>  		goto nla_put_failure;
>  
> +	expires = rt->dst.expires;
> +	if (expires) {
> +		if (time_before(jiffies, expires))
> +			expires -= jiffies;

Seeing this racy code, we could fix it as well.

jiffies is volatile and can change between the test and subtraction.

So we could in theory get strange result.

I suggest using :

if (expires) {
	unsigned long now = jiffies;

	if (time_before(now, expires))
		expires -= now;
	else
		expires = 0;
}

or 

if (expires)
	expires = max_t(signed long, 0L, expires - jiffies);




> +		else
> +			expires = 0;
> +	}
> +

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace
  2012-10-08  9:55   ` Eric Dumazet
@ 2012-10-08 10:38     ` Steffen Klassert
  2012-10-08 10:56     ` [PATCH v2 " Steffen Klassert
  1 sibling, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2012-10-08 10:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

On Mon, Oct 08, 2012 at 11:55:44AM +0200, Eric Dumazet wrote:
> On Mon, 2012-10-08 at 10:48 +0200, Steffen Klassert wrote:
> > We report cached pmtu values even if they are already expired.
> > Change this to not report these values after they are expired.
> > 
> > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> > ---
> >  net/ipv4/route.c |   17 +++++++++--------
> >  1 files changed, 9 insertions(+), 8 deletions(-)
> > 
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 741df67..24b52dd 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -2187,8 +2187,16 @@ static int rt_fill_info(struct net *net,  __be32 dst, __be32 src,
> >  	    nla_put_be32(skb, RTA_GATEWAY, rt->rt_gateway))
> >  		goto nla_put_failure;
> >  
> > +	expires = rt->dst.expires;
> > +	if (expires) {
> > +		if (time_before(jiffies, expires))
> > +			expires -= jiffies;
> 
> Seeing this racy code, we could fix it as well.

Sure, we can :)

> 
> jiffies is volatile and can change between the test and subtraction.
> 
> So we could in theory get strange result.
> 
> I suggest using :
> 
> if (expires) {
> 	unsigned long now = jiffies;
> 
> 	if (time_before(now, expires))
> 		expires -= now;
> 	else
> 		expires = 0;
> }

I'll go with the version above and send an updated patch.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] ipv4: Don't report stale pmtu values to userspace
  2012-10-08  9:55   ` Eric Dumazet
  2012-10-08 10:38     ` Steffen Klassert
@ 2012-10-08 10:56     ` Steffen Klassert
  1 sibling, 0 replies; 8+ messages in thread
From: Steffen Klassert @ 2012-10-08 10:56 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

We report cached pmtu values even if they are already expired.
Change this to not report these values after they are expired
and fix a race in the expire time calculation, as suggested by
Eric Dumazet.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/route.c |   19 +++++++++++--------
 1 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 741df67..132e0df 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2187,8 +2187,18 @@ static int rt_fill_info(struct net *net,  __be32 dst, __be32 src,
 	    nla_put_be32(skb, RTA_GATEWAY, rt->rt_gateway))
 		goto nla_put_failure;
 
+	expires = rt->dst.expires;
+	if (expires) {
+		unsigned long now = jiffies;
+
+		if (time_before(now, expires))
+			expires -= now;
+		else
+			expires = 0;
+	}
+
 	memcpy(metrics, dst_metrics_ptr(&rt->dst), sizeof(metrics));
-	if (rt->rt_pmtu)
+	if (rt->rt_pmtu && expires)
 		metrics[RTAX_MTU - 1] = rt->rt_pmtu;
 	if (rtnetlink_put_metrics(skb, metrics) < 0)
 		goto nla_put_failure;
@@ -2198,13 +2208,6 @@ static int rt_fill_info(struct net *net,  __be32 dst, __be32 src,
 		goto nla_put_failure;
 
 	error = rt->dst.error;
-	expires = rt->dst.expires;
-	if (expires) {
-		if (time_before(jiffies, expires))
-			expires -= jiffies;
-		else
-			expires = 0;
-	}
 
 	if (rt_is_input_route(rt)) {
 		if (nla_put_u32(skb, RTA_IIF, rt->rt_iif))
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] ipv4: pmtu fixes
  2012-10-08  8:46 [PATCH 0/3] ipv4: pmtu fixes Steffen Klassert
                   ` (2 preceding siblings ...)
  2012-10-08  8:48 ` [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace Steffen Klassert
@ 2012-10-08 18:47 ` David Miller
  3 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2012-10-08 18:47 UTC (permalink / raw)
  To: steffen.klassert; +Cc: netdev

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Mon, 8 Oct 2012 10:46:42 +0200

> This patchset fixes some issues that came with the routing cache removal.
> 
> 1) IPsec and others (udp, ipvs) may cache output routes, these routes
> need to be invalidated on pmtu events in the same way e.g. tcp socket
> cached routes are invalidated. With this we always invalidate or update
> (if we already use a nh exeption route) the old route on pmtu events. 
> 
> This has the drawback that we may needlessly invalidate an uncached route,
> but this fixes all the users that cache routes and pmtu events are rare, so
> this should not be a real issue.
> 
> 2) We create nh exeptions if a user (e.g. tracepath) tries to do pmtu
> dicsovery with packets bigger than the output device mtu. The device mtu
> is not learned and does not expire, so don't create an exeption route.
> 
> 3) We report cached pmtu values to userspace even if they are expired.
> Fix this by checking for expiration before we report.

All applied and queued up for -stable, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-10-08 18:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-08  8:46 [PATCH 0/3] ipv4: pmtu fixes Steffen Klassert
2012-10-08  8:47 ` [PATCH 1/3] ipv4: Always invalidate or update the route on pmtu events Steffen Klassert
2012-10-08  8:48 ` [PATCH 2/3] ipv4: Don't create nh exeption when the device mtu is smaller than the reported pmtu Steffen Klassert
2012-10-08  8:48 ` [PATCH 3/3] ipv4: Don't report stale pmtu values to userspace Steffen Klassert
2012-10-08  9:55   ` Eric Dumazet
2012-10-08 10:38     ` Steffen Klassert
2012-10-08 10:56     ` [PATCH v2 " Steffen Klassert
2012-10-08 18:47 ` [PATCH 0/3] ipv4: pmtu fixes David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.