netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2] ipv6: route: purge exception on removal
@ 2019-02-20 17:18 Paolo Abeni
  2019-02-21 15:10 ` David Ahern
  2019-02-22 19:46 ` David Miller
  0 siblings, 2 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-02-20 17:18 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, David S. Miller

When a netdevice is unregistered, we flush the relevant exception
via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().

Finally, we end-up calling rt6_remove_exception(), where we release
the relevant dst, while we keep the references to the related fib6_info and
dev. Such references should be released later when the dst will be
destroyed.

There are a number of caches that can keep the exception around for an
unlimited amount of time - namely dst_cache, possibly even socket cache.
As a result device registration may hang, as demonstrated by this script:

ip netns add cl
ip netns add rt
ip netns add srv
ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1

ip link add name cl_veth type veth peer name cl_rt_veth
ip link set dev cl_veth netns cl
ip -n cl link set dev cl_veth up
ip -n cl addr add dev cl_veth 2001::2/64
ip -n cl route add default via 2001::1

ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
ip -n cl link set tunv6 up
ip -n cl addr add 2013::2/64 dev tunv6

ip link set dev cl_rt_veth netns rt
ip -n rt link set dev cl_rt_veth up
ip -n rt addr add dev cl_rt_veth 2001::1/64

ip link add name rt_srv_veth type veth peer name srv_veth
ip link set dev srv_veth netns srv
ip -n srv link set dev srv_veth up
ip -n srv addr add dev srv_veth 2002::1/64
ip -n srv route add default via 2002::2

ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
ip -n srv link set tunv6 up
ip -n srv addr add 2013::1/64 dev tunv6

ip link set dev rt_srv_veth netns rt
ip -n rt link set dev rt_srv_veth up
ip -n rt addr add dev rt_srv_veth 2002::2/64

ip netns exec srv netserver & sleep 0.1
ip netns exec cl ping6 -c 4 2013::1
ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
ip -n rt link set dev rt_srv_veth mtu 1400
wait %2

ip -n cl link del cl_veth

This commit addresses the issue purging all the references held by the
exception at time, as we currently do for e.g. ipv6 pcpu dst entries.

v1 -> v2:
 - re-order the code to avoid accessing dst and net after dst_dev_put()

Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv6/route.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 964491cf3672..bd09abd1fb22 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1274,18 +1274,29 @@ static DEFINE_SPINLOCK(rt6_exception_lock);
 static void rt6_remove_exception(struct rt6_exception_bucket *bucket,
 				 struct rt6_exception *rt6_ex)
 {
+	struct fib6_info *from;
 	struct net *net;
 
 	if (!bucket || !rt6_ex)
 		return;
 
 	net = dev_net(rt6_ex->rt6i->dst.dev);
+	net->ipv6.rt6_stats->fib_rt_cache--;
+
+	/* purge completely the exception to allow releasing the held resources:
+	 * some [sk] cache may keep the dst around for unlimited time
+	 */
+	from = rcu_dereference_protected(rt6_ex->rt6i->from,
+					 lockdep_is_held(&rt6_exception_lock));
+	rcu_assign_pointer(rt6_ex->rt6i->from, NULL);
+	fib6_info_release(from);
+	dst_dev_put(&rt6_ex->rt6i->dst);
+
 	hlist_del_rcu(&rt6_ex->hlist);
 	dst_release(&rt6_ex->rt6i->dst);
 	kfree_rcu(rt6_ex, rcu);
 	WARN_ON_ONCE(!bucket->depth);
 	bucket->depth--;
-	net->ipv6.rt6_stats->fib_rt_cache--;
 }
 
 /* Remove oldest rt6_ex in bucket and free the memory
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] ipv6: route: purge exception on removal
  2019-02-20 17:18 [PATCH net v2] ipv6: route: purge exception on removal Paolo Abeni
@ 2019-02-21 15:10 ` David Ahern
  2019-02-21 17:48   ` Paolo Abeni
  2019-02-21 18:11   ` Stefano Brivio
  2019-02-22 19:46 ` David Miller
  1 sibling, 2 replies; 5+ messages in thread
From: David Ahern @ 2019-02-21 15:10 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller

On 2/20/19 12:18 PM, Paolo Abeni wrote:
> When a netdevice is unregistered, we flush the relevant exception
> via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
> 
> Finally, we end-up calling rt6_remove_exception(), where we release
> the relevant dst, while we keep the references to the related fib6_info and
> dev. Such references should be released later when the dst will be
> destroyed.
> 
> There are a number of caches that can keep the exception around for an
> unlimited amount of time - namely dst_cache, possibly even socket cache.
> As a result device registration may hang, as demonstrated by this script:
> 
> ip netns add cl
> ip netns add rt
> ip netns add srv
> ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1
> 
> ip link add name cl_veth type veth peer name cl_rt_veth
> ip link set dev cl_veth netns cl
> ip -n cl link set dev cl_veth up
> ip -n cl addr add dev cl_veth 2001::2/64
> ip -n cl route add default via 2001::1
> 
> ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
> ip -n cl link set tunv6 up
> ip -n cl addr add 2013::2/64 dev tunv6
> 
> ip link set dev cl_rt_veth netns rt
> ip -n rt link set dev cl_rt_veth up
> ip -n rt addr add dev cl_rt_veth 2001::1/64
> 
> ip link add name rt_srv_veth type veth peer name srv_veth
> ip link set dev srv_veth netns srv
> ip -n srv link set dev srv_veth up
> ip -n srv addr add dev srv_veth 2002::1/64
> ip -n srv route add default via 2002::2
> 
> ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
> ip -n srv link set tunv6 up
> ip -n srv addr add 2013::1/64 dev tunv6
> 
> ip link set dev rt_srv_veth netns rt
> ip -n rt link set dev rt_srv_veth up
> ip -n rt addr add dev rt_srv_veth 2002::2/64
> 
> ip netns exec srv netserver & sleep 0.1
> ip netns exec cl ping6 -c 4 2013::1
> ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
> ip -n rt link set dev rt_srv_veth mtu 1400
> wait %2
> 
> ip -n cl link del cl_veth
> 
> This commit addresses the issue purging all the references held by the
> exception at time, as we currently do for e.g. ipv6 pcpu dst entries.
> 
> v1 -> v2:
>  - re-order the code to avoid accessing dst and net after dst_dev_put()
> 
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/ipv6/route.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 

I am surprised this was not found by the existing pmtu script which
creates exceptions. Please add this test case to selftests to capture
this specific set of events.

Reviewed-by: David Ahern <dsahern@gmail.com>

Thanks for the resolving.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] ipv6: route: purge exception on removal
  2019-02-21 15:10 ` David Ahern
@ 2019-02-21 17:48   ` Paolo Abeni
  2019-02-21 18:11   ` Stefano Brivio
  1 sibling, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-02-21 17:48 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: David S. Miller

On Thu, 2019-02-21 at 10:10 -0500, David Ahern wrote:
> I am surprised this was not found by the existing pmtu script which
> creates exceptions. Please add this test case to selftests to capture
> this specific set of events.

Sure, I'll have a look at the self-tests. Should I target net (since
the fix is there) or net-next, since it'a new "feature" (test)? 

Thank you for reviewing,

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] ipv6: route: purge exception on removal
  2019-02-21 15:10 ` David Ahern
  2019-02-21 17:48   ` Paolo Abeni
@ 2019-02-21 18:11   ` Stefano Brivio
  1 sibling, 0 replies; 5+ messages in thread
From: Stefano Brivio @ 2019-02-21 18:11 UTC (permalink / raw)
  To: David Ahern, Paolo Abeni; +Cc: netdev, David S. Miller

On Thu, 21 Feb 2019 10:10:32 -0500
David Ahern <dsahern@gmail.com> wrote:

> I am surprised this was not found by the existing pmtu script which
> creates exceptions. Please add this test case to selftests to capture
> this specific set of events.

I think the reason is that, to keep the cleanup function minimal, I
just decided to tear down namespaces after tests, and the tests won't
catch this.

Paolo, I guess you could either make cleanup() symmetric with the
setup_routing() function (but then we don't catch issues related to
namespaces teardown), or introduce just two tests (IPv4 and IPv6) doing
this with one existing tunnel setup (and we don't have variability on
device types, but I don't think it's very relevant).

The alternative of doubling every test (with namespace *and* "orderly"
teardown) looks a bit overkill to me.

-- 
Stefano

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] ipv6: route: purge exception on removal
  2019-02-20 17:18 [PATCH net v2] ipv6: route: purge exception on removal Paolo Abeni
  2019-02-21 15:10 ` David Ahern
@ 2019-02-22 19:46 ` David Miller
  1 sibling, 0 replies; 5+ messages in thread
From: David Miller @ 2019-02-22 19:46 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, dsahern

From: Paolo Abeni <pabeni@redhat.com>
Date: Wed, 20 Feb 2019 18:18:12 +0100

> When a netdevice is unregistered, we flush the relevant exception
> via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
> 
> Finally, we end-up calling rt6_remove_exception(), where we release
> the relevant dst, while we keep the references to the related fib6_info and
> dev. Such references should be released later when the dst will be
> destroyed.
> 
> There are a number of caches that can keep the exception around for an
> unlimited amount of time - namely dst_cache, possibly even socket cache.
> As a result device registration may hang, as demonstrated by this script:
 ...
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-22 19:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-20 17:18 [PATCH net v2] ipv6: route: purge exception on removal Paolo Abeni
2019-02-21 15:10 ` David Ahern
2019-02-21 17:48   ` Paolo Abeni
2019-02-21 18:11   ` Stefano Brivio
2019-02-22 19:46 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).