All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: ipv6: Make address flushing on ifdown optional
@ 2016-02-13 22:23 David Ahern
  2016-02-13 22:25 ` David Ahern
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: David Ahern @ 2016-02-13 22:23 UTC (permalink / raw)
  To: netdev; +Cc: hannes, David Ahern

Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:

    $ ip -6 addr add dev eth1 2000:11:1:1::1/64
    $ ip addr show dev eth1
    3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
        link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
        inet6 2000:11:1:1::1/64 scope global tentative
           valid_lft forever preferred_lft forever
    $ ip link set dev eth1 up
    $ ip link set dev eth1 down
    $ ip addr show dev eth1
    3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
        link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff

Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down:

    $ echo 1 > /proc/sys/net/ipv6/conf/eth1/keep_addr_on_down
    $ ip -6 addr add dev eth1 2000:11:1:1::1/64
    $ ip addr show dev eth1
    3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
        link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
        inet6 2000:11:1:1::1/64 scope global tentative
           valid_lft forever preferred_lft forever
    $ ip link set dev eth1 up
    $ ip link set dev eth1 down
    $ ip addr show dev eth1
    3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
        link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
        inet6 2000:11:1:1::1/64 scope global
           valid_lft forever preferred_lft forever
        inet6 fe80::4:11ff:fe22:3301/64 scope link
           valid_lft forever preferred_lft forever

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
Dave: per the discussion at netconf tossing this out again. While the
      failure semantics are not ideal it only occurs on GFP_ATOMIC
      memory failures.

v6:
- rebase to 4.5.0-rc2
- flip logic from flush_addr to keep_addr with default 0. makes it easier
  to use the 'all' conf for interfaces.

v5:
- renamed managed to user_managed as requested by Hannes
- handle addrconf_dst_alloc failure and cleanup ifp as noted by Dave
  -- tested by faking allocation failure
- minor ordering changes in addrconf_ifdown() to handle changes under lock

v4:
- rebased to top of tree
- updated to clear all routes on admin down and re-added on admin up
- verified the route tables (main and local) on a link down have *no*
  remnants of the configured, global address. On a link up all routes
  are restored -- multicast, linklocal, local routes and connected.

v3:
- fix local variable ordering and comment style per Dave's comment
- consistency in DEVCONF naming per Brian Haley's comment
- added entry to Documentation/networking/ip-sysctl.txt

v2:
- only keep static addresses as suggested by Hannes
- added new managed flag to track configured addresses
- on ifdown do not remove from configured address from inet6_addr_lst
- on ifdown reset the TENTATIVE flag and set state to DAD so that DAD is
  redone when link is brought up again


 Documentation/networking/ip-sysctl.txt |   6 ++
 include/linux/ipv6.h                   |   1 +
 include/net/if_inet6.h                 |   1 +
 include/uapi/linux/ipv6.h              |   1 +
 net/ipv6/addrconf.c                    | 128 +++++++++++++++++++++++++++++----
 5 files changed, 122 insertions(+), 15 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 24ce97f42d35..7ddbbb67f0db 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1563,6 +1563,12 @@ temp_prefered_lft - INTEGER
 	Preferred lifetime (in seconds) for temporary addresses.
 	Default: 86400 (1 day)
 
+keep_addr_on_down - BOOLEAN
+	Keep all IPv6 addresses on an interface down event. If set static
+	global addresses with no expiration time are not flushed.
+
+	Default: disabled
+
 max_desync_factor - INTEGER
 	Maximum value for DESYNC_FACTOR, which is a random value
 	that ensures that clients don't synchronize with each
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 4b2267e1b7c3..7edc14fb66b6 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -62,6 +62,7 @@ struct ipv6_devconf {
 		struct in6_addr secret;
 	} stable_secret;
 	__s32		use_oif_addrs_only;
+	__s32		keep_addr_on_down;
 	void		*sysctl;
 };
 
diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index 1c8b6820b694..01ba6a286a4b 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -72,6 +72,7 @@ struct inet6_ifaddr {
 	int			regen_count;
 
 	bool			tokenized;
+	bool			user_managed;
 
 	struct rcu_head		rcu;
 	struct in6_addr		peer_addr;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index ec117b65d5a5..395876060f50 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -176,6 +176,7 @@ enum {
 	DEVCONF_IGNORE_ROUTES_WITH_LINKDOWN,
 	DEVCONF_DROP_UNICAST_IN_L2_MULTICAST,
 	DEVCONF_DROP_UNSOLICITED_NA,
+	DEVCONF_KEEP_ADDR_ON_DOWN,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ac0ba9e4e06b..0bcb0f538e54 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -216,6 +216,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	},
 	.use_oif_addrs_only	= 0,
 	.ignore_routes_with_linkdown = 0,
+	.keep_addr_on_down	= 0,
 };
 
 static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
@@ -260,6 +261,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	},
 	.use_oif_addrs_only	= 0,
 	.ignore_routes_with_linkdown = 0,
+	.keep_addr_on_down	= 0,
 };
 
 /* Check if a valid qdisc is available */
@@ -962,6 +964,7 @@ ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr,
 	ifa->prefered_lft = prefered_lft;
 	ifa->cstamp = ifa->tstamp = jiffies;
 	ifa->tokenized = false;
+	ifa->user_managed = false;
 
 	ifa->rt = rt;
 
@@ -2701,6 +2704,9 @@ static int inet6_addr_add(struct net *net, int ifindex,
 			    valid_lft, prefered_lft);
 
 	if (!IS_ERR(ifp)) {
+		if (!expires)
+			ifp->user_managed = true;
+
 		if (!(ifa_flags & IFA_F_NOPREFIXROUTE)) {
 			addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev,
 					      expires, flags);
@@ -3168,6 +3174,55 @@ static void addrconf_gre_config(struct net_device *dev)
 }
 #endif
 
+static int fixup_user_managed_addr(struct inet6_dev *idev,
+				   struct inet6_ifaddr *ifp)
+{
+	if (!ifp->rt) {
+		struct rt6_info *rt;
+
+		rt = addrconf_dst_alloc(idev, &ifp->addr, false);
+		if (unlikely(IS_ERR(rt)))
+			return PTR_ERR(rt);
+
+		ifp->rt = rt;
+	}
+
+	if (!(ifp->flags & IFA_F_NOPREFIXROUTE)) {
+		addrconf_prefix_route(&ifp->addr, ifp->prefix_len,
+				      idev->dev, 0, 0);
+	}
+
+	addrconf_dad_start(ifp);
+
+	return 0;
+}
+
+static void addrconf_user_managed_addr(struct net_device *dev)
+{
+	struct inet6_ifaddr *ifp, *tmp;
+	struct inet6_dev *idev;
+
+	idev = __in6_dev_get(dev);
+	if (!idev)
+		return;
+
+	write_lock_bh(&idev->lock);
+
+	list_for_each_entry_safe(ifp, tmp, &idev->addr_list, if_list) {
+		if (ifp->user_managed &&
+		    fixup_user_managed_addr(idev, ifp) < 0) {
+			write_unlock_bh(&idev->lock);
+			ipv6_del_addr(ifp);
+			write_lock_bh(&idev->lock);
+
+			net_info_ratelimited("%s: Failed to add prefix route for address %pI6c; dropping\n",
+					     idev->dev->name, &ifp->addr);
+		}
+	}
+
+	write_unlock_bh(&idev->lock);
+}
+
 static int addrconf_notify(struct notifier_block *this, unsigned long event,
 			   void *ptr)
 {
@@ -3253,6 +3308,8 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
 			run_pending = 1;
 		}
 
+		addrconf_user_managed_addr(dev);
+
 		switch (dev->type) {
 #if IS_ENABLED(CONFIG_IPV6_SIT)
 		case ARPHRD_SIT:
@@ -3356,7 +3413,9 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 {
 	struct net *net = dev_net(dev);
 	struct inet6_dev *idev;
-	struct inet6_ifaddr *ifa;
+	struct inet6_ifaddr *ifa, *tmp;
+	struct list_head del_list;
+	int keep_addr;
 	int state, i;
 
 	ASSERT_RTNL();
@@ -3383,6 +3442,10 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 	}
 
+	keep_addr = net->ipv6.devconf_all->keep_addr_on_down;
+	if (!keep_addr)
+		keep_addr = idev->cnf.keep_addr_on_down;
+
 	/* Step 2: clear hash table */
 	for (i = 0; i < IN6_ADDR_HSIZE; i++) {
 		struct hlist_head *h = &inet6_addr_lst[i];
@@ -3391,9 +3454,12 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 restart:
 		hlist_for_each_entry_rcu(ifa, h, addr_lst) {
 			if (ifa->idev == idev) {
-				hlist_del_init_rcu(&ifa->addr_lst);
 				addrconf_del_dad_work(ifa);
-				goto restart;
+				if (how || !keep_addr || !ifa->user_managed) {
+					hlist_del_init_rcu(&ifa->addr_lst);
+					goto restart;
+				}
+
 			}
 		}
 		spin_unlock_bh(&addrconf_hash_lock);
@@ -3427,31 +3493,52 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 		write_lock_bh(&idev->lock);
 	}
 
-	while (!list_empty(&idev->addr_list)) {
-		ifa = list_first_entry(&idev->addr_list,
-				       struct inet6_ifaddr, if_list);
-		addrconf_del_dad_work(ifa);
+	INIT_LIST_HEAD(&del_list);
+	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
+		bool keep_ifa = false;
 
-		list_del(&ifa->if_list);
+		if (!how && keep_addr && ifa->user_managed)
+			keep_ifa = true;
 
-		write_unlock_bh(&idev->lock);
+		addrconf_del_dad_work(ifa);
 
+		write_unlock_bh(&idev->lock);
 		spin_lock_bh(&ifa->lock);
-		state = ifa->state;
-		ifa->state = INET6_IFADDR_STATE_DEAD;
+
+		if (unlikely(keep_ifa)) {
+			/* set state to skip the notifier below */
+			state = INET6_IFADDR_STATE_DEAD;
+			ifa->state = 0;
+			if (!(ifa->flags & IFA_F_NODAD))
+				ifa->flags |= IFA_F_TENTATIVE;
+		} else {
+			state = ifa->state;
+			ifa->state = INET6_IFADDR_STATE_DEAD;
+
+			list_del(&ifa->if_list);
+			list_add(&ifa->if_list, &del_list);
+		}
+
 		spin_unlock_bh(&ifa->lock);
 
 		if (state != INET6_IFADDR_STATE_DEAD) {
 			__ipv6_ifa_notify(RTM_DELADDR, ifa);
 			inet6addr_notifier_call_chain(NETDEV_DOWN, ifa);
 		}
-		in6_ifa_put(ifa);
 
 		write_lock_bh(&idev->lock);
 	}
 
 	write_unlock_bh(&idev->lock);
 
+	while (!list_empty(&del_list)) {
+		ifa = list_first_entry(&del_list,
+				       struct inet6_ifaddr, if_list);
+		list_del(&ifa->if_list);
+
+		in6_ifa_put(ifa);
+	}
+
 	/* Step 5: Discard anycast and multicast list */
 	if (how) {
 		ipv6_ac_destroy_dev(idev);
@@ -4713,6 +4800,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_USE_OIF_ADDRS_ONLY] = cnf->use_oif_addrs_only;
 	array[DEVCONF_DROP_UNICAST_IN_L2_MULTICAST] = cnf->drop_unicast_in_l2_multicast;
 	array[DEVCONF_DROP_UNSOLICITED_NA] = cnf->drop_unsolicited_na;
+	array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -5194,10 +5282,12 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
 			if (rt)
 				ip6_del_rt(rt);
 		}
-		dst_hold(&ifp->rt->dst);
-
-		ip6_del_rt(ifp->rt);
+		if (ifp->rt) {
+			dst_hold(&ifp->rt->dst);
 
+			ip6_del_rt(ifp->rt);
+			ifp->rt = NULL;
+		}
 		rt_genid_bump_ipv6(net);
 		break;
 	}
@@ -5801,6 +5891,14 @@ static struct addrconf_sysctl_table
 			.proc_handler	= proc_dointvec,
 		},
 		{
+			.procname       = "keep_addr_on_down",
+			.data           = &ipv6_devconf.keep_addr_on_down,
+			.maxlen         = sizeof(int),
+			.mode           = 0644,
+			.proc_handler   = proc_dointvec,
+
+		},
+		{
 			/* sentinel */
 		}
 	},
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-13 22:23 [PATCH] net: ipv6: Make address flushing on ifdown optional David Ahern
@ 2016-02-13 22:25 ` David Ahern
  2016-02-16  8:45 ` YOSHIFUJI Hideaki
  2016-02-17 18:05 ` David Miller
  2 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2016-02-13 22:25 UTC (permalink / raw)
  To: netdev; +Cc: hannes

grrr.... changed the subject in send-email editor which dropped it. 
obviously this is for net-next

On 2/13/16 11:23 PM, David Ahern wrote:
> Currently, all ipv6 addresses are flushed when the interface is configured
> down, including global, static addresses:
>
>      $ ip -6 addr add dev eth1 2000:11:1:1::1/64
>      $ ip addr show dev eth1
>      3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
>          link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>          inet6 2000:11:1:1::1/64 scope global tentative
>             valid_lft forever preferred_lft forever
>      $ ip link set dev eth1 up
>      $ ip link set dev eth1 down
>      $ ip addr show dev eth1
>      3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>          link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>
> Add a new sysctl to make this behavior optional. The new setting defaults to
> flush all addresses to maintain backwards compatibility. When the set global
> addresses with no expire times are not flushed on an admin down:
>
>      $ echo 1 > /proc/sys/net/ipv6/conf/eth1/keep_addr_on_down
>      $ ip -6 addr add dev eth1 2000:11:1:1::1/64
>      $ ip addr show dev eth1
>      3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>          link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>          inet6 2000:11:1:1::1/64 scope global tentative
>             valid_lft forever preferred_lft forever
>      $ ip link set dev eth1 up
>      $ ip link set dev eth1 down
>      $ ip addr show dev eth1
>      3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>          link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>          inet6 2000:11:1:1::1/64 scope global
>             valid_lft forever preferred_lft forever
>          inet6 fe80::4:11ff:fe22:3301/64 scope link
>             valid_lft forever preferred_lft forever
>
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
> ---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-13 22:23 [PATCH] net: ipv6: Make address flushing on ifdown optional David Ahern
  2016-02-13 22:25 ` David Ahern
@ 2016-02-16  8:45 ` YOSHIFUJI Hideaki
  2016-02-16 15:16   ` David Ahern
  2016-02-17 18:05 ` David Miller
  2 siblings, 1 reply; 8+ messages in thread
From: YOSHIFUJI Hideaki @ 2016-02-16  8:45 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: hideaki.yoshifuji, hannes

Hi,

David Ahern wrote:
> Currently, all ipv6 addresses are flushed when the interface is configured
> down, including global, static addresses:
> 
>     $ ip -6 addr add dev eth1 2000:11:1:1::1/64
>     $ ip addr show dev eth1
>     3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
>         link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>         inet6 2000:11:1:1::1/64 scope global tentative
>            valid_lft forever preferred_lft forever
>     $ ip link set dev eth1 up
>     $ ip link set dev eth1 down
>     $ ip addr show dev eth1
>     3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>         link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
> 
> Add a new sysctl to make this behavior optional. The new setting defaults to
> flush all addresses to maintain backwards compatibility. When the set global
> addresses with no expire times are not flushed on an admin down:
> 
>     $ echo 1 > /proc/sys/net/ipv6/conf/eth1/keep_addr_on_down
>     $ ip -6 addr add dev eth1 2000:11:1:1::1/64
>     $ ip addr show dev eth1
>     3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>         link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>         inet6 2000:11:1:1::1/64 scope global tentative
>            valid_lft forever preferred_lft forever
>     $ ip link set dev eth1 up
>     $ ip link set dev eth1 down
>     $ ip addr show dev eth1
>     3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>         link/ether 02:04:11:22:33:01 brd ff:ff:ff:ff:ff:ff
>         inet6 2000:11:1:1::1/64 scope global
>            valid_lft forever preferred_lft forever
>         inet6 fe80::4:11ff:fe22:3301/64 scope link
>            valid_lft forever preferred_lft forever
> 
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
> ---
> Dave: per the discussion at netconf tossing this out again. While the
>       failure semantics are not ideal it only occurs on GFP_ATOMIC
>       memory failures.
:
> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
> index 24ce97f42d35..7ddbbb67f0db 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -1563,6 +1563,12 @@ temp_prefered_lft - INTEGER
>  	Preferred lifetime (in seconds) for temporary addresses.
>  	Default: 86400 (1 day)
>  
> +keep_addr_on_down - BOOLEAN
> +	Keep all IPv6 addresses on an interface down event. If set static
> +	global addresses with no expiration time are not flushed.
> +
> +	Default: disabled
> +

How about this:
   1: enabled
   0: system default
  -1: disabled
so that an iterface can override system-wide config?

>  max_desync_factor - INTEGER
>  	Maximum value for DESYNC_FACTOR, which is a random value
>  	that ensures that clients don't synchronize with each
> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> index 4b2267e1b7c3..7edc14fb66b6 100644
> --- a/include/linux/ipv6.h
> +++ b/include/linux/ipv6.h
> @@ -62,6 +62,7 @@ struct ipv6_devconf {
>  		struct in6_addr secret;
>  	} stable_secret;
>  	__s32		use_oif_addrs_only;
> +	__s32		keep_addr_on_down;
>  	void		*sysctl;
>  };
>  
> diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
> index 1c8b6820b694..01ba6a286a4b 100644
> --- a/include/net/if_inet6.h
> +++ b/include/net/if_inet6.h
> @@ -72,6 +72,7 @@ struct inet6_ifaddr {
>  	int			regen_count;
>  
>  	bool			tokenized;
> +	bool			user_managed;

Can't we use IFA_F_PERMANENT?

> diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
> index ec117b65d5a5..395876060f50 100644
> --- a/include/uapi/linux/ipv6.h
> +++ b/include/uapi/linux/ipv6.h
> @@ -176,6 +176,7 @@ enum {
>  	DEVCONF_IGNORE_ROUTES_WITH_LINKDOWN,
>  	DEVCONF_DROP_UNICAST_IN_L2_MULTICAST,
>  	DEVCONF_DROP_UNSOLICITED_NA,
> +	DEVCONF_KEEP_ADDR_ON_DOWN,
>  	DEVCONF_MAX
>  };
>  
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index ac0ba9e4e06b..0bcb0f538e54 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -216,6 +216,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
>  	},
>  	.use_oif_addrs_only	= 0,
>  	.ignore_routes_with_linkdown = 0,
> +	.keep_addr_on_down	= 0,
>  };
>  
>  static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
> @@ -260,6 +261,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
>  	},
>  	.use_oif_addrs_only	= 0,
>  	.ignore_routes_with_linkdown = 0,
> +	.keep_addr_on_down	= 0,
>  };
>  
>  /* Check if a valid qdisc is available */
> @@ -962,6 +964,7 @@ ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr,
>  	ifa->prefered_lft = prefered_lft;
>  	ifa->cstamp = ifa->tstamp = jiffies;
>  	ifa->tokenized = false;
> +	ifa->user_managed = false;
>  
>  	ifa->rt = rt;
>  
> @@ -2701,6 +2704,9 @@ static int inet6_addr_add(struct net *net, int ifindex,
>  			    valid_lft, prefered_lft);
>  
>  	if (!IS_ERR(ifp)) {
> +		if (!expires)
> +			ifp->user_managed = true;
> +
>  		if (!(ifa_flags & IFA_F_NOPREFIXROUTE)) {
>  			addrconf_prefix_route(&ifp->addr, ifp->prefix_len, dev,
>  					      expires, flags);
> @@ -3168,6 +3174,55 @@ static void addrconf_gre_config(struct net_device *dev)
>  }
>  #endif
>  
> +static int fixup_user_managed_addr(struct inet6_dev *idev,
> +				   struct inet6_ifaddr *ifp)
> +{
> +	if (!ifp->rt) {
> +		struct rt6_info *rt;
> +
> +		rt = addrconf_dst_alloc(idev, &ifp->addr, false);
> +		if (unlikely(IS_ERR(rt)))
> +			return PTR_ERR(rt);
> +
> +		ifp->rt = rt;
> +	}
> +
> +	if (!(ifp->flags & IFA_F_NOPREFIXROUTE)) {
> +		addrconf_prefix_route(&ifp->addr, ifp->prefix_len,
> +				      idev->dev, 0, 0);
> +	}
> +
> +	addrconf_dad_start(ifp);
> +
> +	return 0;
> +}
> +
> +static void addrconf_user_managed_addr(struct net_device *dev)
> +{
> +	struct inet6_ifaddr *ifp, *tmp;
> +	struct inet6_dev *idev;
> +
> +	idev = __in6_dev_get(dev);
> +	if (!idev)
> +		return;
> +
> +	write_lock_bh(&idev->lock);
> +
> +	list_for_each_entry_safe(ifp, tmp, &idev->addr_list, if_list) {
> +		if (ifp->user_managed &&
> +		    fixup_user_managed_addr(idev, ifp) < 0) {
> +			write_unlock_bh(&idev->lock);
> +			ipv6_del_addr(ifp);
> +			write_lock_bh(&idev->lock);
> +
> +			net_info_ratelimited("%s: Failed to add prefix route for address %pI6c; dropping\n",
> +					     idev->dev->name, &ifp->addr);
> +		}
> +	}
> +
> +	write_unlock_bh(&idev->lock);
> +}
> +
>  static int addrconf_notify(struct notifier_block *this, unsigned long event,
>  			   void *ptr)
>  {
> @@ -3253,6 +3308,8 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
>  			run_pending = 1;
>  		}
>  
> +		addrconf_user_managed_addr(dev);
> +
>  		switch (dev->type) {
>  #if IS_ENABLED(CONFIG_IPV6_SIT)
>  		case ARPHRD_SIT:
> @@ -3356,7 +3413,9 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>  {
>  	struct net *net = dev_net(dev);
>  	struct inet6_dev *idev;
> -	struct inet6_ifaddr *ifa;
> +	struct inet6_ifaddr *ifa, *tmp;
> +	struct list_head del_list;
> +	int keep_addr;
>  	int state, i;
>  
>  	ASSERT_RTNL();
> @@ -3383,6 +3442,10 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>  
>  	}
>  
> +	keep_addr = net->ipv6.devconf_all->keep_addr_on_down;
> +	if (!keep_addr)
> +		keep_addr = idev->cnf.keep_addr_on_down;
> +
>  	/* Step 2: clear hash table */
>  	for (i = 0; i < IN6_ADDR_HSIZE; i++) {
>  		struct hlist_head *h = &inet6_addr_lst[i];
> @@ -3391,9 +3454,12 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>  restart:
>  		hlist_for_each_entry_rcu(ifa, h, addr_lst) {
>  			if (ifa->idev == idev) {
> -				hlist_del_init_rcu(&ifa->addr_lst);
>  				addrconf_del_dad_work(ifa);
> -				goto restart;
> +				if (how || !keep_addr || !ifa->user_managed) {

keep_addr <= 0

> +					hlist_del_init_rcu(&ifa->addr_lst);
> +					goto restart;
> +				}
> +
>  			}
>  		}
>  		spin_unlock_bh(&addrconf_hash_lock);
> @@ -3427,31 +3493,52 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>  		write_lock_bh(&idev->lock);
>  	}
>  
> -	while (!list_empty(&idev->addr_list)) {
> -		ifa = list_first_entry(&idev->addr_list,
> -				       struct inet6_ifaddr, if_list);
> -		addrconf_del_dad_work(ifa);
> +	INIT_LIST_HEAD(&del_list);
> +	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
> +		bool keep_ifa = false;
>  
> -		list_del(&ifa->if_list);
> +		if (!how && keep_addr && ifa->user_managed)

keep_addr > 0

etc...

> +			keep_ifa = true;
>  
> -		write_unlock_bh(&idev->lock);
> +		addrconf_del_dad_work(ifa);
>  
> +		write_unlock_bh(&idev->lock);
>  		spin_lock_bh(&ifa->lock);
> -		state = ifa->state;
> -		ifa->state = INET6_IFADDR_STATE_DEAD;
> +
> +		if (unlikely(keep_ifa)) {
> +			/* set state to skip the notifier below */
> +			state = INET6_IFADDR_STATE_DEAD;
> +			ifa->state = 0;
> +			if (!(ifa->flags & IFA_F_NODAD))
> +				ifa->flags |= IFA_F_TENTATIVE;
> +		} else {
> +			state = ifa->state;
> +			ifa->state = INET6_IFADDR_STATE_DEAD;
> +
> +			list_del(&ifa->if_list);
> +			list_add(&ifa->if_list, &del_list);
> +		}
> +
>  		spin_unlock_bh(&ifa->lock);
>  
>  		if (state != INET6_IFADDR_STATE_DEAD) {
>  			__ipv6_ifa_notify(RTM_DELADDR, ifa);
>  			inet6addr_notifier_call_chain(NETDEV_DOWN, ifa);
>  		}
> -		in6_ifa_put(ifa);
>  
>  		write_lock_bh(&idev->lock);
>  	}
>  
>  	write_unlock_bh(&idev->lock);
>  
> +	while (!list_empty(&del_list)) {
> +		ifa = list_first_entry(&del_list,
> +				       struct inet6_ifaddr, if_list);
> +		list_del(&ifa->if_list);
> +
> +		in6_ifa_put(ifa);
> +	}
> +
>  	/* Step 5: Discard anycast and multicast list */
>  	if (how) {
>  		ipv6_ac_destroy_dev(idev);
> @@ -4713,6 +4800,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
>  	array[DEVCONF_USE_OIF_ADDRS_ONLY] = cnf->use_oif_addrs_only;
>  	array[DEVCONF_DROP_UNICAST_IN_L2_MULTICAST] = cnf->drop_unicast_in_l2_multicast;
>  	array[DEVCONF_DROP_UNSOLICITED_NA] = cnf->drop_unsolicited_na;
> +	array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
>  }
>  
>  static inline size_t inet6_ifla6_size(void)
> @@ -5194,10 +5282,12 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
>  			if (rt)
>  				ip6_del_rt(rt);
>  		}
> -		dst_hold(&ifp->rt->dst);
> -
> -		ip6_del_rt(ifp->rt);
> +		if (ifp->rt) {
> +			dst_hold(&ifp->rt->dst);
>  
> +			ip6_del_rt(ifp->rt);
> +			ifp->rt = NULL;
> +		}
>  		rt_genid_bump_ipv6(net);
>  		break;
>  	}
> @@ -5801,6 +5891,14 @@ static struct addrconf_sysctl_table
>  			.proc_handler	= proc_dointvec,
>  		},
>  		{
> +			.procname       = "keep_addr_on_down",
> +			.data           = &ipv6_devconf.keep_addr_on_down,
> +			.maxlen         = sizeof(int),
> +			.mode           = 0644,
> +			.proc_handler   = proc_dointvec,
> +
> +		},
> +		{
>  			/* sentinel */
>  		}
>  	},
> 

-- 
Hideaki Yoshifuji <hideaki.yoshifuji@miraclelinux.com>
Technical Division, MIRACLE LINUX CORPORATION

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-16  8:45 ` YOSHIFUJI Hideaki
@ 2016-02-16 15:16   ` David Ahern
  2016-02-17  2:10     ` YOSHIFUJI Hideaki
  0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2016-02-16 15:16 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki, netdev; +Cc: hannes

On 2/16/16 1:45 AM, YOSHIFUJI Hideaki wrote:
>> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
>> index 24ce97f42d35..7ddbbb67f0db 100644
>> --- a/Documentation/networking/ip-sysctl.txt
>> +++ b/Documentation/networking/ip-sysctl.txt
>> @@ -1563,6 +1563,12 @@ temp_prefered_lft - INTEGER
>>   	Preferred lifetime (in seconds) for temporary addresses.
>>   	Default: 86400 (1 day)
>>   
>> +keep_addr_on_down - BOOLEAN
>> +	Keep all IPv6 addresses on an interface down event. If set static
>> +	global addresses with no expiration time are not flushed.
>> +
>> +	Default: disabled
>> +
> 
> How about this:
>     1: enabled
>     0: system default
>    -1: disabled
> so that an iterface can override system-wide config?

It is my understanding that the 'all' settings override the individual
interface settings. From Documentation/networking/ip-sysctl.txt +1346:

conf/all/*:
        Change all the interface-specific settings.


-----8<-----


>> diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
>> index 1c8b6820b694..01ba6a286a4b 100644
>> --- a/include/net/if_inet6.h
>> +++ b/include/net/if_inet6.h
>> @@ -72,6 +72,7 @@ struct inet6_ifaddr {
>>   	int			regen_count;
>>   
>>   	bool			tokenized;
>> +	bool			user_managed;
> 
> Can't we use IFA_F_PERMANENT?

I think so. Will fix.


-----8<-----

>> @@ -3356,7 +3413,9 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>>   {
>>   	struct net *net = dev_net(dev);
>>   	struct inet6_dev *idev;
>> -	struct inet6_ifaddr *ifa;
>> +	struct inet6_ifaddr *ifa, *tmp;
>> +	struct list_head del_list;
>> +	int keep_addr;
>>   	int state, i;
>>   
>>   	ASSERT_RTNL();
>> @@ -3383,6 +3442,10 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>>   
>>   	}
>>   
>> +	keep_addr = net->ipv6.devconf_all->keep_addr_on_down;
>> +	if (!keep_addr)
>> +		keep_addr = idev->cnf.keep_addr_on_down;
>> +
>>   	/* Step 2: clear hash table */
>>   	for (i = 0; i < IN6_ADDR_HSIZE; i++) {
>>   		struct hlist_head *h = &inet6_addr_lst[i];

So what I have here is if the system-wide setting says keep the address
it is kept. Else if the individual interface setting is enabled the
address is kept.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-16 15:16   ` David Ahern
@ 2016-02-17  2:10     ` YOSHIFUJI Hideaki
  2016-02-17  3:45       ` David Ahern
  0 siblings, 1 reply; 8+ messages in thread
From: YOSHIFUJI Hideaki @ 2016-02-17  2:10 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: hideaki.yoshifuji, hannes

Hi,

David Ahern wrote:
> On 2/16/16 1:45 AM, YOSHIFUJI Hideaki wrote:
>>> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
>>> index 24ce97f42d35..7ddbbb67f0db 100644
>>> --- a/Documentation/networking/ip-sysctl.txt
>>> +++ b/Documentation/networking/ip-sysctl.txt
>>> @@ -1563,6 +1563,12 @@ temp_prefered_lft - INTEGER
>>>   	Preferred lifetime (in seconds) for temporary addresses.
>>>   	Default: 86400 (1 day)
>>>   
>>> +keep_addr_on_down - BOOLEAN
>>> +	Keep all IPv6 addresses on an interface down event. If set static
>>> +	global addresses with no expiration time are not flushed.
>>> +
>>> +	Default: disabled
>>> +
>>
>> How about this:
>>     1: enabled
>>     0: system default
>>    -1: disabled
>> so that an iterface can override system-wide config?
> 
> It is my understanding that the 'all' settings override the individual
> interface settings. From Documentation/networking/ip-sysctl.txt +1346:
> 
> conf/all/*:
>         Change all the interface-specific settings.

Well, document is not correct.
1) Some of "all" variables set all interface specific settings.
2) Some of "all" variables override interface specific settings.
3) Some provide "fall-back" values; such an interface specific
   setting overrides the corresponding "all" variable.
   (Note: "default" variables are values per-interface settings
   are initialized to.)
4) Others are ignored (the exists but no-ops).

> 
> 
> -----8<-----
> 
> 
>>> diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
>>> index 1c8b6820b694..01ba6a286a4b 100644
>>> --- a/include/net/if_inet6.h
>>> +++ b/include/net/if_inet6.h
>>> @@ -72,6 +72,7 @@ struct inet6_ifaddr {
>>>   	int			regen_count;
>>>   
>>>   	bool			tokenized;
>>> +	bool			user_managed;
>>
>> Can't we use IFA_F_PERMANENT?
> 
> I think so. Will fix.
> 
> 
> -----8<-----
> 
>>> @@ -3356,7 +3413,9 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>>>   {
>>>   	struct net *net = dev_net(dev);
>>>   	struct inet6_dev *idev;
>>> -	struct inet6_ifaddr *ifa;
>>> +	struct inet6_ifaddr *ifa, *tmp;
>>> +	struct list_head del_list;
>>> +	int keep_addr;
>>>   	int state, i;
>>>   
>>>   	ASSERT_RTNL();
>>> @@ -3383,6 +3442,10 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>>>   
>>>   	}
>>>   
>>> +	keep_addr = net->ipv6.devconf_all->keep_addr_on_down;
>>> +	if (!keep_addr)
>>> +		keep_addr = idev->cnf.keep_addr_on_down;
>>> +
>>>   	/* Step 2: clear hash table */
>>>   	for (i = 0; i < IN6_ADDR_HSIZE; i++) {
>>>   		struct hlist_head *h = &inet6_addr_lst[i];
> 
> So what I have here is if the system-wide setting says keep the address
> it is kept. Else if the individual interface setting is enabled the
> address is kept.
> 
> 
> 

Other admin may want to enable it system-wide with some exceptions.

And well, you could just check per-interface configuration; 4 above.

-- 
Hideaki Yoshifuji <hideaki.yoshifuji@miraclelinux.com>
Technical Division, MIRACLE LINUX CORPORATION

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-17  2:10     ` YOSHIFUJI Hideaki
@ 2016-02-17  3:45       ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2016-02-17  3:45 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki, netdev; +Cc: hannes

On 2/16/16 7:10 PM, YOSHIFUJI Hideaki wrote:
> Hi,
> 
> David Ahern wrote:
>> On 2/16/16 1:45 AM, YOSHIFUJI Hideaki wrote:
>>>> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
>>>> index 24ce97f42d35..7ddbbb67f0db 100644
>>>> --- a/Documentation/networking/ip-sysctl.txt
>>>> +++ b/Documentation/networking/ip-sysctl.txt
>>>> @@ -1563,6 +1563,12 @@ temp_prefered_lft - INTEGER
>>>>    	Preferred lifetime (in seconds) for temporary addresses.
>>>>    	Default: 86400 (1 day)
>>>>    
>>>> +keep_addr_on_down - BOOLEAN
>>>> +	Keep all IPv6 addresses on an interface down event. If set static
>>>> +	global addresses with no expiration time are not flushed.
>>>> +
>>>> +	Default: disabled
>>>> +
>>>
>>> How about this:
>>>      1: enabled
>>>      0: system default
>>>     -1: disabled
>>> so that an iterface can override system-wide config?
>>
>> It is my understanding that the 'all' settings override the individual
>> interface settings. From Documentation/networking/ip-sysctl.txt +1346:
>>
>> conf/all/*:
>>          Change all the interface-specific settings.
> 
> Well, document is not correct.
> 1) Some of "all" variables set all interface specific settings.
> 2) Some of "all" variables override interface specific settings.
> 3) Some provide "fall-back" values; such an interface specific
>     setting overrides the corresponding "all" variable.
>     (Note: "default" variables are values per-interface settings
>     are initialized to.)
> 4) Others are ignored (the exists but no-ops).

Seems like a nightmare for an admin to understand which ones fall into
which category.

I really don't have a preference here beyond having the feature and
making it easy to enable (e.g., enable 'all' and it works for all). If
you want the 1/0/-1 trio and allow individual netdev settings to
override all then I will update the patch.

Thanks,
David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-13 22:23 [PATCH] net: ipv6: Make address flushing on ifdown optional David Ahern
  2016-02-13 22:25 ` David Ahern
  2016-02-16  8:45 ` YOSHIFUJI Hideaki
@ 2016-02-17 18:05 ` David Miller
  2016-02-17 18:09   ` David Ahern
  2 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2016-02-17 18:05 UTC (permalink / raw)
  To: dsa; +Cc: netdev, hannes

From: David Ahern <dsa@cumulusnetworks.com>
Date: Sat, 13 Feb 2016 14:23:27 -0800

> @@ -3427,31 +3493,52 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>  		write_lock_bh(&idev->lock);
>  	}
>  
> -	while (!list_empty(&idev->addr_list)) {
> -		ifa = list_first_entry(&idev->addr_list,
> -				       struct inet6_ifaddr, if_list);
> -		addrconf_del_dad_work(ifa);
> +	INIT_LIST_HEAD(&del_list);
> +	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
> +		bool keep_ifa = false;
>  
> -		list_del(&ifa->if_list);
> +		if (!how && keep_addr && ifa->user_managed)
> +			keep_ifa = true;

I think it would make sense to evaluate "!how && keep_addr" outside the
loop.  The only thing that changes is ifa->user_managed on each iteration.

But I also want some more documentation in what you are doing here.

I understand the address flushing on ifdown avoidance, but all of this
user_managed logic is not mentioned at all.  Why do you need it?  What
role does it play in achieving your goal?

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] net: ipv6: Make address flushing on ifdown optional
  2016-02-17 18:05 ` David Miller
@ 2016-02-17 18:09   ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2016-02-17 18:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, hannes

On 2/17/16 11:05 AM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Sat, 13 Feb 2016 14:23:27 -0800
>
>> @@ -3427,31 +3493,52 @@ static int addrconf_ifdown(struct net_device *dev, int how)
>>   		write_lock_bh(&idev->lock);
>>   	}
>>
>> -	while (!list_empty(&idev->addr_list)) {
>> -		ifa = list_first_entry(&idev->addr_list,
>> -				       struct inet6_ifaddr, if_list);
>> -		addrconf_del_dad_work(ifa);
>> +	INIT_LIST_HEAD(&del_list);
>> +	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
>> +		bool keep_ifa = false;
>>
>> -		list_del(&ifa->if_list);
>> +		if (!how && keep_addr && ifa->user_managed)
>> +			keep_ifa = true;
>
> I think it would make sense to evaluate "!how && keep_addr" outside the
> loop.  The only thing that changes is ifa->user_managed on each iteration.
>
> But I also want some more documentation in what you are doing here.
>
> I understand the address flushing on ifdown avoidance, but all of this
> user_managed logic is not mentioned at all.  Why do you need it?  What
> role does it play in achieving your goal?
>

Per prior comment user_managed will go away in favor of checking 
IFA_F_PERMANENT. We are only keeping permanent addresses which in past 
versions of the patch were marked with user_managed flag but it is 
redundant.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-02-17 18:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-13 22:23 [PATCH] net: ipv6: Make address flushing on ifdown optional David Ahern
2016-02-13 22:25 ` David Ahern
2016-02-16  8:45 ` YOSHIFUJI Hideaki
2016-02-16 15:16   ` David Ahern
2016-02-17  2:10     ` YOSHIFUJI Hideaki
2016-02-17  3:45       ` David Ahern
2016-02-17 18:05 ` David Miller
2016-02-17 18:09   ` David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.