netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 net 0/2] net: fix nsna_ping not working in team
@ 2023-01-12  0:41 Xin Long
  2023-01-12  0:41 ` [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF Xin Long
  2023-01-12  0:41 ` [PATCHv2 net 2/2] kselftest: add a selftest for ipv6 dad and rs sending Xin Long
  0 siblings, 2 replies; 5+ messages in thread
From: Xin Long @ 2023-01-12  0:41 UTC (permalink / raw)
  To: network dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Hideaki YOSHIFUJI, David Ahern

Completely disabling ipv6 addrconf is too harsh to team driver,
as nsna_ping link-watch still needs it. The 1st patch is to fix
it by only preventing DAD and RS sending for it, and 2nd patch
is to add a selftest for all factors that may prevent DAD and
RS sending including the team/bond slave ports.

v1->v2:
  - no need to check IFF_NO_ADDRCONF addrconf_dad_begin(), see
    Patch 1.
  - add a selftest for DAD and RS as David Ahern suggested, see
    Patch 2.

Xin Long (2):
  ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF
  kselftest: add a selftest for ipv6 dad and rs sending

 net/ipv6/addrconf.c                        |  12 +--
 tools/testing/selftests/net/Makefile       |   1 +
 tools/testing/selftests/net/ipv6_dad_rs.sh | 111 +++++++++++++++++++++
 3 files changed, 117 insertions(+), 7 deletions(-)
 create mode 100755 tools/testing/selftests/net/ipv6_dad_rs.sh

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF
  2023-01-12  0:41 [PATCHv2 net 0/2] net: fix nsna_ping not working in team Xin Long
@ 2023-01-12  0:41 ` Xin Long
  2023-01-14  5:33   ` Jakub Kicinski
  2023-01-12  0:41 ` [PATCHv2 net 2/2] kselftest: add a selftest for ipv6 dad and rs sending Xin Long
  1 sibling, 1 reply; 5+ messages in thread
From: Xin Long @ 2023-01-12  0:41 UTC (permalink / raw)
  To: network dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Hideaki YOSHIFUJI, David Ahern

Currently IFF_NO_ADDRCONF is used to prevent all ipv6 addrconf for the
slave ports of team, bonding and failover devices and it means no ipv6
packets can be sent out through these slave ports. However, for team
device, "nsna_ping" link_watch requires ipv6 addrconf. Otherwise, the
link will be marked failure.

The orginal issue fixed by IFF_NO_ADDRCONF was caused by DAD and RS
packets sent by slave ports in commit c2edacf80e15 ("bonding / ipv6: no
addrconf for slaves separately from master") where it's using IFF_SLAVE
and later changed to IFF_NO_ADDRCONF in commit 8a321cf7becc ("net: add
IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf").

So instead of preventing all the ipv6 addrconf, it makes more sense to
only prevent DAD and RS sending for the slave ports: Firstly, check
IFF_NO_ADDRCONF in addrconf_dad_completed() to prevent RS as it did in
commit b52e1cce31ca ("ipv6: Don't send rs packets to the interface of
ARPHRD_TUNNEL"), and then also check IFF_NO_ADDRCONF where IFA_F_NODAD
is checked to prevent DAD.

Note that the check for flags & IFA_F_NODAD in addrconf_dad_begin() is
not necessary, as with IFA_F_NODAF, flags & IFA_F_TENTATIVE is always
false, so there's no need to add IFF_NO_ADDRCONF check there either.

Fixes: 0aa64df30b38 ("net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf")
Reported-by: Liang Li <liali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/ipv6/addrconf.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f7a84a4acffc..de4186e5349c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1124,7 +1124,8 @@ ipv6_add_addr(struct inet6_dev *idev, struct ifa6_config *cfg,
 	ifa->flags = cfg->ifa_flags;
 	ifa->ifa_proto = cfg->ifa_proto;
 	/* No need to add the TENTATIVE flag for addresses with NODAD */
-	if (!(cfg->ifa_flags & IFA_F_NODAD))
+	if (!(cfg->ifa_flags & IFA_F_NODAD) &&
+	    !(idev->dev->priv_flags & IFF_NO_ADDRCONF))
 		ifa->flags |= IFA_F_TENTATIVE;
 	ifa->valid_lft = cfg->valid_lft;
 	ifa->prefered_lft = cfg->preferred_lft;
@@ -3319,10 +3320,6 @@ static void addrconf_addr_gen(struct inet6_dev *idev, bool prefix_route)
 	if (netif_is_l3_master(idev->dev))
 		return;
 
-	/* no link local addresses on devices flagged as slaves */
-	if (idev->dev->priv_flags & IFF_NO_ADDRCONF)
-		return;
-
 	ipv6_addr_set(&addr, htonl(0xFE800000), 0, 0, 0);
 
 	switch (idev->cnf.addr_gen_mode) {
@@ -3564,7 +3561,6 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
 			if (event == NETDEV_UP && !IS_ERR_OR_NULL(idev) &&
 			    dev->flags & IFF_UP && dev->flags & IFF_MULTICAST)
 				ipv6_mc_up(idev);
-			break;
 		}
 
 		if (event == NETDEV_UP) {
@@ -3855,7 +3851,8 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
 			/* set state to skip the notifier below */
 			state = INET6_IFADDR_STATE_DEAD;
 			ifa->state = INET6_IFADDR_STATE_PREDAD;
-			if (!(ifa->flags & IFA_F_NODAD))
+			if (!(ifa->flags & IFA_F_NODAD) &&
+			    !(dev->priv_flags & IFF_NO_ADDRCONF))
 				ifa->flags |= IFA_F_TENTATIVE;
 
 			rt = ifa->rt;
@@ -4218,6 +4215,7 @@ static void addrconf_dad_completed(struct inet6_ifaddr *ifp, bool bump_id,
 		  ipv6_accept_ra(ifp->idev) &&
 		  ifp->idev->cnf.rtr_solicits != 0 &&
 		  (dev->flags & IFF_LOOPBACK) == 0 &&
+		  (dev->priv_flags & IFF_NO_ADDRCONF) == 0 &&
 		  (dev->type != ARPHRD_TUNNEL);
 	read_unlock_bh(&ifp->idev->lock);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCHv2 net 2/2] kselftest: add a selftest for ipv6 dad and rs sending
  2023-01-12  0:41 [PATCHv2 net 0/2] net: fix nsna_ping not working in team Xin Long
  2023-01-12  0:41 ` [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF Xin Long
@ 2023-01-12  0:41 ` Xin Long
  1 sibling, 0 replies; 5+ messages in thread
From: Xin Long @ 2023-01-12  0:41 UTC (permalink / raw)
  To: network dev
  Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Hideaki YOSHIFUJI, David Ahern

This patch is to test all these factors and their combinations
that may enable/disable ipv6 DAD or RS on a slave port or dev.
For DAD, it includes:

  - sysctl "net.ipv6.conf.all.accept_dad"
  - sysctl "net.ipv6.conf.$dev_name.accept_dad"
  - inet6_ifaddr flag "IFA_F_NODAD"
  - netdev priv_flags "IFF_NO_ADDRCONF"

and for rs, it includes:

  - sysctl "net.ipv6.conf.$dev_name.accept_ra"
  - sysctl "net.ipv6.conf.$dev_name.router_solicitations"
  - netdev priv_flags "IFF_NO_ADDRCONF"

The test uses team/bond ports to have IFF_NO_ADDRCONF priv_flags
set, and "ip addr add ... nodad" to have IFA_F_NODAD flag set.
It uses "ip6tables" to count the DAD or RS packets during the
port or dev goes up.

Note that the bridge port is also tested as slave ports without
IFF_NO_ADDRCONF flag.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 tools/testing/selftests/net/Makefile       |   1 +
 tools/testing/selftests/net/ipv6_dad_rs.sh | 111 +++++++++++++++++++++
 2 files changed, 112 insertions(+)
 create mode 100755 tools/testing/selftests/net/ipv6_dad_rs.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 3007e98a6d64..4a9905d10212 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -75,6 +75,7 @@ TEST_GEN_PROGS += so_incoming_cpu
 TEST_PROGS += sctp_vrf.sh
 TEST_GEN_FILES += sctp_hello
 TEST_GEN_FILES += csum
+TEST_PROGS += ipv6_dad_rs.sh
 
 TEST_FILES := settings
 
diff --git a/tools/testing/selftests/net/ipv6_dad_rs.sh b/tools/testing/selftests/net/ipv6_dad_rs.sh
new file mode 100755
index 000000000000..064afe806ce4
--- /dev/null
+++ b/tools/testing/selftests/net/ipv6_dad_rs.sh
@@ -0,0 +1,111 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Testing for DAD/RS on Ports/Devices.
+# TOPO: ns0 (link0) <---> (link1) ns1
+
+setup() {
+	local mac_addr
+	local ip6_addr
+
+	ip net add ns0
+	ip net add ns1
+	ip net exec ns0 ip link add link0 type veth peer link1 netns ns1
+	ip net exec ns0 ip link set link0 up
+
+	# The test uses global addrs, so drop the pkts for link-local addrs.
+	mac_addr=`ip net exec ns1 cat /sys/class/net/link1/address`
+	ip6_addr="ff02::1:ff${mac_addr:9:2}:${mac_addr:12:2}${mac_addr:15:2}"
+	ip net exec ns1 ip6tables -A OUTPUT -d $ip6_addr -j DROP
+}
+
+cleanup() {
+	ip net del ns1
+	ip net del ns0
+}
+
+check_pkts() {
+	local CNT=0
+
+	while ip net exec ns0 ip6tables -t raw -L -v | \
+		grep link0 | awk '$1 != "0" {exit 1}'; do
+		[ $((CNT++)) = "30" ] && return 1
+		sleep 0.1
+	done
+}
+
+do_test() {
+	local master_type="$1"
+	local icmpv6_type="$2"
+	local pkt_exp="$3"
+	local pkt_rcv="0"
+	local dad="$4"
+
+	ip net exec ns1 ip link set link1 down
+	[ $master_type != "veth" ] && {
+		ip net exec ns1 ip link add master_dev1 type $master_type
+		ip net exec ns1 ip link set link1 master master_dev1
+	}
+
+	ip net exec ns0 ip6tables -t raw -A PREROUTING -i link0 \
+		-p ipv6-icmp --icmpv6-type $icmpv6_type -j ACCEPT
+
+	ip net exec ns1 ip addr add 2000::1/64 dev link1 $dad
+	ip net exec ns1 ip link set link1 up
+	check_pkts && pkt_rcv="1"
+
+	ip net exec ns1 ip addr del 2000::1/64 dev link1 $dad
+	ip net exec ns0 ip6tables -t raw -D PREROUTING -i link0 \
+		-p ipv6-icmp --icmpv6-type $icmpv6_type -j ACCEPT
+
+	[ $master_type != "veth" ] &&
+		ip net exec ns1 ip link del master_dev1
+	test "$pkt_exp" = "$pkt_rcv"
+}
+
+test_rs() {
+	local rs=1
+
+	echo "- link_ra: $link_ra, link_rs: $link_rs"
+	ip net exec ns1 sysctl -qw net.ipv6.conf.link1.accept_ra=$link_ra
+	ip net exec ns1 sysctl -qw net.ipv6.conf.link1.router_solicitations=$link_rs
+
+	[ "$link_ra" = "0" -o  "$link_rs" = "0" ] && rs=0
+	do_test veth router-solicitation $rs   && echo "  veth device (RS $rs): PASS" &&
+	do_test bridge router-solicitation $rs && echo "  bridge port (RS $rs): PASS" &&
+	do_test bond router-solicitation 0     && echo "  bond slave  (RS 0): PASS" &&
+	do_test team router-solicitation 0     && echo "  team port   (RS 0): PASS"
+}
+
+test_dad() {
+	local nodad=""
+	local ns=1
+
+	echo "- all_dad: $all_dad, link_dad: $link_dad, addr_nodad: $addr_nodad"
+	ip net exec ns1 sysctl -qw net.ipv6.conf.all.accept_dad=$all_dad
+	ip net exec ns1 sysctl -qw net.ipv6.conf.link1.accept_dad=$link_dad
+
+	[ "$all_dad" = "0" -a "$link_dad" = "0" ] && ns=0
+	[ "$addr_nodad" = "1" ] && nodad="nodad"  && ns=0
+	do_test veth neighbor-solicitation $ns $nodad   && echo "  veth device (NS $ns): PASS" &&
+	do_test bridge neighbor-solicitation $ns $nodad && echo "  bridge port (NS $ns): PASS" &&
+	do_test bond neighbor-solicitation 0 $dad       && echo "  bond slave  (NS 0): PASS" &&
+	do_test team neighbor-solicitation 0 $dad       && echo "  team port   (NS 0): PASS"
+}
+
+trap cleanup EXIT
+setup && echo "Testing for DAD/RS on Ports/Devices:" && {
+	for all_dad in 0 1; do
+		for link_dad in 0 1; do
+			for addr_nodad in 0 1; do
+				test_dad || exit $?
+			done
+		done
+	done
+	for link_ra in 0 1; do
+		for link_rs in 0 1; do
+			test_rs || exit $?
+		done
+	done
+}
+exit $?
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF
  2023-01-12  0:41 ` [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF Xin Long
@ 2023-01-14  5:33   ` Jakub Kicinski
  2023-01-14 17:23     ` Xin Long
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2023-01-14  5:33 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, davem, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Hideaki YOSHIFUJI, David Ahern

On Wed, 11 Jan 2023 19:41:56 -0500 Xin Long wrote:
> So instead of preventing all the ipv6 addrconf, it makes more sense to
> only prevent DAD and RS sending for the slave ports: Firstly, check
> IFF_NO_ADDRCONF in addrconf_dad_completed() to prevent RS as it did in
> commit b52e1cce31ca ("ipv6: Don't send rs packets to the interface of
> ARPHRD_TUNNEL"), and then also check IFF_NO_ADDRCONF where IFA_F_NODAD
> is checked to prevent DAD.

Maybe it's because I'm not an ipv6 expert but it feels to me like we're
getting into intricate / hacky territory. IIUC all addresses on legs of
bond/team will silently get nodad behavior? Isn't that risky for a fix?

Could we instead revert 0aa64df30b38 and take this via net-next?

Alternatively - could the team user space just tell the kernel what
behavior it wants? Instead of always putting the flag up, like we did 
in 0aa64df30b3, do it only when the user space opts in?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF
  2023-01-14  5:33   ` Jakub Kicinski
@ 2023-01-14 17:23     ` Xin Long
  0 siblings, 0 replies; 5+ messages in thread
From: Xin Long @ 2023-01-14 17:23 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: network dev, davem, Eric Dumazet, Paolo Abeni, Jiri Pirko,
	Hideaki YOSHIFUJI, David Ahern

On Sat, Jan 14, 2023 at 12:33 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 11 Jan 2023 19:41:56 -0500 Xin Long wrote:
> > So instead of preventing all the ipv6 addrconf, it makes more sense to
> > only prevent DAD and RS sending for the slave ports: Firstly, check
> > IFF_NO_ADDRCONF in addrconf_dad_completed() to prevent RS as it did in
> > commit b52e1cce31ca ("ipv6: Don't send rs packets to the interface of
> > ARPHRD_TUNNEL"), and then also check IFF_NO_ADDRCONF where IFA_F_NODAD
> > is checked to prevent DAD.
>
> Maybe it's because I'm not an ipv6 expert but it feels to me like we're
> getting into intricate / hacky territory. IIUC all addresses on legs of
> bond/team will silently get nodad behavior? Isn't that risky for a fix?
Understand.
I was actually thinking this would be less risky than completely disabling
ipv6 addrconf for IFF_NO_ADDRCONF.

>
> Could we instead revert 0aa64df30b38 and take this via net-next?
Fair enough.
I will send a revert of 0aa64df30b38.
Let's take a step back and think about doing it via net-next.

>
> Alternatively - could the team user space just tell the kernel what
> behavior it wants? Instead of always putting the flag up, like we did
> in 0aa64df30b3, do it only when the user space opts in?
Like when knowing nsna_ping link watch is used, but it is loaded after
the port is added in libteam, and yet the kernel has no idea what link
watch is used in userspace.
Jiri?

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-14 17:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-12  0:41 [PATCHv2 net 0/2] net: fix nsna_ping not working in team Xin Long
2023-01-12  0:41 ` [PATCHv2 net 1/2] ipv6: prevent only DAD and RS sending for IFF_NO_ADDRCONF Xin Long
2023-01-14  5:33   ` Jakub Kicinski
2023-01-14 17:23     ` Xin Long
2023-01-12  0:41 ` [PATCHv2 net 2/2] kselftest: add a selftest for ipv6 dad and rs sending Xin Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).