All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 0/4] vxlan: Various fixes
@ 2018-12-17 18:09 Petr Machata
  2018-12-17 18:09 ` [PATCH net v2 1/4] vxlan: Unmark offloaded bit on replaced FDB entries Petr Machata
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Petr Machata @ 2018-12-17 18:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, Ido Schimmel, roopa

This patch set contains three fixes for the vxlan driver.

Patch #1 fixes handling of offload mark on replaced VXLAN FDB entries. A
way to trigger this is to replace the FDB entry with one that can not be
offloaded. A future patch set should make it possible to veto such FDB
changes. However the FDB might still fail to be offloaded due to another
issue, and the offload mark should reflect that.

Patch #2 fixes problems in __vxlan_dev_create() when a call to
rtnl_configure_link() fails. These failures would be tricky to hit on a
real system, the most likely vector is through an error in vxlan_open().
However, with the abovementioned vetoing patchset, vetoing the created
entry would trigger the same problems (and be easier to reproduce).

Patch #3 fixes a problem in vxlan_changelink(). In situations where the
default remote configured in the FDB table (if any) does not exactly
match the remote address configured at the VXLAN device, changing the
remote address breaks the default FDB entry. Patch #4 is then a self
test for this issue.

v2:
- Drop former patch #3
- Patch #2:
    - Delete the default entry before calling unregister_netdevice(). That
      takes care of former patch #3, hence tweak the commit message to
      mention that problem as well.

Petr Machata (4):
  vxlan: Unmark offloaded bit on replaced FDB entries
  vxlan: Fix error path in __vxlan_dev_create()
  vxlan: changelink: Fix handling of default remotes
  selftests: net: Add test_vxlan_fdb_changelink.sh

 drivers/net/vxlan.c                                | 25 ++++++++++++-------
 tools/testing/selftests/net/Makefile               |  1 +
 .../selftests/net/test_vxlan_fdb_changelink.sh     | 29 ++++++++++++++++++++++
 3 files changed, 46 insertions(+), 9 deletions(-)
 create mode 100755 tools/testing/selftests/net/test_vxlan_fdb_changelink.sh

-- 
2.4.11

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net v2 1/4] vxlan: Unmark offloaded bit on replaced FDB entries
  2018-12-17 18:09 [PATCH net v2 0/4] vxlan: Various fixes Petr Machata
@ 2018-12-17 18:09 ` Petr Machata
  2018-12-17 18:10 ` [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create() Petr Machata
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Petr Machata @ 2018-12-17 18:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, Ido Schimmel, roopa

When rdst of an offloaded FDB entry is replaced, it certainly isn't
offloaded anymore. Drivers are notified about such replacements, and can
re-mark the entry as offloaded again if they so wish. However until a
driver does so explicitly, assume a replaced FDB entry is not offloaded.

Note that replaces coming via vxlan_fdb_external_learn_add() are always
immediately followed by an explicit offload marking.

Fixes: 0efe11733356 ("vxlan: Support marking RDSTs as offloaded")
Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 drivers/net/vxlan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 297cdeaef479..c9956c08edf5 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -568,6 +568,7 @@ static int vxlan_fdb_replace(struct vxlan_fdb *f,
 	rd->remote_port = port;
 	rd->remote_vni = vni;
 	rd->remote_ifindex = ifindex;
+	rd->offloaded = false;
 	return 1;
 }
 
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create()
  2018-12-17 18:09 [PATCH net v2 0/4] vxlan: Various fixes Petr Machata
  2018-12-17 18:09 ` [PATCH net v2 1/4] vxlan: Unmark offloaded bit on replaced FDB entries Petr Machata
@ 2018-12-17 18:10 ` Petr Machata
  2018-12-17 20:07   ` Roopa Prabhu
  2018-12-17 18:10 ` [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes Petr Machata
  2018-12-17 18:10 ` [PATCH net v2 4/4] selftests: net: Add test_vxlan_fdb_changelink.sh Petr Machata
  3 siblings, 1 reply; 11+ messages in thread
From: Petr Machata @ 2018-12-17 18:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, Ido Schimmel, roopa

When a failure occurs in rtnl_configure_link(), the current code
calls unregister_netdevice() to roll back the earlier call to
register_netdevice(), and jumps to errout, which calls
vxlan_fdb_destroy().

However unregister_netdevice() calls transitively ndo_uninit, which is
vxlan_uninit(), and that already takes care of deleting the default FDB
entry by calling vxlan_fdb_delete_default(). Since the entry added
earlier in __vxlan_dev_create() is exactly the default entry, the
cleanup code in the errout block always leads to double free and thus a
panic.

Besides, since vxlan_fdb_delete_default() always destroys the FDB entry
with notification enabled, the deletion of the default entry is notified
even before the addition was notified.

Instead, before calling unregister_netdevice(), destroy the default FDB
entry by hand, which solves both problems.

Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Petr Machata <petrm@mellanox.com>
---

Notes:
    v2:
    
    - Delete the default entry before calling unregister_netdevice(). That
      takes care of former patch #3, hence tweak the commit message to
      mention that problem as well.

 drivers/net/vxlan.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index c9956c08edf5..f29bf7d3e682 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3282,14 +3282,15 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 	}
 
 	err = register_netdevice(dev);
-	if (err)
-		goto errout;
+	if (err) {
+		if (f)
+			vxlan_fdb_destroy(vxlan, f, false);
+		return err;
+	}
 
 	err = rtnl_configure_link(dev, NULL);
-	if (err) {
-		unregister_netdevice(dev);
+	if (err)
 		goto errout;
-	}
 
 	/* notify default fdb entry */
 	if (f)
@@ -3297,9 +3298,15 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 
 	list_add(&vxlan->next, &vn->vxlan_list);
 	return 0;
+
 errout:
+	/* unregister_netdevice() destroys the default FDB entry with deletion
+	 * notification. But the addition notification was not sent yet, so
+	 * destroy the entry by hand here.
+	 */
 	if (f)
 		vxlan_fdb_destroy(vxlan, f, false);
+	unregister_netdevice(dev);
 	return err;
 }
 
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes
  2018-12-17 18:09 [PATCH net v2 0/4] vxlan: Various fixes Petr Machata
  2018-12-17 18:09 ` [PATCH net v2 1/4] vxlan: Unmark offloaded bit on replaced FDB entries Petr Machata
  2018-12-17 18:10 ` [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create() Petr Machata
@ 2018-12-17 18:10 ` Petr Machata
  2018-12-17 19:59   ` Roopa Prabhu
  2018-12-17 18:10 ` [PATCH net v2 4/4] selftests: net: Add test_vxlan_fdb_changelink.sh Petr Machata
  3 siblings, 1 reply; 11+ messages in thread
From: Petr Machata @ 2018-12-17 18:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, Ido Schimmel, roopa

Default remotes are stored as FDB entries with an Ethernet address of
00:00:00:00:00:00. When a request is made to change a remote address of
a VXLAN device, vxlan_changelink() first deletes the existing default
remote, and then creates a new FDB entry.

This works well as long as the list of default remotes matches exactly
the configuration of a VXLAN remote address. Thus when the VXLAN device
has a remote of X, there should be exactly one default remote FDB entry
X. If the VXLAN device has no remote address, there should be no such
entry.

Besides using "ip link set", it is possible to manipulate the list of
default remotes by using the "bridge fdb". It is therefore easy to break
the above condition. Under such circumstances, the __vxlan_fdb_delete()
call doesn't delete the FDB entry itself, but just one remote. The
following vxlan_fdb_create() then creates a new FDB entry, leading to a
situation where two entries exist for the address 00:00:00:00:00:00,
each with a different subset of default remotes.

An even more obvious breakage rooted in the same cause can be observed
when a remote address is configured for a VXLAN device that did not have
one before. In that case vxlan_changelink() doesn't remove any remote,
and just creates a new FDB entry for the new address:

$ ip link add name vx up type vxlan id 2000 dstport 4789
$ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
$ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
$ ip link set dev vx type vxlan remote 192.0.2.30
$ bridge fdb sh dev vx | grep 00:00:00:00:00:00
00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
00:00:00:00:00:00 dst 192.0.2.30 self permanent

To fix this, instead of calling vxlan_fdb_create() directly, defer to
vxlan_fdb_update(). That has logic to handle the duplicates properly.
Additionally, it also handles notifications, so drop that call from
changelink as well.

Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 drivers/net/vxlan.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index f29bf7d3e682..156cf4ff7000 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3542,7 +3542,6 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 	struct vxlan_rdst *dst = &vxlan->default_dst;
 	struct vxlan_rdst old_dst;
 	struct vxlan_config conf;
-	struct vxlan_fdb *f = NULL;
 	int err;
 
 	err = vxlan_nl2conf(tb, data,
@@ -3568,19 +3567,19 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[],
 					   old_dst.remote_ifindex, 0);
 
 		if (!vxlan_addr_any(&dst->remote_ip)) {
-			err = vxlan_fdb_create(vxlan, all_zeros_mac,
+			err = vxlan_fdb_update(vxlan, all_zeros_mac,
 					       &dst->remote_ip,
 					       NUD_REACHABLE | NUD_PERMANENT,
+					       NLM_F_APPEND | NLM_F_CREATE,
 					       vxlan->cfg.dst_port,
 					       dst->remote_vni,
 					       dst->remote_vni,
 					       dst->remote_ifindex,
-					       NTF_SELF, &f);
+					       NTF_SELF);
 			if (err) {
 				spin_unlock_bh(&vxlan->hash_lock);
 				return err;
 			}
-			vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f), RTM_NEWNEIGH);
 		}
 		spin_unlock_bh(&vxlan->hash_lock);
 	}
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net v2 4/4] selftests: net: Add test_vxlan_fdb_changelink.sh
  2018-12-17 18:09 [PATCH net v2 0/4] vxlan: Various fixes Petr Machata
                   ` (2 preceding siblings ...)
  2018-12-17 18:10 ` [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes Petr Machata
@ 2018-12-17 18:10 ` Petr Machata
  3 siblings, 0 replies; 11+ messages in thread
From: Petr Machata @ 2018-12-17 18:10 UTC (permalink / raw)
  To: netdev; +Cc: davem, Ido Schimmel, roopa

Add a test to exercise the fix from the previous patch.

Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 tools/testing/selftests/net/Makefile               |  1 +
 .../selftests/net/test_vxlan_fdb_changelink.sh     | 29 ++++++++++++++++++++++
 2 files changed, 30 insertions(+)
 create mode 100755 tools/testing/selftests/net/test_vxlan_fdb_changelink.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 256d82d5fa87..923570a9708a 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -7,6 +7,7 @@ CFLAGS += -I../../../../usr/include/
 TEST_PROGS := run_netsocktests run_afpackettests test_bpf.sh netdevice.sh rtnetlink.sh
 TEST_PROGS += fib_tests.sh fib-onlink-tests.sh pmtu.sh udpgso.sh ip_defrag.sh
 TEST_PROGS += udpgso_bench.sh fib_rule_tests.sh msg_zerocopy.sh psock_snd.sh
+TEST_PROGS += test_vxlan_fdb_changelink.sh
 TEST_PROGS_EXTENDED := in_netns.sh
 TEST_GEN_FILES =  socket
 TEST_GEN_FILES += psock_fanout psock_tpacket msg_zerocopy
diff --git a/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh b/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh
new file mode 100755
index 000000000000..2d442cdab11e
--- /dev/null
+++ b/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh
@@ -0,0 +1,29 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Check FDB default-remote handling across "ip link set".
+
+check_remotes()
+{
+	local what=$1; shift
+	local N=$(bridge fdb sh dev vx | grep 00:00:00:00:00:00 | wc -l)
+
+	echo -ne "expected two remotes after $what\t"
+	if [[ $N != 2 ]]; then
+		echo "[FAIL]"
+		EXIT_STATUS=1
+	else
+		echo "[ OK ]"
+	fi
+}
+
+ip link add name vx up type vxlan id 2000 dstport 4789
+bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
+bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
+check_remotes "fdb append"
+
+ip link set dev vx type vxlan remote 192.0.2.30
+check_remotes "link set"
+
+ip link del dev vx
+exit $EXIT_STATUS
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes
  2018-12-17 18:10 ` [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes Petr Machata
@ 2018-12-17 19:59   ` Roopa Prabhu
  2018-12-17 23:44     ` Petr Machata
  0 siblings, 1 reply; 11+ messages in thread
From: Roopa Prabhu @ 2018-12-17 19:59 UTC (permalink / raw)
  To: Petr Machata; +Cc: netdev, davem, Ido Schimmel

On Mon, Dec 17, 2018 at 10:10 AM Petr Machata <petrm@mellanox.com> wrote:
>
> Default remotes are stored as FDB entries with an Ethernet address of
> 00:00:00:00:00:00. When a request is made to change a remote address of
> a VXLAN device, vxlan_changelink() first deletes the existing default
> remote, and then creates a new FDB entry.
>
> This works well as long as the list of default remotes matches exactly
> the configuration of a VXLAN remote address. Thus when the VXLAN device
> has a remote of X, there should be exactly one default remote FDB entry
> X. If the VXLAN device has no remote address, there should be no such
> entry.
>
> Besides using "ip link set", it is possible to manipulate the list of
> default remotes by using the "bridge fdb". It is therefore easy to break
> the above condition. Under such circumstances, the __vxlan_fdb_delete()
> call doesn't delete the FDB entry itself, but just one remote. The
> following vxlan_fdb_create() then creates a new FDB entry, leading to a
> situation where two entries exist for the address 00:00:00:00:00:00,
> each with a different subset of default remotes.
>
> An even more obvious breakage rooted in the same cause can be observed
> when a remote address is configured for a VXLAN device that did not have
> one before. In that case vxlan_changelink() doesn't remove any remote,
> and just creates a new FDB entry for the new address:
>
> $ ip link add name vx up type vxlan id 2000 dstport 4789
> $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
> $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
> $ ip link set dev vx type vxlan remote 192.0.2.30
> $ bridge fdb sh dev vx | grep 00:00:00:00:00:00
> 00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
> 00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
> 00:00:00:00:00:00 dst 192.0.2.30 self permanent
>
> To fix this, instead of calling vxlan_fdb_create() directly, defer to
> vxlan_fdb_update(). That has logic to handle the duplicates properly.
> Additionally, it also handles notifications, so drop that call from
> changelink as well.
>
> Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---

Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>

This fix looks fine. But it almost seems like the changelink Fixes tag
is more appropriate here unless you are
sure that it worked correctly after the changelink. Though I agree
this needs to be fixed, deployments either
configure the default remote dst on the vxlan device or manage default
remote dsts via fdb api in control plane
 not both. The trigger for this bug seems like the use of both.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create()
  2018-12-17 18:10 ` [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create() Petr Machata
@ 2018-12-17 20:07   ` Roopa Prabhu
  2018-12-17 23:41     ` Petr Machata
  0 siblings, 1 reply; 11+ messages in thread
From: Roopa Prabhu @ 2018-12-17 20:07 UTC (permalink / raw)
  To: Petr Machata; +Cc: netdev, David Miller, Ido Schimmel

On Mon, Dec 17, 2018 at 10:10 AM Petr Machata <petrm@mellanox.com> wrote:
>
> When a failure occurs in rtnl_configure_link(), the current code
> calls unregister_netdevice() to roll back the earlier call to
> register_netdevice(), and jumps to errout, which calls
> vxlan_fdb_destroy().
>
> However unregister_netdevice() calls transitively ndo_uninit, which is
> vxlan_uninit(), and that already takes care of deleting the default FDB
> entry by calling vxlan_fdb_delete_default(). Since the entry added
> earlier in __vxlan_dev_create() is exactly the default entry, the
> cleanup code in the errout block always leads to double free and thus a
> panic.
>
> Besides, since vxlan_fdb_delete_default() always destroys the FDB entry
> with notification enabled, the deletion of the default entry is notified
> even before the addition was notified.
>
> Instead, before calling unregister_netdevice(), destroy the default FDB
> entry by hand, which solves both problems.
>
> Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---

Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>

thanks for adding the comment. this looks good.., minor nit..can we
remove the repeated vxlan_fdb_destroy by using a
unregister_netdev_onerr bool ?
eg .,.

>
> Notes:
>     v2:
>
>     - Delete the default entry before calling unregister_netdevice(). That
>       takes care of former patch #3, hence tweak the commit message to
>       mention that problem as well.
>
>  drivers/net/vxlan.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index c9956c08edf5..f29bf7d3e682 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -3282,14 +3282,15 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
>         }
>
>         err = register_netdevice(dev);
> -       if (err)
> -               goto errout;
> +       if (err) {
> +               if (f)
> +                       vxlan_fdb_destroy(vxlan, f, false);
> +               return err;
> +       }
>
>         err = rtnl_configure_link(dev, NULL);
> -       if (err) {
> -               unregister_netdevice(dev);
> +       if (err)
>                 goto errout;
> -       }
>
>         /* notify default fdb entry */
>         if (f)
> @@ -3297,9 +3298,15 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
>
>         list_add(&vxlan->next, &vn->vxlan_list);
>         return 0;
> +
>  errout:
> +       /* unregister_netdevice() destroys the default FDB entry with deletion
> +        * notification. But the addition notification was not sent yet, so
> +        * destroy the entry by hand here.
> +        */
>         if (f)
>                 vxlan_fdb_destroy(vxlan, f, false);
> +       unregister_netdevice(dev);

if (unregister_netdev_onerr)
         unregister_netdevice


>         return err;
>  }
>
> --
> 2.4.11
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create()
  2018-12-17 20:07   ` Roopa Prabhu
@ 2018-12-17 23:41     ` Petr Machata
  0 siblings, 0 replies; 11+ messages in thread
From: Petr Machata @ 2018-12-17 23:41 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, David Miller, Ido Schimmel

Roopa Prabhu <roopa@cumulusnetworks.com> writes:

>> @@ -3297,9 +3298,15 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
>>
>>         list_add(&vxlan->next, &vn->vxlan_list);
>>         return 0;
>> +
>>  errout:
>> +       /* unregister_netdevice() destroys the default FDB entry with deletion
>> +        * notification. But the addition notification was not sent yet, so
>> +        * destroy the entry by hand here.
>> +        */
>>         if (f)
>>                 vxlan_fdb_destroy(vxlan, f, false);
>> +       unregister_netdevice(dev);
>
> if (unregister_netdev_onerr)
>          unregister_netdevice

All right, I'll send v3 tomorrow.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes
  2018-12-17 19:59   ` Roopa Prabhu
@ 2018-12-17 23:44     ` Petr Machata
  2018-12-18  1:30       ` Roopa Prabhu
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Machata @ 2018-12-17 23:44 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, davem, Ido Schimmel

Roopa Prabhu <roopa@cumulusnetworks.com> writes:

> On Mon, Dec 17, 2018 at 10:10 AM Petr Machata <petrm@mellanox.com> wrote:
>>
>> Default remotes are stored as FDB entries with an Ethernet address of
>> 00:00:00:00:00:00. When a request is made to change a remote address of
>> a VXLAN device, vxlan_changelink() first deletes the existing default
>> remote, and then creates a new FDB entry.
>>
>> This works well as long as the list of default remotes matches exactly
>> the configuration of a VXLAN remote address. Thus when the VXLAN device
>> has a remote of X, there should be exactly one default remote FDB entry
>> X. If the VXLAN device has no remote address, there should be no such
>> entry.
>>
>> Besides using "ip link set", it is possible to manipulate the list of
>> default remotes by using the "bridge fdb". It is therefore easy to break
>> the above condition. Under such circumstances, the __vxlan_fdb_delete()
>> call doesn't delete the FDB entry itself, but just one remote. The
>> following vxlan_fdb_create() then creates a new FDB entry, leading to a
>> situation where two entries exist for the address 00:00:00:00:00:00,
>> each with a different subset of default remotes.
>>
>> An even more obvious breakage rooted in the same cause can be observed
>> when a remote address is configured for a VXLAN device that did not have
>> one before. In that case vxlan_changelink() doesn't remove any remote,
>> and just creates a new FDB entry for the new address:
>>
>> $ ip link add name vx up type vxlan id 2000 dstport 4789
>> $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
>> $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
>> $ ip link set dev vx type vxlan remote 192.0.2.30
>> $ bridge fdb sh dev vx | grep 00:00:00:00:00:00
>> 00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
>> 00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
>> 00:00:00:00:00:00 dst 192.0.2.30 self permanent
>>
>> To fix this, instead of calling vxlan_fdb_create() directly, defer to
>> vxlan_fdb_update(). That has logic to handle the duplicates properly.
>> Additionally, it also handles notifications, so drop that call from
>> changelink as well.
>>
>> Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
>> Signed-off-by: Petr Machata <petrm@mellanox.com>
>> ---
>
> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This fix looks fine. But it almost seems like the changelink Fixes tag
> is more appropriate here unless you are
> sure that it worked correctly after the changelink.

I don't follow. What do you mean?

> Though I agree
> this needs to be fixed, deployments either
> configure the default remote dst on the vxlan device or manage default
> remote dsts via fdb api in control plane
>  not both. The trigger for this bug seems like the use of both.

Yep, you need to do both to trigger this. Which is why I think this went
undetected--otherwise people would have noticed some addresses are not
flooded.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes
  2018-12-17 23:44     ` Petr Machata
@ 2018-12-18  1:30       ` Roopa Prabhu
  2018-12-18 11:20         ` Petr Machata
  0 siblings, 1 reply; 11+ messages in thread
From: Roopa Prabhu @ 2018-12-18  1:30 UTC (permalink / raw)
  To: Petr Machata; +Cc: netdev, davem, Ido Schimmel

On Mon, Dec 17, 2018 at 3:44 PM Petr Machata <petrm@mellanox.com> wrote:
>
> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>
> > On Mon, Dec 17, 2018 at 10:10 AM Petr Machata <petrm@mellanox.com> wrote:
> >>
> >> Default remotes are stored as FDB entries with an Ethernet address of
> >> 00:00:00:00:00:00. When a request is made to change a remote address of
> >> a VXLAN device, vxlan_changelink() first deletes the existing default
> >> remote, and then creates a new FDB entry.
> >>
> >> This works well as long as the list of default remotes matches exactly
> >> the configuration of a VXLAN remote address. Thus when the VXLAN device
> >> has a remote of X, there should be exactly one default remote FDB entry
> >> X. If the VXLAN device has no remote address, there should be no such
> >> entry.
> >>
> >> Besides using "ip link set", it is possible to manipulate the list of
> >> default remotes by using the "bridge fdb". It is therefore easy to break
> >> the above condition. Under such circumstances, the __vxlan_fdb_delete()
> >> call doesn't delete the FDB entry itself, but just one remote. The
> >> following vxlan_fdb_create() then creates a new FDB entry, leading to a
> >> situation where two entries exist for the address 00:00:00:00:00:00,
> >> each with a different subset of default remotes.
> >>
> >> An even more obvious breakage rooted in the same cause can be observed
> >> when a remote address is configured for a VXLAN device that did not have
> >> one before. In that case vxlan_changelink() doesn't remove any remote,
> >> and just creates a new FDB entry for the new address:
> >>
> >> $ ip link add name vx up type vxlan id 2000 dstport 4789
> >> $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
> >> $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
> >> $ ip link set dev vx type vxlan remote 192.0.2.30
> >> $ bridge fdb sh dev vx | grep 00:00:00:00:00:00
> >> 00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
> >> 00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
> >> 00:00:00:00:00:00 dst 192.0.2.30 self permanent
> >>
> >> To fix this, instead of calling vxlan_fdb_create() directly, defer to
> >> vxlan_fdb_update(). That has logic to handle the duplicates properly.
> >> Additionally, it also handles notifications, so drop that call from
> >> changelink as well.
> >>
> >> Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
> >> Signed-off-by: Petr Machata <petrm@mellanox.com>
> >> ---
> >
> > Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> >
> > This fix looks fine. But it almost seems like the changelink Fixes tag
> > is more appropriate here unless you are
> > sure that it worked correctly after the changelink.
>
> I don't follow. What do you mean?

I meant the notify patch in the Fixes tag had no intentions to change
this. Also the vxlan changelink support
did not consider this as a test case. So i was wondering if this
problem existed before the commit in the Fixes tag.

regardless, ack on the fix. thanks.

>
> > Though I agree
> > this needs to be fixed, deployments either
> > configure the default remote dst on the vxlan device or manage default
> > remote dsts via fdb api in control plane
> >  not both. The trigger for this bug seems like the use of both.
>
> Yep, you need to do both to trigger this. Which is why I think this went
> undetected--otherwise people would have noticed some addresses are not
> flooded.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes
  2018-12-18  1:30       ` Roopa Prabhu
@ 2018-12-18 11:20         ` Petr Machata
  0 siblings, 0 replies; 11+ messages in thread
From: Petr Machata @ 2018-12-18 11:20 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, davem, Ido Schimmel

Roopa Prabhu <roopa@cumulusnetworks.com> writes:

> On Mon, Dec 17, 2018 at 3:44 PM Petr Machata <petrm@mellanox.com> wrote:
>>
>> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>> >> Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
>> >> Signed-off-by: Petr Machata <petrm@mellanox.com>
>> >> ---
>> >
>> > Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> >
>> > This fix looks fine. But it almost seems like the changelink Fixes tag
>> > is more appropriate here unless you are
>> > sure that it worked correctly after the changelink.
>>
>> I don't follow. What do you mean?
>
> I meant the notify patch in the Fixes tag had no intentions to change
> this. Also the vxlan changelink support
> did not consider this as a test case. So i was wondering if this
> problem existed before the commit in the Fixes tag.

That patch changed vxlan_fdb_update() to vxlan_fdb_create() in two
instances: on changelink, and on create. In changelink, always calling
create may lead to a split of the default FDB entry. The fix is
basically a revert of that part of the patch.

> regardless, ack on the fix. thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-12-18 11:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-17 18:09 [PATCH net v2 0/4] vxlan: Various fixes Petr Machata
2018-12-17 18:09 ` [PATCH net v2 1/4] vxlan: Unmark offloaded bit on replaced FDB entries Petr Machata
2018-12-17 18:10 ` [PATCH net v2 2/4] vxlan: Fix error path in __vxlan_dev_create() Petr Machata
2018-12-17 20:07   ` Roopa Prabhu
2018-12-17 23:41     ` Petr Machata
2018-12-17 18:10 ` [PATCH net v2 3/4] vxlan: changelink: Fix handling of default remotes Petr Machata
2018-12-17 19:59   ` Roopa Prabhu
2018-12-17 23:44     ` Petr Machata
2018-12-18  1:30       ` Roopa Prabhu
2018-12-18 11:20         ` Petr Machata
2018-12-17 18:10 ` [PATCH net v2 4/4] selftests: net: Add test_vxlan_fdb_changelink.sh Petr Machata

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.