All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes
@ 2018-12-18 15:59 Ido Schimmel
  2018-12-18 15:59 ` [PATCH net 1/3] mlxsw: core: Increase timeout during firmware flash process Ido Schimmel
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Ido Schimmel @ 2018-12-18 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, Jiri Pirko, Petr Machata, Shalom Toledo,
	Alexander Petrovskiy, mlxsw, Ido Schimmel

Patch #1 fixes firmware flashing failures by increasing the time period
after which the driver fails the transaction with the firmware. The
problem is explained in detail in the commit message.

Patch #2 adds a missing trap for decapsulated ARP packets. It is
necessary for VXLAN routing to work.

Patch #3 fixes a memory leak during driver reload caused by NULLing a
pointer before kfree().

Please consider patch #1 for 4.19.y

Ido Schimmel (2):
  mlxsw: spectrum: Add trap for decapsulated ARP packets
  mlxsw: spectrum_nve: Fix memory leak upon driver reload

Shalom Toledo (1):
  mlxsw: core: Increase timeout during firmware flash process

 drivers/net/ethernet/mellanox/mlxsw/core.c    | 19 ++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlxsw/core.h    |  3 +++
 .../net/ethernet/mellanox/mlxsw/spectrum.c    |  8 +++++++-
 .../ethernet/mellanox/mlxsw/spectrum_nve.c    |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/trap.h    |  1 +
 5 files changed, 30 insertions(+), 3 deletions(-)

-- 
2.20.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net 1/3] mlxsw: core: Increase timeout during firmware flash process
  2018-12-18 15:59 [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes Ido Schimmel
@ 2018-12-18 15:59 ` Ido Schimmel
  2018-12-18 15:59 ` [PATCH net 2/3] mlxsw: spectrum: Add trap for decapsulated ARP packets Ido Schimmel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2018-12-18 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, Jiri Pirko, Petr Machata, Shalom Toledo,
	Alexander Petrovskiy, mlxsw, Ido Schimmel

From: Shalom Toledo <shalomt@mellanox.com>

During the firmware flash process, some of the EMADs get timed out, which
causes the driver to send them again with a limit of 5 retries. There are
some situations in which 5 retries is not enough and the EMAD access fails.
If the failed EMAD was related to the flashing process, the driver fails
the flashing.

The reason for these timeouts during firmware flashing is cache misses in
the CPU running the firmware. In case the CPU needs to fetch instructions
from the flash when a firmware is flashed, it needs to wait for the
flashing to complete. Since flashing takes time, it is possible for pending
EMADs to timeout.

Fix by increasing EMADs' timeout while flashing firmware.

Fixes: ce6ef68f433f ("mlxsw: spectrum: Implement the ethtool flash_device callback")
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c    | 19 ++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlxsw/core.h    |  3 +++
 .../net/ethernet/mellanox/mlxsw/spectrum.c    |  7 ++++++-
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 30f751e69698..f7154f358f27 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -81,6 +81,7 @@ struct mlxsw_core {
 	struct mlxsw_core_port *ports;
 	unsigned int max_ports;
 	bool reload_fail;
+	bool fw_flash_in_progress;
 	unsigned long driver_priv[0];
 	/* driver_priv has to be always the last item */
 };
@@ -428,12 +429,16 @@ struct mlxsw_reg_trans {
 	struct rcu_head rcu;
 };
 
-#define MLXSW_EMAD_TIMEOUT_MS 200
+#define MLXSW_EMAD_TIMEOUT_DURING_FW_FLASH_MS	3000
+#define MLXSW_EMAD_TIMEOUT_MS			200
 
 static void mlxsw_emad_trans_timeout_schedule(struct mlxsw_reg_trans *trans)
 {
 	unsigned long timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_MS);
 
+	if (trans->core->fw_flash_in_progress)
+		timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_DURING_FW_FLASH_MS);
+
 	queue_delayed_work(trans->core->emad_wq, &trans->timeout_dw, timeout);
 }
 
@@ -1854,6 +1859,18 @@ int mlxsw_core_kvd_sizes_get(struct mlxsw_core *mlxsw_core,
 }
 EXPORT_SYMBOL(mlxsw_core_kvd_sizes_get);
 
+void mlxsw_core_fw_flash_start(struct mlxsw_core *mlxsw_core)
+{
+	mlxsw_core->fw_flash_in_progress = true;
+}
+EXPORT_SYMBOL(mlxsw_core_fw_flash_start);
+
+void mlxsw_core_fw_flash_end(struct mlxsw_core *mlxsw_core)
+{
+	mlxsw_core->fw_flash_in_progress = false;
+}
+EXPORT_SYMBOL(mlxsw_core_fw_flash_end);
+
 static int __init mlxsw_core_module_init(void)
 {
 	int err;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index c35be477856f..c4e4971764e5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -292,6 +292,9 @@ int mlxsw_core_kvd_sizes_get(struct mlxsw_core *mlxsw_core,
 			     u64 *p_single_size, u64 *p_double_size,
 			     u64 *p_linear_size);
 
+void mlxsw_core_fw_flash_start(struct mlxsw_core *mlxsw_core);
+void mlxsw_core_fw_flash_end(struct mlxsw_core *mlxsw_core);
+
 bool mlxsw_core_res_valid(struct mlxsw_core *mlxsw_core,
 			  enum mlxsw_res_id res_id);
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 9bec940330a4..75a2f1495455 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -309,8 +309,13 @@ static int mlxsw_sp_firmware_flash(struct mlxsw_sp *mlxsw_sp,
 		},
 		.mlxsw_sp = mlxsw_sp
 	};
+	int err;
+
+	mlxsw_core_fw_flash_start(mlxsw_sp->core);
+	err = mlxfw_firmware_flash(&mlxsw_sp_mlxfw_dev.mlxfw_dev, firmware);
+	mlxsw_core_fw_flash_end(mlxsw_sp->core);
 
-	return mlxfw_firmware_flash(&mlxsw_sp_mlxfw_dev.mlxfw_dev, firmware);
+	return err;
 }
 
 static int mlxsw_sp_fw_rev_validate(struct mlxsw_sp *mlxsw_sp)
-- 
2.20.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net 2/3] mlxsw: spectrum: Add trap for decapsulated ARP packets
  2018-12-18 15:59 [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes Ido Schimmel
  2018-12-18 15:59 ` [PATCH net 1/3] mlxsw: core: Increase timeout during firmware flash process Ido Schimmel
@ 2018-12-18 15:59 ` Ido Schimmel
  2018-12-18 15:59 ` [PATCH net 3/3] mlxsw: spectrum_nve: Fix memory leak upon driver reload Ido Schimmel
  2018-12-18 17:19 ` [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2018-12-18 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, Jiri Pirko, Petr Machata, Shalom Toledo,
	Alexander Petrovskiy, mlxsw, Ido Schimmel

After a packet was decapsulated it is classified to the relevant FID
based on its VNI and undergoes L2 forwarding.

Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap
decapsulated ARP packets during L2 forwarding and instead can only trap
such packets in the underlay router during decapsulation.

Add this missing packet trap, which is required for VXLAN routing when
the MAC of the target host is not known.

Fixes: b02597d513a9 ("mlxsw: spectrum: Add NVE packet traps")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 1 +
 drivers/net/ethernet/mellanox/mlxsw/trap.h     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 75a2f1495455..f84b9c02fcc5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -3526,6 +3526,7 @@ static const struct mlxsw_listener mlxsw_sp_listener[] = {
 	MLXSW_SP_RXL_MR_MARK(ACL2, TRAP_TO_CPU, MULTICAST, false),
 	/* NVE traps */
 	MLXSW_SP_RXL_MARK(NVE_ENCAP_ARP, TRAP_TO_CPU, ARP, false),
+	MLXSW_SP_RXL_NO_MARK(NVE_DECAP_ARP, TRAP_TO_CPU, ARP, false),
 };
 
 static int mlxsw_sp_cpu_policers_set(struct mlxsw_core *mlxsw_core)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/trap.h b/drivers/net/ethernet/mellanox/mlxsw/trap.h
index 6f18f4d3322a..451216dd7f6b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/trap.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/trap.h
@@ -60,6 +60,7 @@ enum {
 	MLXSW_TRAP_ID_IPV6_MC_LINK_LOCAL_DEST = 0x91,
 	MLXSW_TRAP_ID_HOST_MISS_IPV6 = 0x92,
 	MLXSW_TRAP_ID_IPIP_DECAP_ERROR = 0xB1,
+	MLXSW_TRAP_ID_NVE_DECAP_ARP = 0xB8,
 	MLXSW_TRAP_ID_NVE_ENCAP_ARP = 0xBD,
 	MLXSW_TRAP_ID_ROUTER_ALERT_IPV4 = 0xD6,
 	MLXSW_TRAP_ID_ROUTER_ALERT_IPV6 = 0xD7,
-- 
2.20.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net 3/3] mlxsw: spectrum_nve: Fix memory leak upon driver reload
  2018-12-18 15:59 [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes Ido Schimmel
  2018-12-18 15:59 ` [PATCH net 1/3] mlxsw: core: Increase timeout during firmware flash process Ido Schimmel
  2018-12-18 15:59 ` [PATCH net 2/3] mlxsw: spectrum: Add trap for decapsulated ARP packets Ido Schimmel
@ 2018-12-18 15:59 ` Ido Schimmel
  2018-12-18 17:19 ` [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Ido Schimmel @ 2018-12-18 15:59 UTC (permalink / raw)
  To: netdev
  Cc: davem, Jiri Pirko, Petr Machata, Shalom Toledo,
	Alexander Petrovskiy, mlxsw, Ido Schimmel

The pointer was NULLed before freeing the memory, resulting in a memory
leak. Trace from kmemleak:

unreferenced object 0xffff88820ae36528 (size 512):
  comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330
    [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0
    [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690
    [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260
    [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360
    [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220
    [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0
    [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150
    [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180
    [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0
    [<00000000efc4eae8>] genl_rcv+0x2d/0x40
    [<00000000ea645603>] netlink_unicast+0x52f/0x740
    [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50
    [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120
    [<00000000d85795a9>] __sys_sendto+0x397/0x620
    [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0

Fixes: 6e6030bd5412 ("mlxsw: spectrum_nve: Implement common NVE core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c
index 5c13674439f1..b5b54b41349a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_nve.c
@@ -977,6 +977,6 @@ void mlxsw_sp_nve_fini(struct mlxsw_sp *mlxsw_sp)
 {
 	WARN_ON(mlxsw_sp->nve->num_nve_tunnels);
 	rhashtable_destroy(&mlxsw_sp->nve->mc_list_ht);
-	mlxsw_sp->nve = NULL;
 	kfree(mlxsw_sp->nve);
+	mlxsw_sp->nve = NULL;
 }
-- 
2.20.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes
  2018-12-18 15:59 [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes Ido Schimmel
                   ` (2 preceding siblings ...)
  2018-12-18 15:59 ` [PATCH net 3/3] mlxsw: spectrum_nve: Fix memory leak upon driver reload Ido Schimmel
@ 2018-12-18 17:19 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2018-12-18 17:19 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, petrm, shalomt, alexpe, mlxsw

From: Ido Schimmel <idosch@mellanox.com>
Date: Tue, 18 Dec 2018 15:59:19 +0000

> Patch #1 fixes firmware flashing failures by increasing the time period
> after which the driver fails the transaction with the firmware. The
> problem is explained in detail in the commit message.
> 
> Patch #2 adds a missing trap for decapsulated ARP packets. It is
> necessary for VXLAN routing to work.
> 
> Patch #3 fixes a memory leak during driver reload caused by NULLing a
> pointer before kfree().
> 
> Please consider patch #1 for 4.19.y

Series applied and patch #1 queued up.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-12-18 17:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-18 15:59 [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes Ido Schimmel
2018-12-18 15:59 ` [PATCH net 1/3] mlxsw: core: Increase timeout during firmware flash process Ido Schimmel
2018-12-18 15:59 ` [PATCH net 2/3] mlxsw: spectrum: Add trap for decapsulated ARP packets Ido Schimmel
2018-12-18 15:59 ` [PATCH net 3/3] mlxsw: spectrum_nve: Fix memory leak upon driver reload Ido Schimmel
2018-12-18 17:19 ` [PATCH net 0/3] mlxsw: VXLAN and firmware flashing fixes David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.