LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH net-next RFC 00/13] Add devlink reload level option
@ 2020-07-27 11:02 Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command Moshe Shemesh
                   ` (14 more replies)
  0 siblings, 15 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Introduce new option on devlink reload API to enable the user to select the
reload level required. Complete support for all levels in mlx5.
The following reload levels are supported:
  driver: Driver entities re-instantiation only. 
  fw_reset: Firmware reset and driver entities re-instantiation. 
  fw_live_patch: Firmware live patching only.

Each driver which support this command should expose the reload levels
supported and the driver's default reload level.
The uAPI is backward compatible, if the reload level option is omitted
from the reload command, the driver's default reload level will be used.

Patch 1 adds the new API reload level option to devlink.
Patch 2 exposes the supported reload levels and default level on devlink
        dev get.
Patches 3-8 add support on mlx5 for devlink reload level fw-reset and
            handle the firmware reset events.
Patches 9-10 add devlink enable remote dev reset parameter and use it
             in mlx5.
Patches 11-12 mlx5 add devlink reload live patch support and event
              handling.
Patch 13 adds documentation file devlink-reload.rst 

Command examples:

# Run reload command with fw-reset reload level:
$ devlink dev reload pci/0000:82:00.0 level fw-reset

# Run reload command with driver reload level:
$ devlink dev reload pci/0000:82:00.0 level driver

# Run reload command with driver's default level (backward compatible):
$ devlink dev reload pci/0000:82:00.0


Moshe Shemesh (13):
  devlink: Add reload level option to devlink reload command
  devlink: Add reload levels data to dev get
  net/mlx5: Add functions to set/query MFRL register
  net/mlx5: Set cap for pci sync for fw update event
  net/mlx5: Handle sync reset request event
  net/mlx5: Handle sync reset now event
  net/mlx5: Handle sync reset abort event
  net/mlx5: Add support for devlink reload level fw reset
  devlink: Add enable_remote_dev_reset generic parameter
  net/mlx5: Add devlink param enable_remote_dev_reset support
  net/mlx5: Add support for fw live patch event
  net/mlx5: Add support for devlink reload level live patch
  devlink: Add Documentation/networking/devlink/devlink-reload.rst

 .../networking/devlink/devlink-params.rst     |   6 +
 .../networking/devlink/devlink-reload.rst     |  56 +++
 Documentation/networking/devlink/index.rst    |   1 +
 drivers/net/ethernet/mellanox/mlx4/main.c     |   6 +-
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 114 +++++-
 .../mellanox/mlx5/core/diag/fw_tracer.c       |  31 ++
 .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 328 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/fw_reset.h    |  17 +
 .../net/ethernet/mellanox/mlx5/core/health.c  |  74 +++-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  13 +
 drivers/net/ethernet/mellanox/mlxsw/core.c    |   6 +-
 drivers/net/netdevsim/dev.c                   |   6 +-
 include/linux/mlx5/device.h                   |   1 +
 include/linux/mlx5/driver.h                   |  12 +
 include/net/devlink.h                         |  10 +-
 include/uapi/linux/devlink.h                  |  22 ++
 net/core/devlink.c                            |  95 ++++-
 19 files changed, 764 insertions(+), 37 deletions(-)
 create mode 100644 Documentation/networking/devlink/devlink-reload.rst
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-28  0:58   ` Jakub Kicinski
  2020-07-27 11:02 ` [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get Moshe Shemesh
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Add devlink reload level to allow the user to request a specific reload
level. The level parameter is optional, if not specified then driver's
default reload level is used (backward compatible).
Reload levels supported are:
driver: driver entities re-instantiation only.
fw_reset: firmware reset and driver entities re-instantiation.
fw_live_patch: firmware live patching only.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c     |  6 ++-
 .../net/ethernet/mellanox/mlx5/core/devlink.c |  6 ++-
 drivers/net/ethernet/mellanox/mlxsw/core.c    |  6 ++-
 drivers/net/netdevsim/dev.c                   |  6 ++-
 include/net/devlink.h                         |  6 ++-
 include/uapi/linux/devlink.h                  | 19 +++++++
 net/core/devlink.c                            | 52 +++++++++++++++++--
 7 files changed, 86 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 954c22c79f6b..57d9d4381cb0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -3935,7 +3935,7 @@ static int mlx4_restart_one_up(struct pci_dev *pdev, bool reload,
 			       struct devlink *devlink);
 
 static int mlx4_devlink_reload_down(struct devlink *devlink, bool netns_change,
-				    struct netlink_ext_ack *extack)
+				    enum devlink_reload_level level, struct netlink_ext_ack *extack)
 {
 	struct mlx4_priv *priv = devlink_priv(devlink);
 	struct mlx4_dev *dev = &priv->dev;
@@ -3951,7 +3951,7 @@ static int mlx4_devlink_reload_down(struct devlink *devlink, bool netns_change,
 	return 0;
 }
 
-static int mlx4_devlink_reload_up(struct devlink *devlink,
+static int mlx4_devlink_reload_up(struct devlink *devlink, enum devlink_reload_level level,
 				  struct netlink_ext_ack *extack)
 {
 	struct mlx4_priv *priv = devlink_priv(devlink);
@@ -3969,6 +3969,8 @@ static int mlx4_devlink_reload_up(struct devlink *devlink,
 
 static const struct devlink_ops mlx4_devlink_ops = {
 	.port_type_set	= mlx4_devlink_port_type_set,
+	.supported_reload_levels = BIT(DEVLINK_RELOAD_LEVEL_DRIVER),
+	.default_reload_level = DEVLINK_RELOAD_LEVEL_DRIVER,
 	.reload_down	= mlx4_devlink_reload_down,
 	.reload_up	= mlx4_devlink_reload_up,
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index c709e9a385f6..5424e31a0f45 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -89,7 +89,7 @@ mlx5_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 }
 
 static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
-				    struct netlink_ext_ack *extack)
+				    enum devlink_reload_level level, struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
 
@@ -97,7 +97,7 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 	return 0;
 }
 
-static int mlx5_devlink_reload_up(struct devlink *devlink,
+static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_level level,
 				  struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
@@ -118,6 +118,8 @@ static const struct devlink_ops mlx5_devlink_ops = {
 #endif
 	.flash_update = mlx5_devlink_flash_update,
 	.info_get = mlx5_devlink_info_get,
+	.supported_reload_levels = BIT(DEVLINK_RELOAD_LEVEL_DRIVER),
+	.default_reload_level = DEVLINK_RELOAD_LEVEL_DRIVER,
 	.reload_down = mlx5_devlink_reload_down,
 	.reload_up = mlx5_devlink_reload_up,
 };
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index b01f8f2fab63..360d749eb042 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1113,7 +1113,7 @@ mlxsw_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 
 static int
 mlxsw_devlink_core_bus_device_reload_down(struct devlink *devlink,
-					  bool netns_change,
+					  bool netns_change, enum devlink_reload_level level,
 					  struct netlink_ext_ack *extack)
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
@@ -1126,7 +1126,7 @@ mlxsw_devlink_core_bus_device_reload_down(struct devlink *devlink,
 }
 
 static int
-mlxsw_devlink_core_bus_device_reload_up(struct devlink *devlink,
+mlxsw_devlink_core_bus_device_reload_up(struct devlink *devlink, enum devlink_reload_level level,
 					struct netlink_ext_ack *extack)
 {
 	struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
@@ -1266,6 +1266,8 @@ mlxsw_devlink_trap_policer_counter_get(struct devlink *devlink,
 }
 
 static const struct devlink_ops mlxsw_devlink_ops = {
+	.supported_reload_levels	= BIT(DEVLINK_RELOAD_LEVEL_FW_RESET),
+	.default_reload_level		= DEVLINK_RELOAD_LEVEL_FW_RESET,
 	.reload_down		= mlxsw_devlink_core_bus_device_reload_down,
 	.reload_up		= mlxsw_devlink_core_bus_device_reload_up,
 	.port_type_set			= mlxsw_devlink_port_type_set,
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index ce719c830a77..680482f687f4 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -697,7 +697,7 @@ static int nsim_dev_reload_create(struct nsim_dev *nsim_dev,
 static void nsim_dev_reload_destroy(struct nsim_dev *nsim_dev);
 
 static int nsim_dev_reload_down(struct devlink *devlink, bool netns_change,
-				struct netlink_ext_ack *extack)
+				enum devlink_reload_level level, struct netlink_ext_ack *extack)
 {
 	struct nsim_dev *nsim_dev = devlink_priv(devlink);
 
@@ -713,7 +713,7 @@ static int nsim_dev_reload_down(struct devlink *devlink, bool netns_change,
 	return 0;
 }
 
-static int nsim_dev_reload_up(struct devlink *devlink,
+static int nsim_dev_reload_up(struct devlink *devlink, enum devlink_reload_level level,
 			      struct netlink_ext_ack *extack)
 {
 	struct nsim_dev *nsim_dev = devlink_priv(devlink);
@@ -873,6 +873,8 @@ nsim_dev_devlink_trap_policer_counter_get(struct devlink *devlink,
 }
 
 static const struct devlink_ops nsim_dev_devlink_ops = {
+	.supported_reload_levels = BIT(DEVLINK_RELOAD_LEVEL_DRIVER),
+	.default_reload_level = DEVLINK_RELOAD_LEVEL_DRIVER,
 	.reload_down = nsim_dev_reload_down,
 	.reload_up = nsim_dev_reload_up,
 	.info_get = nsim_dev_info_get,
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 19d990c8edcc..b291cd8d6be6 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -985,9 +985,11 @@ enum devlink_trap_group_generic_id {
 	}
 
 struct devlink_ops {
+	unsigned long supported_reload_levels;
+	enum devlink_reload_level default_reload_level;
 	int (*reload_down)(struct devlink *devlink, bool netns_change,
-			   struct netlink_ext_ack *extack);
-	int (*reload_up)(struct devlink *devlink,
+			   enum devlink_reload_level level, struct netlink_ext_ack *extack);
+	int (*reload_up)(struct devlink *devlink, enum devlink_reload_level level,
 			 struct netlink_ext_ack *extack);
 	int (*port_type_set)(struct devlink_port *devlink_port,
 			     enum devlink_port_type port_type);
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index cfef4245ea5a..fa5f66db5012 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -272,6 +272,23 @@ enum {
 	DEVLINK_ATTR_TRAP_METADATA_TYPE_FA_COOKIE,
 };
 
+/**
+ * enum devlink_reload_level - Reload level.
+ * @DEVLINK_RELOAD_LEVEL_DRIVER: Driver entities re-instantiation only.
+ * @DEVLINK_RELOAD_LEVEL_FW_RESET: FW reset and driver entities re-instantiation.
+ * @DEVLINK_RELOAD_LEVEL_FW_LIVE_PATCH: FW live patching only.
+ */
+enum devlink_reload_level {
+	DEVLINK_RELOAD_LEVEL_UNSPEC,
+	DEVLINK_RELOAD_LEVEL_DRIVER,
+	DEVLINK_RELOAD_LEVEL_FW_RESET,
+	DEVLINK_RELOAD_LEVEL_FW_LIVE_PATCH,
+
+	/* Add new reload levels above */
+	__DEVLINK_RELOAD_LEVEL_MAX,
+	DEVLINK_RELOAD_LEVEL_MAX = __DEVLINK_RELOAD_LEVEL_MAX - 1
+};
+
 enum devlink_attr {
 	/* don't change the order or add anything between, this is ABI! */
 	DEVLINK_ATTR_UNSPEC,
@@ -458,6 +475,8 @@ enum devlink_attr {
 	DEVLINK_ATTR_PORT_LANES,			/* u32 */
 	DEVLINK_ATTR_PORT_SPLITTABLE,			/* u8 */
 
+	DEVLINK_ATTR_RELOAD_LEVEL,		/* u8 */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 0ca89196a367..31b367a1612d 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -462,6 +462,12 @@ static int devlink_nl_put_handle(struct sk_buff *msg, struct devlink *devlink)
 	return 0;
 }
 
+static bool
+devlink_reload_level_is_supported(struct devlink *devlink, enum devlink_reload_level level)
+{
+	return test_bit(level, &devlink->ops->supported_reload_levels);
+}
+
 static int devlink_nl_fill(struct sk_buff *msg, struct devlink *devlink,
 			   enum devlink_command cmd, u32 portid,
 			   u32 seq, int flags)
@@ -2958,21 +2964,21 @@ bool devlink_is_reload_failed(const struct devlink *devlink)
 EXPORT_SYMBOL_GPL(devlink_is_reload_failed);
 
 static int devlink_reload(struct devlink *devlink, struct net *dest_net,
-			  struct netlink_ext_ack *extack)
+			  enum devlink_reload_level level, struct netlink_ext_ack *extack)
 {
 	int err;
 
 	if (!devlink->reload_enabled)
 		return -EOPNOTSUPP;
 
-	err = devlink->ops->reload_down(devlink, !!dest_net, extack);
+	err = devlink->ops->reload_down(devlink, !!dest_net, level, extack);
 	if (err)
 		return err;
 
 	if (dest_net && !net_eq(dest_net, devlink_net(devlink)))
 		devlink_reload_netns_change(devlink, dest_net);
 
-	err = devlink->ops->reload_up(devlink, extack);
+	err = devlink->ops->reload_up(devlink, level, extack);
 	devlink_reload_failed_set(devlink, !!err);
 	return err;
 }
@@ -2980,6 +2986,7 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
 static int devlink_nl_cmd_reload(struct sk_buff *skb, struct genl_info *info)
 {
 	struct devlink *devlink = info->user_ptr[0];
+	enum devlink_reload_level level;
 	struct net *dest_net = NULL;
 	int err;
 
@@ -3000,7 +3007,20 @@ static int devlink_nl_cmd_reload(struct sk_buff *skb, struct genl_info *info)
 			return PTR_ERR(dest_net);
 	}
 
-	err = devlink_reload(devlink, dest_net, info->extack);
+	if (info->attrs[DEVLINK_ATTR_RELOAD_LEVEL])
+		level = nla_get_u8(info->attrs[DEVLINK_ATTR_RELOAD_LEVEL]);
+	else
+		level = devlink->ops->default_reload_level;
+
+	if (level == DEVLINK_RELOAD_LEVEL_UNSPEC || level > DEVLINK_RELOAD_LEVEL_MAX) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Invalid reload level");
+		return -EINVAL;
+	} else if (!devlink_reload_level_is_supported(devlink, level)) {
+		NL_SET_ERR_MSG_MOD(info->extack, "Requested reload level is not supported");
+		return -EOPNOTSUPP;
+	}
+
+	err = devlink_reload(devlink, dest_net, level, info->extack);
 
 	if (dest_net)
 		put_net(dest_net);
@@ -7026,6 +7046,7 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_TRAP_POLICER_RATE] = { .type = NLA_U64 },
 	[DEVLINK_ATTR_TRAP_POLICER_BURST] = { .type = NLA_U64 },
 	[DEVLINK_ATTR_PORT_FUNCTION] = { .type = NLA_NESTED },
+	[DEVLINK_ATTR_RELOAD_LEVEL] = { .type = NLA_U8 },
 };
 
 static const struct genl_ops devlink_nl_ops[] = {
@@ -7351,6 +7372,21 @@ static struct genl_family devlink_nl_family __ro_after_init = {
 	.n_mcgrps	= ARRAY_SIZE(devlink_nl_mcgrps),
 };
 
+static int devlink_reload_levels_verify(struct devlink *devlink)
+{
+	const struct devlink_ops *ops;
+
+	if (!devlink_reload_supported(devlink))
+		return 0;
+
+	ops = devlink->ops;
+	if (WARN_ON(ops->supported_reload_levels >= BIT(__DEVLINK_RELOAD_LEVEL_MAX) ||
+		    ops->supported_reload_levels & BIT(DEVLINK_RELOAD_LEVEL_UNSPEC) ||
+		    !(ops->supported_reload_levels & BIT(ops->default_reload_level))))
+		return -EINVAL;
+	return 0;
+}
+
 /**
  *	devlink_alloc - Allocate new devlink instance resources
  *
@@ -7371,6 +7407,11 @@ struct devlink *devlink_alloc(const struct devlink_ops *ops, size_t priv_size)
 	if (!devlink)
 		return NULL;
 	devlink->ops = ops;
+	if (devlink_reload_levels_verify(devlink)) {
+		kfree(devlink);
+		return NULL;
+	}
+
 	xa_init_flags(&devlink->snapshot_ids, XA_FLAGS_ALLOC);
 	__devlink_net_set(devlink, &init_net);
 	INIT_LIST_HEAD(&devlink->port_list);
@@ -9599,7 +9640,8 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
 		if (net_eq(devlink_net(devlink), net)) {
 			if (WARN_ON(!devlink_reload_supported(devlink)))
 				continue;
-			err = devlink_reload(devlink, &init_net, NULL);
+			err = devlink_reload(devlink, &init_net,
+					     devlink->ops->default_reload_level, NULL);
 			if (err && err != -EOPNOTSUPP)
 				pr_warn("Failed to reload devlink instance into init_net\n");
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-28  0:58   ` Jakub Kicinski
  2020-07-27 11:02 ` [PATCH net-next RFC 03/13] net/mlx5: Add functions to set/query MFRL register Moshe Shemesh
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Expose devlink reload supported levels and driver's default level to the
user through devlink dev get command.

Examples:
$ devlink dev show
pci/0000:82:00.0:
  reload_levels_info:
    default_level driver
    supported_levels:
      driver fw_reset fw_live_patch
pci/0000:82:00.1:
  reload_levels_info:
    default_level driver
    supported_levels:
      driver fw_reset fw_live_patch

$ devlink dev show -jp
{
    "dev": {
        "pci/0000:82:00.0": {
            "reload_levels_info": {
                "default_level": "driver",
                "supported_levels": [ "driver","fw_reset","fw_live_patch" ]
            }
        },
        "pci/0000:82:00.1": {
            "reload_levels_info": {
                "default_level": "driver",
                "supported_levels": [ "driver","fw_reset","fw_live_patch" ]
            }
        }
    }
}

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 include/uapi/linux/devlink.h |  3 +++
 net/core/devlink.c           | 38 +++++++++++++++++++++++++++++++-----
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index fa5f66db5012..249e921ff106 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -476,6 +476,9 @@ enum devlink_attr {
 	DEVLINK_ATTR_PORT_SPLITTABLE,			/* u8 */
 
 	DEVLINK_ATTR_RELOAD_LEVEL,		/* u8 */
+	DEVLINK_ATTR_RELOAD_DEFAULT_LEVEL,	/* u8 */
+	DEVLINK_ATTR_RELOAD_SUPPORTED_LEVELS,	/* nested */
+	DEVLINK_ATTR_RELOAD_LEVELS_INFO,	/* nested */
 
 	/* add new attributes above here, update the policy in devlink.c */
 
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 31b367a1612d..f1812fc620d4 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -462,6 +462,11 @@ static int devlink_nl_put_handle(struct sk_buff *msg, struct devlink *devlink)
 	return 0;
 }
 
+static bool devlink_reload_supported(struct devlink *devlink)
+{
+	return devlink->ops->reload_down && devlink->ops->reload_up;
+}
+
 static bool
 devlink_reload_level_is_supported(struct devlink *devlink, enum devlink_reload_level level)
 {
@@ -472,7 +477,9 @@ static int devlink_nl_fill(struct sk_buff *msg, struct devlink *devlink,
 			   enum devlink_command cmd, u32 portid,
 			   u32 seq, int flags)
 {
+	struct nlattr *reload_levels_info, *supported_levels;
 	void *hdr;
+	int i;
 
 	hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd);
 	if (!hdr)
@@ -483,9 +490,35 @@ static int devlink_nl_fill(struct sk_buff *msg, struct devlink *devlink,
 	if (nla_put_u8(msg, DEVLINK_ATTR_RELOAD_FAILED, devlink->reload_failed))
 		goto nla_put_failure;
 
+	if (devlink_reload_supported(devlink)) {
+		reload_levels_info = nla_nest_start(msg, DEVLINK_ATTR_RELOAD_LEVELS_INFO);
+		if (!reload_levels_info)
+			goto nla_put_failure;
+		if (nla_put_u8(msg, DEVLINK_ATTR_RELOAD_DEFAULT_LEVEL,
+			       devlink->ops->default_reload_level))
+			goto reload_levels_info_nest_cancel;
+
+		supported_levels = nla_nest_start(msg, DEVLINK_ATTR_RELOAD_SUPPORTED_LEVELS);
+		if (!supported_levels)
+			goto reload_levels_info_nest_cancel;
+
+		for (i = 0; i <= DEVLINK_RELOAD_LEVEL_MAX; i++) {
+			if (!devlink_reload_level_is_supported(devlink, i))
+				continue;
+			if (nla_put_u8(msg, DEVLINK_ATTR_RELOAD_LEVEL, i))
+				goto supported_levels_nest_cancel;
+		}
+		nla_nest_end(msg, supported_levels);
+		nla_nest_end(msg, reload_levels_info);
+	}
+
 	genlmsg_end(msg, hdr);
 	return 0;
 
+supported_levels_nest_cancel:
+	nla_nest_cancel(msg, supported_levels);
+reload_levels_info_nest_cancel:
+	nla_nest_cancel(msg, reload_levels_info);
 nla_put_failure:
 	genlmsg_cancel(msg, hdr);
 	return -EMSGSIZE;
@@ -2943,11 +2976,6 @@ static void devlink_reload_netns_change(struct devlink *devlink,
 				     DEVLINK_CMD_PARAM_NEW);
 }
 
-static bool devlink_reload_supported(const struct devlink *devlink)
-{
-	return devlink->ops->reload_down && devlink->ops->reload_up;
-}
-
 static void devlink_reload_failed_set(struct devlink *devlink,
 				      bool reload_failed)
 {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 03/13] net/mlx5: Add functions to set/query MFRL register
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 04/13] net/mlx5: Set cap for pci sync for fw update event Moshe Shemesh
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Add functions to query and set the MFRL reset options supported by
firmware.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |  2 +-
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 46 +++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/fw_reset.h    | 13 ++++++
 3 files changed, 60 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 10e6886c96ba..4d45a2f6fed6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -16,7 +16,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
 		fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \
 		lib/devcom.o lib/pci_vsc.o lib/dm.o diag/fs_tracepoint.o \
-		diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o
+		diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o fw_reset.o
 
 #
 # Netdev basic
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
new file mode 100644
index 000000000000..76d2cece29ac
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2020, Mellanox Technologies inc.  All rights reserved. */
+
+#include "fw_reset.h"
+
+static int mlx5_reg_mfrl_set(struct mlx5_core_dev *dev, u8 reset_level,
+			     u8 reset_type_sel, u8 sync_resp, bool sync_start)
+{
+	u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {};
+	u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {};
+
+	MLX5_SET(mfrl_reg, in, reset_level, reset_level);
+	MLX5_SET(mfrl_reg, in, rst_type_sel, reset_type_sel);
+	MLX5_SET(mfrl_reg, in, pci_sync_for_fw_update_resp, sync_resp);
+	MLX5_SET(mfrl_reg, in, pci_sync_for_fw_update_start, sync_start);
+
+	return mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out), MLX5_REG_MFRL, 0, 1);
+}
+
+int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type)
+{
+	u32 out[MLX5_ST_SZ_DW(mfrl_reg)] = {};
+	u32 in[MLX5_ST_SZ_DW(mfrl_reg)] = {};
+	int err;
+
+	err = mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out), MLX5_REG_MFRL, 0, 0);
+	if (err)
+		return err;
+
+	if (reset_level)
+		*reset_level = MLX5_GET(mfrl_reg, out, reset_level);
+	if (reset_type)
+		*reset_type = MLX5_GET(mfrl_reg, out, reset_type);
+
+	return 0;
+}
+
+int mlx5_fw_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel)
+{
+	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, reset_type_sel, 0, true);
+}
+
+int mlx5_fw_set_live_patch(struct mlx5_core_dev *dev)
+{
+	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL0, 0, 0, false);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
new file mode 100644
index 000000000000..1bbd95182ca6
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2020, Mellanox Technologies inc.  All rights reserved. */
+
+#ifndef __MLX5_FW_RESET_H
+#define __MLX5_FW_RESET_H
+
+#include "mlx5_core.h"
+
+int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_type);
+int mlx5_fw_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel);
+int mlx5_fw_set_live_patch(struct mlx5_core_dev *dev);
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 04/13] net/mlx5: Set cap for pci sync for fw update event
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (2 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 03/13] net/mlx5: Add functions to set/query MFRL register Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 05/13] net/mlx5: Handle sync reset request event Moshe Shemesh
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Set capability to notify the firmware that this host driver is capable
of handling pci sync for firmware update events.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index e32d46c33701..5fb23cfecd71 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -548,6 +548,9 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
 	if (MLX5_CAP_GEN_MAX(dev, dct))
 		MLX5_SET(cmd_hca_cap, set_hca_cap, dct, 1);
 
+	if (MLX5_CAP_GEN_MAX(dev, pci_sync_for_fw_update_event))
+		MLX5_SET(cmd_hca_cap, set_hca_cap, pci_sync_for_fw_update_event, 1);
+
 	if (MLX5_CAP_GEN_MAX(dev, num_vhca_ports))
 		MLX5_SET(cmd_hca_cap,
 			 set_hca_cap,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 05/13] net/mlx5: Handle sync reset request event
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (3 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 04/13] net/mlx5: Set cap for pci sync for fw update event Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 06/13] net/mlx5: Handle sync reset now event Moshe Shemesh
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Once the driver gets sync_reset_request from firmware it prepares for the
coming reset and sends acknowledge.
After getting this event the driver expects device reset, either it will
trigger PCI reset on sync_reset_now event or such PCI reset will be
triggered by another PF of the same device. So it moves to reset
requested mode and if it gets PCI reset triggered by the other PF it
does silent recovery without reporting it as an error.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 88 +++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/fw_reset.h    |  3 +
 .../net/ethernet/mellanox/mlx5/core/health.c  | 74 +++++++++++++---
 .../net/ethernet/mellanox/mlx5/core/main.c    | 10 +++
 include/linux/mlx5/driver.h                   | 11 +++
 5 files changed, 173 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 76d2cece29ac..13f83fc1783f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -3,6 +3,13 @@
 
 #include "fw_reset.h"
 
+struct mlx5_fw_reset {
+	struct mlx5_core_dev *dev;
+	struct mlx5_nb nb;
+	struct workqueue_struct *wq;
+	struct work_struct reset_request_work;
+};
+
 static int mlx5_reg_mfrl_set(struct mlx5_core_dev *dev, u8 reset_level,
 			     u8 reset_type_sel, u8 sync_resp, bool sync_start)
 {
@@ -44,3 +51,84 @@ int mlx5_fw_set_live_patch(struct mlx5_core_dev *dev)
 {
 	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL0, 0, 0, false);
 }
+
+static int mlx5_fw_set_reset_sync_ack(struct mlx5_core_dev *dev)
+{
+	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, 0, 1, false);
+}
+
+static void mlx5_sync_reset_request_event(struct work_struct *work)
+{
+	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
+						      reset_request_work);
+	struct mlx5_core_dev *dev = fw_reset->dev;
+
+	mlx5_health_set_reset_requested_mode(dev);
+	mlx5_reload_health_poll_timer(dev);
+	if (mlx5_fw_set_reset_sync_ack(dev))
+		mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack Failed.\n");
+	else
+		mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n");
+}
+
+static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct mlx5_eqe *eqe)
+{
+	struct mlx5_eqe_sync_fw_update *sync_fw_update_eqe;
+	u8 sync_event_rst_type;
+
+	sync_fw_update_eqe = &eqe->data.sync_fw_update;
+	sync_event_rst_type = sync_fw_update_eqe->sync_rst_state & SYNC_RST_STATE_MASK;
+	switch (sync_event_rst_type) {
+	case MLX5_SYNC_RST_STATE_RESET_REQUEST:
+		queue_work(fw_reset->wq, &fw_reset->reset_request_work);
+		break;
+	}
+}
+
+static int fw_reset_event_notifier(struct notifier_block *nb, unsigned long action, void *data)
+{
+	struct mlx5_fw_reset *fw_reset = mlx5_nb_cof(nb, struct mlx5_fw_reset, nb);
+	struct mlx5_eqe *eqe = data;
+
+	switch (eqe->sub_type) {
+	case MLX5_GENERAL_SUBTYPE_PCI_SYNC_FOR_FW_UPDATE_EVENT:
+		mlx5_sync_reset_events_handle(fw_reset, eqe);
+		break;
+	default:
+		return NOTIFY_DONE;
+	}
+
+	return NOTIFY_OK;
+}
+
+int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev)
+{
+	struct mlx5_fw_reset *fw_reset = kzalloc(sizeof(*fw_reset), GFP_KERNEL);
+
+	if (!fw_reset)
+		return -ENOMEM;
+	fw_reset->wq = create_singlethread_workqueue("mlx5_fw_reset_events");
+	if (!fw_reset->wq) {
+		kfree(fw_reset);
+		return -ENOMEM;
+	}
+
+	fw_reset->dev = dev;
+	dev->priv.fw_reset = fw_reset;
+
+	INIT_WORK(&fw_reset->reset_request_work, mlx5_sync_reset_request_event);
+
+	MLX5_NB_INIT(&fw_reset->nb, fw_reset_event_notifier, GENERAL_EVENT);
+	mlx5_eq_notifier_register(dev, &fw_reset->nb);
+
+	return 0;
+}
+
+void mlx5_fw_reset_events_cleanup(struct mlx5_core_dev *dev)
+{
+	struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
+
+	mlx5_eq_notifier_unregister(dev, &fw_reset->nb);
+	destroy_workqueue(fw_reset->wq);
+	kvfree(dev->priv.fw_reset);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
index 1bbd95182ca6..278f538ea92a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
@@ -10,4 +10,7 @@ int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty
 int mlx5_fw_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel);
 int mlx5_fw_set_live_patch(struct mlx5_core_dev *dev);
 
+int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev);
+void mlx5_fw_reset_events_cleanup(struct mlx5_core_dev *dev);
+
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index b31f769d2df9..371134789375 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -47,6 +47,8 @@ enum {
 	MAX_MISSES			= 3,
 };
 
+#define MLX5_HEALTH_RR_POLL_INTERVAL	(HZ / 10)
+
 enum {
 	MLX5_HEALTH_SYNDR_FW_ERR		= 0x1,
 	MLX5_HEALTH_SYNDR_IRISC_ERR		= 0x7,
@@ -222,6 +224,7 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev, bool force)
 #define MLX5_FW_RESET_WAIT_MS	1000
 void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
 {
+	struct mlx5_core_health *health = &dev->priv.health;
 	unsigned long end, delay_ms = MLX5_FW_RESET_WAIT_MS;
 	int lock = -EBUSY;
 
@@ -229,7 +232,8 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
 	if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR)
 		goto unlock;
 
-	mlx5_core_err(dev, "start\n");
+	if (!test_bit(MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY, &health->reset_flags))
+		mlx5_core_err(dev, "start\n");
 
 	if (check_fatal_sensors(dev) == MLX5_SENSOR_FW_SYND_RFR) {
 		/* Get cr-dump and reset FW semaphore */
@@ -262,7 +266,8 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
 	if (!lock)
 		lock_sem_sw_reset(dev, false);
 
-	mlx5_core_err(dev, "end\n");
+	if (!test_bit(MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY, &health->reset_flags))
+		mlx5_core_err(dev, "end\n");
 
 unlock:
 	mutex_unlock(&dev->intf_state_mutex);
@@ -310,10 +315,15 @@ static void mlx5_handle_bad_state(struct mlx5_core_dev *dev)
 #define MLX5_RECOVERY_WAIT_MSECS 60000
 static int mlx5_health_try_recover(struct mlx5_core_dev *dev)
 {
+	struct mlx5_core_health *health = &dev->priv.health;
 	unsigned long end;
 
-	mlx5_core_warn(dev, "handling bad device here\n");
-	mlx5_handle_bad_state(dev);
+	if (!test_bit(MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY, &health->reset_flags)) {
+		mlx5_core_warn(dev, "handling bad device here\n");
+		mlx5_handle_bad_state(dev);
+	} else {
+		mlx5_disable_device(dev);
+	}
 	end = jiffies + msecs_to_jiffies(MLX5_RECOVERY_WAIT_MSECS);
 	while (sensor_pci_not_working(dev)) {
 		if (time_after(jiffies, end)) {
@@ -324,7 +334,8 @@ static int mlx5_health_try_recover(struct mlx5_core_dev *dev)
 		msleep(100);
 	}
 
-	mlx5_core_err(dev, "starting health recovery flow\n");
+	if (!test_and_clear_bit(MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY, &health->reset_flags))
+		mlx5_core_err(dev, "starting health recovery flow\n");
 	mlx5_recover_device(dev);
 	if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state) ||
 	    check_fatal_sensors(dev)) {
@@ -609,7 +620,8 @@ static void mlx5_fw_fatal_reporter_err_work(struct work_struct *work)
 	dev = container_of(priv, struct mlx5_core_dev, priv);
 
 	mlx5_enter_error_state(dev, false);
-	if (IS_ERR_OR_NULL(health->fw_fatal_reporter)) {
+	if (test_bit(MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY, &health->reset_flags) ||
+	    IS_ERR_OR_NULL(health->fw_fatal_reporter)) {
 		if (mlx5_health_try_recover(dev))
 			mlx5_core_err(dev, "health recovery failed\n");
 		return;
@@ -660,13 +672,17 @@ static void mlx5_fw_reporters_destroy(struct mlx5_core_dev *dev)
 		devlink_health_reporter_destroy(health->fw_fatal_reporter);
 }
 
-static unsigned long get_next_poll_jiffies(void)
+static unsigned long get_next_poll_jiffies(bool reset_requested)
 {
 	unsigned long next;
 
-	get_random_bytes(&next, sizeof(next));
-	next %= HZ;
-	next += jiffies + MLX5_HEALTH_POLL_INTERVAL;
+	if (reset_requested) {
+		next = jiffies + MLX5_HEALTH_RR_POLL_INTERVAL;
+	} else {
+		get_random_bytes(&next, sizeof(next));
+		next %= HZ;
+		next += jiffies + MLX5_HEALTH_POLL_INTERVAL;
+	}
 
 	return next;
 }
@@ -699,9 +715,16 @@ static void poll_health(struct timer_list *t)
 	fatal_error = check_fatal_sensors(dev);
 
 	if (fatal_error && !health->fatal_error) {
-		mlx5_core_err(dev, "Fatal error %u detected\n", fatal_error);
+		if (!test_and_clear_bit(MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED,
+					&health->reset_flags)) {
+			mlx5_core_err(dev, "Fatal error %u detected\n",
+				      fatal_error);
+			print_health_info(dev);
+		} else {
+			mlx5_core_warn(dev, "Got Device Reset\n");
+			set_bit(MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY, &health->reset_flags);
+		}
 		dev->priv.health.fatal_error = fatal_error;
-		print_health_info(dev);
 		mlx5_trigger_health_work(dev);
 		goto out;
 	}
@@ -725,7 +748,9 @@ static void poll_health(struct timer_list *t)
 		queue_work(health->wq, &health->report_work);
 
 out:
-	mod_timer(&health->timer, get_next_poll_jiffies());
+	mod_timer(&health->timer,
+		  get_next_poll_jiffies(test_bit(MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED,
+						 &health->reset_flags)));
 }
 
 void mlx5_start_health_poll(struct mlx5_core_dev *dev)
@@ -756,6 +781,15 @@ void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health)
 	del_timer_sync(&health->timer);
 }
 
+void mlx5_reload_health_poll_timer(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+
+	mod_timer(&health->timer,
+		  get_next_poll_jiffies(test_bit(MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED,
+						 &health->reset_flags)));
+}
+
 void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
@@ -768,6 +802,20 @@ void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 	cancel_work_sync(&health->fatal_report_work);
 }
 
+void mlx5_health_set_reset_requested_mode(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+
+	set_bit(MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED, &health->reset_flags);
+}
+
+void mlx5_health_clear_reset_requested_mode(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+
+	clear_bit(MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED, &health->reset_flags);
+}
+
 void mlx5_health_flush(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 5fb23cfecd71..8aa4d5aec105 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -57,6 +57,7 @@
 #include "lib/mpfs.h"
 #include "eswitch.h"
 #include "devlink.h"
+#include "fw_reset.h"
 #include "lib/mlx5.h"
 #include "fpga/core.h"
 #include "fpga/ipsec.h"
@@ -832,6 +833,12 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 		goto err_eq_cleanup;
 	}
 
+	err = mlx5_fw_reset_events_init(dev);
+	if (err) {
+		mlx5_core_err(dev, "failed to initialize fw reset events\n");
+		goto err_events_cleanup;
+	}
+
 	mlx5_cq_debugfs_init(dev);
 
 	mlx5_init_reserved_gids(dev);
@@ -893,6 +900,8 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 	mlx5_geneve_destroy(dev->geneve);
 	mlx5_vxlan_destroy(dev->vxlan);
 	mlx5_cq_debugfs_cleanup(dev);
+	mlx5_fw_reset_events_cleanup(dev);
+err_events_cleanup:
 	mlx5_events_cleanup(dev);
 err_eq_cleanup:
 	mlx5_eq_table_cleanup(dev);
@@ -920,6 +929,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
 	mlx5_cleanup_clock(dev);
 	mlx5_cleanup_reserved_gids(dev);
 	mlx5_cq_debugfs_cleanup(dev);
+	mlx5_fw_reset_events_cleanup(dev);
 	mlx5_events_cleanup(dev);
 	mlx5_eq_table_cleanup(dev);
 	mlx5_irq_table_cleanup(dev);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 6a97ad601991..27b6086f3095 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -423,6 +423,11 @@ struct mlx5_sq_bfreg {
 	unsigned int		offset;
 };
 
+enum {
+	MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED,
+	MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY,
+};
+
 struct mlx5_core_health {
 	struct health_buffer __iomem   *health;
 	__be32 __iomem		       *health_counter;
@@ -432,6 +437,7 @@ struct mlx5_core_health {
 	u8				synd;
 	u32				fatal_error;
 	u32				crdump_size;
+	unsigned long			reset_flags;
 	/* wq spinlock to synchronize draining */
 	spinlock_t			wq_lock;
 	struct workqueue_struct	       *wq;
@@ -501,6 +507,7 @@ struct mlx5_mpfs;
 struct mlx5_eswitch;
 struct mlx5_lag;
 struct mlx5_devcom;
+struct mlx5_fw_reset;
 struct mlx5_eq_table;
 struct mlx5_irq_table;
 
@@ -578,6 +585,7 @@ struct mlx5_priv {
 	struct mlx5_core_sriov	sriov;
 	struct mlx5_lag		*lag;
 	struct mlx5_devcom	*devcom;
+	struct mlx5_fw_reset	*fw_reset;
 	struct mlx5_core_roce	roce;
 	struct mlx5_fc_stats		fc_stats;
 	struct mlx5_rl_table            rl_table;
@@ -942,8 +950,11 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev);
 int mlx5_health_init(struct mlx5_core_dev *dev);
 void mlx5_start_health_poll(struct mlx5_core_dev *dev);
 void mlx5_stop_health_poll(struct mlx5_core_dev *dev, bool disable_health);
+void mlx5_reload_health_poll_timer(struct mlx5_core_dev *dev);
 void mlx5_drain_health_wq(struct mlx5_core_dev *dev);
 void mlx5_trigger_health_work(struct mlx5_core_dev *dev);
+void mlx5_health_set_reset_requested_mode(struct mlx5_core_dev *dev);
+void mlx5_health_clear_reset_requested_mode(struct mlx5_core_dev *dev);
 int mlx5_buf_alloc(struct mlx5_core_dev *dev,
 		   int size, struct mlx5_frag_buf *buf);
 void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_frag_buf *buf);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 06/13] net/mlx5: Handle sync reset now event
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (4 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 05/13] net/mlx5: Handle sync reset request event Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 07/13] net/mlx5: Handle sync reset abort event Moshe Shemesh
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

On sync_reset_now event the driver does reload and PCI link toggle to
activate firmware upgrade reset. When the firmware sends this event it
syncs the event on all PFs, so all PFs will do PCI link toggle at once.
To do PCI link toggle, the driver ensures that no other device ID under
the same bridge by checking that all the PF functions under the same PCI
bridge have same device ID. If no other device it uses PCI bridge link
control to turn link down and up.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 126 ++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 13f83fc1783f..87f2b24f4aaa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -8,6 +8,9 @@ struct mlx5_fw_reset {
 	struct mlx5_nb nb;
 	struct workqueue_struct *wq;
 	struct work_struct reset_request_work;
+	struct work_struct reset_now_work;
+	struct completion done;
+	int ret;
 };
 
 static int mlx5_reg_mfrl_set(struct mlx5_core_dev *dev, u8 reset_level,
@@ -71,6 +74,124 @@ static void mlx5_sync_reset_request_event(struct work_struct *work)
 		mlx5_core_warn(dev, "PCI Sync FW Update Reset Ack. Device reset is expected.\n");
 }
 
+#define MLX5_PCI_LINK_UP_TIMEOUT 2000
+
+static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev)
+{
+	struct pci_bus *bridge_bus = dev->pdev->bus;
+	struct pci_dev *bridge = bridge_bus->self;
+	u16 reg16, dev_id, sdev_id;
+	unsigned long timeout;
+	struct pci_dev *sdev;
+	int cap, err;
+	u32 reg32;
+
+	/* Check that all functions under the pci bridge are PFs of
+	 * this device otherwise fail this function.
+	 */
+	err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id);
+	if (err)
+		return err;
+	list_for_each_entry(sdev, &bridge_bus->devices, bus_list) {
+		err = pci_read_config_word(sdev, PCI_DEVICE_ID, &sdev_id);
+		if (err)
+			return err;
+		if (sdev_id != dev_id)
+			return -EPERM;
+	}
+
+	cap = pci_find_capability(bridge, PCI_CAP_ID_EXP);
+	if (!cap)
+		return -EOPNOTSUPP;
+
+	list_for_each_entry(sdev, &bridge_bus->devices, bus_list) {
+		pci_save_state(sdev);
+		pci_cfg_access_lock(sdev);
+	}
+	/* PCI link toggle */
+	err = pci_read_config_word(bridge, cap + PCI_EXP_LNKCTL, &reg16);
+	if (err)
+		return err;
+	reg16 |= PCI_EXP_LNKCTL_LD;
+	err = pci_write_config_word(bridge, cap + PCI_EXP_LNKCTL, reg16);
+	if (err)
+		return err;
+	msleep(500);
+	reg16 &= ~PCI_EXP_LNKCTL_LD;
+	err = pci_write_config_word(bridge, cap + PCI_EXP_LNKCTL, reg16);
+	if (err)
+		return err;
+
+	/* Check link */
+	err = pci_read_config_dword(bridge, cap + PCI_EXP_LNKCAP, &reg32);
+	if (err)
+		return err;
+	if (!(reg32 & PCI_EXP_LNKCAP_DLLLARC)) {
+		mlx5_core_warn(dev, "No PCI link reporting capability (0x%08x)\n", reg32);
+		msleep(1000);
+		goto restore;
+	}
+
+	timeout = jiffies + msecs_to_jiffies(MLX5_PCI_LINK_UP_TIMEOUT);
+	do {
+		err = pci_read_config_word(bridge, cap + PCI_EXP_LNKSTA, &reg16);
+		if (err)
+			return err;
+		if (reg16 & PCI_EXP_LNKSTA_DLLLA)
+			break;
+		msleep(20);
+	} while (!time_after(jiffies, timeout));
+
+	if (reg16 & PCI_EXP_LNKSTA_DLLLA) {
+		mlx5_core_info(dev, "PCI Link up\n");
+	} else {
+		mlx5_core_err(dev, "PCI link not ready (0x%04x) after %d ms\n",
+			      reg16, MLX5_PCI_LINK_UP_TIMEOUT);
+		err = -ETIMEDOUT;
+	}
+
+restore:
+	list_for_each_entry(sdev, &bridge_bus->devices, bus_list) {
+		pci_cfg_access_unlock(sdev);
+		pci_restore_state(sdev);
+	}
+
+	return err;
+}
+
+static void mlx5_sync_reset_now_event(struct work_struct *work)
+{
+	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
+						      reset_now_work);
+	struct mlx5_core_dev *dev = fw_reset->dev;
+	int err;
+
+	mlx5_health_clear_reset_requested_mode(dev);
+	mlx5_stop_health_poll(dev, true);
+
+	mlx5_core_warn(dev, "Sync Reset now. Device is going to reset.\n");
+
+	err = mlx5_cmd_fast_teardown_hca(dev);
+	if (err) {
+		mlx5_core_warn(dev, "Fast teardown failed, no reset done, err %d\n", err);
+		goto done;
+	}
+
+	err = mlx5_pci_link_toggle(dev);
+	if (err) {
+		mlx5_core_warn(dev, "mlx5_pci_link_toggle failed, no reset done, err %d\n", err);
+		goto done;
+	}
+
+	mlx5_enter_error_state(dev, true);
+	mlx5_unload_one(dev, false);
+done:
+	if (err)
+		mlx5_start_health_poll(dev);
+	fw_reset->ret = err;
+	complete(&fw_reset->done);
+}
+
 static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct mlx5_eqe *eqe)
 {
 	struct mlx5_eqe_sync_fw_update *sync_fw_update_eqe;
@@ -82,6 +203,9 @@ static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct
 	case MLX5_SYNC_RST_STATE_RESET_REQUEST:
 		queue_work(fw_reset->wq, &fw_reset->reset_request_work);
 		break;
+	case MLX5_SYNC_RST_STATE_RESET_NOW:
+		queue_work(fw_reset->wq, &fw_reset->reset_now_work);
+		break;
 	}
 }
 
@@ -117,10 +241,12 @@ int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev)
 	dev->priv.fw_reset = fw_reset;
 
 	INIT_WORK(&fw_reset->reset_request_work, mlx5_sync_reset_request_event);
+	INIT_WORK(&fw_reset->reset_now_work, mlx5_sync_reset_now_event);
 
 	MLX5_NB_INIT(&fw_reset->nb, fw_reset_event_notifier, GENERAL_EVENT);
 	mlx5_eq_notifier_register(dev, &fw_reset->nb);
 
+	init_completion(&fw_reset->done);
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 07/13] net/mlx5: Handle sync reset abort event
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (5 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 06/13] net/mlx5: Handle sync reset now event Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 08/13] net/mlx5: Add support for devlink reload level fw reset Moshe Shemesh
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

If firmware sends sync_reset_abort to driver the driver should clear the
reset requested mode as reset is not expected any more.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/fw_reset.c    | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 87f2b24f4aaa..e665080e9a4e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -9,6 +9,7 @@ struct mlx5_fw_reset {
 	struct workqueue_struct *wq;
 	struct work_struct reset_request_work;
 	struct work_struct reset_now_work;
+	struct work_struct reset_abort_work;
 	struct completion done;
 	int ret;
 };
@@ -192,6 +193,16 @@ static void mlx5_sync_reset_now_event(struct work_struct *work)
 	complete(&fw_reset->done);
 }
 
+static void mlx5_sync_reset_abort_event(struct work_struct *work)
+{
+	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
+						      reset_abort_work);
+	struct mlx5_core_dev *dev = fw_reset->dev;
+
+	mlx5_health_clear_reset_requested_mode(dev);
+	mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n");
+}
+
 static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct mlx5_eqe *eqe)
 {
 	struct mlx5_eqe_sync_fw_update *sync_fw_update_eqe;
@@ -206,6 +217,9 @@ static void mlx5_sync_reset_events_handle(struct mlx5_fw_reset *fw_reset, struct
 	case MLX5_SYNC_RST_STATE_RESET_NOW:
 		queue_work(fw_reset->wq, &fw_reset->reset_now_work);
 		break;
+	case MLX5_SYNC_RST_STATE_RESET_ABORT:
+		queue_work(fw_reset->wq, &fw_reset->reset_abort_work);
+		break;
 	}
 }
 
@@ -242,6 +256,7 @@ int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev)
 
 	INIT_WORK(&fw_reset->reset_request_work, mlx5_sync_reset_request_event);
 	INIT_WORK(&fw_reset->reset_now_work, mlx5_sync_reset_now_event);
+	INIT_WORK(&fw_reset->reset_abort_work, mlx5_sync_reset_abort_event);
 
 	MLX5_NB_INIT(&fw_reset->nb, fw_reset_event_notifier, GENERAL_EVENT);
 	mlx5_eq_notifier_register(dev, &fw_reset->nb);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 08/13] net/mlx5: Add support for devlink reload level fw reset
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (6 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 07/13] net/mlx5: Handle sync reset abort event Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter Moshe Shemesh
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Add support for devlink reload level fw_reset which does firmware reset
and driver entities re-instantiation. Once this reload command is
executed the driver initiates fw sync reset flow, where the firmware
synchronizes all PFs on coming reset and driver's entities
re-instantiation.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 53 +++++++++++++++++--
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 14 +++++
 .../ethernet/mellanox/mlx5/core/fw_reset.h    |  1 +
 3 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index 5424e31a0f45..905d55cab4c3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -4,6 +4,7 @@
 #include <devlink.h>
 
 #include "mlx5_core.h"
+#include "fw_reset.h"
 #include "fs_core.h"
 #include "eswitch.h"
 
@@ -88,13 +89,48 @@ mlx5_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 	return 0;
 }
 
+static int mlx5_devlink_trigger_fw_reset(struct devlink *devlink, struct netlink_ext_ack *extack)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+	u8 reset_level, reset_type, net_port_alive;
+	int err;
+
+	err = mlx5_reg_mfrl_query(dev, &reset_level, &reset_type);
+	if (err)
+		return err;
+	if (!(reset_level & MLX5_MFRL_REG_RESET_LEVEL3)) {
+		NL_SET_ERR_MSG_MOD(extack, "FW reset requires reboot");
+		return -EINVAL;
+	}
+
+	net_port_alive = !!(reset_type & MLX5_MFRL_REG_RESET_TYPE_NET_PORT_ALIVE);
+	err = mlx5_fw_set_reset_sync(dev, net_port_alive);
+	if (err)
+		goto out;
+
+	err = mlx5_fw_wait_fw_reset_done(dev);
+out:
+	if (err)
+		NL_SET_ERR_MSG_MOD(extack, "FW reset command failed");
+	return err;
+}
+
 static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 				    enum devlink_reload_level level, struct netlink_ext_ack *extack)
 {
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
 
-	mlx5_unload_one(dev, false);
-	return 0;
+	switch (level) {
+	case DEVLINK_RELOAD_LEVEL_DRIVER:
+		mlx5_unload_one(dev, false);
+		return 0;
+	case DEVLINK_RELOAD_LEVEL_FW_RESET:
+		return mlx5_devlink_trigger_fw_reset(devlink, extack);
+	default:
+		/* Unsupported level should not get to this function */
+		WARN_ON(1);
+		return -EOPNOTSUPP;
+	}
 }
 
 static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_level level,
@@ -102,7 +138,15 @@ static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_l
 {
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
 
-	return mlx5_load_one(dev, false);
+	switch (level) {
+	case DEVLINK_RELOAD_LEVEL_DRIVER:
+	case DEVLINK_RELOAD_LEVEL_FW_RESET:
+		return mlx5_load_one(dev, false);
+	default:
+		/* Unsupported level should not get to this function */
+		WARN_ON(1);
+		return -EOPNOTSUPP;
+	}
 }
 
 static const struct devlink_ops mlx5_devlink_ops = {
@@ -118,7 +162,8 @@ static const struct devlink_ops mlx5_devlink_ops = {
 #endif
 	.flash_update = mlx5_devlink_flash_update,
 	.info_get = mlx5_devlink_info_get,
-	.supported_reload_levels = BIT(DEVLINK_RELOAD_LEVEL_DRIVER),
+	.supported_reload_levels = BIT(DEVLINK_RELOAD_LEVEL_DRIVER) |
+				   BIT(DEVLINK_RELOAD_LEVEL_FW_RESET),
 	.default_reload_level = DEVLINK_RELOAD_LEVEL_DRIVER,
 	.reload_down = mlx5_devlink_reload_down,
 	.reload_up = mlx5_devlink_reload_up,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index e665080e9a4e..f95df226b915 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -239,6 +239,20 @@ static int fw_reset_event_notifier(struct notifier_block *nb, unsigned long acti
 	return NOTIFY_OK;
 }
 
+#define MLX5_FW_RESET_TIMEOUT_MSEC 5000
+int mlx5_fw_wait_fw_reset_done(struct mlx5_core_dev *dev)
+{
+	unsigned long timeout = msecs_to_jiffies(MLX5_FW_RESET_TIMEOUT_MSEC);
+	struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
+
+	if (!wait_for_completion_timeout(&fw_reset->done, timeout)) {
+		mlx5_core_warn(dev, "FW sync reset timeout after %d seconds\n",
+			       MLX5_FW_RESET_TIMEOUT_MSEC / 1000);
+		return -ETIMEDOUT;
+	}
+	return fw_reset->ret;
+}
+
 int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev)
 {
 	struct mlx5_fw_reset *fw_reset = kzalloc(sizeof(*fw_reset), GFP_KERNEL);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
index 278f538ea92a..d7ee951a2258 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
@@ -10,6 +10,7 @@ int mlx5_reg_mfrl_query(struct mlx5_core_dev *dev, u8 *reset_level, u8 *reset_ty
 int mlx5_fw_set_reset_sync(struct mlx5_core_dev *dev, u8 reset_type_sel);
 int mlx5_fw_set_live_patch(struct mlx5_core_dev *dev);
 
+int mlx5_fw_wait_fw_reset_done(struct mlx5_core_dev *dev);
 int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev);
 void mlx5_fw_reset_events_cleanup(struct mlx5_core_dev *dev);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (7 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 08/13] net/mlx5: Add support for devlink reload level fw reset Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-28  0:59   ` Jakub Kicinski
  2020-07-27 11:02 ` [PATCH net-next RFC 10/13] net/mlx5: Add devlink param enable_remote_dev_reset support Moshe Shemesh
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

The enable_remote_dev_reset devlink param flags that the host admin
allows device resets that can be initiated by other hosts. This
parameter is useful for setups where a device is shared by different
hosts, such as multi-host setup. Once the user set this parameter to
false, the driver should NACK any attempt to reset the device while the
driver is loaded.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 Documentation/networking/devlink/devlink-params.rst | 6 ++++++
 include/net/devlink.h                               | 4 ++++
 net/core/devlink.c                                  | 5 +++++
 3 files changed, 15 insertions(+)

diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index d075fd090b3d..54c9f107c4b0 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -108,3 +108,9 @@ own name.
    * - ``region_snapshot_enable``
      - Boolean
      - Enable capture of ``devlink-region`` snapshots.
+   * - ``enable_remote_dev_reset``
+     - Boolean
+     - Enable device reset by remote host. When cleared, the device driver
+       will NACK any attempt of other host to reset the device. This parameter
+       is useful for setups where a device is shared by different hosts, such
+       as multi-host setup.
diff --git a/include/net/devlink.h b/include/net/devlink.h
index b291cd8d6be6..125e7bb9bb82 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -420,6 +420,7 @@ enum devlink_param_generic_id {
 	DEVLINK_PARAM_GENERIC_ID_FW_LOAD_POLICY,
 	DEVLINK_PARAM_GENERIC_ID_RESET_DEV_ON_DRV_PROBE,
 	DEVLINK_PARAM_GENERIC_ID_ENABLE_ROCE,
+	DEVLINK_PARAM_GENERIC_ID_ENABLE_REMOTE_DEV_RESET,
 
 	/* add new param generic ids above here*/
 	__DEVLINK_PARAM_GENERIC_ID_MAX,
@@ -457,6 +458,9 @@ enum devlink_param_generic_id {
 #define DEVLINK_PARAM_GENERIC_ENABLE_ROCE_NAME "enable_roce"
 #define DEVLINK_PARAM_GENERIC_ENABLE_ROCE_TYPE DEVLINK_PARAM_TYPE_BOOL
 
+#define DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_NAME "enable_remote_dev_reset"
+#define DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_TYPE DEVLINK_PARAM_TYPE_BOOL
+
 #define DEVLINK_PARAM_GENERIC(_id, _cmodes, _get, _set, _validate)	\
 {									\
 	.id = DEVLINK_PARAM_GENERIC_ID_##_id,				\
diff --git a/net/core/devlink.c b/net/core/devlink.c
index f1812fc620d4..4d7b0f2a6f7b 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3230,6 +3230,11 @@ static const struct devlink_param devlink_param_generic[] = {
 		.name = DEVLINK_PARAM_GENERIC_ENABLE_ROCE_NAME,
 		.type = DEVLINK_PARAM_GENERIC_ENABLE_ROCE_TYPE,
 	},
+	{
+		.id = DEVLINK_PARAM_GENERIC_ID_ENABLE_REMOTE_DEV_RESET,
+		.name = DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_NAME,
+		.type = DEVLINK_PARAM_GENERIC_ENABLE_REMOTE_DEV_RESET_TYPE,
+	},
 };
 
 static int devlink_param_generic_verify(const struct devlink_param *param)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 10/13] net/mlx5: Add devlink param enable_remote_dev_reset support
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (8 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-28  0:59   ` Jakub Kicinski
  2020-07-27 11:02 ` [PATCH net-next RFC 11/13] net/mlx5: Add support for fw live patch event Moshe Shemesh
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

The enable_remote_dev_reset devlink param flags that the host admin
allows resets by other hosts. In case it is cleared mlx5 host PF driver
will send NACK on pci sync for firmware update reset request and the
command will fail.
By default enable_remote_dev_reset parameter is true, so pci sync for
firmware update reset is enabled.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 29 +++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 12 ++++++++
 include/linux/mlx5/driver.h                   |  1 +
 3 files changed, 42 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index 905d55cab4c3..a81204b3dd7c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -275,6 +275,32 @@ static int mlx5_devlink_large_group_num_validate(struct devlink *devlink, u32 id
 }
 #endif
 
+static int mlx5_devlink_enable_remote_dev_reset_set(struct devlink *devlink, u32 id,
+						    struct devlink_param_gset_ctx *ctx)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+	struct mlx5_core_health *health;
+
+	health = &dev->priv.health;
+	if (ctx->val.vbool)
+		clear_bit(MLX5_HEALTH_RESET_FLAGS_NACK_RESET_REQUEST, &health->reset_flags);
+	else
+		set_bit(MLX5_HEALTH_RESET_FLAGS_NACK_RESET_REQUEST, &health->reset_flags);
+	return 0;
+}
+
+static int mlx5_devlink_enable_remote_dev_reset_get(struct devlink *devlink, u32 id,
+						    struct devlink_param_gset_ctx *ctx)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+	struct mlx5_core_health *health;
+
+	health = &dev->priv.health;
+	ctx->val.vbool = !test_bit(MLX5_HEALTH_RESET_FLAGS_NACK_RESET_REQUEST,
+				   &health->reset_flags);
+	return 0;
+}
+
 static const struct devlink_param mlx5_devlink_params[] = {
 	DEVLINK_PARAM_DRIVER(MLX5_DEVLINK_PARAM_ID_FLOW_STEERING_MODE,
 			     "flow_steering_mode", DEVLINK_PARAM_TYPE_STRING,
@@ -290,6 +316,9 @@ static const struct devlink_param mlx5_devlink_params[] = {
 			     NULL, NULL,
 			     mlx5_devlink_large_group_num_validate),
 #endif
+	DEVLINK_PARAM_GENERIC(ENABLE_REMOTE_DEV_RESET, BIT(DEVLINK_PARAM_CMODE_RUNTIME),
+			      mlx5_devlink_enable_remote_dev_reset_get,
+			      mlx5_devlink_enable_remote_dev_reset_set, NULL),
 };
 
 static void mlx5_devlink_set_params_init_values(struct devlink *devlink)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index f95df226b915..45fdecbadf52 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -61,12 +61,24 @@ static int mlx5_fw_set_reset_sync_ack(struct mlx5_core_dev *dev)
 	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, 0, 1, false);
 }
 
+static int mlx5_fw_set_reset_sync_nack(struct mlx5_core_dev *dev)
+{
+	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, 0, 2, false);
+}
+
 static void mlx5_sync_reset_request_event(struct work_struct *work)
 {
 	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
 						      reset_request_work);
 	struct mlx5_core_dev *dev = fw_reset->dev;
+	int err;
 
+	if (test_bit(MLX5_HEALTH_RESET_FLAGS_NACK_RESET_REQUEST, &dev->priv.health.reset_flags)) {
+		err = mlx5_fw_set_reset_sync_nack(dev);
+		mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s",
+			       err ? "Failed" : "Sent");
+		return;
+	}
 	mlx5_health_set_reset_requested_mode(dev);
 	mlx5_reload_health_poll_timer(dev);
 	if (mlx5_fw_set_reset_sync_ack(dev))
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 27b6086f3095..4999dff9dc8c 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -426,6 +426,7 @@ struct mlx5_sq_bfreg {
 enum {
 	MLX5_HEALTH_RESET_FLAGS_RESET_REQUESTED,
 	MLX5_HEALTH_RESET_FLAGS_SILENT_RECOVERY,
+	MLX5_HEALTH_RESET_FLAGS_NACK_RESET_REQUEST,
 };
 
 struct mlx5_core_health {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 11/13] net/mlx5: Add support for fw live patch event
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (9 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 10/13] net/mlx5: Add devlink param enable_remote_dev_reset support Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 12/13] net/mlx5: Add support for devlink reload level live patch Moshe Shemesh
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Firmware live patch event notifies the driver that the firmware was just
updated using live patch. In such case the driver should not reload or
re-initiate entities, part to updating the firmware version and
re-initiate the firmware tracer which can be updated by live patch with
new strings database to help debugging an issue.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../mellanox/mlx5/core/diag/fw_tracer.c       | 31 +++++++++++++++++++
 .../mellanox/mlx5/core/diag/fw_tracer.h       |  1 +
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 27 ++++++++++++++++
 include/linux/mlx5/device.h                   |  1 +
 4 files changed, 60 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
index ad3594c4afcb..08dae045d185 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
@@ -1064,6 +1064,37 @@ void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer)
 	kvfree(tracer);
 }
 
+int mlx5_fw_tracer_recreate_strings_db(struct mlx5_fw_tracer *tracer)
+{
+	struct mlx5_core_dev *dev;
+	int err;
+
+	if (IS_ERR_OR_NULL(tracer))
+		return -EINVAL;
+
+	cancel_work_sync(&tracer->read_fw_strings_work);
+	mlx5_fw_tracer_clean_ready_list(tracer);
+	mlx5_fw_tracer_clean_print_hash(tracer);
+	mlx5_fw_tracer_clean_saved_traces_array(tracer);
+	mlx5_fw_tracer_free_strings_db(tracer);
+
+	dev = tracer->dev;
+	err = mlx5_query_mtrc_caps(tracer);
+	if (err) {
+		mlx5_core_dbg(dev, "FWTracer: Failed to query capabilities %d\n", err);
+		return err;
+	}
+
+	err = mlx5_fw_tracer_allocate_strings_db(tracer);
+	if (err) {
+		mlx5_core_warn(dev, "FWTracer: Allocate strings DB failed %d\n", err);
+		return err;
+	}
+	mlx5_fw_tracer_init_saved_traces_array(tracer);
+
+	return 0;
+}
+
 static int fw_tracer_event(struct notifier_block *nb, unsigned long action, void *data)
 {
 	struct mlx5_fw_tracer *tracer = mlx5_nb_cof(nb, struct mlx5_fw_tracer, nb);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
index 40601fba80ba..1a755098aeeb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.h
@@ -191,5 +191,6 @@ void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer);
 int mlx5_fw_tracer_trigger_core_dump_general(struct mlx5_core_dev *dev);
 int mlx5_fw_tracer_get_saved_traces_objects(struct mlx5_fw_tracer *tracer,
 					    struct devlink_fmsg *fmsg);
+int mlx5_fw_tracer_recreate_strings_db(struct mlx5_fw_tracer *tracer);
 
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 45fdecbadf52..87465a5e5577 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -2,11 +2,13 @@
 /* Copyright (c) 2020, Mellanox Technologies inc.  All rights reserved. */
 
 #include "fw_reset.h"
+#include "diag/fw_tracer.h"
 
 struct mlx5_fw_reset {
 	struct mlx5_core_dev *dev;
 	struct mlx5_nb nb;
 	struct workqueue_struct *wq;
+	struct work_struct fw_live_patch_work;
 	struct work_struct reset_request_work;
 	struct work_struct reset_now_work;
 	struct work_struct reset_abort_work;
@@ -66,6 +68,27 @@ static int mlx5_fw_set_reset_sync_nack(struct mlx5_core_dev *dev)
 	return mlx5_reg_mfrl_set(dev, MLX5_MFRL_REG_RESET_LEVEL3, 0, 2, false);
 }
 
+static void mlx5_fw_live_patch_event(struct work_struct *work)
+{
+	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
+						      fw_live_patch_work);
+	struct mlx5_core_dev *dev = fw_reset->dev;
+	struct mlx5_fw_tracer *tracer;
+
+	mlx5_core_info(dev, "Live patch updated firmware version: %d.%d.%d\n", fw_rev_maj(dev),
+		       fw_rev_min(dev), fw_rev_sub(dev));
+
+	tracer = dev->tracer;
+	if (IS_ERR_OR_NULL(tracer))
+		return;
+
+	mlx5_fw_tracer_cleanup(tracer);
+	if (mlx5_fw_tracer_recreate_strings_db(tracer))
+		mlx5_core_err(dev, "Failed to recreate FW tracer strings DB\n");
+	if (mlx5_fw_tracer_init(tracer))
+		mlx5_core_err(dev, "Failed to re-initialize FW tracer\n");
+}
+
 static void mlx5_sync_reset_request_event(struct work_struct *work)
 {
 	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
@@ -241,6 +264,9 @@ static int fw_reset_event_notifier(struct notifier_block *nb, unsigned long acti
 	struct mlx5_eqe *eqe = data;
 
 	switch (eqe->sub_type) {
+	case MLX5_GENERAL_SUBTYPE_FW_LIVE_PATCH_EVENT:
+			queue_work(fw_reset->wq, &fw_reset->fw_live_patch_work);
+		break;
 	case MLX5_GENERAL_SUBTYPE_PCI_SYNC_FOR_FW_UPDATE_EVENT:
 		mlx5_sync_reset_events_handle(fw_reset, eqe);
 		break;
@@ -280,6 +306,7 @@ int mlx5_fw_reset_events_init(struct mlx5_core_dev *dev)
 	fw_reset->dev = dev;
 	dev->priv.fw_reset = fw_reset;
 
+	INIT_WORK(&fw_reset->fw_live_patch_work, mlx5_fw_live_patch_event);
 	INIT_WORK(&fw_reset->reset_request_work, mlx5_sync_reset_request_event);
 	INIT_WORK(&fw_reset->reset_now_work, mlx5_sync_reset_now_event);
 	INIT_WORK(&fw_reset->reset_abort_work, mlx5_sync_reset_abort_event);
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 57db125e5802..58e63bb718a2 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -364,6 +364,7 @@ enum {
 enum {
 	MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT = 0x1,
 	MLX5_GENERAL_SUBTYPE_PCI_POWER_CHANGE_EVENT = 0x5,
+	MLX5_GENERAL_SUBTYPE_FW_LIVE_PATCH_EVENT = 0x7,
 	MLX5_GENERAL_SUBTYPE_PCI_SYNC_FOR_FW_UPDATE_EVENT = 0x8,
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 12/13] net/mlx5: Add support for devlink reload level live patch
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (10 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 11/13] net/mlx5: Add support for fw live patch event Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-27 11:02 ` [PATCH net-next RFC 13/13] devlink: Add Documentation/networking/devlink/devlink-reload.rst Moshe Shemesh
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Add support for devlink reload level fw_live_patch which does live
patching to firmware.
The driver checks if the firmware is capable of handling the pending
firmware changes as a live patch. If it is then it triggers
fw_live_patch flow.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 30 ++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index a81204b3dd7c..804e1e7b25cc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -115,6 +115,29 @@ static int mlx5_devlink_trigger_fw_reset(struct devlink *devlink, struct netlink
 	return err;
 }
 
+static int mlx5_devlink_trigger_fw_live_patch(struct devlink *devlink,
+					      struct netlink_ext_ack *extack)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+	u8 reset_level;
+	int err;
+
+	err = mlx5_reg_mfrl_query(dev, &reset_level, NULL);
+	if (err)
+		return err;
+	if (!(reset_level & MLX5_MFRL_REG_RESET_LEVEL0)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "FW upgrade to the stored FW can't be done by FW live patching");
+		return -EINVAL;
+	}
+
+	err = mlx5_fw_set_live_patch(dev);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 				    enum devlink_reload_level level, struct netlink_ext_ack *extack)
 {
@@ -126,6 +149,8 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change,
 		return 0;
 	case DEVLINK_RELOAD_LEVEL_FW_RESET:
 		return mlx5_devlink_trigger_fw_reset(devlink, extack);
+	case DEVLINK_RELOAD_LEVEL_FW_LIVE_PATCH:
+		return mlx5_devlink_trigger_fw_live_patch(devlink, extack);
 	default:
 		/* Unsupported level should not get to this function */
 		WARN_ON(1);
@@ -142,6 +167,8 @@ static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_l
 	case DEVLINK_RELOAD_LEVEL_DRIVER:
 	case DEVLINK_RELOAD_LEVEL_FW_RESET:
 		return mlx5_load_one(dev, false);
+	case DEVLINK_RELOAD_LEVEL_FW_LIVE_PATCH:
+		return 0;
 	default:
 		/* Unsupported level should not get to this function */
 		WARN_ON(1);
@@ -163,7 +190,8 @@ static const struct devlink_ops mlx5_devlink_ops = {
 	.flash_update = mlx5_devlink_flash_update,
 	.info_get = mlx5_devlink_info_get,
 	.supported_reload_levels = BIT(DEVLINK_RELOAD_LEVEL_DRIVER) |
-				   BIT(DEVLINK_RELOAD_LEVEL_FW_RESET),
+				   BIT(DEVLINK_RELOAD_LEVEL_FW_RESET) |
+				   BIT(DEVLINK_RELOAD_LEVEL_FW_LIVE_PATCH),
 	.default_reload_level = DEVLINK_RELOAD_LEVEL_DRIVER,
 	.reload_down = mlx5_devlink_reload_down,
 	.reload_up = mlx5_devlink_reload_up,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next RFC 13/13] devlink: Add Documentation/networking/devlink/devlink-reload.rst
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (11 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 12/13] net/mlx5: Add support for devlink reload level live patch Moshe Shemesh
@ 2020-07-27 11:02 ` Moshe Shemesh
  2020-07-28  5:25 ` [PATCH net-next RFC 00/13] Add devlink reload level option Vasundhara Volam
  2020-07-28 16:37 ` Jacob Keller
  14 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-27 11:02 UTC (permalink / raw)
  To: David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel, Moshe Shemesh

Add devlink reload rst documentation file.
Update index file to include it.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
---
 .../networking/devlink/devlink-reload.rst     | 56 +++++++++++++++++++
 Documentation/networking/devlink/index.rst    |  1 +
 2 files changed, 57 insertions(+)
 create mode 100644 Documentation/networking/devlink/devlink-reload.rst

diff --git a/Documentation/networking/devlink/devlink-reload.rst b/Documentation/networking/devlink/devlink-reload.rst
new file mode 100644
index 000000000000..092574229bab
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-reload.rst
@@ -0,0 +1,56 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+Devlink Reload
+==============
+
+``devlink-reload`` provides mechanism to reload either driver or firmware or both,
+depends on reload level selected.
+The driver reload is used to re-initiate driver's entities, including applying
+new values to ``devlink`` entities which are used during driver load, such as
+``devlink-params`` in configuration mode ``driverinit`` or ``devlink-resources``.
+The firmware reload is used either to reset the firmware or upgrade the firmware
+if new firmware is already stored and waiting to be activated.
+Some driver's may support fw_live_patch which will do firmware upgrade,
+applying changes without the need for reset.
+
+Reload levels
+=============
+
+Reload may be set in different reload levels.
+
+.. list-table:: Possible reload levels
+   :widths: 5 90
+
+   * - Name
+     - Description
+   * - ``driver``
+     - Driver entities re-instantiation only.
+   * - ``fw_reset``
+     - Firmware reset and driver entities re-instantiation. Can be used for
+       firmware upgrade if new firmware is stored and driver supports such
+       firmware upgrade.
+   * - ``fw_live_patch``
+     - Firmware live patch, applies firmware changes without reset.
+
+Change namespace
+================
+
+All devlink instances are created in init_net and stay there for a
+lifetime. Allow user to be able to move devlink instances into
+namespaces during devlink reload operation. That ensures proper
+re-instantiation of driver objects, including netdevices.
+
+example usage
+-------------
+
+.. code:: shell
+
+    $ devlink dev reload help
+    $ devlink dev reload DEV [ netns { PID | NAME | ID } ] [ level { driver | fw_reset | fw_live_patch } ]
+
+    # Run reload command with driver's default level:
+    $ devlink dev reload pci/0000:82:00.0
+
+    # Run reload command with fw_reset level:
+    $ devlink dev reload pci/0000:82:00.0 level fw_reset
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 7684ae5c4a4a..d82874760ae2 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -20,6 +20,7 @@ general.
    devlink-params
    devlink-region
    devlink-resource
+   devlink-reload
    devlink-trap
 
 Driver-specific documentation
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-27 11:02 ` [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command Moshe Shemesh
@ 2020-07-28  0:58   ` Jakub Kicinski
  2020-07-28 13:58     ` Jiri Pirko
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-28  0:58 UTC (permalink / raw)
  To: Moshe Shemesh, netdev, linux-kernel
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam

On Mon, 27 Jul 2020 14:02:21 +0300 Moshe Shemesh wrote:
> Add devlink reload level to allow the user to request a specific reload
> level. The level parameter is optional, if not specified then driver's
> default reload level is used (backward compatible).

Please don't leave space for driver-specific behavior. The OS is
supposed to abstract device differences away.

Previously the purpose of reload was to activate new devlink params
(with driverinit cmode), now you want the ability to activate new
firmware. Let users specify their intent and their constraints.

> Reload levels supported are:
> driver: driver entities re-instantiation only.
> fw_reset: firmware reset and driver entities re-instantiation.
> fw_live_patch: firmware live patching only.

I'm concerned live_patch is not first - it's the lowest impact (since
it's live). Please make sure you clearly specify the expected behavior
for the new API.

The notion of multi-host is key for live patching, so it has to be
mentioned.

> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get
  2020-07-27 11:02 ` [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get Moshe Shemesh
@ 2020-07-28  0:58   ` Jakub Kicinski
  2020-07-29 14:37     ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-28  0:58 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel

On Mon, 27 Jul 2020 14:02:22 +0300 Moshe Shemesh wrote:
> Expose devlink reload supported levels and driver's default level to the
> user through devlink dev get command.
> 
> Examples:
> $ devlink dev show
> pci/0000:82:00.0:
>   reload_levels_info:
>     default_level driver
>     supported_levels:
>       driver fw_reset fw_live_patch
> pci/0000:82:00.1:
>   reload_levels_info:
>     default_level driver
>     supported_levels:
>       driver fw_reset fw_live_patch
> 
> $ devlink dev show -jp
> {
>     "dev": {
>         "pci/0000:82:00.0": {
>             "reload_levels_info": {
>                 "default_level": "driver",
>                 "supported_levels": [ "driver","fw_reset","fw_live_patch" ]
>             }
>         },
>         "pci/0000:82:00.1": {
>             "reload_levels_info": {
>                 "default_level": "driver",
>                 "supported_levels": [ "driver","fw_reset","fw_live_patch" ]
>             }
>         }
>     }
> }
> 
> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>

The fact that the driver supports fw_live_patch, does not necessarily
mean that the currently running FW can be live upgraded to the
currently flashed one, right? 

This interface does not appear to be optimal for the purpose.

Again, documentation of what can be lost (in terms of configuration and
features) upon upgrade is missing.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter
  2020-07-27 11:02 ` [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter Moshe Shemesh
@ 2020-07-28  0:59   ` Jakub Kicinski
  2020-07-29 14:42     ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-28  0:59 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel

On Mon, 27 Jul 2020 14:02:29 +0300 Moshe Shemesh wrote:
> The enable_remote_dev_reset devlink param flags that the host admin
> allows device resets that can be initiated by other hosts. This
> parameter is useful for setups where a device is shared by different
> hosts, such as multi-host setup. Once the user set this parameter to
> false, the driver should NACK any attempt to reset the device while the
> driver is loaded.
> 
> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>

There needs to be a devlink event generated when reset is triggered
(remotely or not).

You're also missing failure reasons. Users need to know if the reset
request was clearly nacked by some host, not supported, etc. vs
unexpected failure.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 10/13] net/mlx5: Add devlink param enable_remote_dev_reset support
  2020-07-27 11:02 ` [PATCH net-next RFC 10/13] net/mlx5: Add devlink param enable_remote_dev_reset support Moshe Shemesh
@ 2020-07-28  0:59   ` Jakub Kicinski
  0 siblings, 0 replies; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-28  0:59 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel

On Mon, 27 Jul 2020 14:02:30 +0300 Moshe Shemesh wrote:
>  static void mlx5_sync_reset_request_event(struct work_struct *work)
>  {
>  	struct mlx5_fw_reset *fw_reset = container_of(work, struct mlx5_fw_reset,
>  						      reset_request_work);
>  	struct mlx5_core_dev *dev = fw_reset->dev;
> +	int err;
>  
> +	if (test_bit(MLX5_HEALTH_RESET_FLAGS_NACK_RESET_REQUEST, &dev->priv.health.reset_flags)) {
> +		err = mlx5_fw_set_reset_sync_nack(dev);
> +		mlx5_core_warn(dev, "PCI Sync FW Update Reset Nack %s",
> +			       err ? "Failed" : "Sent");
> +		return;
> +	}

What if the NACK fails? Does the reset still proceed?

>  	mlx5_health_set_reset_requested_mode(dev);
>  	mlx5_reload_health_poll_timer(dev);
>  	if (mlx5_fw_set_reset_sync_ack(dev))

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (12 preceding siblings ...)
  2020-07-27 11:02 ` [PATCH net-next RFC 13/13] devlink: Add Documentation/networking/devlink/devlink-reload.rst Moshe Shemesh
@ 2020-07-28  5:25 ` Vasundhara Volam
  2020-07-28 16:43   ` Jacob Keller
  2020-07-28 16:37 ` Jacob Keller
  14 siblings, 1 reply; 58+ messages in thread
From: Vasundhara Volam @ 2020-07-28  5:25 UTC (permalink / raw)
  To: Moshe Shemesh; +Cc: David S. Miller, Jiri Pirko, Netdev, open list

On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>
> Introduce new option on devlink reload API to enable the user to select the
> reload level required. Complete support for all levels in mlx5.
> The following reload levels are supported:
>   driver: Driver entities re-instantiation only.
>   fw_reset: Firmware reset and driver entities re-instantiation.
The Name is a little confusing. I think it should be renamed to
fw_live_reset (in which both firmware and driver entities are
re-instantiated).  For only fw_reset, the driver should not undergo
reset (it requires a driver reload for firmware to undergo reset).

>   fw_live_patch: Firmware live patching only.
This level is not clear. Is this similar to flashing??

Also I have a basic query. The reload command is split into
reload_up/reload_down handlers (Please correct me if this behaviour is
changed with this patchset). What if the vendor specific driver does
not support up/down and needs only a single handler to fire a firmware
reset or firmware live reset command?
>
> Each driver which support this command should expose the reload levels
> supported and the driver's default reload level.
> The uAPI is backward compatible, if the reload level option is omitted
> from the reload command, the driver's default reload level will be used.
>
> Patch 1 adds the new API reload level option to devlink.
> Patch 2 exposes the supported reload levels and default level on devlink
>         dev get.
> Patches 3-8 add support on mlx5 for devlink reload level fw-reset and
>             handle the firmware reset events.
> Patches 9-10 add devlink enable remote dev reset parameter and use it
>              in mlx5.
> Patches 11-12 mlx5 add devlink reload live patch support and event
>               handling.
> Patch 13 adds documentation file devlink-reload.rst
>
> Command examples:
>
> # Run reload command with fw-reset reload level:
> $ devlink dev reload pci/0000:82:00.0 level fw-reset
>
> # Run reload command with driver reload level:
> $ devlink dev reload pci/0000:82:00.0 level driver
>
> # Run reload command with driver's default level (backward compatible):
> $ devlink dev reload pci/0000:82:00.0
>
>
> Moshe Shemesh (13):
>   devlink: Add reload level option to devlink reload command
>   devlink: Add reload levels data to dev get
>   net/mlx5: Add functions to set/query MFRL register
>   net/mlx5: Set cap for pci sync for fw update event
>   net/mlx5: Handle sync reset request event
>   net/mlx5: Handle sync reset now event
>   net/mlx5: Handle sync reset abort event
>   net/mlx5: Add support for devlink reload level fw reset
>   devlink: Add enable_remote_dev_reset generic parameter
>   net/mlx5: Add devlink param enable_remote_dev_reset support
>   net/mlx5: Add support for fw live patch event
>   net/mlx5: Add support for devlink reload level live patch
>   devlink: Add Documentation/networking/devlink/devlink-reload.rst
>
>  .../networking/devlink/devlink-params.rst     |   6 +
>  .../networking/devlink/devlink-reload.rst     |  56 +++
>  Documentation/networking/devlink/index.rst    |   1 +
>  drivers/net/ethernet/mellanox/mlx4/main.c     |   6 +-
>  .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
>  .../net/ethernet/mellanox/mlx5/core/devlink.c | 114 +++++-
>  .../mellanox/mlx5/core/diag/fw_tracer.c       |  31 ++
>  .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
>  .../ethernet/mellanox/mlx5/core/fw_reset.c    | 328 ++++++++++++++++++
>  .../ethernet/mellanox/mlx5/core/fw_reset.h    |  17 +
>  .../net/ethernet/mellanox/mlx5/core/health.c  |  74 +++-
>  .../net/ethernet/mellanox/mlx5/core/main.c    |  13 +
>  drivers/net/ethernet/mellanox/mlxsw/core.c    |   6 +-
>  drivers/net/netdevsim/dev.c                   |   6 +-
>  include/linux/mlx5/device.h                   |   1 +
>  include/linux/mlx5/driver.h                   |  12 +
>  include/net/devlink.h                         |  10 +-
>  include/uapi/linux/devlink.h                  |  22 ++
>  net/core/devlink.c                            |  95 ++++-
>  19 files changed, 764 insertions(+), 37 deletions(-)
>  create mode 100644 Documentation/networking/devlink/devlink-reload.rst
>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-28  0:58   ` Jakub Kicinski
@ 2020-07-28 13:58     ` Jiri Pirko
  2020-07-28 16:47       ` Jacob Keller
  0 siblings, 1 reply; 58+ messages in thread
From: Jiri Pirko @ 2020-07-28 13:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Moshe Shemesh, netdev, linux-kernel, David S. Miller, Jiri Pirko,
	Vasundhara Volam

Tue, Jul 28, 2020 at 02:58:02AM CEST, kuba@kernel.org wrote:
>On Mon, 27 Jul 2020 14:02:21 +0300 Moshe Shemesh wrote:
>> Add devlink reload level to allow the user to request a specific reload
>> level. The level parameter is optional, if not specified then driver's
>> default reload level is used (backward compatible).
>
>Please don't leave space for driver-specific behavior. The OS is
>supposed to abstract device differences away.

But this is needed to maintain the existing behaviour which is different
for different drivers.


>
>Previously the purpose of reload was to activate new devlink params
>(with driverinit cmode), now you want the ability to activate new
>firmware. Let users specify their intent and their constraints.
>
>> Reload levels supported are:
>> driver: driver entities re-instantiation only.
>> fw_reset: firmware reset and driver entities re-instantiation.
>> fw_live_patch: firmware live patching only.
>
>I'm concerned live_patch is not first - it's the lowest impact (since
>it's live). Please make sure you clearly specify the expected behavior
>for the new API.
>
>The notion of multi-host is key for live patching, so it has to be
>mentioned.
>
>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
                   ` (13 preceding siblings ...)
  2020-07-28  5:25 ` [PATCH net-next RFC 00/13] Add devlink reload level option Vasundhara Volam
@ 2020-07-28 16:37 ` Jacob Keller
  14 siblings, 0 replies; 58+ messages in thread
From: Jacob Keller @ 2020-07-28 16:37 UTC (permalink / raw)
  To: Moshe Shemesh, David S. Miller, Jiri Pirko, Vasundhara Volam
  Cc: netdev, linux-kernel



On 7/27/2020 4:02 AM, Moshe Shemesh wrote:
> Introduce new option on devlink reload API to enable the user to select the
> reload level required. Complete support for all levels in mlx5.
> The following reload levels are supported:
>   driver: Driver entities re-instantiation only. 

So, this is the current support. Ok.

>   fw_reset: Firmware reset and driver entities re-instantiation. 


This would include firmware update? What about differing levels of
device/firmware reset? I.e. I think some of our HW has function level
reset, device wide reset, as well as EMP reset. For us, only EMP reset
would trigger firmware update.

>   fw_live_patch: Firmware live patching only.

This is for update without reset, right?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-07-28  5:25 ` [PATCH net-next RFC 00/13] Add devlink reload level option Vasundhara Volam
@ 2020-07-28 16:43   ` Jacob Keller
  2020-08-03 10:24     ` Vasundhara Volam
  0 siblings, 1 reply; 58+ messages in thread
From: Jacob Keller @ 2020-07-28 16:43 UTC (permalink / raw)
  To: Vasundhara Volam, Moshe Shemesh
  Cc: David S. Miller, Jiri Pirko, Netdev, open list



On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>
>> Introduce new option on devlink reload API to enable the user to select the
>> reload level required. Complete support for all levels in mlx5.
>> The following reload levels are supported:
>>   driver: Driver entities re-instantiation only.
>>   fw_reset: Firmware reset and driver entities re-instantiation.
> The Name is a little confusing. I think it should be renamed to
> fw_live_reset (in which both firmware and driver entities are
> re-instantiated).  For only fw_reset, the driver should not undergo
> reset (it requires a driver reload for firmware to undergo reset).
> 

So, I think the differentiation here is that "live_patch" doesn't reset
anything.

>>   fw_live_patch: Firmware live patching only.
> This level is not clear. Is this similar to flashing??
> 
> Also I have a basic query. The reload command is split into
> reload_up/reload_down handlers (Please correct me if this behaviour is
> changed with this patchset). What if the vendor specific driver does
> not support up/down and needs only a single handler to fire a firmware
> reset or firmware live reset command?

In the "reload_down" handler, they would trigger the appropriate reset,
and quiesce anything that needs to be done. Then on reload up, it would
restore and bring up anything quiesced in the first stage.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-28 13:58     ` Jiri Pirko
@ 2020-07-28 16:47       ` Jacob Keller
  2020-07-28 18:44         ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Jacob Keller @ 2020-07-28 16:47 UTC (permalink / raw)
  To: Jiri Pirko, Jakub Kicinski
  Cc: Moshe Shemesh, netdev, linux-kernel, David S. Miller, Jiri Pirko,
	Vasundhara Volam



On 7/28/2020 6:58 AM, Jiri Pirko wrote:
> Tue, Jul 28, 2020 at 02:58:02AM CEST, kuba@kernel.org wrote:
>> On Mon, 27 Jul 2020 14:02:21 +0300 Moshe Shemesh wrote:
>>> Add devlink reload level to allow the user to request a specific reload
>>> level. The level parameter is optional, if not specified then driver's
>>> default reload level is used (backward compatible).
>>
>> Please don't leave space for driver-specific behavior. The OS is
>> supposed to abstract device differences away.
> 
> But this is needed to maintain the existing behaviour which is different
> for different drivers.
> 

Which drivers behave differently here?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-28 16:47       ` Jacob Keller
@ 2020-07-28 18:44         ` Jakub Kicinski
  2020-07-28 19:18           ` Jacob Keller
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-28 18:44 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jiri Pirko, Moshe Shemesh, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam

On Tue, 28 Jul 2020 09:47:00 -0700 Jacob Keller wrote:
> On 7/28/2020 6:58 AM, Jiri Pirko wrote:
> > But this is needed to maintain the existing behaviour which is different
> > for different drivers.
> 
> Which drivers behave differently here?

I think Jiri refers to mlxsw vs mlx5.

mlxsw loads firmware on probe, by default at least. So reloading the
driver implies a FW reset. NIC drivers OTOH don't generally load FW
so they didn't reset FW.

Now since we're redefining the API from "do a reload so that driverinit
params are applied" (or "so that all netdevs get spawned in a new
netns") to "do a reset of depth X" we have to change the paradigm.

What I was trying to suggest is that we should not have to re-define
the API like this.

From user perspective what's important is what the reset achieves (and
perhaps how destructive it is). We can define the reset levels as:

$ devlink dev reload pci/0000:82:00.0 net-ns-respawn
$ devlink dev reload pci/0000:82:00.0 driver-param-init
$ devlink dev reload pci/0000:82:00.0 fw-activate

combining should be possible when user wants multiple things to happen:

$ devlink dev reload pci/0000:82:00.0 fw-activate driver-param-init


Then we have the use case of a "live reset" which is slightly
under-defined right now IMHO, but we can extend it as:

$ devlink dev reload pci/0000:82:00.0 fw-activate --live


We can also add the "reset level" specifier - for the cases where
device is misbehaving:

$ devlink dev reload pci/0000:82:00.0 level [driver|fw|hardware]


But I don't think that we can go from the current reload command
cleanly to just a level reset. The driver-specific default is a bad
smell which indicates we're changing semantics from what user wants 
to what the reset depth is. Our semantics with the patch as it stands
are in fact:
 - if you want to load new params or change netns, don't pass the level
   - the "driver default" workaround dictates the right reset level for
   param init;
 - if you want to activate new firmware - select the reset level you'd
   like from the reset level options.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-28 18:44         ` Jakub Kicinski
@ 2020-07-28 19:18           ` Jacob Keller
  2020-07-28 20:06             ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Jacob Keller @ 2020-07-28 19:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jiri Pirko, Moshe Shemesh, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam



On 7/28/2020 11:44 AM, Jakub Kicinski wrote:
> On Tue, 28 Jul 2020 09:47:00 -0700 Jacob Keller wrote:
>> On 7/28/2020 6:58 AM, Jiri Pirko wrote:
>>> But this is needed to maintain the existing behaviour which is different
>>> for different drivers.
>>
>> Which drivers behave differently here?
> 
> I think Jiri refers to mlxsw vs mlx5.
> 
> mlxsw loads firmware on probe, by default at least. So reloading the
> driver implies a FW reset. NIC drivers OTOH don't generally load FW
> so they didn't reset FW.
> 

Ok.

> Now since we're redefining the API from "do a reload so that driverinit
> params are applied" (or "so that all netdevs get spawned in a new
> netns") to "do a reset of depth X" we have to change the paradigm.
> 
> What I was trying to suggest is that we should not have to re-define
> the API like this.

Ok.

> 
> From user perspective what's important is what the reset achieves (and
> perhaps how destructive it is). We can define the reset levels as:
> 
> $ devlink dev reload pci/0000:82:00.0 net-ns-respawn
> $ devlink dev reload pci/0000:82:00.0 driver-param-init
> $ devlink dev reload pci/0000:82:00.0 fw-activate
> 
> combining should be possible when user wants multiple things to happen:
> 
> $ devlink dev reload pci/0000:82:00.0 fw-activate driver-param-init
> 

Where today "driver-param-init" is the default behavior. But didn't we
just say that mlxsw also does the equivalent of fw-activate?

> 
> Then we have the use case of a "live reset" which is slightly
> under-defined right now IMHO, but we can extend it as:
> 
> $ devlink dev reload pci/0000:82:00.0 fw-activate --live
> 

Yea, I think live fw patching things aren't quite as defined yet.

> 
> We can also add the "reset level" specifier - for the cases where
> device is misbehaving:
> 
> $ devlink dev reload pci/0000:82:00.0 level [driver|fw|hardware]
> 

I guess I don't quite see how level fits in? This is orthogonal to the
other settings?

> 
> But I don't think that we can go from the current reload command
> cleanly to just a level reset. The driver-specific default is a bad
> smell which indicates we're changing semantics from what user wants 
> to what the reset depth is. Our semantics with the patch as it stands
> are in fact:
>  - if you want to load new params or change netns, don't pass the level
>    - the "driver default" workaround dictates the right reset level for
>    param init;
>  - if you want to activate new firmware - select the reset level you'd
>    like from the reset level options.
> 

I think I agree, having the "what gets reloaded" as a separate vector
makes sense and avoids confusion. For example for ice hardware,
"fw-activate" really does mean "Do an EMP reset" rather than just a
"device reset" which could be interpreted differently. ice can do
function level reset, or core device reset also. Neither of those two
resets activates firmware.

Additionally the current function load process in ice doesn't support
driver-init at all. That's something I'd like to see happen but is quite
a significant refactor for how the driver loads. We need to refactor
everything out so that devlink is created early on and factor out
load/unload into handlers that can be called by the devlink reload. As I
have primarily been focused on flash update I sort of left that for the
future because it was a huge task to solve.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-28 19:18           ` Jacob Keller
@ 2020-07-28 20:06             ` Jakub Kicinski
  2020-07-29 14:54               ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-28 20:06 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jiri Pirko, Moshe Shemesh, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam

On Tue, 28 Jul 2020 12:18:30 -0700 Jacob Keller wrote:
> On 7/28/2020 11:44 AM, Jakub Kicinski wrote:
> > From user perspective what's important is what the reset achieves (and
> > perhaps how destructive it is). We can define the reset levels as:
> > 
> > $ devlink dev reload pci/0000:82:00.0 net-ns-respawn
> > $ devlink dev reload pci/0000:82:00.0 driver-param-init
> > $ devlink dev reload pci/0000:82:00.0 fw-activate
> > 
> > combining should be possible when user wants multiple things to happen:
> > 
> > $ devlink dev reload pci/0000:82:00.0 fw-activate driver-param-init
> 
> Where today "driver-param-init" is the default behavior. But didn't we
> just say that mlxsw also does the equivalent of fw-activate?

Actually the default should probably be the combination of
driver-param-init and net-ns-respawn.

My expectations would be that the driver must perform the lowest reset 
level possible that satisfies the requested functional change. 
IOW driver may do more, in fact it should be acceptable for the driver
to always for a full HW reset (unless --live or other constraint is
specified).

> > We can also add the "reset level" specifier - for the cases where
> > device is misbehaving:
> > 
> > $ devlink dev reload pci/0000:82:00.0 level [driver|fw|hardware]
> 
> I guess I don't quite see how level fits in? This is orthogonal to the
> other settings?

Yup, it is, it's already orthogonal to what reload does today, hence the
need for the "driver default" hack.

> > But I don't think that we can go from the current reload command
> > cleanly to just a level reset. The driver-specific default is a bad
> > smell which indicates we're changing semantics from what user wants 
> > to what the reset depth is. Our semantics with the patch as it stands
> > are in fact:
> >  - if you want to load new params or change netns, don't pass the level
> >    - the "driver default" workaround dictates the right reset level for
> >    param init;
> >  - if you want to activate new firmware - select the reset level you'd
> >    like from the reset level options.
> >   
> 
> I think I agree, having the "what gets reloaded" as a separate vector
> makes sense and avoids confusion. For example for ice hardware,
> "fw-activate" really does mean "Do an EMP reset" rather than just a
> "device reset" which could be interpreted differently. ice can do
> function level reset, or core device reset also. Neither of those two
> resets activates firmware.
> 
> Additionally the current function load process in ice doesn't support
> driver-init at all. That's something I'd like to see happen but is quite
> a significant refactor for how the driver loads. We need to refactor
> everything out so that devlink is created early on and factor out
> load/unload into handlers that can be called by the devlink reload. As I
> have primarily been focused on flash update I sort of left that for the
> future because it was a huge task to solve.

Cool! That was what I was concerned about, but I didn't know any
existing driver already has the problem. "FW reset" is not nearly
a clear enough operation. We'd end up with drivers differing and 
users having to refer to vendor documentation to find out which 
"reset level" maps to what.

I think the components in ethtool-reset try to address the same
problem, and they have the notion of per-port, and per-device.
In the modern world we lack the per-host notion, but that's still 
strictly clearer than the limited API proposed here.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get
  2020-07-28  0:58   ` Jakub Kicinski
@ 2020-07-29 14:37     ` Moshe Shemesh
  2020-07-29 21:11       ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-29 14:37 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel


On 7/28/2020 3:58 AM, Jakub Kicinski wrote:
> On Mon, 27 Jul 2020 14:02:22 +0300 Moshe Shemesh wrote:
>> Expose devlink reload supported levels and driver's default level to the
>> user through devlink dev get command.
>>
>> Examples:
>> $ devlink dev show
>> pci/0000:82:00.0:
>>    reload_levels_info:
>>      default_level driver
>>      supported_levels:
>>        driver fw_reset fw_live_patch
>> pci/0000:82:00.1:
>>    reload_levels_info:
>>      default_level driver
>>      supported_levels:
>>        driver fw_reset fw_live_patch
>>
>> $ devlink dev show -jp
>> {
>>      "dev": {
>>          "pci/0000:82:00.0": {
>>              "reload_levels_info": {
>>                  "default_level": "driver",
>>                  "supported_levels": [ "driver","fw_reset","fw_live_patch" ]
>>              }
>>          },
>>          "pci/0000:82:00.1": {
>>              "reload_levels_info": {
>>                  "default_level": "driver",
>>                  "supported_levels": [ "driver","fw_reset","fw_live_patch" ]
>>              }
>>          }
>>      }
>> }
>>
>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> The fact that the driver supports fw_live_patch, does not necessarily
> mean that the currently running FW can be live upgraded to the
> currently flashed one, right?


That's correct, though the feature is supported, the firmware gap may 
not be suitable for live_patch.

The user will be noted accordingly by extack message.

>
> This interface does not appear to be optimal for the purpose.
>
> Again, documentation of what can be lost (in terms of configuration and
> features) upon upgrade is missing.


I will clarify in documentation. On live_patch nothing should be lost or 
re-initialized, that's the "live" thing.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter
  2020-07-28  0:59   ` Jakub Kicinski
@ 2020-07-29 14:42     ` Moshe Shemesh
  2020-07-29 20:57       ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-29 14:42 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel


On 7/28/2020 3:59 AM, Jakub Kicinski wrote:
> On Mon, 27 Jul 2020 14:02:29 +0300 Moshe Shemesh wrote:
>> The enable_remote_dev_reset devlink param flags that the host admin
>> allows device resets that can be initiated by other hosts. This
>> parameter is useful for setups where a device is shared by different
>> hosts, such as multi-host setup. Once the user set this parameter to
>> false, the driver should NACK any attempt to reset the device while the
>> driver is loaded.
>>
>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> There needs to be a devlink event generated when reset is triggered
> (remotely or not).
>
> You're also missing failure reasons. Users need to know if the reset
> request was clearly nacked by some host, not supported, etc. vs
> unexpected failure.


I will fix and send extack message to the user accordingly.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-28 20:06             ` Jakub Kicinski
@ 2020-07-29 14:54               ` Moshe Shemesh
  2020-07-29 21:07                 ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-29 14:54 UTC (permalink / raw)
  To: Jakub Kicinski, Jacob Keller
  Cc: Jiri Pirko, netdev, linux-kernel, David S. Miller, Jiri Pirko,
	Vasundhara Volam


On 7/28/2020 11:06 PM, Jakub Kicinski wrote:
> On Tue, 28 Jul 2020 12:18:30 -0700 Jacob Keller wrote:
>> On 7/28/2020 11:44 AM, Jakub Kicinski wrote:
>>>  From user perspective what's important is what the reset achieves (and
>>> perhaps how destructive it is). We can define the reset levels as:
>>>
>>> $ devlink dev reload pci/0000:82:00.0 net-ns-respawn
>>> $ devlink dev reload pci/0000:82:00.0 driver-param-init
>>> $ devlink dev reload pci/0000:82:00.0 fw-activate
>>>
>>> combining should be possible when user wants multiple things to happen:
>>>
>>> $ devlink dev reload pci/0000:82:00.0 fw-activate driver-param-init
>> Where today "driver-param-init" is the default behavior. But didn't we
>> just say that mlxsw also does the equivalent of fw-activate?
> Actually the default should probably be the combination of
> driver-param-init and net-ns-respawn.


What about the support of these combinations, one device needs to reset 
fw to apply the param init,

while another device can apply param-init without fw reset, but has to 
reload the driver for fw-reset.

So the support per driver will be a matrix of combinations ?

> My expectations would be that the driver must perform the lowest reset
> level possible that satisfies the requested functional change.
> IOW driver may do more, in fact it should be acceptable for the driver
> to always for a full HW reset (unless --live or other constraint is
> specified).


OK, but some combinations may still not be valid for specific driver 
even if it tries lowest level possible.

>>> We can also add the "reset level" specifier - for the cases where
>>> device is misbehaving:
>>>
>>> $ devlink dev reload pci/0000:82:00.0 level [driver|fw|hardware]
>> I guess I don't quite see how level fits in? This is orthogonal to the
>> other settings?
> Yup, it is, it's already orthogonal to what reload does today, hence the
> need for the "driver default" hack.
>
>>> But I don't think that we can go from the current reload command
>>> cleanly to just a level reset. The driver-specific default is a bad
>>> smell which indicates we're changing semantics from what user wants
>>> to what the reset depth is. Our semantics with the patch as it stands
>>> are in fact:
>>>   - if you want to load new params or change netns, don't pass the level
>>>     - the "driver default" workaround dictates the right reset level for
>>>     param init;
>>>   - if you want to activate new firmware - select the reset level you'd
>>>     like from the reset level options.
>>>    
>> I think I agree, having the "what gets reloaded" as a separate vector
>> makes sense and avoids confusion. For example for ice hardware,
>> "fw-activate" really does mean "Do an EMP reset" rather than just a
>> "device reset" which could be interpreted differently. ice can do
>> function level reset, or core device reset also. Neither of those two
>> resets activates firmware.
>>
>> Additionally the current function load process in ice doesn't support
>> driver-init at all. That's something I'd like to see happen but is quite
>> a significant refactor for how the driver loads. We need to refactor
>> everything out so that devlink is created early on and factor out
>> load/unload into handlers that can be called by the devlink reload. As I
>> have primarily been focused on flash update I sort of left that for the
>> future because it was a huge task to solve.
> Cool! That was what I was concerned about, but I didn't know any
> existing driver already has the problem. "FW reset" is not nearly
> a clear enough operation. We'd end up with drivers differing and
> users having to refer to vendor documentation to find out which
> "reset level" maps to what.
>
> I think the components in ethtool-reset try to address the same
> problem, and they have the notion of per-port, and per-device.
> In the modern world we lack the per-host notion, but that's still
> strictly clearer than the limited API proposed here.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter
  2020-07-29 14:42     ` Moshe Shemesh
@ 2020-07-29 20:57       ` Jakub Kicinski
  2020-07-30 12:08         ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-29 20:57 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel

On Wed, 29 Jul 2020 17:42:12 +0300 Moshe Shemesh wrote:
> On 7/28/2020 3:59 AM, Jakub Kicinski wrote:
> > On Mon, 27 Jul 2020 14:02:29 +0300 Moshe Shemesh wrote:  
> >> The enable_remote_dev_reset devlink param flags that the host admin
> >> allows device resets that can be initiated by other hosts. This
> >> parameter is useful for setups where a device is shared by different
> >> hosts, such as multi-host setup. Once the user set this parameter to
> >> false, the driver should NACK any attempt to reset the device while the
> >> driver is loaded.
> >>
> >> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>  
> > There needs to be a devlink event generated when reset is triggered
> > (remotely or not).
> >
> > You're also missing failure reasons. Users need to know if the reset
> > request was clearly nacked by some host, not supported, etc. vs
> > unexpected failure.  
> 
> I will fix and send extack message to the user accordingly.

I'd suggest the reason codes to be somewhat standard.

The groups I can think of:
 - timeout - device did not respond to the reset request
 - device reject - FW or else has nacked the activation req
 - host incapable - one of the participating hosts (in MH) is not
   capable of handling live activation
 - host denied - one of the participating hosts has NACKed
 - host timeout - one of the p. hosts did not ack or done the procedure
   in time (e.g. has not toggled the link)
 - failed reset - the activation itself had failed
 - failed reinit - one of p. hosts was not able to cleanly come back up

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-29 14:54               ` Moshe Shemesh
@ 2020-07-29 21:07                 ` Jakub Kicinski
  2020-07-30 12:30                   ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-29 21:07 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jacob Keller, Jiri Pirko, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam

On Wed, 29 Jul 2020 17:54:08 +0300 Moshe Shemesh wrote:
> On 7/28/2020 11:06 PM, Jakub Kicinski wrote:
> > On Tue, 28 Jul 2020 12:18:30 -0700 Jacob Keller wrote:  
> >> On 7/28/2020 11:44 AM, Jakub Kicinski wrote:  
> >>>  From user perspective what's important is what the reset achieves (and
> >>> perhaps how destructive it is). We can define the reset levels as:
> >>>
> >>> $ devlink dev reload pci/0000:82:00.0 net-ns-respawn
> >>> $ devlink dev reload pci/0000:82:00.0 driver-param-init
> >>> $ devlink dev reload pci/0000:82:00.0 fw-activate
> >>>
> >>> combining should be possible when user wants multiple things to happen:
> >>>
> >>> $ devlink dev reload pci/0000:82:00.0 fw-activate driver-param-init  
> >> Where today "driver-param-init" is the default behavior. But didn't we
> >> just say that mlxsw also does the equivalent of fw-activate?  
> > Actually the default should probably be the combination of
> > driver-param-init and net-ns-respawn.  
> 
> What about the support of these combinations, one device needs to reset 
> fw to apply the param init, while another device can apply param-init
> without fw reset, but has to reload the driver for fw-reset.
> 
> So the support per driver will be a matrix of combinations ?

Note that there is no driver reload in my examples, driver reload is
likely not user's goal. Whatever the driver needs to reset to satisfy
the goal is fair game IMO.

It's already the case that some drivers reset FW for param init and some
don't and nobody is complaining.

We should treat constraints separate (in this set we have the live
activation which is a constraint on the reload operation).

> > My expectations would be that the driver must perform the lowest
> > reset level possible that satisfies the requested functional change.
> > IOW driver may do more, in fact it should be acceptable for the
> > driver to always for a full HW reset (unless --live or other
> > constraint is specified).  
> 
> OK, but some combinations may still not be valid for specific driver 
> even if it tries lowest level possible.

Can you give an example?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get
  2020-07-29 14:37     ` Moshe Shemesh
@ 2020-07-29 21:11       ` Jakub Kicinski
  2020-07-30 12:05         ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-29 21:11 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel

On Wed, 29 Jul 2020 17:37:41 +0300 Moshe Shemesh wrote:
> > The fact that the driver supports fw_live_patch, does not necessarily
> > mean that the currently running FW can be live upgraded to the
> > currently flashed one, right?  
> 
> That's correct, though the feature is supported, the firmware gap may 
> not be suitable for live_patch.
> 
> The user will be noted accordingly by extack message.

That's kinda late, because use may have paid the cost of migrating the
workload or otherwise taking precautions - and if live reset fails all
this work is wasted.

While the device most likely knows upfront whether it can be live reset
or not, otherwise I don't see how it could reject the reset reliably.

> > This interface does not appear to be optimal for the purpose.
> >
> > Again, documentation of what can be lost (in terms of configuration and
> > features) upon upgrade is missing.  
> 
> I will clarify in documentation. On live_patch nothing should be lost or 
> re-initialized, that's the "live" thing.

Okay, so FW upgrade cannot be allowed when it'd mean the device gets
de-featured? Also no link loss, correct? What's the expected length of
traffic interruption (order of magnitude)?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get
  2020-07-29 21:11       ` Jakub Kicinski
@ 2020-07-30 12:05         ` Moshe Shemesh
  0 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-30 12:05 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel


On 7/30/2020 12:11 AM, Jakub Kicinski wrote:
> On Wed, 29 Jul 2020 17:37:41 +0300 Moshe Shemesh wrote:
>>> The fact that the driver supports fw_live_patch, does not necessarily
>>> mean that the currently running FW can be live upgraded to the
>>> currently flashed one, right?
>> That's correct, though the feature is supported, the firmware gap may
>> not be suitable for live_patch.
>>
>> The user will be noted accordingly by extack message.
> That's kinda late, because use may have paid the cost of migrating the
> workload or otherwise taking precautions - and if live reset fails all
> this work is wasted.
>
> While the device most likely knows upfront whether it can be live reset
> or not, otherwise I don't see how it could reject the reset reliably.


The device knows if the new FW can be updated by live-patch or need 
reset once the new version is stored and it so it can check the gaps.

So once the new FW is stored I can query if it is a change that can do 
by live_patch or need full fw_reset.

>>> This interface does not appear to be optimal for the purpose.
>>>
>>> Again, documentation of what can be lost (in terms of configuration and
>>> features) upon upgrade is missing.
>> I will clarify in documentation. On live_patch nothing should be lost or
>> re-initialized, that's the "live" thing.
> Okay, so FW upgrade cannot be allowed when it'd mean the device gets
> de-featured? Also no link loss, correct? What's the expected length of
> traffic interruption (order of magnitude)?


That's different between fw_live_patch and fw_reset, that's why I see it 
as different level.

The live_patch is totally live, no link loss, no data interruption at all.

But when the firmware gap for upgrade is not suitable for live patch, 
the user can choose to do full fw reset, that can include link loss 
(depends on device) for few seconds and some configuration which is not 
saved by the driver or was not configured through the driver (some other 
tool) need to re-configure.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter
  2020-07-29 20:57       ` Jakub Kicinski
@ 2020-07-30 12:08         ` Moshe Shemesh
  0 siblings, 0 replies; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-30 12:08 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Jiri Pirko, Vasundhara Volam, netdev, linux-kernel


On 7/29/2020 11:57 PM, Jakub Kicinski wrote:
> On Wed, 29 Jul 2020 17:42:12 +0300 Moshe Shemesh wrote:
>> On 7/28/2020 3:59 AM, Jakub Kicinski wrote:
>>> On Mon, 27 Jul 2020 14:02:29 +0300 Moshe Shemesh wrote:
>>>> The enable_remote_dev_reset devlink param flags that the host admin
>>>> allows device resets that can be initiated by other hosts. This
>>>> parameter is useful for setups where a device is shared by different
>>>> hosts, such as multi-host setup. Once the user set this parameter to
>>>> false, the driver should NACK any attempt to reset the device while the
>>>> driver is loaded.
>>>>
>>>> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
>>> There needs to be a devlink event generated when reset is triggered
>>> (remotely or not).
>>>
>>> You're also missing failure reasons. Users need to know if the reset
>>> request was clearly nacked by some host, not supported, etc. vs
>>> unexpected failure.
>> I will fix and send extack message to the user accordingly.
> I'd suggest the reason codes to be somewhat standard.
>
> The groups I can think of:
>   - timeout - device did not respond to the reset request
>   - device reject - FW or else has nacked the activation req
>   - host incapable - one of the participating hosts (in MH) is not
>     capable of handling live activation
>   - host denied - one of the participating hosts has NACKed
>   - host timeout - one of the p. hosts did not ack or done the procedure
>     in time (e.g. has not toggled the link)
>   - failed reset - the activation itself had failed
>   - failed reinit - one of p. hosts was not able to cleanly come back up


Sounds good, that seems to cover all options of fw_reset process to fail.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-29 21:07                 ` Jakub Kicinski
@ 2020-07-30 12:30                   ` Moshe Shemesh
  2020-07-30 23:11                     ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-07-30 12:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jacob Keller, Jiri Pirko, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam


On 7/30/2020 12:07 AM, Jakub Kicinski wrote:
> On Wed, 29 Jul 2020 17:54:08 +0300 Moshe Shemesh wrote:
>> On 7/28/2020 11:06 PM, Jakub Kicinski wrote:
>>> On Tue, 28 Jul 2020 12:18:30 -0700 Jacob Keller wrote:
>>>> On 7/28/2020 11:44 AM, Jakub Kicinski wrote:
>>>>>   From user perspective what's important is what the reset achieves (and
>>>>> perhaps how destructive it is). We can define the reset levels as:
>>>>>
>>>>> $ devlink dev reload pci/0000:82:00.0 net-ns-respawn
>>>>> $ devlink dev reload pci/0000:82:00.0 driver-param-init
>>>>> $ devlink dev reload pci/0000:82:00.0 fw-activate
>>>>>
>>>>> combining should be possible when user wants multiple things to happen:
>>>>>
>>>>> $ devlink dev reload pci/0000:82:00.0 fw-activate driver-param-init
>>>> Where today "driver-param-init" is the default behavior. But didn't we
>>>> just say that mlxsw also does the equivalent of fw-activate?
>>> Actually the default should probably be the combination of
>>> driver-param-init and net-ns-respawn.
>> What about the support of these combinations, one device needs to reset
>> fw to apply the param init, while another device can apply param-init
>> without fw reset, but has to reload the driver for fw-reset.
>>
>> So the support per driver will be a matrix of combinations ?
> Note that there is no driver reload in my examples, driver reload is
> likely not user's goal. Whatever the driver needs to reset to satisfy
> the goal is fair game IMO.


Actually, driver-param-init (cmode driverinit) implicit driver 
re-initialization.

> It's already the case that some drivers reset FW for param init and some
> don't and nobody is complaining.


Right, driver may need more than driver re-initialization for 
driver-param-init, but I think that driver re-initialization is the 
minimum for driver-param-init.

> We should treat constraints separate (in this set we have the live
> activation which is a constraint on the reload operation).
>
>>> My expectations would be that the driver must perform the lowest
>>> reset level possible that satisfies the requested functional change.
>>> IOW driver may do more, in fact it should be acceptable for the
>>> driver to always for a full HW reset (unless --live or other
>>> constraint is specified).
>> OK, but some combinations may still not be valid for specific driver
>> even if it tries lowest level possible.
> Can you give an example?


For example take the combination of fw-live-patch and param-init.

The fw-live-patch needs no re-initialization, while the param-init 
requires driver re-initialization.

So the only way to do that is to the one command after the other, not 
really combining.

Other combination, as fw-atcivate and param-init may not be valid for a 
specific driver as it doesn't support one of them and so can't even run 
one after the other.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-30 12:30                   ` Moshe Shemesh
@ 2020-07-30 23:11                     ` Jakub Kicinski
  2020-08-01 21:32                       ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-07-30 23:11 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jacob Keller, Jiri Pirko, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam

On Thu, 30 Jul 2020 15:30:45 +0300 Moshe Shemesh wrote:
> >>> My expectations would be that the driver must perform the lowest
> >>> reset level possible that satisfies the requested functional change.
> >>> IOW driver may do more, in fact it should be acceptable for the
> >>> driver to always for a full HW reset (unless --live or other
> >>> constraint is specified).  
> >> OK, but some combinations may still not be valid for specific driver
> >> even if it tries lowest level possible.  
> > Can you give an example?  
> 
> For example take the combination of fw-live-patch and param-init.
> 
> The fw-live-patch needs no re-initialization, while the param-init 
> requires driver re-initialization.
> 
> So the only way to do that is to the one command after the other, not 
> really combining.

You need to read my responses more carefully. I don't have
fw-live-patch in my proposal. The operation is fw-activate,
--live is independent and an constraint, not an operation.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-07-30 23:11                     ` Jakub Kicinski
@ 2020-08-01 21:32                       ` Moshe Shemesh
  2020-08-03 14:14                         ` Jiri Pirko
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-08-01 21:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jacob Keller, Jiri Pirko, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam


On 7/31/2020 2:11 AM, Jakub Kicinski wrote:
> On Thu, 30 Jul 2020 15:30:45 +0300 Moshe Shemesh wrote:
>>>>> My expectations would be that the driver must perform the lowest
>>>>> reset level possible that satisfies the requested functional change.
>>>>> IOW driver may do more, in fact it should be acceptable for the
>>>>> driver to always for a full HW reset (unless --live or other
>>>>> constraint is specified).
>>>> OK, but some combinations may still not be valid for specific driver
>>>> even if it tries lowest level possible.
>>> Can you give an example?
>> For example take the combination of fw-live-patch and param-init.
>>
>> The fw-live-patch needs no re-initialization, while the param-init
>> requires driver re-initialization.
>>
>> So the only way to do that is to the one command after the other, not
>> really combining.
> You need to read my responses more carefully. I don't have
> fw-live-patch in my proposal. The operation is fw-activate,
> --live is independent and an constraint, not an operation.


OK, I probably didn't get the whole picture right.

I am not sure I got it yet, please review if that's the uAPI that you 
mean to:

devlink dev reload [ net-ns-respawn { PID | NAME | ID } ] [ 
driver-param-init ] [ fw-activate [ --live] ]


Also, I recall that before devlink param was added the devlink reload 
was used for devlink resources.

I am not sure it is still used for devlink resources as I don't see it 
in the code of devlink reload.

But if it is we probably should add it as another operation.

Jiri, please comment on that.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-07-28 16:43   ` Jacob Keller
@ 2020-08-03 10:24     ` Vasundhara Volam
  2020-08-03 12:17       ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Vasundhara Volam @ 2020-08-03 10:24 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Moshe Shemesh, David S. Miller, Jiri Pirko, Netdev, open list

On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>
>
>
> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
> > On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>
> >> Introduce new option on devlink reload API to enable the user to select the
> >> reload level required. Complete support for all levels in mlx5.
> >> The following reload levels are supported:
> >>   driver: Driver entities re-instantiation only.
> >>   fw_reset: Firmware reset and driver entities re-instantiation.
> > The Name is a little confusing. I think it should be renamed to
> > fw_live_reset (in which both firmware and driver entities are
> > re-instantiated).  For only fw_reset, the driver should not undergo
> > reset (it requires a driver reload for firmware to undergo reset).
> >
>
> So, I think the differentiation here is that "live_patch" doesn't reset
> anything.
This seems similar to flashing the firmware and does not reset anything.

>
> >>   fw_live_patch: Firmware live patching only.
> > This level is not clear. Is this similar to flashing??
> >
> > Also I have a basic query. The reload command is split into
> > reload_up/reload_down handlers (Please correct me if this behaviour is
> > changed with this patchset). What if the vendor specific driver does
> > not support up/down and needs only a single handler to fire a firmware
> > reset or firmware live reset command?
>
> In the "reload_down" handler, they would trigger the appropriate reset,
> and quiesce anything that needs to be done. Then on reload up, it would
> restore and bring up anything quiesced in the first stage.
Yes, I got the "reload_down" and "reload_up". Similar to the device
"remove" and "re-probe" respectively.

But our requirement is a similar "ethtool reset" command, where
ethtool calls a single callback in driver and driver just sends a
firmware command for doing the reset. Once firmware receives the
command, it will initiate the reset of driver and firmware entities
asynchronously.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-03 10:24     ` Vasundhara Volam
@ 2020-08-03 12:17       ` Moshe Shemesh
  2020-08-03 12:47         ` Vasundhara Volam
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-08-03 12:17 UTC (permalink / raw)
  To: Vasundhara Volam, Jacob Keller
  Cc: David S. Miller, Jiri Pirko, Netdev, open list


On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>>
>>
>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>> Introduce new option on devlink reload API to enable the user to select the
>>>> reload level required. Complete support for all levels in mlx5.
>>>> The following reload levels are supported:
>>>>    driver: Driver entities re-instantiation only.
>>>>    fw_reset: Firmware reset and driver entities re-instantiation.
>>> The Name is a little confusing. I think it should be renamed to
>>> fw_live_reset (in which both firmware and driver entities are
>>> re-instantiated).  For only fw_reset, the driver should not undergo
>>> reset (it requires a driver reload for firmware to undergo reset).
>>>
>> So, I think the differentiation here is that "live_patch" doesn't reset
>> anything.
> This seems similar to flashing the firmware and does not reset anything.


The live patch is activating fw change without reset.

It is not suitable for any fw change but fw gaps which don't require reset.

I can query the fw to check if the pending image change is suitable or 
require fw reset.

>>>>    fw_live_patch: Firmware live patching only.
>>> This level is not clear. Is this similar to flashing??
>>>
>>> Also I have a basic query. The reload command is split into
>>> reload_up/reload_down handlers (Please correct me if this behaviour is
>>> changed with this patchset). What if the vendor specific driver does
>>> not support up/down and needs only a single handler to fire a firmware
>>> reset or firmware live reset command?
>> In the "reload_down" handler, they would trigger the appropriate reset,
>> and quiesce anything that needs to be done. Then on reload up, it would
>> restore and bring up anything quiesced in the first stage.
> Yes, I got the "reload_down" and "reload_up". Similar to the device
> "remove" and "re-probe" respectively.
>
> But our requirement is a similar "ethtool reset" command, where
> ethtool calls a single callback in driver and driver just sends a
> firmware command for doing the reset. Once firmware receives the
> command, it will initiate the reset of driver and firmware entities
> asynchronously.


It is similar to mlx5 case here for fw_reset. The driver triggers the fw 
command to reset and all PFs drivers gets events to handle and do 
re-initialization.  To fit it to the devlink reload_down and reload_up, 
I wait for the event handler to complete and it stops at driver unload 
to have the driver up by devlink reload_up. See patch 8 in this patchset.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-03 12:17       ` Moshe Shemesh
@ 2020-08-03 12:47         ` Vasundhara Volam
  2020-08-03 13:52           ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Vasundhara Volam @ 2020-08-03 12:47 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list

On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>
>
> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
> > On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> >>
> >>
> >> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
> >>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>> Introduce new option on devlink reload API to enable the user to select the
> >>>> reload level required. Complete support for all levels in mlx5.
> >>>> The following reload levels are supported:
> >>>>    driver: Driver entities re-instantiation only.
> >>>>    fw_reset: Firmware reset and driver entities re-instantiation.
> >>> The Name is a little confusing. I think it should be renamed to
> >>> fw_live_reset (in which both firmware and driver entities are
> >>> re-instantiated).  For only fw_reset, the driver should not undergo
> >>> reset (it requires a driver reload for firmware to undergo reset).
> >>>
> >> So, I think the differentiation here is that "live_patch" doesn't reset
> >> anything.
> > This seems similar to flashing the firmware and does not reset anything.
>
>
> The live patch is activating fw change without reset.
>
> It is not suitable for any fw change but fw gaps which don't require reset.
>
> I can query the fw to check if the pending image change is suitable or
> require fw reset.
Okay.
>
> >>>>    fw_live_patch: Firmware live patching only.
> >>> This level is not clear. Is this similar to flashing??
> >>>
> >>> Also I have a basic query. The reload command is split into
> >>> reload_up/reload_down handlers (Please correct me if this behaviour is
> >>> changed with this patchset). What if the vendor specific driver does
> >>> not support up/down and needs only a single handler to fire a firmware
> >>> reset or firmware live reset command?
> >> In the "reload_down" handler, they would trigger the appropriate reset,
> >> and quiesce anything that needs to be done. Then on reload up, it would
> >> restore and bring up anything quiesced in the first stage.
> > Yes, I got the "reload_down" and "reload_up". Similar to the device
> > "remove" and "re-probe" respectively.
> >
> > But our requirement is a similar "ethtool reset" command, where
> > ethtool calls a single callback in driver and driver just sends a
> > firmware command for doing the reset. Once firmware receives the
> > command, it will initiate the reset of driver and firmware entities
> > asynchronously.
>
>
> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
> command to reset and all PFs drivers gets events to handle and do
> re-initialization.  To fit it to the devlink reload_down and reload_up,
> I wait for the event handler to complete and it stops at driver unload
> to have the driver up by devlink reload_up. See patch 8 in this patchset.
>
Yes, I see reload_down is triggering the reset. In our driver, after
triggering the reset through a firmware command, reset is done in
another context as the driver initiates the reset only after receiving
an ASYNC event from the firmware.

Probably, we have to use reload_down() to send firmware command to
trigger reset and do nothing in reload_up. And returning from reload
does not mean that reset is complete as it is done in another context
and the driver notifies the health reporter once the reset is
complete. devlink framework may have to allow drivers to implement
reload_down only to look more clean or call reload_up only if the
driver notifies the devlink once reset is completed from another
context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-03 12:47         ` Vasundhara Volam
@ 2020-08-03 13:52           ` Moshe Shemesh
  2020-08-04 10:13             ` Vasundhara Volam
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-08-03 13:52 UTC (permalink / raw)
  To: Vasundhara Volam
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list


On 8/3/2020 3:47 PM, Vasundhara Volam wrote:
> On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>
>> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
>>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>>>>
>>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
>>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>>>> Introduce new option on devlink reload API to enable the user to select the
>>>>>> reload level required. Complete support for all levels in mlx5.
>>>>>> The following reload levels are supported:
>>>>>>     driver: Driver entities re-instantiation only.
>>>>>>     fw_reset: Firmware reset and driver entities re-instantiation.
>>>>> The Name is a little confusing. I think it should be renamed to
>>>>> fw_live_reset (in which both firmware and driver entities are
>>>>> re-instantiated).  For only fw_reset, the driver should not undergo
>>>>> reset (it requires a driver reload for firmware to undergo reset).
>>>>>
>>>> So, I think the differentiation here is that "live_patch" doesn't reset
>>>> anything.
>>> This seems similar to flashing the firmware and does not reset anything.
>>
>> The live patch is activating fw change without reset.
>>
>> It is not suitable for any fw change but fw gaps which don't require reset.
>>
>> I can query the fw to check if the pending image change is suitable or
>> require fw reset.
> Okay.
>>>>>>     fw_live_patch: Firmware live patching only.
>>>>> This level is not clear. Is this similar to flashing??
>>>>>
>>>>> Also I have a basic query. The reload command is split into
>>>>> reload_up/reload_down handlers (Please correct me if this behaviour is
>>>>> changed with this patchset). What if the vendor specific driver does
>>>>> not support up/down and needs only a single handler to fire a firmware
>>>>> reset or firmware live reset command?
>>>> In the "reload_down" handler, they would trigger the appropriate reset,
>>>> and quiesce anything that needs to be done. Then on reload up, it would
>>>> restore and bring up anything quiesced in the first stage.
>>> Yes, I got the "reload_down" and "reload_up". Similar to the device
>>> "remove" and "re-probe" respectively.
>>>
>>> But our requirement is a similar "ethtool reset" command, where
>>> ethtool calls a single callback in driver and driver just sends a
>>> firmware command for doing the reset. Once firmware receives the
>>> command, it will initiate the reset of driver and firmware entities
>>> asynchronously.
>>
>> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
>> command to reset and all PFs drivers gets events to handle and do
>> re-initialization.  To fit it to the devlink reload_down and reload_up,
>> I wait for the event handler to complete and it stops at driver unload
>> to have the driver up by devlink reload_up. See patch 8 in this patchset.
>>
> Yes, I see reload_down is triggering the reset. In our driver, after
> triggering the reset through a firmware command, reset is done in
> another context as the driver initiates the reset only after receiving
> an ASYNC event from the firmware.


Same here.

>
> Probably, we have to use reload_down() to send firmware command to
> trigger reset and do nothing in reload_up.
I had that in previous version, but its wrong to use devlink reload this 
way, so I added wait with timeout for the event handling to complete 
before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also 
the event handler stops before load back to have that done by devlink 
reload_up.
>   And returning from reload
> does not mean that reset is complete as it is done in another context
> and the driver notifies the health reporter once the reset is
> complete. devlink framework may have to allow drivers to implement
> reload_down only to look more clean or call reload_up only if the
> driver notifies the devlink once reset is completed from another
> context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-01 21:32                       ` Moshe Shemesh
@ 2020-08-03 14:14                         ` Jiri Pirko
  2020-08-03 20:57                           ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Jiri Pirko @ 2020-08-03 14:14 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jakub Kicinski, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

Sat, Aug 01, 2020 at 11:32:25PM CEST, moshe@mellanox.com wrote:
>
>On 7/31/2020 2:11 AM, Jakub Kicinski wrote:
>> On Thu, 30 Jul 2020 15:30:45 +0300 Moshe Shemesh wrote:
>> > > > > My expectations would be that the driver must perform the lowest
>> > > > > reset level possible that satisfies the requested functional change.
>> > > > > IOW driver may do more, in fact it should be acceptable for the
>> > > > > driver to always for a full HW reset (unless --live or other
>> > > > > constraint is specified).
>> > > > OK, but some combinations may still not be valid for specific driver
>> > > > even if it tries lowest level possible.
>> > > Can you give an example?
>> > For example take the combination of fw-live-patch and param-init.
>> > 
>> > The fw-live-patch needs no re-initialization, while the param-init
>> > requires driver re-initialization.
>> > 
>> > So the only way to do that is to the one command after the other, not
>> > really combining.
>> You need to read my responses more carefully. I don't have
>> fw-live-patch in my proposal. The operation is fw-activate,
>> --live is independent and an constraint, not an operation.
>
>
>OK, I probably didn't get the whole picture right.
>
>I am not sure I got it yet, please review if that's the uAPI that you mean
>to:
>
>devlink dev reload [ net-ns-respawn { PID | NAME | ID } ] [ driver-param-init
>] [ fw-activate [ --live] ]

Jakub, why do you prefer to have another extra level-specific option
"live"? I think it is clear to have it as a separate level. The behaviour
of the operation is quite different.


>
>
>Also, I recall that before devlink param was added the devlink reload was
>used for devlink resources.

Yes. That was the primary usecase. That is also why mlxsw does fw reset,
because the fw reset is needed in order to pass resources configuration.

So I don't think that the name should be "driver-param-init" as it is
not specific to params.


>
>I am not sure it is still used for devlink resources as I don't see it in the
>code of devlink reload.
>
>But if it is we probably should add it as another operation.
>
>Jiri, please comment on that.
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-03 14:14                         ` Jiri Pirko
@ 2020-08-03 20:57                           ` Jakub Kicinski
  2020-08-04 10:04                             ` Jiri Pirko
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-08-03 20:57 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Moshe Shemesh, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

On Mon, 3 Aug 2020 16:14:42 +0200 Jiri Pirko wrote:
> >devlink dev reload [ net-ns-respawn { PID | NAME | ID } ] [ driver-param-init
> >] [ fw-activate [ --live] ]  
> 
> Jakub, why do you prefer to have another extra level-specific option
> "live"? I think it is clear to have it as a separate level. The behaviour
> of the operation is quite different.

I was trying to avoid having to provide a Cartesian product of
operation and system disruption level, if any other action can
be done "live" at some point.

But no strong feelings about that one.

Really, as long as there is no driver-specific defaults (or as 
little driver-specific anything as possible) and user actions 
are clearly expressed (fw-reset does not necessarily imply
fw-activation) - the API will be fine IMO.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-03 20:57                           ` Jakub Kicinski
@ 2020-08-04 10:04                             ` Jiri Pirko
  2020-08-04 20:39                               ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Jiri Pirko @ 2020-08-04 10:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Moshe Shemesh, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

Mon, Aug 03, 2020 at 10:57:03PM CEST, kuba@kernel.org wrote:
>On Mon, 3 Aug 2020 16:14:42 +0200 Jiri Pirko wrote:
>> >devlink dev reload [ net-ns-respawn { PID | NAME | ID } ] [ driver-param-init
>> >] [ fw-activate [ --live] ]  
>> 
>> Jakub, why do you prefer to have another extra level-specific option
>> "live"? I think it is clear to have it as a separate level. The behaviour
>> of the operation is quite different.
>
>I was trying to avoid having to provide a Cartesian product of
>operation and system disruption level, if any other action can
>be done "live" at some point.
>
>But no strong feelings about that one.
>
>Really, as long as there is no driver-specific defaults (or as 
>little driver-specific anything as possible) and user actions 
>are clearly expressed (fw-reset does not necessarily imply
>fw-activation) - the API will be fine IMO.

Clear actions, that is what I'm fine with.

But not sure how you think we can achieve no driver-specific defaults.
We have them already :/ I don't think we can easily remove them and not
break user expectations.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-03 13:52           ` Moshe Shemesh
@ 2020-08-04 10:13             ` Vasundhara Volam
  2020-08-05  6:32               ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Vasundhara Volam @ 2020-08-04 10:13 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list

On Mon, Aug 3, 2020 at 7:23 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>
>
> On 8/3/2020 3:47 PM, Vasundhara Volam wrote:
> > On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>
> >> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
> >>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> >>>>
> >>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
> >>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>>>> Introduce new option on devlink reload API to enable the user to select the
> >>>>>> reload level required. Complete support for all levels in mlx5.
> >>>>>> The following reload levels are supported:
> >>>>>>     driver: Driver entities re-instantiation only.
> >>>>>>     fw_reset: Firmware reset and driver entities re-instantiation.
> >>>>> The Name is a little confusing. I think it should be renamed to
> >>>>> fw_live_reset (in which both firmware and driver entities are
> >>>>> re-instantiated).  For only fw_reset, the driver should not undergo
> >>>>> reset (it requires a driver reload for firmware to undergo reset).
> >>>>>
> >>>> So, I think the differentiation here is that "live_patch" doesn't reset
> >>>> anything.
> >>> This seems similar to flashing the firmware and does not reset anything.
> >>
> >> The live patch is activating fw change without reset.
> >>
> >> It is not suitable for any fw change but fw gaps which don't require reset.
> >>
> >> I can query the fw to check if the pending image change is suitable or
> >> require fw reset.
> > Okay.
> >>>>>>     fw_live_patch: Firmware live patching only.
> >>>>> This level is not clear. Is this similar to flashing??
> >>>>>
> >>>>> Also I have a basic query. The reload command is split into
> >>>>> reload_up/reload_down handlers (Please correct me if this behaviour is
> >>>>> changed with this patchset). What if the vendor specific driver does
> >>>>> not support up/down and needs only a single handler to fire a firmware
> >>>>> reset or firmware live reset command?
> >>>> In the "reload_down" handler, they would trigger the appropriate reset,
> >>>> and quiesce anything that needs to be done. Then on reload up, it would
> >>>> restore and bring up anything quiesced in the first stage.
> >>> Yes, I got the "reload_down" and "reload_up". Similar to the device
> >>> "remove" and "re-probe" respectively.
> >>>
> >>> But our requirement is a similar "ethtool reset" command, where
> >>> ethtool calls a single callback in driver and driver just sends a
> >>> firmware command for doing the reset. Once firmware receives the
> >>> command, it will initiate the reset of driver and firmware entities
> >>> asynchronously.
> >>
> >> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
> >> command to reset and all PFs drivers gets events to handle and do
> >> re-initialization.  To fit it to the devlink reload_down and reload_up,
> >> I wait for the event handler to complete and it stops at driver unload
> >> to have the driver up by devlink reload_up. See patch 8 in this patchset.
> >>
> > Yes, I see reload_down is triggering the reset. In our driver, after
> > triggering the reset through a firmware command, reset is done in
> > another context as the driver initiates the reset only after receiving
> > an ASYNC event from the firmware.
>
>
> Same here.
>
> >
> > Probably, we have to use reload_down() to send firmware command to
> > trigger reset and do nothing in reload_up.
> I had that in previous version, but its wrong to use devlink reload this
> way, so I added wait with timeout for the event handling to complete
> before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also
> the event handler stops before load back to have that done by devlink
> reload_up.
But "devlink dev reload" will be invoked by the user only on a single
dev handler and all function drivers will be re-instantiated upon the
ASYNC event. reload_down and reload_up are invoked only the function
which the user invoked.

Take an example of a 2-port (PF0 and PF1) adapter on a single host and
with some VFs loaded on the device. User invokes "devlink dev reload"
on PF0, ASYNC event is received on 2 PFs and VFs for reset. All the
function drivers will be re-instantiated including PF0.

If we wait for some time in reload_down() of PF0 and then call load in
reload_up(), this code will be different from other function drivers.

> >   And returning from reload
> > does not mean that reset is complete as it is done in another context
> > and the driver notifies the health reporter once the reset is
> > complete. devlink framework may have to allow drivers to implement
> > reload_down only to look more clean or call reload_up only if the
> > driver notifies the devlink once reset is completed from another
> > context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-04 10:04                             ` Jiri Pirko
@ 2020-08-04 20:39                               ` Jakub Kicinski
  2020-08-05 11:02                                 ` Jiri Pirko
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2020-08-04 20:39 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Moshe Shemesh, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

On Tue, 4 Aug 2020 12:04:18 +0200 Jiri Pirko wrote:
> Mon, Aug 03, 2020 at 10:57:03PM CEST, kuba@kernel.org wrote:
> >I was trying to avoid having to provide a Cartesian product of
> >operation and system disruption level, if any other action can
> >be done "live" at some point.
> >
> >But no strong feelings about that one.
> >
> >Really, as long as there is no driver-specific defaults (or as 
> >little driver-specific anything as possible) and user actions 
> >are clearly expressed (fw-reset does not necessarily imply
> >fw-activation) - the API will be fine IMO.  
> 
> Clear actions, that is what I'm fine with.
> 
> But not sure how you think we can achieve no driver-specific defaults.
> We have them already :/ I don't think we can easily remove them and not
> break user expectations.

AFAIU the per-driver default is needed because we went too low 
level with what the action constitutes. We need maintain the higher
level actions.

The user clearly did not care if FW was reset during devlink reload
before this set, so what has changed? The objective user has is to
activate their config / FW / move to different net ns. 

Reloading the driver or resetting FW is a low level detail which
achieves different things for different implementations. So it's 
not a suitable abstraction -> IOW we need the driver default.


The work flow for the user is:

0. download fw to /lib/firmware
1. devlink flash $dev $fw
2. if live activation is enabled
   yes - devlink reload $dev $live-activate
   no - report machine has to be drained for reboot

fw-reset can't be $live-activate, because as Jake said fw-reset does
not activate the new image for Intel. So will we end up per-driver
defaults in the kernel space, and user space maintaining a mapping from
a driver to what a "level" of reset implies.

I hope this makes things crystal clear. Please explain what problems
you're seeing and extensions you're expecting. A list of user scenarios
you foresee would be v. useful.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-04 10:13             ` Vasundhara Volam
@ 2020-08-05  6:32               ` Moshe Shemesh
  2020-08-05  6:55                 ` Vasundhara Volam
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-08-05  6:32 UTC (permalink / raw)
  To: Vasundhara Volam
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list


On 8/4/2020 1:13 PM, Vasundhara Volam wrote:
> On Mon, Aug 3, 2020 at 7:23 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>
>> On 8/3/2020 3:47 PM, Vasundhara Volam wrote:
>>> On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
>>>>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>>>>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
>>>>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>>>>>> Introduce new option on devlink reload API to enable the user to select the
>>>>>>>> reload level required. Complete support for all levels in mlx5.
>>>>>>>> The following reload levels are supported:
>>>>>>>>      driver: Driver entities re-instantiation only.
>>>>>>>>      fw_reset: Firmware reset and driver entities re-instantiation.
>>>>>>> The Name is a little confusing. I think it should be renamed to
>>>>>>> fw_live_reset (in which both firmware and driver entities are
>>>>>>> re-instantiated).  For only fw_reset, the driver should not undergo
>>>>>>> reset (it requires a driver reload for firmware to undergo reset).
>>>>>>>
>>>>>> So, I think the differentiation here is that "live_patch" doesn't reset
>>>>>> anything.
>>>>> This seems similar to flashing the firmware and does not reset anything.
>>>> The live patch is activating fw change without reset.
>>>>
>>>> It is not suitable for any fw change but fw gaps which don't require reset.
>>>>
>>>> I can query the fw to check if the pending image change is suitable or
>>>> require fw reset.
>>> Okay.
>>>>>>>>      fw_live_patch: Firmware live patching only.
>>>>>>> This level is not clear. Is this similar to flashing??
>>>>>>>
>>>>>>> Also I have a basic query. The reload command is split into
>>>>>>> reload_up/reload_down handlers (Please correct me if this behaviour is
>>>>>>> changed with this patchset). What if the vendor specific driver does
>>>>>>> not support up/down and needs only a single handler to fire a firmware
>>>>>>> reset or firmware live reset command?
>>>>>> In the "reload_down" handler, they would trigger the appropriate reset,
>>>>>> and quiesce anything that needs to be done. Then on reload up, it would
>>>>>> restore and bring up anything quiesced in the first stage.
>>>>> Yes, I got the "reload_down" and "reload_up". Similar to the device
>>>>> "remove" and "re-probe" respectively.
>>>>>
>>>>> But our requirement is a similar "ethtool reset" command, where
>>>>> ethtool calls a single callback in driver and driver just sends a
>>>>> firmware command for doing the reset. Once firmware receives the
>>>>> command, it will initiate the reset of driver and firmware entities
>>>>> asynchronously.
>>>> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
>>>> command to reset and all PFs drivers gets events to handle and do
>>>> re-initialization.  To fit it to the devlink reload_down and reload_up,
>>>> I wait for the event handler to complete and it stops at driver unload
>>>> to have the driver up by devlink reload_up. See patch 8 in this patchset.
>>>>
>>> Yes, I see reload_down is triggering the reset. In our driver, after
>>> triggering the reset through a firmware command, reset is done in
>>> another context as the driver initiates the reset only after receiving
>>> an ASYNC event from the firmware.
>>
>> Same here.
>>
>>> Probably, we have to use reload_down() to send firmware command to
>>> trigger reset and do nothing in reload_up.
>> I had that in previous version, but its wrong to use devlink reload this
>> way, so I added wait with timeout for the event handling to complete
>> before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also
>> the event handler stops before load back to have that done by devlink
>> reload_up.
> But "devlink dev reload" will be invoked by the user only on a single
> dev handler and all function drivers will be re-instantiated upon the
> ASYNC event. reload_down and reload_up are invoked only the function
> which the user invoked.
>
> Take an example of a 2-port (PF0 and PF1) adapter on a single host and
> with some VFs loaded on the device. User invokes "devlink dev reload"
> on PF0, ASYNC event is received on 2 PFs and VFs for reset. All the
> function drivers will be re-instantiated including PF0.
>
> If we wait for some time in reload_down() of PF0 and then call load in
> reload_up(), this code will be different from other function drivers.


I see your point here, but the user run devlink reload command on one 
PF, in this case of fw-reset it will influence other PFs, but that's a 
result of the fw-reset, the user if asked for params change or namespace 
change that was for this PF.

>>>    And returning from reload
>>> does not mean that reset is complete as it is done in another context
>>> and the driver notifies the health reporter once the reset is
>>> complete. devlink framework may have to allow drivers to implement
>>> reload_down only to look more clean or call reload_up only if the
>>> driver notifies the devlink once reset is completed from another
>>> context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-05  6:32               ` Moshe Shemesh
@ 2020-08-05  6:55                 ` Vasundhara Volam
  2020-08-05  8:20                   ` Moshe Shemesh
  0 siblings, 1 reply; 58+ messages in thread
From: Vasundhara Volam @ 2020-08-05  6:55 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list

On Wed, Aug 5, 2020 at 12:02 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>
>
> On 8/4/2020 1:13 PM, Vasundhara Volam wrote:
> > On Mon, Aug 3, 2020 at 7:23 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>
> >> On 8/3/2020 3:47 PM, Vasundhara Volam wrote:
> >>> On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
> >>>>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> >>>>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
> >>>>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>>>>>> Introduce new option on devlink reload API to enable the user to select the
> >>>>>>>> reload level required. Complete support for all levels in mlx5.
> >>>>>>>> The following reload levels are supported:
> >>>>>>>>      driver: Driver entities re-instantiation only.
> >>>>>>>>      fw_reset: Firmware reset and driver entities re-instantiation.
> >>>>>>> The Name is a little confusing. I think it should be renamed to
> >>>>>>> fw_live_reset (in which both firmware and driver entities are
> >>>>>>> re-instantiated).  For only fw_reset, the driver should not undergo
> >>>>>>> reset (it requires a driver reload for firmware to undergo reset).
> >>>>>>>
> >>>>>> So, I think the differentiation here is that "live_patch" doesn't reset
> >>>>>> anything.
> >>>>> This seems similar to flashing the firmware and does not reset anything.
> >>>> The live patch is activating fw change without reset.
> >>>>
> >>>> It is not suitable for any fw change but fw gaps which don't require reset.
> >>>>
> >>>> I can query the fw to check if the pending image change is suitable or
> >>>> require fw reset.
> >>> Okay.
> >>>>>>>>      fw_live_patch: Firmware live patching only.
> >>>>>>> This level is not clear. Is this similar to flashing??
> >>>>>>>
> >>>>>>> Also I have a basic query. The reload command is split into
> >>>>>>> reload_up/reload_down handlers (Please correct me if this behaviour is
> >>>>>>> changed with this patchset). What if the vendor specific driver does
> >>>>>>> not support up/down and needs only a single handler to fire a firmware
> >>>>>>> reset or firmware live reset command?
> >>>>>> In the "reload_down" handler, they would trigger the appropriate reset,
> >>>>>> and quiesce anything that needs to be done. Then on reload up, it would
> >>>>>> restore and bring up anything quiesced in the first stage.
> >>>>> Yes, I got the "reload_down" and "reload_up". Similar to the device
> >>>>> "remove" and "re-probe" respectively.
> >>>>>
> >>>>> But our requirement is a similar "ethtool reset" command, where
> >>>>> ethtool calls a single callback in driver and driver just sends a
> >>>>> firmware command for doing the reset. Once firmware receives the
> >>>>> command, it will initiate the reset of driver and firmware entities
> >>>>> asynchronously.
> >>>> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
> >>>> command to reset and all PFs drivers gets events to handle and do
> >>>> re-initialization.  To fit it to the devlink reload_down and reload_up,
> >>>> I wait for the event handler to complete and it stops at driver unload
> >>>> to have the driver up by devlink reload_up. See patch 8 in this patchset.
> >>>>
> >>> Yes, I see reload_down is triggering the reset. In our driver, after
> >>> triggering the reset through a firmware command, reset is done in
> >>> another context as the driver initiates the reset only after receiving
> >>> an ASYNC event from the firmware.
> >>
> >> Same here.
> >>
> >>> Probably, we have to use reload_down() to send firmware command to
> >>> trigger reset and do nothing in reload_up.
> >> I had that in previous version, but its wrong to use devlink reload this
> >> way, so I added wait with timeout for the event handling to complete
> >> before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also
> >> the event handler stops before load back to have that done by devlink
> >> reload_up.
> > But "devlink dev reload" will be invoked by the user only on a single
> > dev handler and all function drivers will be re-instantiated upon the
> > ASYNC event. reload_down and reload_up are invoked only the function
> > which the user invoked.
> >
> > Take an example of a 2-port (PF0 and PF1) adapter on a single host and
> > with some VFs loaded on the device. User invokes "devlink dev reload"
> > on PF0, ASYNC event is received on 2 PFs and VFs for reset. All the
> > function drivers will be re-instantiated including PF0.
> >
> > If we wait for some time in reload_down() of PF0 and then call load in
> > reload_up(), this code will be different from other function drivers.
>
>
> I see your point here, but the user run devlink reload command on one
> PF, in this case of fw-reset it will influence other PFs, but that's a
> result of the fw-reset, the user if asked for params change or namespace
> change that was for this PF.
Right, if any driver is implementing only fw-reset have to leave
reload_up as an empty function.

>
> >>>    And returning from reload
> >>> does not mean that reset is complete as it is done in another context
> >>> and the driver notifies the health reporter once the reset is
> >>> complete. devlink framework may have to allow drivers to implement
> >>> reload_down only to look more clean or call reload_up only if the
> >>> driver notifies the devlink once reset is completed from another
> >>> context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-05  6:55                 ` Vasundhara Volam
@ 2020-08-05  8:20                   ` Moshe Shemesh
  2020-08-12  9:34                     ` Vasundhara Volam
  0 siblings, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-08-05  8:20 UTC (permalink / raw)
  To: Vasundhara Volam
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list


On 8/5/2020 9:55 AM, Vasundhara Volam wrote:
> On Wed, Aug 5, 2020 at 12:02 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>
>> On 8/4/2020 1:13 PM, Vasundhara Volam wrote:
>>> On Mon, Aug 3, 2020 at 7:23 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>> On 8/3/2020 3:47 PM, Vasundhara Volam wrote:
>>>>> On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>>>> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
>>>>>>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>>>>>>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
>>>>>>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>>>>>>>>>> Introduce new option on devlink reload API to enable the user to select the
>>>>>>>>>> reload level required. Complete support for all levels in mlx5.
>>>>>>>>>> The following reload levels are supported:
>>>>>>>>>>       driver: Driver entities re-instantiation only.
>>>>>>>>>>       fw_reset: Firmware reset and driver entities re-instantiation.
>>>>>>>>> The Name is a little confusing. I think it should be renamed to
>>>>>>>>> fw_live_reset (in which both firmware and driver entities are
>>>>>>>>> re-instantiated).  For only fw_reset, the driver should not undergo
>>>>>>>>> reset (it requires a driver reload for firmware to undergo reset).
>>>>>>>>>
>>>>>>>> So, I think the differentiation here is that "live_patch" doesn't reset
>>>>>>>> anything.
>>>>>>> This seems similar to flashing the firmware and does not reset anything.
>>>>>> The live patch is activating fw change without reset.
>>>>>>
>>>>>> It is not suitable for any fw change but fw gaps which don't require reset.
>>>>>>
>>>>>> I can query the fw to check if the pending image change is suitable or
>>>>>> require fw reset.
>>>>> Okay.
>>>>>>>>>>       fw_live_patch: Firmware live patching only.
>>>>>>>>> This level is not clear. Is this similar to flashing??
>>>>>>>>>
>>>>>>>>> Also I have a basic query. The reload command is split into
>>>>>>>>> reload_up/reload_down handlers (Please correct me if this behaviour is
>>>>>>>>> changed with this patchset). What if the vendor specific driver does
>>>>>>>>> not support up/down and needs only a single handler to fire a firmware
>>>>>>>>> reset or firmware live reset command?
>>>>>>>> In the "reload_down" handler, they would trigger the appropriate reset,
>>>>>>>> and quiesce anything that needs to be done. Then on reload up, it would
>>>>>>>> restore and bring up anything quiesced in the first stage.
>>>>>>> Yes, I got the "reload_down" and "reload_up". Similar to the device
>>>>>>> "remove" and "re-probe" respectively.
>>>>>>>
>>>>>>> But our requirement is a similar "ethtool reset" command, where
>>>>>>> ethtool calls a single callback in driver and driver just sends a
>>>>>>> firmware command for doing the reset. Once firmware receives the
>>>>>>> command, it will initiate the reset of driver and firmware entities
>>>>>>> asynchronously.
>>>>>> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
>>>>>> command to reset and all PFs drivers gets events to handle and do
>>>>>> re-initialization.  To fit it to the devlink reload_down and reload_up,
>>>>>> I wait for the event handler to complete and it stops at driver unload
>>>>>> to have the driver up by devlink reload_up. See patch 8 in this patchset.
>>>>>>
>>>>> Yes, I see reload_down is triggering the reset. In our driver, after
>>>>> triggering the reset through a firmware command, reset is done in
>>>>> another context as the driver initiates the reset only after receiving
>>>>> an ASYNC event from the firmware.
>>>> Same here.
>>>>
>>>>> Probably, we have to use reload_down() to send firmware command to
>>>>> trigger reset and do nothing in reload_up.
>>>> I had that in previous version, but its wrong to use devlink reload this
>>>> way, so I added wait with timeout for the event handling to complete
>>>> before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also
>>>> the event handler stops before load back to have that done by devlink
>>>> reload_up.
>>> But "devlink dev reload" will be invoked by the user only on a single
>>> dev handler and all function drivers will be re-instantiated upon the
>>> ASYNC event. reload_down and reload_up are invoked only the function
>>> which the user invoked.
>>>
>>> Take an example of a 2-port (PF0 and PF1) adapter on a single host and
>>> with some VFs loaded on the device. User invokes "devlink dev reload"
>>> on PF0, ASYNC event is received on 2 PFs and VFs for reset. All the
>>> function drivers will be re-instantiated including PF0.
>>>
>>> If we wait for some time in reload_down() of PF0 and then call load in
>>> reload_up(), this code will be different from other function drivers.
>>
>> I see your point here, but the user run devlink reload command on one
>> PF, in this case of fw-reset it will influence other PFs, but that's a
>> result of the fw-reset, the user if asked for params change or namespace
>> change that was for this PF.
> Right, if any driver is implementing only fw-reset have to leave
> reload_up as an empty function.


No, its not only up the driver. The netns option is implemented by 
devlink and its running between reload_down and reload_up.

>>>>>     And returning from reload
>>>>> does not mean that reset is complete as it is done in another context
>>>>> and the driver notifies the health reporter once the reset is
>>>>> complete. devlink framework may have to allow drivers to implement
>>>>> reload_down only to look more clean or call reload_up only if the
>>>>> driver notifies the devlink once reset is completed from another
>>>>> context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-04 20:39                               ` Jakub Kicinski
@ 2020-08-05 11:02                                 ` Jiri Pirko
  2020-08-06 18:25                                   ` Jakub Kicinski
  0 siblings, 1 reply; 58+ messages in thread
From: Jiri Pirko @ 2020-08-05 11:02 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Moshe Shemesh, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

Tue, Aug 04, 2020 at 10:39:46PM CEST, kuba@kernel.org wrote:
>On Tue, 4 Aug 2020 12:04:18 +0200 Jiri Pirko wrote:
>> Mon, Aug 03, 2020 at 10:57:03PM CEST, kuba@kernel.org wrote:
>> >I was trying to avoid having to provide a Cartesian product of
>> >operation and system disruption level, if any other action can
>> >be done "live" at some point.
>> >
>> >But no strong feelings about that one.
>> >
>> >Really, as long as there is no driver-specific defaults (or as 
>> >little driver-specific anything as possible) and user actions 
>> >are clearly expressed (fw-reset does not necessarily imply
>> >fw-activation) - the API will be fine IMO.  
>> 
>> Clear actions, that is what I'm fine with.
>> 
>> But not sure how you think we can achieve no driver-specific defaults.
>> We have them already :/ I don't think we can easily remove them and not
>> break user expectations.
>
>AFAIU the per-driver default is needed because we went too low 
>level with what the action constitutes. We need maintain the higher
>level actions.
>
>The user clearly did not care if FW was reset during devlink reload
>before this set, so what has changed? The objective user has is to

Well for mlxsw, the user is used to this flow:
devlink dev flash - flash new fw
devlink dev reload - new fw is activated and reset and driver instances
are re-created.


>activate their config / FW / move to different net ns. 
>
>Reloading the driver or resetting FW is a low level detail which
>achieves different things for different implementations. So it's 
>not a suitable abstraction -> IOW we need the driver default.

I'm confused. So you think we need the driver default?


>
>
>The work flow for the user is:
>
>0. download fw to /lib/firmware
>1. devlink flash $dev $fw
>2. if live activation is enabled
>   yes - devlink reload $dev $live-activate
>   no - report machine has to be drained for reboot
>
>fw-reset can't be $live-activate, because as Jake said fw-reset does
>not activate the new image for Intel. So will we end up per-driver
>defaults in the kernel space, and user space maintaining a mapping from

Well, that is what what is Moshe's proposal. Per-driver kernel default..
I'm not sure what we are arguing about then :/


>a driver to what a "level" of reset implies.
>
>I hope this makes things crystal clear. Please explain what problems
>you're seeing and extensions you're expecting. A list of user scenarios
>you foresee would be v. useful.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-05 11:02                                 ` Jiri Pirko
@ 2020-08-06 18:25                                   ` Jakub Kicinski
  2020-08-06 22:56                                     ` Jacob Keller
  2020-08-09 13:21                                     ` Moshe Shemesh
  0 siblings, 2 replies; 58+ messages in thread
From: Jakub Kicinski @ 2020-08-06 18:25 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Moshe Shemesh, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

On Wed, 5 Aug 2020 13:02:58 +0200 Jiri Pirko wrote:
> Tue, Aug 04, 2020 at 10:39:46PM CEST, kuba@kernel.org wrote:
> >AFAIU the per-driver default is needed because we went too low 
> >level with what the action constitutes. We need maintain the higher
> >level actions.
> >
> >The user clearly did not care if FW was reset during devlink reload
> >before this set, so what has changed? The objective user has is to  
> 
> Well for mlxsw, the user is used to this flow:
> devlink dev flash - flash new fw
> devlink dev reload - new fw is activated and reset and driver instances
> are re-created.

Ugh, if the current behavior already implies fw-activation for some
drivers then the default has to probably be "do all the things" :S

> >activate their config / FW / move to different net ns. 
> >
> >Reloading the driver or resetting FW is a low level detail which
> >achieves different things for different implementations. So it's 
> >not a suitable abstraction -> IOW we need the driver default.  
> 
> I'm confused. So you think we need the driver default?

No, I'm talking about the state of this patch set. _In this patchset_ 
we need a driver default because of the unsuitable abstraction.

Better design would not require it.

> >The work flow for the user is:
> >
> >0. download fw to /lib/firmware
> >1. devlink flash $dev $fw
> >2. if live activation is enabled
> >   yes - devlink reload $dev $live-activate
> >   no - report machine has to be drained for reboot
> >
> >fw-reset can't be $live-activate, because as Jake said fw-reset does
> >not activate the new image for Intel. So will we end up per-driver
> >defaults in the kernel space, and user space maintaining a mapping from  
> 
> Well, that is what what is Moshe's proposal. Per-driver kernel default..
> I'm not sure what we are arguing about then :/

The fact that if I do a pure "driver reload" it will active new
firmware for mlxsw but not for mlx5. In this patchset for mlx5 I need
driver reload fw-reset. And for Intel there is no suitable option.

> >a driver to what a "level" of reset implies.
> >
> >I hope this makes things crystal clear. Please explain what problems
> >you're seeing and extensions you're expecting. A list of user scenarios
> >you foresee would be v. useful.  

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-06 18:25                                   ` Jakub Kicinski
@ 2020-08-06 22:56                                     ` Jacob Keller
  2020-08-09 13:21                                     ` Moshe Shemesh
  1 sibling, 0 replies; 58+ messages in thread
From: Jacob Keller @ 2020-08-06 22:56 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko
  Cc: Moshe Shemesh, netdev, linux-kernel, David S. Miller, Jiri Pirko,
	Vasundhara Volam



On 8/6/2020 11:25 AM, Jakub Kicinski wrote:
> On Wed, 5 Aug 2020 13:02:58 +0200 Jiri Pirko wrote:
>> Tue, Aug 04, 2020 at 10:39:46PM CEST, kuba@kernel.org wrote:
>>> AFAIU the per-driver default is needed because we went too low 
>>> level with what the action constitutes. We need maintain the higher
>>> level actions.
>>>
>>> The user clearly did not care if FW was reset during devlink reload
>>> before this set, so what has changed? The objective user has is to  
>>
>> Well for mlxsw, the user is used to this flow:
>> devlink dev flash - flash new fw
>> devlink dev reload - new fw is activated and reset and driver instances
>> are re-created.
> 
> Ugh, if the current behavior already implies fw-activation for some
> drivers then the default has to probably be "do all the things" :S
> 
>>> activate their config / FW / move to different net ns. 
>>>
>>> Reloading the driver or resetting FW is a low level detail which
>>> achieves different things for different implementations. So it's 
>>> not a suitable abstraction -> IOW we need the driver default.  
>>
>> I'm confused. So you think we need the driver default?
> 
> No, I'm talking about the state of this patch set. _In this patchset_ 
> we need a driver default because of the unsuitable abstraction.
> 
> Better design would not require it.
> 
>>> The work flow for the user is:
>>>
>>> 0. download fw to /lib/firmware
>>> 1. devlink flash $dev $fw
>>> 2. if live activation is enabled
>>>   yes - devlink reload $dev $live-activate
>>>   no - report machine has to be drained for reboot
>>>
>>> fw-reset can't be $live-activate, because as Jake said fw-reset does
>>> not activate the new image for Intel. So will we end up per-driver
>>> defaults in the kernel space, and user space maintaining a mapping from  
>>
>> Well, that is what what is Moshe's proposal. Per-driver kernel default..
>> I'm not sure what we are arguing about then :/
> 
> The fact that if I do a pure "driver reload" it will active new
> firmware for mlxsw but not for mlx5. In this patchset for mlx5 I need
> driver reload fw-reset. And for Intel there is no suitable option.
> 

I want to clarify here, at least for ice: we *do* have a reset that can
activate firmware, but we have various level of reset which not all of
them do a fw activation. We have several levels of firmware reset,
including a "PF" reset that only resets data associated with that
function, a CORE reset which resets all functions, and then an EMP reset
which will activate the new firmware. For all of these resets, affected
PFs are notified over their firmware admin message queue. However, there
isn't a notion of negotiating beyond a message indicating what type of
reset is occurring.

I mostly wanted to clarify that "fw-reset" as a name doesn't necessarily
imply firmware activation. (hence separating fw-activate vs fw-reset).

Thanks,
Jake

>>> a driver to what a "level" of reset implies.
>>>
>>> I hope this makes things crystal clear. Please explain what problems
>>> you're seeing and extensions you're expecting. A list of user scenarios
>>> you foresee would be v. useful.  

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-06 18:25                                   ` Jakub Kicinski
  2020-08-06 22:56                                     ` Jacob Keller
@ 2020-08-09 13:21                                     ` Moshe Shemesh
  2020-08-10 16:53                                       ` Jakub Kicinski
  1 sibling, 1 reply; 58+ messages in thread
From: Moshe Shemesh @ 2020-08-09 13:21 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko
  Cc: Jacob Keller, netdev, linux-kernel, David S. Miller, Jiri Pirko,
	Vasundhara Volam


On 8/6/2020 9:25 PM, Jakub Kicinski wrote:
> On Wed, 5 Aug 2020 13:02:58 +0200 Jiri Pirko wrote:
>> Tue, Aug 04, 2020 at 10:39:46PM CEST, kuba@kernel.org wrote:
>>> AFAIU the per-driver default is needed because we went too low
>>> level with what the action constitutes. We need maintain the higher
>>> level actions.
>>>
>>> The user clearly did not care if FW was reset during devlink reload
>>> before this set, so what has changed? The objective user has is to
>> Well for mlxsw, the user is used to this flow:
>> devlink dev flash - flash new fw
>> devlink dev reload - new fw is activated and reset and driver instances
>> are re-created.
> Ugh, if the current behavior already implies fw-activation for some
> drivers then the default has to probably be "do all the things" :S


Okay, so devlink reload default for mlx5 will include also fw-activate 
to align with mlxsw default.

Meaning drivers that supports fw-activate will add it to the default.

The flow of devlink reload default on mlx5 will be:

If there is FW image pending and live patch is suitable to apply, do 
live patch and driver re-initialization.

If there is FW image pending but live patch doesn't fit do fw-reset and 
driver-initialization.

If no FW image pending just do driver-initialization.


I still think I should on top of that add the level option to be 
selected by the user if he prefers a specific action, so the uAPI would be:

devlink dev reload [ netns { PID | NAME | ID } ] [ level { fw-live-patch 
| driver-reinit |fw-activate } ]

But I am still missing something: fw-activate implies that it will 
activate a new FW image stored on flash, pending activation. What if the 
user wants to reset and reload the FW if no new FW pending ? Should we 
add --force option to fw-activate level ?


>>> activate their config / FW / move to different net ns.
>>>
>>> Reloading the driver or resetting FW is a low level detail which
>>> achieves different things for different implementations. So it's
>>> not a suitable abstraction -> IOW we need the driver default.
>> I'm confused. So you think we need the driver default?
> No, I'm talking about the state of this patch set. _In this patchset_
> we need a driver default because of the unsuitable abstraction.
>
> Better design would not require it.
>
>>> The work flow for the user is:
>>>
>>> 0. download fw to /lib/firmware
>>> 1. devlink flash $dev $fw
>>> 2. if live activation is enabled
>>>    yes - devlink reload $dev $live-activate
>>>    no - report machine has to be drained for reboot
>>>
>>> fw-reset can't be $live-activate, because as Jake said fw-reset does
>>> not activate the new image for Intel. So will we end up per-driver
>>> defaults in the kernel space, and user space maintaining a mapping from
>> Well, that is what what is Moshe's proposal. Per-driver kernel default..
>> I'm not sure what we are arguing about then :/
> The fact that if I do a pure "driver reload" it will active new
> firmware for mlxsw but not for mlx5. In this patchset for mlx5 I need
> driver reload fw-reset. And for Intel there is no suitable option.
>
>>> a driver to what a "level" of reset implies.
>>>
>>> I hope this makes things crystal clear. Please explain what problems
>>> you're seeing and extensions you're expecting. A list of user scenarios
>>> you foresee would be v. useful.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-09 13:21                                     ` Moshe Shemesh
@ 2020-08-10 16:53                                       ` Jakub Kicinski
  2020-08-10 17:09                                         ` Jacob Keller
  2020-08-11  5:46                                         ` Jiri Pirko
  0 siblings, 2 replies; 58+ messages in thread
From: Jakub Kicinski @ 2020-08-10 16:53 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jiri Pirko, Jacob Keller, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam

On Sun, 9 Aug 2020 16:21:29 +0300 Moshe Shemesh wrote:
> Okay, so devlink reload default for mlx5 will include also fw-activate 
> to align with mlxsw default.
> 
> Meaning drivers that supports fw-activate will add it to the default.

No per-driver default.

Maybe the difference between mlxsw and mlx5 can be simply explained by
the fact that mlxsw loads firmware from /lib/firmware on every probe
(more or less).

It's only natural for a driver which loads FW from disk to load it on
driver reload.

> The flow of devlink reload default on mlx5 will be:
> 
> If there is FW image pending and live patch is suitable to apply, do 
> live patch and driver re-initialization.
> 
> If there is FW image pending but live patch doesn't fit do fw-reset and 
> driver-initialization.
> 
> If no FW image pending just do driver-initialization.

This sounds too complicated. Don't try to guess what the user wants.

> I still think I should on top of that add the level option to be 
> selected by the user if he prefers a specific action, so the uAPI would be:
> 
> devlink dev reload [ netns { PID | NAME | ID } ] [ level { fw-live-patch 
> | driver-reinit |fw-activate } ]

I'm all for the level/action.

> But I am still missing something: fw-activate implies that it will 
> activate a new FW image stored on flash, pending activation. What if the 
> user wants to reset and reload the FW if no new FW pending ? Should we 
> add --force option to fw-activate level ?

Since reload does not check today if anything changed - i.e. if reload
is actually needed, neither should fw-activate, IMO. I'd expect the
"--force behavior" to be the default.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-10 16:53                                       ` Jakub Kicinski
@ 2020-08-10 17:09                                         ` Jacob Keller
  2020-08-10 18:17                                           ` Jakub Kicinski
  2020-08-11  5:46                                         ` Jiri Pirko
  1 sibling, 1 reply; 58+ messages in thread
From: Jacob Keller @ 2020-08-10 17:09 UTC (permalink / raw)
  To: Jakub Kicinski, Moshe Shemesh
  Cc: Jiri Pirko, netdev, linux-kernel, David S. Miller, Jiri Pirko,
	Vasundhara Volam



On 8/10/2020 9:53 AM, Jakub Kicinski wrote:
> On Sun, 9 Aug 2020 16:21:29 +0300 Moshe Shemesh wrote:
>> Okay, so devlink reload default for mlx5 will include also fw-activate 
>> to align with mlxsw default.
>>
>> Meaning drivers that supports fw-activate will add it to the default.
> 
> No per-driver default.
> 
> Maybe the difference between mlxsw and mlx5 can be simply explained by
> the fact that mlxsw loads firmware from /lib/firmware on every probe
> (more or less).
> 
> It's only natural for a driver which loads FW from disk to load it on
> driver reload.
> 

This seems reasonable to me as long as the drivers document this
behavior in their devlink/<driver>.rst. We shouldn't change existing
behavior. One could argue that this difference in behavior amounts to a
"driver default"... but I agree that we shouldn't enshrine that in the
interface.


>> The flow of devlink reload default on mlx5 will be:
>>
>> If there is FW image pending and live patch is suitable to apply, do 
>> live patch and driver re-initialization.
>>
>> If there is FW image pending but live patch doesn't fit do fw-reset and 
>> driver-initialization.
>>
>> If no FW image pending just do driver-initialization.
> 
> This sounds too complicated. Don't try to guess what the user wants.
> >> I still think I should on top of that add the level option to be
>> selected by the user if he prefers a specific action, so the uAPI would be:
>>
>> devlink dev reload [ netns { PID | NAME | ID } ] [ level { fw-live-patch 
>> | driver-reinit |fw-activate } ]
> 
> I'm all for the level/action.
> 

Yep, same here.

>> But I am still missing something: fw-activate implies that it will 
>> activate a new FW image stored on flash, pending activation. What if the 
>> user wants to reset and reload the FW if no new FW pending ? Should we 
>> add --force option to fw-activate level ?
> 
> Since reload does not check today if anything changed - i.e. if reload
> is actually needed, neither should fw-activate, IMO. I'd expect the
> "--force behavior" to be the default.
> 

Yep. What about if there is HW/FW that can't initiate the fw-activate
reset unless there is a pending update? I think ice firmware might
respond to the "please reset/activate" command with a specific status
code indicating that no update was pending.

I think the simplest solution is to just interpret this as a success.
Alternatively we could report a specific error to inform user that no
activation took place?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-10 17:09                                         ` Jacob Keller
@ 2020-08-10 18:17                                           ` Jakub Kicinski
  0 siblings, 0 replies; 58+ messages in thread
From: Jakub Kicinski @ 2020-08-10 18:17 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Moshe Shemesh, Jiri Pirko, netdev, linux-kernel, David S. Miller,
	Jiri Pirko, Vasundhara Volam

On Mon, 10 Aug 2020 10:09:20 -0700 Jacob Keller wrote:
> >> But I am still missing something: fw-activate implies that it will 
> >> activate a new FW image stored on flash, pending activation. What if the 
> >> user wants to reset and reload the FW if no new FW pending ? Should we 
> >> add --force option to fw-activate level ?  
> > 
> > Since reload does not check today if anything changed - i.e. if reload
> > is actually needed, neither should fw-activate, IMO. I'd expect the
> > "--force behavior" to be the default.
> >   
> 
> Yep. What about if there is HW/FW that can't initiate the fw-activate
> reset unless there is a pending update? I think ice firmware might
> respond to the "please reset/activate" command with a specific status
> code indicating that no update was pending.
> 
> I think the simplest solution is to just interpret this as a success.
> Alternatively we could report a specific error to inform user that no
> activation took place?

I'd do EOPNOTSUPP + extack.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command
  2020-08-10 16:53                                       ` Jakub Kicinski
  2020-08-10 17:09                                         ` Jacob Keller
@ 2020-08-11  5:46                                         ` Jiri Pirko
  1 sibling, 0 replies; 58+ messages in thread
From: Jiri Pirko @ 2020-08-11  5:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Moshe Shemesh, Jacob Keller, netdev, linux-kernel,
	David S. Miller, Jiri Pirko, Vasundhara Volam

Mon, Aug 10, 2020 at 06:53:05PM CEST, kuba@kernel.org wrote:
>On Sun, 9 Aug 2020 16:21:29 +0300 Moshe Shemesh wrote:
>> Okay, so devlink reload default for mlx5 will include also fw-activate 
>> to align with mlxsw default.
>> 
>> Meaning drivers that supports fw-activate will add it to the default.
>
>No per-driver default.
>
>Maybe the difference between mlxsw and mlx5 can be simply explained by
>the fact that mlxsw loads firmware from /lib/firmware on every probe
>(more or less).
>
>It's only natural for a driver which loads FW from disk to load it on
>driver reload.

We don't load it on reaload... We just do reset witn activation.

>
>> The flow of devlink reload default on mlx5 will be:
>> 
>> If there is FW image pending and live patch is suitable to apply, do 
>> live patch and driver re-initialization.
>> 
>> If there is FW image pending but live patch doesn't fit do fw-reset and 
>> driver-initialization.
>> 
>> If no FW image pending just do driver-initialization.
>
>This sounds too complicated. Don't try to guess what the user wants.
>
>> I still think I should on top of that add the level option to be 
>> selected by the user if he prefers a specific action, so the uAPI would be:
>> 
>> devlink dev reload [ netns { PID | NAME | ID } ] [ level { fw-live-patch 
>> | driver-reinit |fw-activate } ]
>
>I'm all for the level/action.
>
>> But I am still missing something: fw-activate implies that it will 
>> activate a new FW image stored on flash, pending activation. What if the 
>> user wants to reset and reload the FW if no new FW pending ? Should we 
>> add --force option to fw-activate level ?
>
>Since reload does not check today if anything changed - i.e. if reload
>is actually needed, neither should fw-activate, IMO. I'd expect the
>"--force behavior" to be the default.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next RFC 00/13] Add devlink reload level option
  2020-08-05  8:20                   ` Moshe Shemesh
@ 2020-08-12  9:34                     ` Vasundhara Volam
  0 siblings, 0 replies; 58+ messages in thread
From: Vasundhara Volam @ 2020-08-12  9:34 UTC (permalink / raw)
  To: Moshe Shemesh
  Cc: Jacob Keller, David S. Miller, Jiri Pirko, Netdev, open list

On Wed, Aug 5, 2020 at 1:51 PM Moshe Shemesh <moshe@mellanox.com> wrote:
>
>
> On 8/5/2020 9:55 AM, Vasundhara Volam wrote:
> > On Wed, Aug 5, 2020 at 12:02 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>
> >> On 8/4/2020 1:13 PM, Vasundhara Volam wrote:
> >>> On Mon, Aug 3, 2020 at 7:23 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>> On 8/3/2020 3:47 PM, Vasundhara Volam wrote:
> >>>>> On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>>>> On 8/3/2020 1:24 PM, Vasundhara Volam wrote:
> >>>>>>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> >>>>>>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote:
> >>>>>>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh <moshe@mellanox.com> wrote:
> >>>>>>>>>> Introduce new option on devlink reload API to enable the user to select the
> >>>>>>>>>> reload level required. Complete support for all levels in mlx5.
> >>>>>>>>>> The following reload levels are supported:
> >>>>>>>>>>       driver: Driver entities re-instantiation only.
> >>>>>>>>>>       fw_reset: Firmware reset and driver entities re-instantiation.
> >>>>>>>>> The Name is a little confusing. I think it should be renamed to
> >>>>>>>>> fw_live_reset (in which both firmware and driver entities are
> >>>>>>>>> re-instantiated).  For only fw_reset, the driver should not undergo
> >>>>>>>>> reset (it requires a driver reload for firmware to undergo reset).
> >>>>>>>>>
> >>>>>>>> So, I think the differentiation here is that "live_patch" doesn't reset
> >>>>>>>> anything.
> >>>>>>> This seems similar to flashing the firmware and does not reset anything.
> >>>>>> The live patch is activating fw change without reset.
> >>>>>>
> >>>>>> It is not suitable for any fw change but fw gaps which don't require reset.
> >>>>>>
> >>>>>> I can query the fw to check if the pending image change is suitable or
> >>>>>> require fw reset.
> >>>>> Okay.
> >>>>>>>>>>       fw_live_patch: Firmware live patching only.
> >>>>>>>>> This level is not clear. Is this similar to flashing??
> >>>>>>>>>
> >>>>>>>>> Also I have a basic query. The reload command is split into
> >>>>>>>>> reload_up/reload_down handlers (Please correct me if this behaviour is
> >>>>>>>>> changed with this patchset). What if the vendor specific driver does
> >>>>>>>>> not support up/down and needs only a single handler to fire a firmware
> >>>>>>>>> reset or firmware live reset command?
> >>>>>>>> In the "reload_down" handler, they would trigger the appropriate reset,
> >>>>>>>> and quiesce anything that needs to be done. Then on reload up, it would
> >>>>>>>> restore and bring up anything quiesced in the first stage.
> >>>>>>> Yes, I got the "reload_down" and "reload_up". Similar to the device
> >>>>>>> "remove" and "re-probe" respectively.
> >>>>>>>
> >>>>>>> But our requirement is a similar "ethtool reset" command, where
> >>>>>>> ethtool calls a single callback in driver and driver just sends a
> >>>>>>> firmware command for doing the reset. Once firmware receives the
> >>>>>>> command, it will initiate the reset of driver and firmware entities
> >>>>>>> asynchronously.
> >>>>>> It is similar to mlx5 case here for fw_reset. The driver triggers the fw
> >>>>>> command to reset and all PFs drivers gets events to handle and do
> >>>>>> re-initialization.  To fit it to the devlink reload_down and reload_up,
> >>>>>> I wait for the event handler to complete and it stops at driver unload
> >>>>>> to have the driver up by devlink reload_up. See patch 8 in this patchset.
> >>>>>>
> >>>>> Yes, I see reload_down is triggering the reset. In our driver, after
> >>>>> triggering the reset through a firmware command, reset is done in
> >>>>> another context as the driver initiates the reset only after receiving
> >>>>> an ASYNC event from the firmware.
> >>>> Same here.
> >>>>
> >>>>> Probably, we have to use reload_down() to send firmware command to
> >>>>> trigger reset and do nothing in reload_up.
> >>>> I had that in previous version, but its wrong to use devlink reload this
> >>>> way, so I added wait with timeout for the event handling to complete
> >>>> before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also
> >>>> the event handler stops before load back to have that done by devlink
> >>>> reload_up.
> >>> But "devlink dev reload" will be invoked by the user only on a single
> >>> dev handler and all function drivers will be re-instantiated upon the
> >>> ASYNC event. reload_down and reload_up are invoked only the function
> >>> which the user invoked.
> >>>
> >>> Take an example of a 2-port (PF0 and PF1) adapter on a single host and
> >>> with some VFs loaded on the device. User invokes "devlink dev reload"
> >>> on PF0, ASYNC event is received on 2 PFs and VFs for reset. All the
> >>> function drivers will be re-instantiated including PF0.
> >>>
> >>> If we wait for some time in reload_down() of PF0 and then call load in
> >>> reload_up(), this code will be different from other function drivers.
> >>
> >> I see your point here, but the user run devlink reload command on one
> >> PF, in this case of fw-reset it will influence other PFs, but that's a
> >> result of the fw-reset, the user if asked for params change or namespace
> >> change that was for this PF.
> > Right, if any driver is implementing only fw-reset have to leave
> > reload_up as an empty function.
>
>
> No, its not only up the driver. The netns option is implemented by
> devlink and its running between reload_down and reload_up.
What I mean is, driver will provide a reload_up handler but it will
not do anything and simply return 0.
>
> >>>>>     And returning from reload
> >>>>> does not mean that reset is complete as it is done in another context
> >>>>> and the driver notifies the health reporter once the reset is
> >>>>> complete. devlink framework may have to allow drivers to implement
> >>>>> reload_down only to look more clean or call reload_up only if the
> >>>>> driver notifies the devlink once reset is completed from another
> >>>>> context. Please suggest.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, back to index

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-27 11:02 [PATCH net-next RFC 00/13] Add devlink reload level option Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 01/13] devlink: Add reload level option to devlink reload command Moshe Shemesh
2020-07-28  0:58   ` Jakub Kicinski
2020-07-28 13:58     ` Jiri Pirko
2020-07-28 16:47       ` Jacob Keller
2020-07-28 18:44         ` Jakub Kicinski
2020-07-28 19:18           ` Jacob Keller
2020-07-28 20:06             ` Jakub Kicinski
2020-07-29 14:54               ` Moshe Shemesh
2020-07-29 21:07                 ` Jakub Kicinski
2020-07-30 12:30                   ` Moshe Shemesh
2020-07-30 23:11                     ` Jakub Kicinski
2020-08-01 21:32                       ` Moshe Shemesh
2020-08-03 14:14                         ` Jiri Pirko
2020-08-03 20:57                           ` Jakub Kicinski
2020-08-04 10:04                             ` Jiri Pirko
2020-08-04 20:39                               ` Jakub Kicinski
2020-08-05 11:02                                 ` Jiri Pirko
2020-08-06 18:25                                   ` Jakub Kicinski
2020-08-06 22:56                                     ` Jacob Keller
2020-08-09 13:21                                     ` Moshe Shemesh
2020-08-10 16:53                                       ` Jakub Kicinski
2020-08-10 17:09                                         ` Jacob Keller
2020-08-10 18:17                                           ` Jakub Kicinski
2020-08-11  5:46                                         ` Jiri Pirko
2020-07-27 11:02 ` [PATCH net-next RFC 02/13] devlink: Add reload levels data to dev get Moshe Shemesh
2020-07-28  0:58   ` Jakub Kicinski
2020-07-29 14:37     ` Moshe Shemesh
2020-07-29 21:11       ` Jakub Kicinski
2020-07-30 12:05         ` Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 03/13] net/mlx5: Add functions to set/query MFRL register Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 04/13] net/mlx5: Set cap for pci sync for fw update event Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 05/13] net/mlx5: Handle sync reset request event Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 06/13] net/mlx5: Handle sync reset now event Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 07/13] net/mlx5: Handle sync reset abort event Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 08/13] net/mlx5: Add support for devlink reload level fw reset Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 09/13] devlink: Add enable_remote_dev_reset generic parameter Moshe Shemesh
2020-07-28  0:59   ` Jakub Kicinski
2020-07-29 14:42     ` Moshe Shemesh
2020-07-29 20:57       ` Jakub Kicinski
2020-07-30 12:08         ` Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 10/13] net/mlx5: Add devlink param enable_remote_dev_reset support Moshe Shemesh
2020-07-28  0:59   ` Jakub Kicinski
2020-07-27 11:02 ` [PATCH net-next RFC 11/13] net/mlx5: Add support for fw live patch event Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 12/13] net/mlx5: Add support for devlink reload level live patch Moshe Shemesh
2020-07-27 11:02 ` [PATCH net-next RFC 13/13] devlink: Add Documentation/networking/devlink/devlink-reload.rst Moshe Shemesh
2020-07-28  5:25 ` [PATCH net-next RFC 00/13] Add devlink reload level option Vasundhara Volam
2020-07-28 16:43   ` Jacob Keller
2020-08-03 10:24     ` Vasundhara Volam
2020-08-03 12:17       ` Moshe Shemesh
2020-08-03 12:47         ` Vasundhara Volam
2020-08-03 13:52           ` Moshe Shemesh
2020-08-04 10:13             ` Vasundhara Volam
2020-08-05  6:32               ` Moshe Shemesh
2020-08-05  6:55                 ` Vasundhara Volam
2020-08-05  8:20                   ` Moshe Shemesh
2020-08-12  9:34                     ` Vasundhara Volam
2020-07-28 16:37 ` Jacob Keller

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git