netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next] devlink: move request_firmware out of driver
@ 2020-11-13  0:01 Jacob Keller
  2020-11-13 21:12 ` Jakub Kicinski
  0 siblings, 1 reply; 7+ messages in thread
From: Jacob Keller @ 2020-11-13  0:01 UTC (permalink / raw)
  To: netdev
  Cc: Jacob Keller, Jiri Pirko, Michael Chan, Shannon Nelson,
	Saeed Mahameed, Boris Pismenny, Bin Luo, Jakub Kicinksi

All drivers which implement the devlink flash update support, with the
exception of netdevsim, use either request_firmware or
request_firmware_direct to locate the firmware file. Rather than having
each driver do this separately as part of its .flash_update
implementation, perform the request_firmware within net/core/devlink.c

Replace the file_name paramter in the struct devlink_flash_update_params
with a pointer to the fw object.

Use request_firmware rather than request_firmware_direct. Although most
Linux distributions today do not have the fallback mechanism
implemented, only about half the drivers used the _direct request, as
compared to the generic request_firmware. In the event that
a distribution does support the fallback mechanism, the devlink flash
update ought to be able to use it to provide the firmware contents. For
distributions which do not support the fallback userspace mechanism,
there should be essentially no difference between request_firmware and
request_firmware_direct.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Shannon Nelson <snelson@pensando.io>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Boris Pismenny <borisp@nvidia.com>
Cc: Bin Luo <luobin9@huawei.com>
Cc: Jakub Kicinksi <kuba@kernel.org>
---
This is a follow to the discussion that took place at [1]. After reading
through the docs for request_firmware vs request_firmware_direct, I believe
that the net/core/devlink.c should be using request_firmware. While it is
true that no distribution supports this today, it seems like we shouldn't
rule it out entirely here. I'm willing to change this if we think it's not
worth bothering to support it.

Note that I have only compile-tested the drivers other than ice, as I do not
have hw for them. The only tricky transformation was in the bnxt driver
which shares code with the ethtool implementation. The rest were pretty
straight forward transformations.

One other thing came to my attention while working on this and while
discussing the ice devlink support with colleagues: the userspace devlink
program doesn't really indicate that the flash file must be located in the
firmware search path (usually /lib/firmware/*). It is probably worth some
effort to make the userspace tool error out more clearly when the file can't
be found.

[1] https://lore.kernel.org/netdev/5fe24aae-6401-c879-b235-a12c1416d00b@intel.com/

 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |  2 +-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 33 ++++++++++++-------
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.h |  4 +--
 .../net/ethernet/huawei/hinic/hinic_devlink.c | 12 +------
 drivers/net/ethernet/intel/ice/ice_devlink.c  | 13 +-------
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 10 +-----
 drivers/net/ethernet/mellanox/mlxsw/core.c    | 11 +------
 .../net/ethernet/netronome/nfp/nfp_devlink.c  |  2 +-
 drivers/net/ethernet/netronome/nfp/nfp_main.c | 17 ++--------
 drivers/net/ethernet/netronome/nfp/nfp_main.h |  2 +-
 .../ethernet/pensando/ionic/ionic_devlink.c   |  2 +-
 .../ethernet/pensando/ionic/ionic_devlink.h   |  2 +-
 .../net/ethernet/pensando/ionic/ionic_fw.c    | 12 +------
 include/net/devlink.h                         |  7 ++--
 net/core/devlink.c                            | 26 ++++++++++++---
 15 files changed, 61 insertions(+), 94 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index 184b6d0513b2..4ebae8a236fd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -32,7 +32,7 @@ bnxt_dl_flash_update(struct devlink *dl,
 
 	devlink_flash_update_begin_notify(dl);
 	devlink_flash_update_status_notify(dl, "Preparing to flash", NULL, 0, 0);
-	rc = bnxt_flash_package_from_file(bp->dev, params->file_name, 0);
+	rc = bnxt_flash_package_from_fw_obj(bp->dev, params->fw, 0);
 	if (!rc)
 		devlink_flash_update_status_notify(dl, "Flashing done", NULL, 0, 0);
 	else
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 53687bc7fcf5..91e73aedcdff 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2416,13 +2416,12 @@ static int bnxt_flash_firmware_from_file(struct net_device *dev,
 	return rc;
 }
 
-int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
-				 u32 install_type)
+int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware *fw,
+				   u32 install_type)
 {
 	struct bnxt *bp = netdev_priv(dev);
 	struct hwrm_nvm_install_update_output *resp = bp->hwrm_cmd_resp_addr;
 	struct hwrm_nvm_install_update_input install = {0};
-	const struct firmware *fw;
 	u32 item_len;
 	int rc = 0;
 	u16 index;
@@ -2437,13 +2436,6 @@ int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
 		return rc;
 	}
 
-	rc = request_firmware(&fw, filename, &dev->dev);
-	if (rc != 0) {
-		netdev_err(dev, "PKG error %d requesting file: %s\n",
-			   rc, filename);
-		return rc;
-	}
-
 	if (fw->size > item_len) {
 		netdev_err(dev, "PKG insufficient update area in nvram: %lu\n",
 			   (unsigned long)fw->size);
@@ -2475,7 +2467,6 @@ int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
 					  dma_handle);
 		}
 	}
-	release_firmware(fw);
 	if (rc)
 		goto err_exit;
 
@@ -2514,6 +2505,26 @@ int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
 	return rc;
 }
 
+static int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
+					u32 install_type)
+{
+	const struct firmware *fw;
+	int rc;
+
+	rc = request_firmware(&fw, filename, &dev->dev);
+	if (rc != 0) {
+		netdev_err(dev, "PKG error %d requesting file: %s\n",
+			   rc, filename);
+		return rc;
+	}
+
+	rc = bnxt_flash_package_from_fw_obj(dev, fw, install_type);
+
+	release_firmware(fw);
+
+	return rc;
+}
+
 static int bnxt_flash_device(struct net_device *dev,
 			     struct ethtool_flash *flash)
 {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
index fa6fbde52bea..0a57cb6a4a4b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
@@ -94,8 +94,8 @@ u32 bnxt_fw_to_ethtool_speed(u16);
 u16 bnxt_get_fw_auto_link_speeds(u32);
 int bnxt_hwrm_nvm_get_dev_info(struct bnxt *bp,
 			       struct hwrm_nvm_get_dev_info_output *nvm_dev_info);
-int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
-				 u32 install_type);
+int bnxt_flash_package_from_fw_obj(struct net_device *dev, const struct firmware *fw,
+				   u32 install_type);
 void bnxt_ethtool_init(struct bnxt *bp);
 void bnxt_ethtool_free(struct bnxt *bp);
 
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c
index 2630d667f393..58d5646444b0 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c
@@ -285,18 +285,8 @@ static int hinic_devlink_flash_update(struct devlink *devlink,
 				      struct netlink_ext_ack *extack)
 {
 	struct hinic_devlink_priv *priv = devlink_priv(devlink);
-	const struct firmware *fw;
-	int err;
 
-	err = request_firmware_direct(&fw, params->file_name,
-				      &priv->hwdev->hwif->pdev->dev);
-	if (err)
-		return err;
-
-	err = hinic_firmware_update(priv, fw, extack);
-	release_firmware(fw);
-
-	return err;
+	return hinic_firmware_update(priv, params->fw, extack);
 }
 
 static const struct devlink_ops hinic_devlink_ops = {
diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
index 511da59bd6f2..348212cb23b0 100644
--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
+++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
@@ -249,7 +249,6 @@ ice_devlink_flash_update(struct devlink *devlink,
 	struct ice_pf *pf = devlink_priv(devlink);
 	struct device *dev = &pf->pdev->dev;
 	struct ice_hw *hw = &pf->hw;
-	const struct firmware *fw;
 	u8 preservation;
 	int err;
 
@@ -277,21 +276,11 @@ ice_devlink_flash_update(struct devlink *devlink,
 	if (err)
 		return err;
 
-	err = request_firmware(&fw, params->file_name, dev);
-	if (err) {
-		NL_SET_ERR_MSG_MOD(extack, "Unable to read file from disk");
-		return err;
-	}
-
-	dev_dbg(dev, "Beginning flash update with file '%s'\n", params->file_name);
-
 	devlink_flash_update_begin_notify(devlink);
 	devlink_flash_update_status_notify(devlink, "Preparing to flash", NULL, 0, 0);
-	err = ice_flash_pldm_image(pf, fw, preservation, extack);
+	err = ice_flash_pldm_image(pf, params->fw, preservation, extack);
 	devlink_flash_update_end_notify(devlink);
 
-	release_firmware(fw);
-
 	return err;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index a28f95df2901..b22c692ccda4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -14,16 +14,8 @@ static int mlx5_devlink_flash_update(struct devlink *devlink,
 {
 	struct mlx5_core_dev *dev = devlink_priv(devlink);
 	const struct firmware *fw;
-	int err;
 
-	err = request_firmware_direct(&fw, params->file_name, &dev->pdev->dev);
-	if (err)
-		return err;
-
-	err = mlx5_firmware_flash(dev, fw, extack);
-	release_firmware(fw);
-
-	return err;
+	return mlx5_firmware_flash(dev, params->fw, extack);
 }
 
 static u8 mlx5_fw_ver_major(u32 version)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 937b8e46f8c7..88b751470c58 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1116,16 +1116,7 @@ static int mlxsw_core_fw_flash_update(struct mlxsw_core *mlxsw_core,
 				      struct devlink_flash_update_params *params,
 				      struct netlink_ext_ack *extack)
 {
-	const struct firmware *firmware;
-	int err;
-
-	err = request_firmware_direct(&firmware, params->file_name, mlxsw_core->bus_info->dev);
-	if (err)
-		return err;
-	err = mlxsw_core_fw_flash(mlxsw_core, firmware, extack);
-	release_firmware(firmware);
-
-	return err;
+	return mlxsw_core_fw_flash(mlxsw_core, params->fw, extack);
 }
 
 static int mlxsw_core_devlink_param_fw_load_policy_validate(struct devlink *devlink, u32 id,
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index 97d2b03208de..713ee3041d49 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -333,7 +333,7 @@ nfp_devlink_flash_update(struct devlink *devlink,
 			 struct devlink_flash_update_params *params,
 			 struct netlink_ext_ack *extack)
 {
-	return nfp_flash_update_common(devlink_priv(devlink), params->file_name, extack);
+	return nfp_flash_update_common(devlink_priv(devlink), params->fw, extack);
 }
 
 const struct devlink_ops nfp_devlink_ops = {
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c
index 7ff2ccbd43b0..d0835af1a577 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c
@@ -301,11 +301,10 @@ static int nfp_pcie_sriov_configure(struct pci_dev *pdev, int num_vfs)
 		return nfp_pcie_sriov_enable(pdev, num_vfs);
 }
 
-int nfp_flash_update_common(struct nfp_pf *pf, const char *path,
+int nfp_flash_update_common(struct nfp_pf *pf, const struct firmware *fw,
 			    struct netlink_ext_ack *extack)
 {
 	struct device *dev = &pf->pdev->dev;
-	const struct firmware *fw;
 	struct nfp_nsp *nsp;
 	int err;
 
@@ -319,24 +318,12 @@ int nfp_flash_update_common(struct nfp_pf *pf, const char *path,
 		return err;
 	}
 
-	err = request_firmware_direct(&fw, path, dev);
-	if (err) {
-		NL_SET_ERR_MSG_MOD(extack,
-				   "unable to read flash file from disk");
-		goto exit_close_nsp;
-	}
-
-	dev_info(dev, "Please be patient while writing flash image: %s\n",
-		 path);
-
 	err = nfp_nsp_write_flash(nsp, fw);
 	if (err < 0)
-		goto exit_release_fw;
+		goto exit_close_nsp;
 	dev_info(dev, "Finished writing flash image\n");
 	err = 0;
 
-exit_release_fw:
-	release_firmware(fw);
 exit_close_nsp:
 	nfp_nsp_close(nsp);
 	return err;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.h b/drivers/net/ethernet/netronome/nfp/nfp_main.h
index fa6b13a05941..a7dede946a33 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.h
@@ -166,7 +166,7 @@ nfp_pf_map_rtsym(struct nfp_pf *pf, const char *name, const char *sym_fmt,
 		 unsigned int min_size, struct nfp_cpp_area **area);
 int nfp_mbox_cmd(struct nfp_pf *pf, u32 cmd, void *in_data, u64 in_length,
 		 void *out_data, u64 out_length);
-int nfp_flash_update_common(struct nfp_pf *pf, const char *path,
+int nfp_flash_update_common(struct nfp_pf *pf, const struct firmware *fw,
 			    struct netlink_ext_ack *extack);
 
 enum nfp_dump_diag {
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_devlink.c b/drivers/net/ethernet/pensando/ionic/ionic_devlink.c
index 51d64718ed9f..b41301a5b0df 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_devlink.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_devlink.c
@@ -15,7 +15,7 @@ static int ionic_dl_flash_update(struct devlink *dl,
 {
 	struct ionic *ionic = devlink_priv(dl);
 
-	return ionic_firmware_update(ionic->lif, params->file_name, extack);
+	return ionic_firmware_update(ionic->lif, params->fw, extack);
 }
 
 static int ionic_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_devlink.h b/drivers/net/ethernet/pensando/ionic/ionic_devlink.h
index 5c01a9e306d8..0a77e8e810c5 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_devlink.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_devlink.h
@@ -6,7 +6,7 @@
 
 #include <net/devlink.h>
 
-int ionic_firmware_update(struct ionic_lif *lif, const char *fw_name,
+int ionic_firmware_update(struct ionic_lif *lif, const struct firmware *fw,
 			  struct netlink_ext_ack *extack);
 
 struct ionic *ionic_devlink_alloc(struct device *dev);
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_fw.c b/drivers/net/ethernet/pensando/ionic/ionic_fw.c
index d7bbf336c6f6..fd2ce134f66c 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_fw.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_fw.c
@@ -91,7 +91,7 @@ static int ionic_fw_status_long_wait(struct ionic *ionic,
 	return err;
 }
 
-int ionic_firmware_update(struct ionic_lif *lif, const char *fw_name,
+int ionic_firmware_update(struct ionic_lif *lif, const struct firmware *fw,
 			  struct netlink_ext_ack *extack)
 {
 	struct ionic_dev *idev = &lif->ionic->idev;
@@ -99,24 +99,15 @@ int ionic_firmware_update(struct ionic_lif *lif, const char *fw_name,
 	struct ionic *ionic = lif->ionic;
 	union ionic_dev_cmd_comp comp;
 	u32 buf_sz, copy_sz, offset;
-	const struct firmware *fw;
 	struct devlink *dl;
 	int next_interval;
 	int err = 0;
 	u8 fw_slot;
 
-	netdev_info(netdev, "Installing firmware %s\n", fw_name);
-
 	dl = priv_to_devlink(ionic);
 	devlink_flash_update_begin_notify(dl);
 	devlink_flash_update_status_notify(dl, "Preparing to flash", NULL, 0, 0);
 
-	err = request_firmware(&fw, fw_name, ionic->dev);
-	if (err) {
-		NL_SET_ERR_MSG_MOD(extack, "Unable to find firmware file");
-		goto err_out;
-	}
-
 	buf_sz = sizeof(idev->dev_cmd_regs->data);
 
 	netdev_dbg(netdev,
@@ -200,7 +191,6 @@ int ionic_firmware_update(struct ionic_lif *lif, const char *fw_name,
 		devlink_flash_update_status_notify(dl, "Flash failed", NULL, 0, 0);
 	else
 		devlink_flash_update_status_notify(dl, "Flash done", NULL, 0, 0);
-	release_firmware(fw);
 	devlink_flash_update_end_notify(dl);
 	return err;
 }
diff --git a/include/net/devlink.h b/include/net/devlink.h
index b01bb9bca5a2..d1d125a33322 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -19,6 +19,7 @@
 #include <net/flow_offload.h>
 #include <uapi/linux/devlink.h>
 #include <linux/xarray.h>
+#include <linux/firmware.h>
 
 #define DEVLINK_RELOAD_STATS_ARRAY_SIZE \
 	(__DEVLINK_RELOAD_LIMIT_MAX * __DEVLINK_RELOAD_ACTION_MAX)
@@ -566,15 +567,15 @@ enum devlink_param_generic_id {
 
 /**
  * struct devlink_flash_update_params - Flash Update parameters
- * @file_name: the name of the flash firmware file to update from
+ * @fw: pointer to the firmware data to update from
  * @component: the flash component to update
  *
- * With the exception of file_name, drivers must opt-in to parameters by
+ * With the exception of fw, drivers must opt-in to parameters by
  * setting the appropriate bit in the supported_flash_update_params field in
  * their devlink_ops structure.
  */
 struct devlink_flash_update_params {
-	const char *file_name;
+	const struct firmware *fw;
 	const char *component;
 	u32 overwrite_mask;
 };
diff --git a/net/core/devlink.c b/net/core/devlink.c
index a932d95be798..6002ac20508c 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3429,10 +3429,12 @@ EXPORT_SYMBOL_GPL(devlink_flash_update_timeout_notify);
 static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
 				       struct genl_info *info)
 {
-	struct nlattr *nla_component, *nla_overwrite_mask;
+	struct nlattr *nla_component, *nla_overwrite_mask, *nla_file_name;
 	struct devlink_flash_update_params params = {};
 	struct devlink *devlink = info->user_ptr[0];
+	const char *file_name;
 	u32 supported_params;
+	int ret;
 
 	if (!devlink->ops->flash_update)
 		return -EOPNOTSUPP;
@@ -3442,8 +3444,6 @@ static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
 
 	supported_params = devlink->ops->supported_flash_update_params;
 
-	params.file_name = nla_data(info->attrs[DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME]);
-
 	nla_component = info->attrs[DEVLINK_ATTR_FLASH_UPDATE_COMPONENT];
 	if (nla_component) {
 		if (!(supported_params & DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT)) {
@@ -3467,7 +3467,19 @@ static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
 		params.overwrite_mask = sections.value & sections.selector;
 	}
 
-	return devlink->ops->flash_update(devlink, &params, info->extack);
+	nla_file_name = info->attrs[DEVLINK_ATTR_FLASH_UPDATE_FILE_NAME];
+	file_name = nla_data(nla_file_name);
+	ret = request_firmware(&params.fw, file_name, devlink->dev);
+	if (ret) {
+		NL_SET_ERR_MSG_ATTR(info->extack, nla_file_name, "failed to locate the requested firmware file");
+		return ret;
+	}
+
+	ret = devlink->ops->flash_update(devlink, &params, info->extack);
+
+	release_firmware(params.fw);
+
+	return ret;
 }
 
 static const struct devlink_param devlink_param_generic[] = {
@@ -10221,12 +10233,16 @@ int devlink_compat_flash_update(struct net_device *dev, const char *file_name)
 		goto out;
 	}
 
-	params.file_name = file_name;
+	ret = request_firmware(&params.fw, file_name, devlink->dev);
+	if (ret)
+		goto out;
 
 	mutex_lock(&devlink->lock);
 	ret = devlink->ops->flash_update(devlink, &params, NULL);
 	mutex_unlock(&devlink->lock);
 
+	release_firmware(params.fw);
+
 out:
 	rtnl_lock();
 	dev_put(dev);

base-commit: 34b93f19c92ca7720efe25e852d480bb13101dec
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [net-next] devlink: move request_firmware out of driver
  2020-11-13  0:01 [net-next] devlink: move request_firmware out of driver Jacob Keller
@ 2020-11-13 21:12 ` Jakub Kicinski
  2020-11-13 21:34   ` Jacob Keller
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2020-11-13 21:12 UTC (permalink / raw)
  To: Jacob Keller
  Cc: netdev, Jiri Pirko, Michael Chan, Shannon Nelson, Saeed Mahameed,
	Boris Pismenny, Bin Luo

On Thu, 12 Nov 2020 16:01:42 -0800 Jacob Keller wrote:
> All drivers which implement the devlink flash update support, with the
> exception of netdevsim, use either request_firmware or
> request_firmware_direct to locate the firmware file. Rather than having
> each driver do this separately as part of its .flash_update
> implementation, perform the request_firmware within net/core/devlink.c
> 
> Replace the file_name paramter in the struct devlink_flash_update_params
> with a pointer to the fw object.
> 
> Use request_firmware rather than request_firmware_direct. Although most
> Linux distributions today do not have the fallback mechanism
> implemented, only about half the drivers used the _direct request, as
> compared to the generic request_firmware. In the event that
> a distribution does support the fallback mechanism, the devlink flash
> update ought to be able to use it to provide the firmware contents. For
> distributions which do not support the fallback userspace mechanism,
> there should be essentially no difference between request_firmware and
> request_firmware_direct.

Thanks for working on this!

> This is a follow to the discussion that took place at [1]. After reading
> through the docs for request_firmware vs request_firmware_direct, I believe
> that the net/core/devlink.c should be using request_firmware. While it is
> true that no distribution supports this today, it seems like we shouldn't
> rule it out entirely here. I'm willing to change this if we think it's not
> worth bothering to support it.
> 
> Note that I have only compile-tested the drivers other than ice, as I do not
> have hw for them. The only tricky transformation was in the bnxt driver
> which shares code with the ethtool implementation. The rest were pretty
> straight forward transformations.
> 
> One other thing came to my attention while working on this and while
> discussing the ice devlink support with colleagues: the userspace devlink
> program doesn't really indicate that the flash file must be located in the
> firmware search path (usually /lib/firmware/*).

It's in the man page, same as ethool.

> It is probably worth some effort to make the userspace tool error out
> more clearly when the file can't be found.

Can do, although the path is configurable AFAIR through some kconfig,
and an extack from the kernel would probably be as informative?

Or are you thinking of doing something like copying/linking the file to
/lib/firmware automatically if user provides path relative to CWD?


../drivers/net/ethernet/intel/ice/ice_devlink.c:250:17: warning: unused variable ‘dev’ [-Wunused-variable]
  250 |  struct device *dev = &pf->pdev->dev;
      |                 ^~~
../drivers/net/ethernet/qlogic/qed/qed_mcp.c:510:9: warning: context imbalance in '_qed_mcp_cmd_and_union' - unexpected unlock
../drivers/net/ethernet/mellanox/mlx5/core/devlink.c: In function ‘mlx5_devlink_flash_update’:
../drivers/net/ethernet/mellanox/mlx5/core/devlink.c:16:25: warning: unused variable ‘fw’ [-Wunused-variable]
   16 |  const struct firmware *fw;
      |                         ^~

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [net-next] devlink: move request_firmware out of driver
  2020-11-13 21:12 ` Jakub Kicinski
@ 2020-11-13 21:34   ` Jacob Keller
  2020-11-13 22:32     ` devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver) Jacob Keller
  0 siblings, 1 reply; 7+ messages in thread
From: Jacob Keller @ 2020-11-13 21:34 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Michael Chan, Shannon Nelson, Saeed Mahameed,
	Boris Pismenny, Bin Luo



On 11/13/2020 1:12 PM, Jakub Kicinski wrote:
> On Thu, 12 Nov 2020 16:01:42 -0800 Jacob Keller wrote:
>> All drivers which implement the devlink flash update support, with the
>> exception of netdevsim, use either request_firmware or
>> request_firmware_direct to locate the firmware file. Rather than having
>> each driver do this separately as part of its .flash_update
>> implementation, perform the request_firmware within net/core/devlink.c
>>
>> Replace the file_name paramter in the struct devlink_flash_update_params
>> with a pointer to the fw object.
>>
>> Use request_firmware rather than request_firmware_direct. Although most
>> Linux distributions today do not have the fallback mechanism
>> implemented, only about half the drivers used the _direct request, as
>> compared to the generic request_firmware. In the event that
>> a distribution does support the fallback mechanism, the devlink flash
>> update ought to be able to use it to provide the firmware contents. For
>> distributions which do not support the fallback userspace mechanism,
>> there should be essentially no difference between request_firmware and
>> request_firmware_direct.
> 
> Thanks for working on this!
> 
>> This is a follow to the discussion that took place at [1]. After reading
>> through the docs for request_firmware vs request_firmware_direct, I believe
>> that the net/core/devlink.c should be using request_firmware. While it is
>> true that no distribution supports this today, it seems like we shouldn't
>> rule it out entirely here. I'm willing to change this if we think it's not
>> worth bothering to support it.
>>
>> Note that I have only compile-tested the drivers other than ice, as I do not
>> have hw for them. The only tricky transformation was in the bnxt driver
>> which shares code with the ethtool implementation. The rest were pretty
>> straight forward transformations.
>>
>> One other thing came to my attention while working on this and while
>> discussing the ice devlink support with colleagues: the userspace devlink
>> program doesn't really indicate that the flash file must be located in the
>> firmware search path (usually /lib/firmware/*).
> 
> It's in the man page, same as ethool.

Yea.

> 
>> It is probably worth some effort to make the userspace tool error out
>> more clearly when the file can't be found.
> 
> Can do, although the path is configurable AFAIR through some kconfig,
> and an extack from the kernel would probably be as informative?
> 

Well, at least with ice, the experience is pretty bad. I tried out with
a garbage file name on one of my test systems. This was on a slightly
older kernel without this patch applied, and the device had a pending
update that had not yet been finalized with a reset:

$ devlink dev flash pci/0000:af:00.0 file garbage_file_does_not_exist
Canceling previous pending update


The update looks like it got stuck, but actually it failed. Somehow the
extack error over the socket didn't get handled by the application very
well. Something buggy in the forked process probably.

I do get this in the dmesg though:

Nov 13 13:12:57 jekeller-stp-glorfindel kernel: ice 0000:af:00.0: Direct
firmware load for garbage_file_does_not_exist failed with error -2



This happens because the way the ice driver structured the checks we
handle cancelling a previous update before checking if we can get the
firmware object. In the new patch that goes away since we get the fw
object pretty early on.

I tried again after applying this patch to that system:

$ devlink dev flash pci/0000:af:00.0 file garbage_file_does_not_exist
Error: failed to locate the requested firmware file.
devlink answers: No such file or directory

So at least it is improved. But if "garbage_file_does_not_exist" happens
to be a file relative to the CWD, then it is very confusing to users who
may not fully understand the request_firmware interface..

What I'd have liked to see is something where the userspace tool
complains more loudly in the case where the file exists relative to CWD
and says something like "firmware files must be located in the firmware
search path".

I think we probably should also root cause and fix the bug I found above
where the program doesn't properly quit.

> Or are you thinking of doing something like copying/linking the file to
> /lib/firmware automatically if user provides path relative to CWD?
> 

I thought about copying to /lib/firmware, but I think that is
problematic. What if the file is quite large? Presumably we want to use
a temporary file, which now means we need to manage that file and ensure
we don't choose a name that collides with anything else, and that we
properly clean up that file all the time on exit...

That seems overly complicated. Mostly I think just offering a warning
like "we couldn't find your file in the search path, but it was relative
to the CWD.. you probably forgot to move the file into /lib/firmware".

> 
> ../drivers/net/ethernet/intel/ice/ice_devlink.c:250:17: warning: unused variable ‘dev’ [-Wunused-variable]
>   250 |  struct device *dev = &pf->pdev->dev;
>       |                 ^~~
> ../drivers/net/ethernet/qlogic/qed/qed_mcp.c:510:9: warning: context imbalance in '_qed_mcp_cmd_and_union' - unexpected unlock
> ../drivers/net/ethernet/mellanox/mlx5/core/devlink.c: In function ‘mlx5_devlink_flash_update’:
> ../drivers/net/ethernet/mellanox/mlx5/core/devlink.c:16:25: warning: unused variable ‘fw’ [-Wunused-variable]
>    16 |  const struct firmware *fw;
>       |                         ^~
> 

Heh. I knew I forgot something! Will fix.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 7+ messages in thread

* devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver)
  2020-11-13 21:34   ` Jacob Keller
@ 2020-11-13 22:32     ` Jacob Keller
  2020-11-13 22:51       ` Jacob Keller
  0 siblings, 1 reply; 7+ messages in thread
From: Jacob Keller @ 2020-11-13 22:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Michael Chan, Shannon Nelson, Saeed Mahameed,
	Boris Pismenny, Bin Luo



On 11/13/2020 1:34 PM, Jacob Keller wrote:
> Well, at least with ice, the experience is pretty bad. I tried out with
> a garbage file name on one of my test systems. This was on a slightly
> older kernel without this patch applied, and the device had a pending
> update that had not yet been finalized with a reset:
> 
> $ devlink dev flash pci/0000:af:00.0 file garbage_file_does_not_exist
> Canceling previous pending update
> 
> 
> The update looks like it got stuck, but actually it failed. Somehow the
> extack error over the socket didn't get handled by the application very
> well. Something buggy in the forked process probably.
> 
> I do get this in the dmesg though:
> 
> Nov 13 13:12:57 jekeller-stp-glorfindel kernel: ice 0000:af:00.0: Direct
> firmware load for garbage_file_does_not_exist failed with error -2
> 

I think I figured out what is going on here, but I'm not sure what the
best solution is.

in userspace devlink.c:3410, the condition for exiting the while loop
that monitors the flash update process is:

(!ctx.flash_done || (ctx.not_first && !ctx.received_end))

This condition means keep looping until flash is done *OR* we've
received a message but have not yet received the end.

In the ice driver implementation, we perform a check for a pending flash
update first, which could trigger a cancellation that causes us to send
back a "cancelling previous pending flash update" status message, which
was sent *before* the devlink_flash_update_begin_notify(). Then, after
this we request the firmware object, which fails, and results in an
error code being reported back..

However, we will never send either the begin or end notification at this
point. Thus, the devlink userspace application will never quit, and
won't display the extack message.

This occurs because we sent a status notify message before we actually
sent a "begin notify". I think the ordering was done because of trying
to avoid having a complicated cleanup logic.

In some sense, this is a bug in the ice driver.. but in another sense
this is the devlink application being too strict about the requirements
on ordering of these messages..

I guess one method if we really want to remain strict is moving the
"begin" and "end" notifications outside of the driver into core so that
it always sends a begin before calling the .flash_update handler, and
always sends an end before exiting.

I guess we could simply relax the restriction on "not first" so that
we'll always exit in the case where the forked process has quit on us,
even if we haven't received a proper flash end notification...

Thoughts?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver)
  2020-11-13 22:32     ` devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver) Jacob Keller
@ 2020-11-13 22:51       ` Jacob Keller
  2020-11-15  4:10         ` Jakub Kicinski
  0 siblings, 1 reply; 7+ messages in thread
From: Jacob Keller @ 2020-11-13 22:51 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Michael Chan, Shannon Nelson, Saeed Mahameed,
	Boris Pismenny, Bin Luo



On 11/13/2020 2:32 PM, Jacob Keller wrote:
> 
> 
> On 11/13/2020 1:34 PM, Jacob Keller wrote:
>> Well, at least with ice, the experience is pretty bad. I tried out with
>> a garbage file name on one of my test systems. This was on a slightly
>> older kernel without this patch applied, and the device had a pending
>> update that had not yet been finalized with a reset:
>>
>> $ devlink dev flash pci/0000:af:00.0 file garbage_file_does_not_exist
>> Canceling previous pending update
>>
>>
>> The update looks like it got stuck, but actually it failed. Somehow the
>> extack error over the socket didn't get handled by the application very
>> well. Something buggy in the forked process probably.
>>
>> I do get this in the dmesg though:
>>
>> Nov 13 13:12:57 jekeller-stp-glorfindel kernel: ice 0000:af:00.0: Direct
>> firmware load for garbage_file_does_not_exist failed with error -2
>>
> 
> I think I figured out what is going on here, but I'm not sure what the
> best solution is.
> 
> in userspace devlink.c:3410, the condition for exiting the while loop
> that monitors the flash update process is:
> 
> (!ctx.flash_done || (ctx.not_first && !ctx.received_end))
> 

FWIW changing this to

(!ctx.flash_done && !ctx.received_end)

works for this problem, but I suspect that the original condition
intended to try and catch the case where flash update has exited but we
haven't yet processed all of the status messages?

I mean in some sense we could just wait for !ctx.flash_done only. Then
we'd always loop until the initial request exits.

There's a slight issue with the netlink extack message not being
displayed on its own line, but I think that just needs us to add a
pr_out("\n") somewhere to fix it.


> This condition means keep looping until flash is done *OR* we've
> received a message but have not yet received the end.
> 
> In the ice driver implementation, we perform a check for a pending flash
> update first, which could trigger a cancellation that causes us to send
> back a "cancelling previous pending flash update" status message, which
> was sent *before* the devlink_flash_update_begin_notify(). Then, after
> this we request the firmware object, which fails, and results in an
> error code being reported back..
> 
> However, we will never send either the begin or end notification at this
> point. Thus, the devlink userspace application will never quit, and
> won't display the extack message.
> 
> This occurs because we sent a status notify message before we actually
> sent a "begin notify". I think the ordering was done because of trying
> to avoid having a complicated cleanup logic.
> 
> In some sense, this is a bug in the ice driver.. but in another sense
> this is the devlink application being too strict about the requirements
> on ordering of these messages..
> 
> I guess one method if we really want to remain strict is moving the
> "begin" and "end" notifications outside of the driver into core so that
> it always sends a begin before calling the .flash_update handler, and
> always sends an end before exiting.
> 
> I guess we could simply relax the restriction on "not first" so that
> we'll always exit in the case where the forked process has quit on us,
> even if we haven't received a proper flash end notification...
> 
> Thoughts?
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver)
  2020-11-13 22:51       ` Jacob Keller
@ 2020-11-15  4:10         ` Jakub Kicinski
  2020-11-17 17:45           ` Jacob Keller
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2020-11-15  4:10 UTC (permalink / raw)
  To: Jacob Keller
  Cc: netdev, Jiri Pirko, Michael Chan, Shannon Nelson, Saeed Mahameed,
	Boris Pismenny, Bin Luo

On Fri, 13 Nov 2020 14:51:36 -0800 Jacob Keller wrote:
> On 11/13/2020 2:32 PM, Jacob Keller wrote:
> > 
> > 
> > On 11/13/2020 1:34 PM, Jacob Keller wrote:  
> >> Well, at least with ice, the experience is pretty bad. I tried out with
> >> a garbage file name on one of my test systems. This was on a slightly
> >> older kernel without this patch applied, and the device had a pending
> >> update that had not yet been finalized with a reset:
> >>
> >> $ devlink dev flash pci/0000:af:00.0 file garbage_file_does_not_exist
> >> Canceling previous pending update
> >>
> >>
> >> The update looks like it got stuck, but actually it failed. Somehow the
> >> extack error over the socket didn't get handled by the application very
> >> well. Something buggy in the forked process probably.
> >>
> >> I do get this in the dmesg though:
> >>
> >> Nov 13 13:12:57 jekeller-stp-glorfindel kernel: ice 0000:af:00.0: Direct
> >> firmware load for garbage_file_does_not_exist failed with error -2
> >>  
> > 
> > I think I figured out what is going on here, but I'm not sure what the
> > best solution is.
> > 
> > in userspace devlink.c:3410, the condition for exiting the while loop
> > that monitors the flash update process is:
> > 
> > (!ctx.flash_done || (ctx.not_first && !ctx.received_end))
> >   
> 
> FWIW changing this to
> 
> (!ctx.flash_done && !ctx.received_end)
> 
> works for this problem, but I suspect that the original condition
> intended to try and catch the case where flash update has exited but we
> haven't yet processed all of the status messages?

Yeah... I've only looked at this for 5 minutes, but it seems that ice
should not send notifications outside of begin / end (in fact it could
be nice to add an appropriate WARN_ON() in notify())...

> I mean in some sense we could just wait for !ctx.flash_done only. Then
> we'd always loop until the initial request exits.
> 
> There's a slight issue with the netlink extack message not being
> displayed on its own line, but I think that just needs us to add a
> pr_out("\n") somewhere to fix it.
> 
> 
> > This condition means keep looping until flash is done *OR* we've
> > received a message but have not yet received the end.
> > 
> > In the ice driver implementation, we perform a check for a pending flash
> > update first, which could trigger a cancellation that causes us to send
> > back a "cancelling previous pending flash update" status message, which
> > was sent *before* the devlink_flash_update_begin_notify(). Then, after
> > this we request the firmware object, which fails, and results in an
> > error code being reported back..
> > 
> > However, we will never send either the begin or end notification at this
> > point. Thus, the devlink userspace application will never quit, and
> > won't display the extack message.
> > 
> > This occurs because we sent a status notify message before we actually
> > sent a "begin notify". I think the ordering was done because of trying
> > to avoid having a complicated cleanup logic.
> > 
> > In some sense, this is a bug in the ice driver.. but in another sense
> > this is the devlink application being too strict about the requirements
> > on ordering of these messages..
> > 
> > I guess one method if we really want to remain strict is moving the
> > "begin" and "end" notifications outside of the driver into core so that
> > it always sends a begin before calling the .flash_update handler, and
> > always sends an end before exiting.
> > 
> > I guess we could simply relax the restriction on "not first" so that
> > we'll always exit in the case where the forked process has quit on us,
> > even if we haven't received a proper flash end notification...
> > 
> > Thoughts?
> >   


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver)
  2020-11-15  4:10         ` Jakub Kicinski
@ 2020-11-17 17:45           ` Jacob Keller
  0 siblings, 0 replies; 7+ messages in thread
From: Jacob Keller @ 2020-11-17 17:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Michael Chan, Shannon Nelson, Saeed Mahameed,
	Boris Pismenny, Bin Luo



On 11/14/2020 8:10 PM, Jakub Kicinski wrote:
> On Fri, 13 Nov 2020 14:51:36 -0800 Jacob Keller wrote:
>> On 11/13/2020 2:32 PM, Jacob Keller wrote:
>>>
>>>
>>> On 11/13/2020 1:34 PM, Jacob Keller wrote:  
>>>> Well, at least with ice, the experience is pretty bad. I tried out with
>>>> a garbage file name on one of my test systems. This was on a slightly
>>>> older kernel without this patch applied, and the device had a pending
>>>> update that had not yet been finalized with a reset:
>>>>
>>>> $ devlink dev flash pci/0000:af:00.0 file garbage_file_does_not_exist
>>>> Canceling previous pending update
>>>>
>>>>
>>>> The update looks like it got stuck, but actually it failed. Somehow the
>>>> extack error over the socket didn't get handled by the application very
>>>> well. Something buggy in the forked process probably.
>>>>
>>>> I do get this in the dmesg though:
>>>>
>>>> Nov 13 13:12:57 jekeller-stp-glorfindel kernel: ice 0000:af:00.0: Direct
>>>> firmware load for garbage_file_does_not_exist failed with error -2
>>>>  
>>>
>>> I think I figured out what is going on here, but I'm not sure what the
>>> best solution is.
>>>
>>> in userspace devlink.c:3410, the condition for exiting the while loop
>>> that monitors the flash update process is:
>>>
>>> (!ctx.flash_done || (ctx.not_first && !ctx.received_end))
>>>   
>>
>> FWIW changing this to
>>
>> (!ctx.flash_done && !ctx.received_end)
>>
>> works for this problem, but I suspect that the original condition
>> intended to try and catch the case where flash update has exited but we
>> haven't yet processed all of the status messages?
> 
> Yeah... I've only looked at this for 5 minutes, but it seems that ice
> should not send notifications outside of begin / end (in fact it could
> be nice to add an appropriate WARN_ON() in notify())...
> 

What about just moving begin/end outside of drivers entirely? These two
functions don't take any arguments from the drivers... Doing the calls
in the core stack would make it impossible for a driver to do the wrong
thing, and would make it easier to handle the error cleanup routine.

Will send a patch to that effect soon for further discussion.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-17 17:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13  0:01 [net-next] devlink: move request_firmware out of driver Jacob Keller
2020-11-13 21:12 ` Jakub Kicinski
2020-11-13 21:34   ` Jacob Keller
2020-11-13 22:32     ` devlink userspace process appears stuck (was: Re: [net-next] devlink: move request_firmware out of driver) Jacob Keller
2020-11-13 22:51       ` Jacob Keller
2020-11-15  4:10         ` Jakub Kicinski
2020-11-17 17:45           ` Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).