All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next 0/4] devlink: add dry run support for flash update
@ 2021-10-08 10:41 Jacob Keller
  2021-10-08 10:41 ` [net-next 1/4] ice: move and rename ice_check_for_pending_update Jacob Keller
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Jacob Keller @ 2021-10-08 10:41 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Pirko, Jakub Kicinski, Jacob Keller

This is an implementation of a previous idea I had discussed on the list at
https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.com/

The idea is to allow user space to query whether a given destructive devlink
command would work without actually performing any actions. This is commonly
referred to as a "dry run", and is intended to give tools and system
administrators the ability to test things like flash image validity, or
whether a given option is valid without having to risk performing the update
when not sufficiently ready.

The intention is that all "destructive" commands can be updated to support
the new DEVLINK_ATTR_DRY_RUN, although this series only implements it for
flash update.

I expect we would want to support this for commands such as reload as well
as other commands which perform some action with no interface to check state
before hand.

I tried to implement the DRY_RUN checks along with useful extended ACK
messages so that even if a driver does not support DRY_RUN, some useful
information can be retrieved. (i.e. the stack indicates that flash update is
supported and will validate the other attributes first before rejecting the
command due to inability to fully validate the run within the driver).

Jacob Keller (4):
  ice: move and rename ice_check_for_pending_update
  ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image
  devlink: add dry run attribute to flash update
  ice: support dry run of a flash update to validate firmware file

 Documentation/driver-api/pldmfw/index.rst     |  10 ++
 drivers/net/ethernet/intel/ice/ice_devlink.c  |  53 +-----
 .../net/ethernet/intel/ice/ice_fw_update.c    | 170 ++++++++++--------
 .../net/ethernet/intel/ice/ice_fw_update.h    |   7 +-
 include/linux/pldmfw.h                        |   5 +
 include/net/devlink.h                         |   2 +
 include/uapi/linux/devlink.h                  |   2 +
 lib/pldmfw/pldmfw.c                           |  12 ++
 net/core/devlink.c                            |  19 +-
 9 files changed, 145 insertions(+), 135 deletions(-)


base-commit: c514fbb6231483b05c97eb22587188d4c453b28e
-- 
2.31.1.331.gb0c09ab8796f


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [net-next 1/4] ice: move and rename ice_check_for_pending_update
  2021-10-08 10:41 [net-next 0/4] devlink: add dry run support for flash update Jacob Keller
@ 2021-10-08 10:41 ` Jacob Keller
  2021-10-08 10:41 ` [net-next 2/4] ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image Jacob Keller
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2021-10-08 10:41 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Pirko, Jakub Kicinski, Jacob Keller

The ice_devlink_flash_update function performs a few checks and then
calls ice_flash_pldm_image. One of these checks is to call
ice_check_for_pending_update. This function checks if the device has
a pending update, and cancels it if so. This is necessary to allow
a new flash update to proceed.

We want to refactor the ice code to eliminate ice_devlink_flash_update,
moving its checks into ice_flash_pldm_image.

To do this, ice_check_for_pending_update will become static, and only
called by ice_flash_pldm_image. To make this change easier to review,
first just move the function up within the ice_fw_update.c file.

While at it, note that the function has a misleading name. Its primary
action is to cancel a pending update. Using the verb "check" does not
imply this. Rename it to ice_cancel_pending_update.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_devlink.c  |   2 +-
 .../net/ethernet/intel/ice/ice_fw_update.c    | 152 +++++++++---------
 .../net/ethernet/intel/ice/ice_fw_update.h    |   4 +-
 3 files changed, 79 insertions(+), 79 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
index cae1cd97a1ef..a11a1563b653 100644
--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
+++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
@@ -412,7 +412,7 @@ ice_devlink_flash_update(struct devlink *devlink,
 		return -EOPNOTSUPP;
 	}
 
-	err = ice_check_for_pending_update(pf, NULL, extack);
+	err = ice_cancel_pending_update(pf, NULL, extack);
 	if (err)
 		return err;
 
diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.c b/drivers/net/ethernet/intel/ice/ice_fw_update.c
index f8601d5b0b19..ae1360d8554e 100644
--- a/drivers/net/ethernet/intel/ice/ice_fw_update.c
+++ b/drivers/net/ethernet/intel/ice/ice_fw_update.c
@@ -648,89 +648,18 @@ static const struct pldmfw_ops ice_fwu_ops = {
 };
 
 /**
- * ice_flash_pldm_image - Write a PLDM-formatted firmware image to the device
- * @pf: private device driver structure
- * @fw: firmware object pointing to the relevant firmware file
- * @preservation: preservation level to request from firmware
- * @extack: netlink extended ACK structure
- *
- * Parse the data for a given firmware file, verifying that it is a valid PLDM
- * formatted image that matches this device.
- *
- * Extract the device record Package Data and Component Tables and send them
- * to the firmware. Extract and write the flash data for each of the three
- * main flash components, "fw.mgmt", "fw.undi", and "fw.netlist". Notify
- * firmware once the data is written to the inactive banks.
- *
- * Returns: zero on success or a negative error code on failure.
- */
-int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
-			 u8 preservation, struct netlink_ext_ack *extack)
-{
-	struct device *dev = ice_pf_to_dev(pf);
-	struct ice_hw *hw = &pf->hw;
-	struct ice_fwu_priv priv;
-	enum ice_status status;
-	int err;
-
-	switch (preservation) {
-	case ICE_AQC_NVM_PRESERVE_ALL:
-	case ICE_AQC_NVM_PRESERVE_SELECTED:
-	case ICE_AQC_NVM_NO_PRESERVATION:
-	case ICE_AQC_NVM_FACTORY_DEFAULT:
-		break;
-	default:
-		WARN(1, "Unexpected preservation level request %u", preservation);
-		return -EINVAL;
-	}
-
-	memset(&priv, 0, sizeof(priv));
-
-	priv.context.ops = &ice_fwu_ops;
-	priv.context.dev = dev;
-	priv.extack = extack;
-	priv.pf = pf;
-	priv.activate_flags = preservation;
-
-	status = ice_acquire_nvm(hw, ICE_RES_WRITE);
-	if (status) {
-		dev_err(dev, "Failed to acquire device flash lock, err %s aq_err %s\n",
-			ice_stat_str(status),
-			ice_aq_str(hw->adminq.sq_last_status));
-		NL_SET_ERR_MSG_MOD(extack, "Failed to acquire device flash lock");
-		return -EIO;
-	}
-
-	err = pldmfw_flash_image(&priv.context, fw);
-	if (err == -ENOENT) {
-		dev_err(dev, "Firmware image has no record matching this device\n");
-		NL_SET_ERR_MSG_MOD(extack, "Firmware image has no record matching this device");
-	} else if (err) {
-		/* Do not set a generic extended ACK message here. A more
-		 * specific message may already have been set by one of our
-		 * ops.
-		 */
-		dev_err(dev, "Failed to flash PLDM image, err %d", err);
-	}
-
-	ice_release_nvm(hw);
-
-	return err;
-}
-
-/**
- * ice_check_for_pending_update - Check for a pending flash update
+ * ice_cancel_pending_update - Cancel any pending update for a component
  * @pf: the PF driver structure
  * @component: if not NULL, the name of the component being updated
  * @extack: Netlink extended ACK structure
  *
- * Check whether the device already has a pending flash update. If such an
- * update is found, cancel it so that the requested update may proceed.
+ * Cancel any pending update for the specified component. If component is
+ * NULL, all device updates will be canceled.
  *
  * Returns: zero on success, or a negative error code on failure.
  */
-int ice_check_for_pending_update(struct ice_pf *pf, const char *component,
-				 struct netlink_ext_ack *extack)
+int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
+			      struct netlink_ext_ack *extack)
 {
 	struct devlink *devlink = priv_to_devlink(pf);
 	struct device *dev = ice_pf_to_dev(pf);
@@ -814,3 +743,74 @@ int ice_check_for_pending_update(struct ice_pf *pf, const char *component,
 
 	return err;
 }
+
+/**
+ * ice_flash_pldm_image - Write a PLDM-formatted firmware image to the device
+ * @pf: private device driver structure
+ * @fw: firmware object pointing to the relevant firmware file
+ * @preservation: preservation level to request from firmware
+ * @extack: netlink extended ACK structure
+ *
+ * Parse the data for a given firmware file, verifying that it is a valid PLDM
+ * formatted image that matches this device.
+ *
+ * Extract the device record Package Data and Component Tables and send them
+ * to the firmware. Extract and write the flash data for each of the three
+ * main flash components, "fw.mgmt", "fw.undi", and "fw.netlist". Notify
+ * firmware once the data is written to the inactive banks.
+ *
+ * Returns: zero on success or a negative error code on failure.
+ */
+int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
+			 u8 preservation, struct netlink_ext_ack *extack)
+{
+	struct device *dev = ice_pf_to_dev(pf);
+	struct ice_hw *hw = &pf->hw;
+	struct ice_fwu_priv priv;
+	enum ice_status status;
+	int err;
+
+	switch (preservation) {
+	case ICE_AQC_NVM_PRESERVE_ALL:
+	case ICE_AQC_NVM_PRESERVE_SELECTED:
+	case ICE_AQC_NVM_NO_PRESERVATION:
+	case ICE_AQC_NVM_FACTORY_DEFAULT:
+		break;
+	default:
+		WARN(1, "Unexpected preservation level request %u", preservation);
+		return -EINVAL;
+	}
+
+	memset(&priv, 0, sizeof(priv));
+
+	priv.context.ops = &ice_fwu_ops;
+	priv.context.dev = dev;
+	priv.extack = extack;
+	priv.pf = pf;
+	priv.activate_flags = preservation;
+
+	status = ice_acquire_nvm(hw, ICE_RES_WRITE);
+	if (status) {
+		dev_err(dev, "Failed to acquire device flash lock, err %s aq_err %s\n",
+			ice_stat_str(status),
+			ice_aq_str(hw->adminq.sq_last_status));
+		NL_SET_ERR_MSG_MOD(extack, "Failed to acquire device flash lock");
+		return -EIO;
+	}
+
+	err = pldmfw_flash_image(&priv.context, fw);
+	if (err == -ENOENT) {
+		dev_err(dev, "Firmware image has no record matching this device\n");
+		NL_SET_ERR_MSG_MOD(extack, "Firmware image has no record matching this device");
+	} else if (err) {
+		/* Do not set a generic extended ACK message here. A more
+		 * specific message may already have been set by one of our
+		 * ops.
+		 */
+		dev_err(dev, "Failed to flash PLDM image, err %d", err);
+	}
+
+	ice_release_nvm(hw);
+
+	return err;
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.h b/drivers/net/ethernet/intel/ice/ice_fw_update.h
index c6390f6851ff..1f84ef18bfd1 100644
--- a/drivers/net/ethernet/intel/ice/ice_fw_update.h
+++ b/drivers/net/ethernet/intel/ice/ice_fw_update.h
@@ -6,7 +6,7 @@
 
 int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
 			 u8 preservation, struct netlink_ext_ack *extack);
-int ice_check_for_pending_update(struct ice_pf *pf, const char *component,
-				 struct netlink_ext_ack *extack);
+int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
+			      struct netlink_ext_ack *extack);
 
 #endif
-- 
2.31.1.331.gb0c09ab8796f


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [net-next 2/4] ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image
  2021-10-08 10:41 [net-next 0/4] devlink: add dry run support for flash update Jacob Keller
  2021-10-08 10:41 ` [net-next 1/4] ice: move and rename ice_check_for_pending_update Jacob Keller
@ 2021-10-08 10:41 ` Jacob Keller
  2021-10-08 10:41 ` [net-next 3/4] devlink: add dry run attribute to flash update Jacob Keller
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2021-10-08 10:41 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Pirko, Jakub Kicinski, Jacob Keller

The ice_devlink_flash_update function performs a few upfront checks and
then calls ice_flash_pldm_image.

Most if these checks make more sense in the context of code within
ice_flash_pldm_image. Merge ice_devlink_flash_update and
ice_flash_pldm_image into one function, placing it in ice_fw_update.c

Since this is still the entry point for devlink, call the function
ice_devlink_flash_update instead of ice_flash_pldm_image. This leaves a
single function which handles the devlink parameters and then initiates
a PLDM update.

When merging the calls to ice_cancel_pending_update function, notice
that both it and the main flash update process take the NVM hardware
semaphore. We can eliminate the call to get the semaphore from
ice_cancel_pending_update by placing the check after we acquire the
semaphore during ice_flash_pldm_image.

With this change, the ice_devlink_flash_update function in
ice_fw_update.c becomes the main entry point for flash update. It
elimintes some unnecessary boiler plate code between the two previous
functions. The ultimate motivation for this is that it eases supporting
a dry run with the PLDM library in a future change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_devlink.c  | 50 -------------
 .../net/ethernet/intel/ice/ice_fw_update.c    | 71 ++++++++++---------
 .../net/ethernet/intel/ice/ice_fw_update.h    |  7 +-
 3 files changed, 41 insertions(+), 87 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
index a11a1563b653..f023e862dbf1 100644
--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
+++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
@@ -371,56 +371,6 @@ static int ice_devlink_info_get(struct devlink *devlink,
 	return err;
 }
 
-/**
- * ice_devlink_flash_update - Update firmware stored in flash on the device
- * @devlink: pointer to devlink associated with device to update
- * @params: flash update parameters
- * @extack: netlink extended ACK structure
- *
- * Perform a device flash update. The bulk of the update logic is contained
- * within the ice_flash_pldm_image function.
- *
- * Returns: zero on success, or an error code on failure.
- */
-static int
-ice_devlink_flash_update(struct devlink *devlink,
-			 struct devlink_flash_update_params *params,
-			 struct netlink_ext_ack *extack)
-{
-	struct ice_pf *pf = devlink_priv(devlink);
-	struct ice_hw *hw = &pf->hw;
-	u8 preservation;
-	int err;
-
-	if (!params->overwrite_mask) {
-		/* preserve all settings and identifiers */
-		preservation = ICE_AQC_NVM_PRESERVE_ALL;
-	} else if (params->overwrite_mask == DEVLINK_FLASH_OVERWRITE_SETTINGS) {
-		/* overwrite settings, but preserve the vital device identifiers */
-		preservation = ICE_AQC_NVM_PRESERVE_SELECTED;
-	} else if (params->overwrite_mask == (DEVLINK_FLASH_OVERWRITE_SETTINGS |
-					      DEVLINK_FLASH_OVERWRITE_IDENTIFIERS)) {
-		/* overwrite both settings and identifiers, preserve nothing */
-		preservation = ICE_AQC_NVM_NO_PRESERVATION;
-	} else {
-		NL_SET_ERR_MSG_MOD(extack, "Requested overwrite mask is not supported");
-		return -EOPNOTSUPP;
-	}
-
-	if (!hw->dev_caps.common_cap.nvm_unified_update) {
-		NL_SET_ERR_MSG_MOD(extack, "Current firmware does not support unified update");
-		return -EOPNOTSUPP;
-	}
-
-	err = ice_cancel_pending_update(pf, NULL, extack);
-	if (err)
-		return err;
-
-	devlink_flash_update_status_notify(devlink, "Preparing to flash", NULL, 0, 0);
-
-	return ice_flash_pldm_image(pf, params->fw, preservation, extack);
-}
-
 static const struct devlink_ops ice_devlink_ops = {
 	.supported_flash_update_params = DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK,
 	.info_get = ice_devlink_info_get,
diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.c b/drivers/net/ethernet/intel/ice/ice_fw_update.c
index ae1360d8554e..436f71a8e8aa 100644
--- a/drivers/net/ethernet/intel/ice/ice_fw_update.c
+++ b/drivers/net/ethernet/intel/ice/ice_fw_update.c
@@ -658,8 +658,9 @@ static const struct pldmfw_ops ice_fwu_ops = {
  *
  * Returns: zero on success, or a negative error code on failure.
  */
-int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
-			      struct netlink_ext_ack *extack)
+static int
+ice_cancel_pending_update(struct ice_pf *pf, const char *component,
+			  struct netlink_ext_ack *extack)
 {
 	struct devlink *devlink = priv_to_devlink(pf);
 	struct device *dev = ice_pf_to_dev(pf);
@@ -667,7 +668,6 @@ int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
 	struct ice_hw *hw = &pf->hw;
 	enum ice_status status;
 	u8 pending = 0;
-	int err;
 
 	dev_caps = kzalloc(sizeof(*dev_caps), GFP_KERNEL);
 	if (!dev_caps)
@@ -727,28 +727,14 @@ int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
 					   "Canceling previous pending update",
 					   component, 0, 0);
 
-	status = ice_acquire_nvm(hw, ICE_RES_WRITE);
-	if (status) {
-		dev_err(dev, "Failed to acquire device flash lock, err %s aq_err %s\n",
-			ice_stat_str(status),
-			ice_aq_str(hw->adminq.sq_last_status));
-		NL_SET_ERR_MSG_MOD(extack, "Failed to acquire device flash lock");
-		return -EIO;
-	}
-
 	pending |= ICE_AQC_NVM_REVERT_LAST_ACTIV;
-	err = ice_switch_flash_banks(pf, pending, extack);
-
-	ice_release_nvm(hw);
-
-	return err;
+	return ice_switch_flash_banks(pf, pending, extack);
 }
 
 /**
- * ice_flash_pldm_image - Write a PLDM-formatted firmware image to the device
- * @pf: private device driver structure
- * @fw: firmware object pointing to the relevant firmware file
- * @preservation: preservation level to request from firmware
+ * ice_devlink_flash_update - Write a firmware image to the device
+ * @devlink: pointer to devlink associated with the device to update
+ * @params: devlink flash update parameters
  * @extack: netlink extended ACK structure
  *
  * Parse the data for a given firmware file, verifying that it is a valid PLDM
@@ -761,24 +747,36 @@ int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
  *
  * Returns: zero on success or a negative error code on failure.
  */
-int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
-			 u8 preservation, struct netlink_ext_ack *extack)
+int ice_devlink_flash_update(struct devlink *devlink,
+			     struct devlink_flash_update_params *params,
+			     struct netlink_ext_ack *extack)
 {
+	struct ice_pf *pf = devlink_priv(devlink);
 	struct device *dev = ice_pf_to_dev(pf);
 	struct ice_hw *hw = &pf->hw;
 	struct ice_fwu_priv priv;
 	enum ice_status status;
+	u8 preservation;
 	int err;
 
-	switch (preservation) {
-	case ICE_AQC_NVM_PRESERVE_ALL:
-	case ICE_AQC_NVM_PRESERVE_SELECTED:
-	case ICE_AQC_NVM_NO_PRESERVATION:
-	case ICE_AQC_NVM_FACTORY_DEFAULT:
-		break;
-	default:
-		WARN(1, "Unexpected preservation level request %u", preservation);
-		return -EINVAL;
+	if (!params->overwrite_mask) {
+		/* preserve all settings and identifiers */
+		preservation = ICE_AQC_NVM_PRESERVE_ALL;
+	} else if (params->overwrite_mask == DEVLINK_FLASH_OVERWRITE_SETTINGS) {
+		/* overwrite settings, but preserve the vital device identifiers */
+		preservation = ICE_AQC_NVM_PRESERVE_SELECTED;
+	} else if (params->overwrite_mask == (DEVLINK_FLASH_OVERWRITE_SETTINGS |
+					      DEVLINK_FLASH_OVERWRITE_IDENTIFIERS)) {
+		/* overwrite both settings and identifiers, preserve nothing */
+		preservation = ICE_AQC_NVM_NO_PRESERVATION;
+	} else {
+		NL_SET_ERR_MSG_MOD(extack, "Requested overwrite mask is not supported");
+		return -EOPNOTSUPP;
+	}
+
+	if (!hw->dev_caps.common_cap.nvm_unified_update) {
+		NL_SET_ERR_MSG_MOD(extack, "Current firmware does not support unified update");
+		return -EOPNOTSUPP;
 	}
 
 	memset(&priv, 0, sizeof(priv));
@@ -789,6 +787,8 @@ int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
 	priv.pf = pf;
 	priv.activate_flags = preservation;
 
+	devlink_flash_update_status_notify(devlink, "Preparing to flash", NULL, 0, 0);
+
 	status = ice_acquire_nvm(hw, ICE_RES_WRITE);
 	if (status) {
 		dev_err(dev, "Failed to acquire device flash lock, err %s aq_err %s\n",
@@ -798,7 +798,11 @@ int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
 		return -EIO;
 	}
 
-	err = pldmfw_flash_image(&priv.context, fw);
+	err = ice_cancel_pending_update(pf, NULL, extack);
+	if (err)
+		goto out_release_nvm;
+
+	err = pldmfw_flash_image(&priv.context, params->fw);
 	if (err == -ENOENT) {
 		dev_err(dev, "Firmware image has no record matching this device\n");
 		NL_SET_ERR_MSG_MOD(extack, "Firmware image has no record matching this device");
@@ -810,6 +814,7 @@ int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
 		dev_err(dev, "Failed to flash PLDM image, err %d", err);
 	}
 
+out_release_nvm:
 	ice_release_nvm(hw);
 
 	return err;
diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.h b/drivers/net/ethernet/intel/ice/ice_fw_update.h
index 1f84ef18bfd1..be6d222124f2 100644
--- a/drivers/net/ethernet/intel/ice/ice_fw_update.h
+++ b/drivers/net/ethernet/intel/ice/ice_fw_update.h
@@ -4,9 +4,8 @@
 #ifndef _ICE_FW_UPDATE_H_
 #define _ICE_FW_UPDATE_H_
 
-int ice_flash_pldm_image(struct ice_pf *pf, const struct firmware *fw,
-			 u8 preservation, struct netlink_ext_ack *extack);
-int ice_cancel_pending_update(struct ice_pf *pf, const char *component,
-			      struct netlink_ext_ack *extack);
+int ice_devlink_flash_update(struct devlink *devlink,
+			     struct devlink_flash_update_params *params,
+			     struct netlink_ext_ack *extack);
 
 #endif
-- 
2.31.1.331.gb0c09ab8796f


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [net-next 3/4] devlink: add dry run attribute to flash update
  2021-10-08 10:41 [net-next 0/4] devlink: add dry run support for flash update Jacob Keller
  2021-10-08 10:41 ` [net-next 1/4] ice: move and rename ice_check_for_pending_update Jacob Keller
  2021-10-08 10:41 ` [net-next 2/4] ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image Jacob Keller
@ 2021-10-08 10:41 ` Jacob Keller
  2021-10-08 10:41 ` [net-next 4/4] ice: support dry run of a flash update to validate firmware file Jacob Keller
  2021-10-08 12:37 ` [net-next 0/4] devlink: add dry run support for flash update Jiri Pirko
  4 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2021-10-08 10:41 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Pirko, Jakub Kicinski, Jacob Keller

The devlink flash interface is used to request programming of a device
flash chip. In some cases, a user (or script) might want to verify
whether or not a device update is supported without actually committing
to update the device. For example, a system administrator might want to
validate that a given file will be accepted by a device driver, or may
want to validate a command before finally committing to it.

There currently is no good method to support this use. To do this, add
a new DEVLINK_ATTR_DRY_RUN attribute. This attribute shall be used by
a command to indicate that the request is just a "dry run" to verify
that things will work.

Ultimately, a proper dry run must be handled by device drivers, as we
want to also validate things such as the flash file. Add
a dry_run parameter to the devlink_flash_update_params, and an
associated bit to indicate if a driver supports verifying a dry_run.

If a driver does not support dry run verification, we will return
-EOPNOTSUPP, but with an appropriate extended ACK message that indicates
to the user that a flash update is supported.

We check for the dry run attribute last in order to allow as much
verification of parameters as possible. For example, even if a driver
does not support dry_run, we can still validate that all of the other
optional parameters such as overwrite_mask and per-component update are
valid.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 include/net/devlink.h        |  2 ++
 include/uapi/linux/devlink.h |  2 ++
 net/core/devlink.c           | 19 ++++++++++++++++++-
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index a7852a257bf6..d67183a739cf 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -640,10 +640,12 @@ struct devlink_flash_update_params {
 	const struct firmware *fw;
 	const char *component;
 	u32 overwrite_mask;
+	bool dry_run;
 };
 
 #define DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT		BIT(0)
 #define DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK	BIT(1)
+#define DEVLINK_SUPPORT_FLASH_UPDATE_DRY_RUN		BIT(2)
 
 struct devlink_region;
 struct devlink_info_req;
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index b897b80770f6..d5faaa942c1b 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -553,6 +553,8 @@ enum devlink_attr {
 
 	DEVLINK_ATTR_REGION_MAX_SNAPSHOTS,	/* u32 */
 
+	DEVLINK_ATTR_DRY_RUN,			/* flag */
+
 	/* add new attributes above here, update the policy in devlink.c */
 
 	__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 4917112406a0..8d11b0838a0a 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4232,7 +4232,8 @@ EXPORT_SYMBOL_GPL(devlink_flash_update_timeout_notify);
 static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
 				       struct genl_info *info)
 {
-	struct nlattr *nla_component, *nla_overwrite_mask, *nla_file_name;
+	struct nlattr *nla_component, *nla_overwrite_mask, *nla_file_name,
+		      *nla_dry_run;
 	struct devlink_flash_update_params params = {};
 	struct devlink *devlink = info->user_ptr[0];
 	const char *file_name;
@@ -4278,6 +4279,21 @@ static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
 		return ret;
 	}
 
+	/* Always check dry run last, in order to allow verification of other
+	 * parameter support even if the particular driver does not yet
+	 * support a full dry-run
+	 */
+	nla_dry_run = info->attrs[DEVLINK_ATTR_DRY_RUN];
+	if (nla_dry_run) {
+		if (!(supported_params & DEVLINK_SUPPORT_FLASH_UPDATE_DRY_RUN)) {
+			NL_SET_ERR_MSG_ATTR(info->extack, nla_dry_run,
+					    "flash update is supported, but dry run is not supported for this device");
+			release_firmware(params.fw);
+			return -EOPNOTSUPP;
+		}
+		params.dry_run = true;
+	}
+
 	devlink_flash_update_begin_notify(devlink);
 	ret = devlink->ops->flash_update(devlink, &params, info->extack);
 	devlink_flash_update_end_notify(devlink);
@@ -8514,6 +8530,7 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = {
 	[DEVLINK_ATTR_RATE_TX_MAX] = { .type = NLA_U64 },
 	[DEVLINK_ATTR_RATE_NODE_NAME] = { .type = NLA_NUL_STRING },
 	[DEVLINK_ATTR_RATE_PARENT_NODE_NAME] = { .type = NLA_NUL_STRING },
+	[DEVLINK_ATTR_DRY_RUN] = { .type = NLA_FLAG },
 };
 
 static const struct genl_small_ops devlink_nl_ops[] = {
-- 
2.31.1.331.gb0c09ab8796f


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [net-next 4/4] ice: support dry run of a flash update to validate firmware file
  2021-10-08 10:41 [net-next 0/4] devlink: add dry run support for flash update Jacob Keller
                   ` (2 preceding siblings ...)
  2021-10-08 10:41 ` [net-next 3/4] devlink: add dry run attribute to flash update Jacob Keller
@ 2021-10-08 10:41 ` Jacob Keller
  2021-10-08 12:37 ` [net-next 0/4] devlink: add dry run support for flash update Jiri Pirko
  4 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2021-10-08 10:41 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Pirko, Jakub Kicinski, Jacob Keller

Now that devlink core flash update can handle dry run requests, update
the ice driver to allow validating a PLDM file in dry_run mode.

First, add a new dry_run field to the pldmfw context structure. This
indicates that the PLDM firmware file library should only validate the
file and verify that it has a matching record. Update the pldmfw
documentation to indicate this "dry run" mode.

In the ice driver, let the stack know that we support the dry run
attribute for flash update by setting the appropriate bit in the
.supported_flash_update_params field.

If the dry run is requested, notify the PLDM firmware library by setting
the context bit appropriately. Don't cancel a pending update during
a dry run.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 Documentation/driver-api/pldmfw/index.rst      | 10 ++++++++++
 drivers/net/ethernet/intel/ice/ice_devlink.c   |  3 ++-
 drivers/net/ethernet/intel/ice/ice_fw_update.c | 15 +++++++++++----
 include/linux/pldmfw.h                         |  5 +++++
 lib/pldmfw/pldmfw.c                            | 12 ++++++++++++
 5 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/Documentation/driver-api/pldmfw/index.rst b/Documentation/driver-api/pldmfw/index.rst
index ad2c33ece30f..454b3ed6576a 100644
--- a/Documentation/driver-api/pldmfw/index.rst
+++ b/Documentation/driver-api/pldmfw/index.rst
@@ -51,6 +51,16 @@ unaligned access of multi-byte fields, and to properly convert from Little
 Endian to CPU host format. Additionally the records, descriptors, and
 components are stored in linked lists.
 
+Validating a PLDM firmware file
+===============================
+
+To simply validate a PLDM firmware file, and verify whether it applies to
+the device, set the ``dry_run`` flag in the ``pldmfw`` context structure.
+If this flag is set, the library will parse the file, validating its UUID
+and checking if any record matches the device. Note that in a dry run, the
+library will *not* issue any ops besides ``match_record``. It will not
+attempt to send the component table or package data to the device firmware.
+
 Performing a flash update
 =========================
 
diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
index f023e862dbf1..cdf4ad4aa437 100644
--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
+++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
@@ -372,7 +372,8 @@ static int ice_devlink_info_get(struct devlink *devlink,
 }
 
 static const struct devlink_ops ice_devlink_ops = {
-	.supported_flash_update_params = DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK,
+	.supported_flash_update_params = DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK |
+					 DEVLINK_SUPPORT_FLASH_UPDATE_DRY_RUN,
 	.info_get = ice_devlink_info_get,
 	.flash_update = ice_devlink_flash_update,
 };
diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.c b/drivers/net/ethernet/intel/ice/ice_fw_update.c
index 436f71a8e8aa..d1ca4dc0dd10 100644
--- a/drivers/net/ethernet/intel/ice/ice_fw_update.c
+++ b/drivers/net/ethernet/intel/ice/ice_fw_update.c
@@ -783,11 +783,15 @@ int ice_devlink_flash_update(struct devlink *devlink,
 
 	priv.context.ops = &ice_fwu_ops;
 	priv.context.dev = dev;
+	priv.context.dry_run = params->dry_run;
 	priv.extack = extack;
 	priv.pf = pf;
 	priv.activate_flags = preservation;
 
-	devlink_flash_update_status_notify(devlink, "Preparing to flash", NULL, 0, 0);
+	if (params->dry_run)
+		devlink_flash_update_status_notify(devlink, "Validating flash binary", NULL, 0, 0);
+	else
+		devlink_flash_update_status_notify(devlink, "Preparing to flash", NULL, 0, 0);
 
 	status = ice_acquire_nvm(hw, ICE_RES_WRITE);
 	if (status) {
@@ -798,9 +802,12 @@ int ice_devlink_flash_update(struct devlink *devlink,
 		return -EIO;
 	}
 
-	err = ice_cancel_pending_update(pf, NULL, extack);
-	if (err)
-		goto out_release_nvm;
+	/* Do not cancel a previous flash update if this is a dry run */
+	if (!params->dry_run) {
+		err = ice_cancel_pending_update(pf, NULL, extack);
+		if (err)
+			goto out_release_nvm;
+	}
 
 	err = pldmfw_flash_image(&priv.context, params->fw);
 	if (err == -ENOENT) {
diff --git a/include/linux/pldmfw.h b/include/linux/pldmfw.h
index 0fc831338226..d9add301582b 100644
--- a/include/linux/pldmfw.h
+++ b/include/linux/pldmfw.h
@@ -124,10 +124,15 @@ struct pldmfw_ops;
  * should embed this in a private structure and use container_of to obtain
  * a pointer to their own data, used to implement the device specific
  * operations.
+ *
+ * @ops: function pointers used as callbacks from the PLDMFW library
+ * @dev: pointer to the device being updated
+ * @dry_run: if true, only validate the file, do not perform an update.
  */
 struct pldmfw {
 	const struct pldmfw_ops *ops;
 	struct device *dev;
+	bool dry_run;
 };
 
 bool pldmfw_op_pci_match_record(struct pldmfw *context, struct pldmfw_record *record);
diff --git a/lib/pldmfw/pldmfw.c b/lib/pldmfw/pldmfw.c
index 6e77eb6d8e72..29a132a39876 100644
--- a/lib/pldmfw/pldmfw.c
+++ b/lib/pldmfw/pldmfw.c
@@ -827,6 +827,10 @@ static int pldm_finalize_update(struct pldmfw_priv *data)
  * to the device firmware. Extract and write the flash data for each of the
  * components indicated in the firmware file.
  *
+ * If the context->dry_run is set, this is a request for a dry run, i.e. to
+ * only validate the PLDM firmware file. In this case, stop and exit after we
+ * find a valid matching record.
+ *
  * Returns: zero on success, or a negative error code on failure.
  */
 int pldmfw_flash_image(struct pldmfw *context, const struct firmware *fw)
@@ -844,14 +848,22 @@ int pldmfw_flash_image(struct pldmfw *context, const struct firmware *fw)
 	data->fw = fw;
 	data->context = context;
 
+	/* Parse the image and make sure it is a valid PLDM firmware binary */
 	err = pldm_parse_image(data);
 	if (err)
 		goto out_release_data;
 
+	/* Search for a record matching the device */
 	err = pldm_find_matching_record(data);
 	if (err)
 		goto out_release_data;
 
+	/* If this is a dry run, do not perform an update */
+	if (context->dry_run)
+		goto out_release_data;
+
+	/* Perform the device update */
+
 	err = pldm_send_package_data(data);
 	if (err)
 		goto out_release_data;
-- 
2.31.1.331.gb0c09ab8796f


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 10:41 [net-next 0/4] devlink: add dry run support for flash update Jacob Keller
                   ` (3 preceding siblings ...)
  2021-10-08 10:41 ` [net-next 4/4] ice: support dry run of a flash update to validate firmware file Jacob Keller
@ 2021-10-08 12:37 ` Jiri Pirko
  2021-10-08 18:21   ` Jakub Kicinski
  2021-10-08 21:42   ` Keller, Jacob E
  4 siblings, 2 replies; 17+ messages in thread
From: Jiri Pirko @ 2021-10-08 12:37 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev, Jakub Kicinski

Fri, Oct 08, 2021 at 12:41:11PM CEST, jacob.e.keller@intel.com wrote:
>This is an implementation of a previous idea I had discussed on the list at
>https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.com/
>
>The idea is to allow user space to query whether a given destructive devlink
>command would work without actually performing any actions. This is commonly
>referred to as a "dry run", and is intended to give tools and system
>administrators the ability to test things like flash image validity, or
>whether a given option is valid without having to risk performing the update
>when not sufficiently ready.
>
>The intention is that all "destructive" commands can be updated to support
>the new DEVLINK_ATTR_DRY_RUN, although this series only implements it for
>flash update.
>
>I expect we would want to support this for commands such as reload as well
>as other commands which perform some action with no interface to check state
>before hand.
>
>I tried to implement the DRY_RUN checks along with useful extended ACK
>messages so that even if a driver does not support DRY_RUN, some useful
>information can be retrieved. (i.e. the stack indicates that flash update is
>supported and will validate the other attributes first before rejecting the
>command due to inability to fully validate the run within the driver).

Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
really dry run. I guess that user might be surprised in that case...


>
>Jacob Keller (4):
>  ice: move and rename ice_check_for_pending_update
>  ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image
>  devlink: add dry run attribute to flash update
>  ice: support dry run of a flash update to validate firmware file
>
> Documentation/driver-api/pldmfw/index.rst     |  10 ++
> drivers/net/ethernet/intel/ice/ice_devlink.c  |  53 +-----
> .../net/ethernet/intel/ice/ice_fw_update.c    | 170 ++++++++++--------
> .../net/ethernet/intel/ice/ice_fw_update.h    |   7 +-
> include/linux/pldmfw.h                        |   5 +
> include/net/devlink.h                         |   2 +
> include/uapi/linux/devlink.h                  |   2 +
> lib/pldmfw/pldmfw.c                           |  12 ++
> net/core/devlink.c                            |  19 +-
> 9 files changed, 145 insertions(+), 135 deletions(-)
>
>
>base-commit: c514fbb6231483b05c97eb22587188d4c453b28e
>-- 
>2.31.1.331.gb0c09ab8796f
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 12:37 ` [net-next 0/4] devlink: add dry run support for flash update Jiri Pirko
@ 2021-10-08 18:21   ` Jakub Kicinski
  2021-10-08 21:43     ` Keller, Jacob E
  2021-10-11  8:21     ` Keller, Jacob E
  2021-10-08 21:42   ` Keller, Jacob E
  1 sibling, 2 replies; 17+ messages in thread
From: Jakub Kicinski @ 2021-10-08 18:21 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Jacob Keller, netdev

On Fri, 8 Oct 2021 14:37:37 +0200 Jiri Pirko wrote:
> Fri, Oct 08, 2021 at 12:41:11PM CEST, jacob.e.keller@intel.com wrote:
> >This is an implementation of a previous idea I had discussed on the list at
> >https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.com/
> >
> >The idea is to allow user space to query whether a given destructive devlink
> >command would work without actually performing any actions. This is commonly
> >referred to as a "dry run", and is intended to give tools and system
> >administrators the ability to test things like flash image validity, or
> >whether a given option is valid without having to risk performing the update
> >when not sufficiently ready.
> >
> >The intention is that all "destructive" commands can be updated to support
> >the new DEVLINK_ATTR_DRY_RUN, although this series only implements it for
> >flash update.
> >
> >I expect we would want to support this for commands such as reload as well
> >as other commands which perform some action with no interface to check state
> >before hand.
> >
> >I tried to implement the DRY_RUN checks along with useful extended ACK
> >messages so that even if a driver does not support DRY_RUN, some useful
> >information can be retrieved. (i.e. the stack indicates that flash update is
> >supported and will validate the other attributes first before rejecting the
> >command due to inability to fully validate the run within the driver).  
> 
> Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> really dry run. I guess that user might be surprised in that case...

Would it be enough to do a policy dump in user space to check attr is
recognized and add a warning that this is required next to the attr
in the uAPI header?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 12:37 ` [net-next 0/4] devlink: add dry run support for flash update Jiri Pirko
  2021-10-08 18:21   ` Jakub Kicinski
@ 2021-10-08 21:42   ` Keller, Jacob E
  1 sibling, 0 replies; 17+ messages in thread
From: Keller, Jacob E @ 2021-10-08 21:42 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, Jakub Kicinski



> -----Original Message-----
> From: Jiri Pirko <jiri@resnulli.us>
> Sent: Friday, October 08, 2021 5:38 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: netdev@vger.kernel.org; Jakub Kicinski <kubakici@wp.pl>
> Subject: Re: [net-next 0/4] devlink: add dry run support for flash update
> 
> Fri, Oct 08, 2021 at 12:41:11PM CEST, jacob.e.keller@intel.com wrote:
> >This is an implementation of a previous idea I had discussed on the list at
> >https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.co
> m/
> >
> >The idea is to allow user space to query whether a given destructive devlink
> >command would work without actually performing any actions. This is
> commonly
> >referred to as a "dry run", and is intended to give tools and system
> >administrators the ability to test things like flash image validity, or
> >whether a given option is valid without having to risk performing the update
> >when not sufficiently ready.
> >
> >The intention is that all "destructive" commands can be updated to support
> >the new DEVLINK_ATTR_DRY_RUN, although this series only implements it for
> >flash update.
> >
> >I expect we would want to support this for commands such as reload as well
> >as other commands which perform some action with no interface to check state
> >before hand.
> >
> >I tried to implement the DRY_RUN checks along with useful extended ACK
> >messages so that even if a driver does not support DRY_RUN, some useful
> >information can be retrieved. (i.e. the stack indicates that flash update is
> >supported and will validate the other attributes first before rejecting the
> >command due to inability to fully validate the run within the driver).
> 
> Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> really dry run. I guess that user might be surprised in that case...
> 

old kernel should reject the command with an invalid attribute entirely, no?

Thanks,
Jake


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 18:21   ` Jakub Kicinski
@ 2021-10-08 21:43     ` Keller, Jacob E
  2021-10-08 22:35       ` Jakub Kicinski
  2021-10-11  8:21     ` Keller, Jacob E
  1 sibling, 1 reply; 17+ messages in thread
From: Keller, Jacob E @ 2021-10-08 21:43 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko; +Cc: netdev



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, October 08, 2021 11:22 AM
> To: Jiri Pirko <jiri@resnulli.us>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; netdev@vger.kernel.org
> Subject: Re: [net-next 0/4] devlink: add dry run support for flash update
> 
> On Fri, 8 Oct 2021 14:37:37 +0200 Jiri Pirko wrote:
> > Fri, Oct 08, 2021 at 12:41:11PM CEST, jacob.e.keller@intel.com wrote:
> > >This is an implementation of a previous idea I had discussed on the list at
> >
> >https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.co
> m/
> > >
> > >The idea is to allow user space to query whether a given destructive devlink
> > >command would work without actually performing any actions. This is
> commonly
> > >referred to as a "dry run", and is intended to give tools and system
> > >administrators the ability to test things like flash image validity, or
> > >whether a given option is valid without having to risk performing the update
> > >when not sufficiently ready.
> > >
> > >The intention is that all "destructive" commands can be updated to support
> > >the new DEVLINK_ATTR_DRY_RUN, although this series only implements it for
> > >flash update.
> > >
> > >I expect we would want to support this for commands such as reload as well
> > >as other commands which perform some action with no interface to check
> state
> > >before hand.
> > >
> > >I tried to implement the DRY_RUN checks along with useful extended ACK
> > >messages so that even if a driver does not support DRY_RUN, some useful
> > >information can be retrieved. (i.e. the stack indicates that flash update is
> > >supported and will validate the other attributes first before rejecting the
> > >command due to inability to fully validate the run within the driver).
> >
> > Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> > really dry run. I guess that user might be surprised in that case...
> 
> Would it be enough to do a policy dump in user space to check attr is
> recognized and add a warning that this is required next to the attr
> in the uAPI header?

Doesn't the policy checks prevent any unknown attributes? Or are unknown attributes silently ignored?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 21:43     ` Keller, Jacob E
@ 2021-10-08 22:35       ` Jakub Kicinski
  2021-10-08 23:58         ` Keller, Jacob E
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2021-10-08 22:35 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: Jiri Pirko, netdev

On Fri, 8 Oct 2021 21:43:32 +0000 Keller, Jacob E wrote:
> > > Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> > > really dry run. I guess that user might be surprised in that case...  
> > 
> > Would it be enough to do a policy dump in user space to check attr is
> > recognized and add a warning that this is required next to the attr
> > in the uAPI header?  
> 
> Doesn't the policy checks prevent any unknown attributes? 
> Or are unknown attributes silently ignored?

Did you test it?

DEVLINK_CMD_FLASH_UPDATE has GENL_DONT_VALIDATE_STRICT set.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 22:35       ` Jakub Kicinski
@ 2021-10-08 23:58         ` Keller, Jacob E
  2021-10-09  0:17           ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Keller, Jacob E @ 2021-10-08 23:58 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, netdev



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, October 08, 2021 3:36 PM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: Jiri Pirko <jiri@resnulli.us>; netdev@vger.kernel.org
> Subject: Re: [net-next 0/4] devlink: add dry run support for flash update
> 
> On Fri, 8 Oct 2021 21:43:32 +0000 Keller, Jacob E wrote:
> > > > Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> > > > really dry run. I guess that user might be surprised in that case...
> > >
> > > Would it be enough to do a policy dump in user space to check attr is
> > > recognized and add a warning that this is required next to the attr
> > > in the uAPI header?
> >
> > Doesn't the policy checks prevent any unknown attributes?
> > Or are unknown attributes silently ignored?
> 
> Did you test it?
> 
> DEVLINK_CMD_FLASH_UPDATE has GENL_DONT_VALIDATE_STRICT set.

Hmm. I did run into an issue while initially testing where DEVLINK_ATTR_DRY_RUN wasn't in the devlink_nla_policy table and it rejected the command with an unknown attribute. I had to add the attribute to the policy table to fix this.

I'm double checking on a different kernel now with the new userspace to see if I get the same behavior.

I'm not super familiar with the validation code or what GENL_DONT_VALIDATE_STRICT means...

Indeed.. I just did a search for GENL_DONT_VALIDATE_STRICT and the only uses I can find are ones which *set* the flag. Nothing ever checks it!!

I suspect it got removed at some point.. still digging into exact history though...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 23:58         ` Keller, Jacob E
@ 2021-10-09  0:17           ` Jakub Kicinski
  2021-10-09  0:32             ` Keller, Jacob E
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2021-10-09  0:17 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: Jiri Pirko, netdev

On Fri, 8 Oct 2021 23:58:45 +0000 Keller, Jacob E wrote:
> > > Doesn't the policy checks prevent any unknown attributes?
> > > Or are unknown attributes silently ignored?  
> > 
> > Did you test it?
> > 
> > DEVLINK_CMD_FLASH_UPDATE has GENL_DONT_VALIDATE_STRICT set.  
> 
> Hmm. I did run into an issue while initially testing where
> DEVLINK_ATTR_DRY_RUN wasn't in the devlink_nla_policy table and it
> rejected the command with an unknown attribute. I had to add the
> attribute to the policy table to fix this.
> 
> I'm double checking on a different kernel now with the new userspace
> to see if I get the same behavior.

Weird.
 
> I'm not super familiar with the validation code or what
> GENL_DONT_VALIDATE_STRICT means...
> 
> Indeed.. I just did a search for GENL_DONT_VALIDATE_STRICT and the
> only uses I can find are ones which *set* the flag. Nothing ever
> checks it!!
> 
> I suspect it got removed at some point.. still digging into exact
> history though...


 It's passed by genl_family_rcv_msg_doit() to
genl_family_rcv_msg_attrs_parse() where it chooses the netlink policy.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-09  0:17           ` Jakub Kicinski
@ 2021-10-09  0:32             ` Keller, Jacob E
  2021-10-09  1:29               ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Keller, Jacob E @ 2021-10-09  0:32 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, netdev



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, October 08, 2021 5:18 PM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: Jiri Pirko <jiri@resnulli.us>; netdev@vger.kernel.org
> Subject: Re: [net-next 0/4] devlink: add dry run support for flash update
> 
> On Fri, 8 Oct 2021 23:58:45 +0000 Keller, Jacob E wrote:
> > > > Doesn't the policy checks prevent any unknown attributes?
> > > > Or are unknown attributes silently ignored?
> > >
> > > Did you test it?
> > >
> > > DEVLINK_CMD_FLASH_UPDATE has GENL_DONT_VALIDATE_STRICT set.
> >
> > Hmm. I did run into an issue while initially testing where
> > DEVLINK_ATTR_DRY_RUN wasn't in the devlink_nla_policy table and it
> > rejected the command with an unknown attribute. I had to add the
> > attribute to the policy table to fix this.
> >
> > I'm double checking on a different kernel now with the new userspace
> > to see if I get the same behavior.
> 
> Weird.
> 
> > I'm not super familiar with the validation code or what
> > GENL_DONT_VALIDATE_STRICT means...
> >
> > Indeed.. I just did a search for GENL_DONT_VALIDATE_STRICT and the
> > only uses I can find are ones which *set* the flag. Nothing ever
> > checks it!!
> >
> > I suspect it got removed at some point.. still digging into exact
> > history though...
> 
> 
>  It's passed by genl_family_rcv_msg_doit() to
> genl_family_rcv_msg_attrs_parse() where it chooses the netlink policy.

Ah.. I see how its done. It's passed as the argument so you  don't see a direct comparison which makes it look like there isn't one... Feels like there could probably be a better abstraction that was more readable here...

Anyways. I'll confirm what happens on the kernel that doesn't have the attribute defined at all.

I wonder if the thing I saw differently was because the attribute *was* known but wasn't in policy. I.e. because it was defined it was validated....

Yep, I confirm that on a kernel without the DRY_RUN flag that it would allow the run because we aren't being strict.

I am guessing that we can't convert devlink over to strict validation?

Thanks,
Jake

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-09  0:32             ` Keller, Jacob E
@ 2021-10-09  1:29               ` Jakub Kicinski
  2022-04-25 23:05                 ` Jacob Keller
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2021-10-09  1:29 UTC (permalink / raw)
  To: Keller, Jacob E; +Cc: Jiri Pirko, netdev

On Sat, 9 Oct 2021 00:32:49 +0000 Keller, Jacob E wrote:
> Ah.. I see how its done. It's passed as the argument so you  don't
> see a direct comparison which makes it look like there isn't one...
> Feels like there could probably be a better abstraction that was more
> readable here...
> 
> Anyways. I'll confirm what happens on the kernel that doesn't have
> the attribute defined at all.
> 
> I wonder if the thing I saw differently was because the attribute
> *was* known but wasn't in policy. I.e. because it was defined it was
> validated....
> 
> Yep, I confirm that on a kernel without the DRY_RUN flag that it
> would allow the run because we aren't being strict.
> 
> I am guessing that we can't convert devlink over to strict validation?

I think the current best practice is not to opt-in commands which
started out as non-strict into strict validation. That said opting 
it in for MAXTYPE validation seems reasonable to me.

Alternatively, as I said, you can just check the max attr for the
family in user space. CTRL_CMD_GETFAMILY returns it as part of family
info (CTRL_ATTR_MAXATTR). We can make user space do the rejecting.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-08 18:21   ` Jakub Kicinski
  2021-10-08 21:43     ` Keller, Jacob E
@ 2021-10-11  8:21     ` Keller, Jacob E
  2021-10-11 23:21       ` Keller, Jacob E
  1 sibling, 1 reply; 17+ messages in thread
From: Keller, Jacob E @ 2021-10-11  8:21 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Pirko; +Cc: netdev

> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, October 08, 2021 11:22 AM
> To: Jiri Pirko <jiri@resnulli.us>
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; netdev@vger.kernel.org
> Subject: Re: [net-next 0/4] devlink: add dry run support for flash update
> 
> On Fri, 8 Oct 2021 14:37:37 +0200 Jiri Pirko wrote:
> > Fri, Oct 08, 2021 at 12:41:11PM CEST, jacob.e.keller@intel.com wrote:
> > >This is an implementation of a previous idea I had discussed on the list at
> >
> >https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.co
> m/
> > >
> > >The idea is to allow user space to query whether a given destructive devlink
> > >command would work without actually performing any actions. This is
> commonly
> > >referred to as a "dry run", and is intended to give tools and system
> > >administrators the ability to test things like flash image validity, or
> > >whether a given option is valid without having to risk performing the update
> > >when not sufficiently ready.
> > >
> > >The intention is that all "destructive" commands can be updated to support
> > >the new DEVLINK_ATTR_DRY_RUN, although this series only implements it for
> > >flash update.
> > >
> > >I expect we would want to support this for commands such as reload as well
> > >as other commands which perform some action with no interface to check
> state
> > >before hand.
> > >
> > >I tried to implement the DRY_RUN checks along with useful extended ACK
> > >messages so that even if a driver does not support DRY_RUN, some useful
> > >information can be retrieved. (i.e. the stack indicates that flash update is
> > >supported and will validate the other attributes first before rejecting the
> > >command due to inability to fully validate the run within the driver).
> >
> > Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> > really dry run. I guess that user might be surprised in that case...
> 
> Would it be enough to do a policy dump in user space to check attr is
> recognized and add a warning that this is required next to the attr
> in the uAPI header?

I'd be more in favor of converting either this specific op (or devlink as a whole) to strict checking. I think that most of the devlink commands and ops would function better if unknown behaviors were rejected rather than ignored.

If we prefer to avoid that due to historically not being strict, I'm ok with a userspace check. It does complicate the userspace a bit more, but I agree that especially for dry_run we don't want it accidentally updating when a dry run was expected.

There is some maintenance cost to switching to strict checking but I think it's pretty minimal because the strict checking simply prevents the unknown attributes from being ignored.

I'm happy to go either direction if we get consensus on this thread though.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-11  8:21     ` Keller, Jacob E
@ 2021-10-11 23:21       ` Keller, Jacob E
  0 siblings, 0 replies; 17+ messages in thread
From: Keller, Jacob E @ 2021-10-11 23:21 UTC (permalink / raw)
  To: Keller, Jacob E, Jakub Kicinski, Jiri Pirko; +Cc: netdev



> -----Original Message-----
> From: Keller, Jacob E <jacob.e.keller@intel.com>
> Sent: Monday, October 11, 2021 1:22 AM
> To: Jakub Kicinski <kuba@kernel.org>; Jiri Pirko <jiri@resnulli.us>
> Cc: netdev@vger.kernel.org
> Subject: RE: [net-next 0/4] devlink: add dry run support for flash update
> 
> > -----Original Message-----
> > From: Jakub Kicinski <kuba@kernel.org>
> > Sent: Friday, October 08, 2021 11:22 AM
> > To: Jiri Pirko <jiri@resnulli.us>
> > Cc: Keller, Jacob E <jacob.e.keller@intel.com>; netdev@vger.kernel.org
> > Subject: Re: [net-next 0/4] devlink: add dry run support for flash update
> >
> > On Fri, 8 Oct 2021 14:37:37 +0200 Jiri Pirko wrote:
> > > Fri, Oct 08, 2021 at 12:41:11PM CEST, jacob.e.keller@intel.com wrote:
> > > >This is an implementation of a previous idea I had discussed on the list at
> > >
> > >https://lore.kernel.org/netdev/51a6e7a33c7d40889c80bf37159f210e@intel.co
> > m/
> > > >
> > > >The idea is to allow user space to query whether a given destructive devlink
> > > >command would work without actually performing any actions. This is
> > commonly
> > > >referred to as a "dry run", and is intended to give tools and system
> > > >administrators the ability to test things like flash image validity, or
> > > >whether a given option is valid without having to risk performing the update
> > > >when not sufficiently ready.
> > > >
> > > >The intention is that all "destructive" commands can be updated to support
> > > >the new DEVLINK_ATTR_DRY_RUN, although this series only implements it
> for
> > > >flash update.
> > > >
> > > >I expect we would want to support this for commands such as reload as well
> > > >as other commands which perform some action with no interface to check
> > state
> > > >before hand.
> > > >
> > > >I tried to implement the DRY_RUN checks along with useful extended ACK
> > > >messages so that even if a driver does not support DRY_RUN, some useful
> > > >information can be retrieved. (i.e. the stack indicates that flash update is
> > > >supported and will validate the other attributes first before rejecting the
> > > >command due to inability to fully validate the run within the driver).
> > >
> > > Hmm, old kernel vs. new-userspace, the requested dry-run, won't be
> > > really dry run. I guess that user might be surprised in that case...
> >
> > Would it be enough to do a policy dump in user space to check attr is
> > recognized and add a warning that this is required next to the attr
> > in the uAPI header?
> 
> I'd be more in favor of converting either this specific op (or devlink as a whole) to
> strict checking. I think that most of the devlink commands and ops would
> function better if unknown behaviors were rejected rather than ignored.
> 
> If we prefer to avoid that due to historically not being strict, I'm ok with a
> userspace check. It does complicate the userspace a bit more, but I agree that
> especially for dry_run we don't want it accidentally updating when a dry run was
> expected.
> 
> There is some maintenance cost to switching to strict checking but I think it's
> pretty minimal because the strict checking simply prevents the unknown
> attributes from being ignored.
> 
> I'm happy to go either direction if we get consensus on this thread though.
> 
> Thanks,
> Jake

The ice changes in this patch conflict with similar work that I posted on IWL to cleanup some things for devlink reload support, so I'll probably wait to re-submit this until those changes make it through IWL and onto the list proper.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 0/4] devlink: add dry run support for flash update
  2021-10-09  1:29               ` Jakub Kicinski
@ 2022-04-25 23:05                 ` Jacob Keller
  0 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2022-04-25 23:05 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, netdev



On 10/8/2021 6:29 PM, Jakub Kicinski wrote:
> On Sat, 9 Oct 2021 00:32:49 +0000 Keller, Jacob E wrote:

Heh. It is frustrating how easily things slip through the cracks. I
dropped the ball on this.

> I think the current best practice is not to opt-in commands which
> started out as non-strict into strict validation. That said opting 
> it in for MAXTYPE validation seems reasonable to me.
> 

Opting in would only help future kernels, and would obviously not help
existing kernels which currently do not have strict validation. I
personally would prefer to migrate to strict matching, but understand
the concerns with breaking existing usage.

> Alternatively, as I said, you can just check the max attr for the
> family in user space. CTRL_CMD_GETFAMILY returns it as part of family
> info (CTRL_ATTR_MAXATTR). We can make user space do the rejecting.
> 

Yea I think this is the right approach because its the only way for new
userspace to properly protect against using this on an old kernel.

I'll update the iproute2 patches to do that.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-04-25 23:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-08 10:41 [net-next 0/4] devlink: add dry run support for flash update Jacob Keller
2021-10-08 10:41 ` [net-next 1/4] ice: move and rename ice_check_for_pending_update Jacob Keller
2021-10-08 10:41 ` [net-next 2/4] ice: move ice_devlink_flash_update and merge with ice_flash_pldm_image Jacob Keller
2021-10-08 10:41 ` [net-next 3/4] devlink: add dry run attribute to flash update Jacob Keller
2021-10-08 10:41 ` [net-next 4/4] ice: support dry run of a flash update to validate firmware file Jacob Keller
2021-10-08 12:37 ` [net-next 0/4] devlink: add dry run support for flash update Jiri Pirko
2021-10-08 18:21   ` Jakub Kicinski
2021-10-08 21:43     ` Keller, Jacob E
2021-10-08 22:35       ` Jakub Kicinski
2021-10-08 23:58         ` Keller, Jacob E
2021-10-09  0:17           ` Jakub Kicinski
2021-10-09  0:32             ` Keller, Jacob E
2021-10-09  1:29               ` Jakub Kicinski
2022-04-25 23:05                 ` Jacob Keller
2021-10-11  8:21     ` Keller, Jacob E
2021-10-11 23:21       ` Keller, Jacob E
2021-10-08 21:42   ` Keller, Jacob E

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.