All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] add Intel FPGA image reload support
@ 2022-08-08  5:33 Tianfei Zhang
  2022-08-08  5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
  2022-08-08  5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
  0 siblings, 2 replies; 11+ messages in thread
From: Tianfei Zhang @ 2022-08-08  5:33 UTC (permalink / raw)
  To: mdf, yilun.xu, linux-fpga, lee.jones, russell.h.weight
  Cc: hao.wu, trix, Tianfei Zhang

This patchset adds FPGA image reload support on Intel PAC N3000 Card.
Image reload means that a reload mechanism of an FPGA, BMC, or
firmware image from FLASH or EEPROM after flash or program the images
without power cycle the server.

This patchset introduces 2 new sysfs files for query the available
images and trigger a image reload.

patch 1: add the available_images and image_load sysfs files.
Write a key word into image_load sysfs file to trigger a reload of an
FPGA, BMC, or firmware image from FLASH or EEPROM.
patch 2: add a trigger to update a new Retimer firmware.

v3:
From Lee Jones's comment, uses regmap_update_bits() API instead of
the wrapper of m10bmc_sys_update_bits().
v2:
add more detail about how to use the image_load sysfs files in ABI
documentation.

Russ Weight (2):
  fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback

 .../sysfs-driver-intel-m10-bmc-sec-update     |  34 +++
 drivers/fpga/intel-m10-bmc-sec-update.c       | 253 ++++++++++++++++++
 include/linux/mfd/intel-m10-bmc.h             |  31 +++
 3 files changed, 318 insertions(+)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-08  5:33 [PATCH v3 0/2] add Intel FPGA image reload support Tianfei Zhang
@ 2022-08-08  5:33 ` Tianfei Zhang
  2022-08-08  9:57   ` Xu Yilun
  2022-08-08  5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
  1 sibling, 1 reply; 11+ messages in thread
From: Tianfei Zhang @ 2022-08-08  5:33 UTC (permalink / raw)
  To: mdf, yilun.xu, linux-fpga, lee.jones, russell.h.weight
  Cc: hao.wu, trix, Tianfei Zhang

From: Russ Weight <russell.h.weight@intel.com>

Add the available_images and image_load sysfs files. The available_images
file returns a space separated list of key words that may be written
into the image_load file. These keywords describe an FPGA, BMC, or
firmware image in FLASH or EEPROM storage that may be loaded.

The image_load sysfs file may be written with a key word to trigger a
reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.

Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
---
v3:
uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
v2:
adds the steps for how to use the "image_load" sysfs file.
---
 .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
 drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
 2 files changed, 139 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
index 0a41afe0ab4c..3d8f04ca6f1b 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
+++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
@@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
 Description:	Read only. Returns number of times the secure update
 		staging area has been flashed.
 		Format: "%u".
+
+What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
+Date:		July 2022
+KernelVersion:  5.20
+Contact:	Russ Weight <russell.h.weight@intel.com>
+Description:	Read-only. This file returns a space separated list of
+		key words that may be written into the image_load file
+		described below. These keywords decribe an FPGA, BMC,
+		or firmware image in FLASH or EEPROM storage that may
+		be loaded.
+
+What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
+Date:		July 2022
+KernelVersion:  5.20
+Contact:	Russ Weight <russell.h.weight@intel.com>
+Description:	Write-only. A key word may be written to this file to
+		trigger a reload of an FPGA, BMC, or firmware image from
+		FLASH or EEPROM. Refer to the available_images file for a
+		list of supported key words for the underlying device.
+		Writing an unsupported string to this file will result in
+		EINVAL being returned.
+		It should remove all of resources related to the old FPGA/BMC
+		image before trigger the image reload otherwise the host system
+		may crash. We recommended that follow the below steps or directly
+		use the OPAE RSU script to perform the reload for FPGA/BMC image.
+		Here is the steps to trigger the reload for FPGA/BMC image:
+		1. disable the AER of the FPGA card.
+		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
+		3. trigger image reload via "image_load" sysfs file.
+		4. remove all of PCI devices of the FPGA card via
+		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
+		5. wait 10 seconds.
+		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
+	        7. enable the AER of the FPGA card.
diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
index 72c677c910de..3a082911cf67 100644
--- a/drivers/fpga/intel-m10-bmc-sec-update.c
+++ b/drivers/fpga/intel-m10-bmc-sec-update.c
@@ -14,6 +14,8 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 
+struct image_load;
+
 struct m10bmc_sec {
 	struct device *dev;
 	struct intel_m10bmc *m10bmc;
@@ -21,6 +23,12 @@ struct m10bmc_sec {
 	char *fw_name;
 	u32 fw_name_id;
 	bool cancel_request;
+	struct image_load *image_load;	/* terminated with { } member */
+};
+
+struct image_load {
+	const char *name;
+	int (*load_image)(struct m10bmc_sec *sec);
 };
 
 static DEFINE_XARRAY_ALLOC(fw_upload_xa);
@@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
 
 #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
 
+static ssize_t available_images_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	struct m10bmc_sec *sec = dev_get_drvdata(dev);
+	const struct image_load *hndlr;
+	ssize_t count = 0;
+
+	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
+		count += scnprintf(buf + count, PAGE_SIZE - count,
+				   "%s ", hndlr->name);
+	}
+
+	buf[count - 1] = '\n';
+
+	return count;
+}
+static DEVICE_ATTR_RO(available_images);
+
+static ssize_t image_load_store(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct m10bmc_sec *sec = dev_get_drvdata(dev);
+	const struct image_load *hndlr;
+	int ret = -EINVAL;
+
+	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
+		if (sysfs_streq(buf, hndlr->name)) {
+			ret = hndlr->load_image(sec);
+			break;
+		}
+	}
+
+	return ret ? : count;
+}
+static DEVICE_ATTR_WO(image_load);
+
+static struct attribute *m10bmc_control_attrs[] = {
+	&dev_attr_available_images.attr,
+	&dev_attr_image_load.attr,
+	NULL,
+};
+
+static struct attribute_group m10bmc_control_attr_group = {
+	.name = "control",
+	.attrs = m10bmc_control_attrs,
+};
+
 static ssize_t flash_count_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
@@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
 
 static const struct attribute_group *m10bmc_sec_attr_groups[] = {
 	&m10bmc_security_attr_group,
+	&m10bmc_control_attr_group,
 	NULL,
 };
 
@@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
 		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
 }
 
+static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
+				     unsigned int val)
+{
+	u32 doorbell;
+	int ret;
+
+	if (val > 1) {
+		dev_err(sec->dev, "invalid reload val = %d\n", val);
+		return -EINVAL;
+	}
+
+	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
+	if (ret)
+		return ret;
+
+	if (doorbell & DRBL_REBOOT_DISABLED)
+		return -EBUSY;
+
+	return regmap_update_bits(sec->m10bmc->regmap,
+				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
+				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
+				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
+				  DRBL_REBOOT_REQ);
+}
+
+static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
+{
+	return m10bmc_sec_bmc_image_load(sec, 0);
+}
+
+static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
+{
+	return m10bmc_sec_bmc_image_load(sec, 1);
+}
+
+static struct image_load m10bmc_image_load_hndlrs[] = {
+	{
+		.name = "bmc_factory",
+		.load_image = m10bmc_sec_bmc_image_load_1,
+	},
+	{
+		.name = "bmc_user",
+		.load_image = m10bmc_sec_bmc_image_load_0,
+	},
+	{}
+};
+
 static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
 {
 	u32 doorbell;
@@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
 	sec->dev = &pdev->dev;
 	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
 	dev_set_drvdata(&pdev->dev, sec);
+	sec->image_load = m10bmc_image_load_hndlrs;
 
 	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
 		       xa_limit_32b, GFP_KERNEL);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback
  2022-08-08  5:33 [PATCH v3 0/2] add Intel FPGA image reload support Tianfei Zhang
  2022-08-08  5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
@ 2022-08-08  5:33 ` Tianfei Zhang
  2022-08-08 16:22   ` Lee Jones
  1 sibling, 1 reply; 11+ messages in thread
From: Tianfei Zhang @ 2022-08-08  5:33 UTC (permalink / raw)
  To: mdf, yilun.xu, linux-fpga, lee.jones, russell.h.weight
  Cc: hao.wu, trix, Tianfei Zhang

From: Russ Weight <russell.h.weight@intel.com>

Create m10bmc_sec_retimer_load() callback function
to provide a trigger to update a new retimer (Intel
C827 Ethernet transceiver) firmware on Intel PAC
N3000 Card.

Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
---
v3:
uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
---
 drivers/fpga/intel-m10-bmc-sec-update.c | 148 ++++++++++++++++++++++++
 include/linux/mfd/intel-m10-bmc.h       |  31 +++++
 2 files changed, 179 insertions(+)

diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
index 3a082911cf67..bef07e97c107 100644
--- a/drivers/fpga/intel-m10-bmc-sec-update.c
+++ b/drivers/fpga/intel-m10-bmc-sec-update.c
@@ -300,6 +300,150 @@ static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
 	return m10bmc_sec_bmc_image_load(sec, 1);
 }
 
+static int trigger_retimer_eeprom_load(struct m10bmc_sec *sec)
+{
+	struct intel_m10bmc *m10bmc = sec->m10bmc;
+	unsigned int val;
+	int ret;
+
+	ret = regmap_update_bits(m10bmc->regmap,
+				 M10BMC_SYS_BASE + M10BMC_DOORBELL,
+				 DRBL_PKVL_EEPROM_LOAD_SEC,
+				 DRBL_PKVL_EEPROM_LOAD_SEC);
+	if (ret)
+		return ret;
+
+	/*
+	 * If the current NIOS FW supports this retimer update feature, then
+	 * it will clear the same PKVL_EEPROM_LOAD bit in 2 seconds. Otherwise
+	 * the driver needs to clear the PKVL_EEPROM_LOAD bit manually and
+	 * return an error code.
+	 */
+	ret = regmap_read_poll_timeout(m10bmc->regmap,
+				       M10BMC_SYS_BASE + M10BMC_DOORBELL, val,
+				       (!(val & DRBL_PKVL_EEPROM_LOAD_SEC)),
+				       M10BMC_PKVL_LOAD_INTERVAL_US,
+				       M10BMC_PKVL_LOAD_TIMEOUT_US);
+	if (ret == -ETIMEDOUT) {
+		dev_err(sec->dev, "PKVL_EEPROM_LOAD clear timedout\n");
+		regmap_update_bits(m10bmc->regmap,
+				   M10BMC_SYS_BASE + M10BMC_DOORBELL,
+				   DRBL_PKVL_EEPROM_LOAD_SEC, 0);
+		ret = -ENODEV;
+	} else if (ret) {
+		dev_err(sec->dev, "poll EEPROM_LOAD error %d\n", ret);
+	}
+
+	return ret;
+}
+
+static int poll_retimer_eeprom_load_done(struct m10bmc_sec *sec)
+{
+	struct intel_m10bmc *m10bmc = sec->m10bmc;
+	unsigned int doorbell;
+	int ret;
+
+	/*
+	 * RSU_STAT_PKVL_REJECT indicates that the current image is
+	 * already programmed. RSU_PROG_PKVL_PROM_DONE that the firmware
+	 * update process has finished, but does not necessarily indicate
+	 * a successful update.
+	 */
+	ret = regmap_read_poll_timeout(m10bmc->regmap,
+				       M10BMC_SYS_BASE + M10BMC_DOORBELL,
+				       doorbell,
+				       ((rsu_prog(doorbell) ==
+					 RSU_PROG_PKVL_PROM_DONE) ||
+					(rsu_stat(doorbell) ==
+					 RSU_STAT_PKVL_REJECT)),
+				       M10BMC_PKVL_PRELOAD_INTERVAL_US,
+				       M10BMC_PKVL_PRELOAD_TIMEOUT_US);
+	if (ret) {
+		if (ret == -ETIMEDOUT)
+			dev_err(sec->dev,
+				"Doorbell check timedout: 0x%08x\n", doorbell);
+		else
+			dev_err(sec->dev, "poll Doorbell error\n");
+		return ret;
+	}
+
+	if (rsu_stat(doorbell) == RSU_STAT_PKVL_REJECT) {
+		dev_err(sec->dev, "duplicate image rejected\n");
+		return -ECANCELED;
+	}
+
+	return 0;
+}
+
+static int poll_retimer_preload_done(struct m10bmc_sec *sec)
+{
+	struct intel_m10bmc *m10bmc = sec->m10bmc;
+	unsigned int val;
+	int ret;
+
+	/*
+	 * Wait for the updated firmware to be loaded by the PKVL device
+	 * and confirm that the updated firmware is operational
+	 */
+	ret = regmap_read_poll_timeout(m10bmc->regmap,
+				       M10BMC_SYS_BASE + M10BMC_PKVL_POLL_CTRL, val,
+				       ((val & M10BMC_PKVL_PRELOAD) == M10BMC_PKVL_PRELOAD),
+				       M10BMC_PKVL_PRELOAD_INTERVAL_US,
+				       M10BMC_PKVL_PRELOAD_TIMEOUT_US);
+	if (ret) {
+		dev_err(sec->dev, "poll M10BMC_PKVL_PRELOAD error %d\n", ret);
+		return ret;
+	}
+
+	if ((val & M10BMC_PKVL_UPG_STATUS_MASK) != M10BMC_PKVL_UPG_STATUS_GOOD) {
+		dev_err(sec->dev, "error detected during upgrade\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int retimer_check_idle(struct m10bmc_sec *sec)
+{
+	u32 doorbell;
+	int ret;
+
+	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
+	if (ret)
+		return -EIO;
+
+	if (rsu_prog(doorbell) != RSU_PROG_IDLE &&
+	    rsu_prog(doorbell) != RSU_PROG_RSU_DONE &&
+	    rsu_prog(doorbell) != RSU_PROG_PKVL_PROM_DONE) {
+		log_error_regs(sec, doorbell);
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static int m10bmc_sec_retimer_eeprom_load(struct m10bmc_sec *sec)
+{
+	int ret;
+
+	ret = retimer_check_idle(sec);
+	if (ret)
+		goto exit;
+
+	ret = trigger_retimer_eeprom_load(sec);
+	if (ret)
+		goto exit;
+
+	ret = poll_retimer_eeprom_load_done(sec);
+	if (ret)
+		goto exit;
+
+	ret = poll_retimer_preload_done(sec);
+
+exit:
+	return ret;
+}
+
 static struct image_load m10bmc_image_load_hndlrs[] = {
 	{
 		.name = "bmc_factory",
@@ -309,6 +453,10 @@ static struct image_load m10bmc_image_load_hndlrs[] = {
 		.name = "bmc_user",
 		.load_image = m10bmc_sec_bmc_image_load_0,
 	},
+	{
+		.name = "retimer_fw",
+		.load_image = m10bmc_sec_retimer_eeprom_load,
+	},
 	{}
 };
 
diff --git a/include/linux/mfd/intel-m10-bmc.h b/include/linux/mfd/intel-m10-bmc.h
index f0044b14136e..c06b9d3d1c5d 100644
--- a/include/linux/mfd/intel-m10-bmc.h
+++ b/include/linux/mfd/intel-m10-bmc.h
@@ -36,6 +36,37 @@
 #define M10BMC_VER_PCB_INFO_MSK		GENMASK(31, 24)
 #define M10BMC_VER_LEGACY_INVALID	0xffffffff
 
+/* Retimer related registers, in system register region */
+#define M10BMC_PKVL_POLL_CTRL		0x80
+#define M10BMC_PKVL_A_PRELOAD		BIT(16)
+#define M10BMC_PKVL_A_PRELOAD_TO	BIT(17)
+#define M10BMC_PKVL_A_DATA_TOO_BIG	BIT(18)
+#define M10BMC_PKVL_A_HDR_CKSUM	BIT(20)
+#define M10BMC_PKVL_B_PRELOAD		BIT(24)
+#define M10BMC_PKVL_B_PRELOAD_TO	BIT(25)
+#define M10BMC_PKVL_B_DATA_TOO_BIG	BIT(26)
+#define M10BMC_PKVL_B_HDR_CKSUM	BIT(28)
+
+#define M10BMC_PKVL_PRELOAD		(M10BMC_PKVL_A_PRELOAD | M10BMC_PKVL_B_PRELOAD)
+#define M10BMC_PKVL_PRELOAD_TIMEOUT	(M10BMC_PKVL_A_PRELOAD_TO | \
+					 M10BMC_PKVL_B_PRELOAD_TO)
+#define M10BMC_PKVL_DATA_TOO_BIG	(M10BMC_PKVL_A_DATA_TOO_BIG | \
+					 M10BMC_PKVL_B_DATA_TOO_BIG)
+#define M10BMC_PKVL_HDR_CHECKSUM	(M10BMC_PKVL_A_HDR_CKSUM | \
+					 M10BMC_PKVL_B_HDR_CKSUM)
+
+#define M10BMC_PKVL_UPG_STATUS_MASK	(M10BMC_PKVL_PRELOAD | M10BMC_PKVL_PRELOAD_TIMEOUT |\
+					 M10BMC_PKVL_DATA_TOO_BIG | M10BMC_PKVL_HDR_CHECKSUM)
+#define M10BMC_PKVL_UPG_STATUS_GOOD	(M10BMC_PKVL_PRELOAD | M10BMC_PKVL_HDR_CHECKSUM)
+
+/* interval 100ms and timeout 2s */
+#define M10BMC_PKVL_LOAD_INTERVAL_US	(100 * 1000)
+#define M10BMC_PKVL_LOAD_TIMEOUT_US	(2 * 1000 * 1000)
+
+/* interval 100ms and timeout 30s */
+#define M10BMC_PKVL_PRELOAD_INTERVAL_US	(100 * 1000)
+#define M10BMC_PKVL_PRELOAD_TIMEOUT_US	(30 * 1000 * 1000)
+
 /* Secure update doorbell register, in system register region */
 #define M10BMC_DOORBELL			0x400
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-08  5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
@ 2022-08-08  9:57   ` Xu Yilun
  2022-08-08 23:39     ` matthew.gerlach
  0 siblings, 1 reply; 11+ messages in thread
From: Xu Yilun @ 2022-08-08  9:57 UTC (permalink / raw)
  To: Tianfei Zhang; +Cc: mdf, linux-fpga, lee.jones, russell.h.weight, hao.wu, trix

On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> From: Russ Weight <russell.h.weight@intel.com>
> 
> Add the available_images and image_load sysfs files. The available_images
> file returns a space separated list of key words that may be written
> into the image_load file. These keywords describe an FPGA, BMC, or
> firmware image in FLASH or EEPROM storage that may be loaded.
> 
> The image_load sysfs file may be written with a key word to trigger a
> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> 
> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> ---
> v3:
> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> v2:
> adds the steps for how to use the "image_load" sysfs file.
> ---
>  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
>  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
>  2 files changed, 139 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> index 0a41afe0ab4c..3d8f04ca6f1b 100644
> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
>  Description:	Read only. Returns number of times the secure update
>  		staging area has been flashed.
>  		Format: "%u".
> +
> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
> +Date:		July 2022
> +KernelVersion:  5.20
> +Contact:	Russ Weight <russell.h.weight@intel.com>
> +Description:	Read-only. This file returns a space separated list of
> +		key words that may be written into the image_load file
> +		described below. These keywords decribe an FPGA, BMC,
> +		or firmware image in FLASH or EEPROM storage that may
> +		be loaded.
> +
> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
> +Date:		July 2022
> +KernelVersion:  5.20
> +Contact:	Russ Weight <russell.h.weight@intel.com>
> +Description:	Write-only. A key word may be written to this file to
> +		trigger a reload of an FPGA, BMC, or firmware image from
> +		FLASH or EEPROM. Refer to the available_images file for a
> +		list of supported key words for the underlying device.
> +		Writing an unsupported string to this file will result in
> +		EINVAL being returned.
> +		It should remove all of resources related to the old FPGA/BMC
> +		image before trigger the image reload otherwise the host system
> +		may crash. We recommended that follow the below steps or directly

I suggest we solve this concern first before other detailed refinements.

My opinion is, don't make the sysfs interface dependent of other user
interfaces, like the following:

> +		use the OPAE RSU script to perform the reload for FPGA/BMC image.
> +		Here is the steps to trigger the reload for FPGA/BMC image:
> +		1. disable the AER of the FPGA card.
> +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> +		3. trigger image reload via "image_load" sysfs file.
> +		4. remove all of PCI devices of the FPGA card via
> +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
> +		5. wait 10 seconds.
> +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> +	        7. enable the AER of the FPGA card.

It is not a good idea the writing of the sysfs node crashes the system,
if we don't follow the whole steps.

Thanks,
Yilun

> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
> index 72c677c910de..3a082911cf67 100644
> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> @@ -14,6 +14,8 @@
>  #include <linux/platform_device.h>
>  #include <linux/slab.h>
>  
> +struct image_load;
> +
>  struct m10bmc_sec {
>  	struct device *dev;
>  	struct intel_m10bmc *m10bmc;
> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>  	char *fw_name;
>  	u32 fw_name_id;
>  	bool cancel_request;
> +	struct image_load *image_load;	/* terminated with { } member */
> +};
> +
> +struct image_load {
> +	const char *name;
> +	int (*load_image)(struct m10bmc_sec *sec);
>  };
>  
>  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
>  
>  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
>  
> +static ssize_t available_images_show(struct device *dev,
> +				     struct device_attribute *attr, char *buf)
> +{
> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> +	const struct image_load *hndlr;
> +	ssize_t count = 0;
> +
> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> +		count += scnprintf(buf + count, PAGE_SIZE - count,
> +				   "%s ", hndlr->name);
> +	}
> +
> +	buf[count - 1] = '\n';
> +
> +	return count;
> +}
> +static DEVICE_ATTR_RO(available_images);
> +
> +static ssize_t image_load_store(struct device *dev,
> +				struct device_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> +	const struct image_load *hndlr;
> +	int ret = -EINVAL;
> +
> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> +		if (sysfs_streq(buf, hndlr->name)) {
> +			ret = hndlr->load_image(sec);
> +			break;
> +		}
> +	}
> +
> +	return ret ? : count;
> +}
> +static DEVICE_ATTR_WO(image_load);
> +
> +static struct attribute *m10bmc_control_attrs[] = {
> +	&dev_attr_available_images.attr,
> +	&dev_attr_image_load.attr,
> +	NULL,
> +};
> +
> +static struct attribute_group m10bmc_control_attr_group = {
> +	.name = "control",
> +	.attrs = m10bmc_control_attrs,
> +};
> +
>  static ssize_t flash_count_show(struct device *dev,
>  				struct device_attribute *attr, char *buf)
>  {
> @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
>  
>  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>  	&m10bmc_security_attr_group,
> +	&m10bmc_control_attr_group,
>  	NULL,
>  };
>  
> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
>  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>  }
>  
> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> +				     unsigned int val)
> +{
> +	u32 doorbell;
> +	int ret;
> +
> +	if (val > 1) {
> +		dev_err(sec->dev, "invalid reload val = %d\n", val);
> +		return -EINVAL;
> +	}
> +
> +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> +	if (ret)
> +		return ret;
> +
> +	if (doorbell & DRBL_REBOOT_DISABLED)
> +		return -EBUSY;
> +
> +	return regmap_update_bits(sec->m10bmc->regmap,
> +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
> +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
> +				  DRBL_REBOOT_REQ);
> +}
> +
> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
> +{
> +	return m10bmc_sec_bmc_image_load(sec, 0);
> +}
> +
> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
> +{
> +	return m10bmc_sec_bmc_image_load(sec, 1);
> +}
> +
> +static struct image_load m10bmc_image_load_hndlrs[] = {
> +	{
> +		.name = "bmc_factory",
> +		.load_image = m10bmc_sec_bmc_image_load_1,
> +	},
> +	{
> +		.name = "bmc_user",
> +		.load_image = m10bmc_sec_bmc_image_load_0,
> +	},
> +	{}
> +};
> +
>  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>  {
>  	u32 doorbell;
> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
>  	sec->dev = &pdev->dev;
>  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>  	dev_set_drvdata(&pdev->dev, sec);
> +	sec->image_load = m10bmc_image_load_hndlrs;
>  
>  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>  		       xa_limit_32b, GFP_KERNEL);
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback
  2022-08-08  5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
@ 2022-08-08 16:22   ` Lee Jones
  0 siblings, 0 replies; 11+ messages in thread
From: Lee Jones @ 2022-08-08 16:22 UTC (permalink / raw)
  To: Tianfei Zhang
  Cc: mdf, yilun.xu, linux-fpga, russell.h.weight, hao.wu, trix, Lee Jones

On Mon, 08 Aug 2022, Tianfei Zhang wrote:

> From: Russ Weight <russell.h.weight@intel.com>
> 
> Create m10bmc_sec_retimer_load() callback function
> to provide a trigger to update a new retimer (Intel
> C827 Ethernet transceiver) firmware on Intel PAC
> N3000 Card.
> 
> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> ---
> v3:
> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> ---
>  drivers/fpga/intel-m10-bmc-sec-update.c | 148 ++++++++++++++++++++++++

>  include/linux/mfd/intel-m10-bmc.h       |  31 +++++

Acked-by: Lee Jones <lee@kernel.org>

-- 
DEPRECATED: Please use lee@kernel.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-08  9:57   ` Xu Yilun
@ 2022-08-08 23:39     ` matthew.gerlach
  2022-08-09  7:04       ` Zhang, Tianfei
  2022-08-09 16:30       ` Xu Yilun
  0 siblings, 2 replies; 11+ messages in thread
From: matthew.gerlach @ 2022-08-08 23:39 UTC (permalink / raw)
  To: Xu Yilun
  Cc: Tianfei Zhang, mdf, linux-fpga, lee.jones, russell.h.weight,
	hao.wu, trix



On Mon, 8 Aug 2022, Xu Yilun wrote:

> On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
>> From: Russ Weight <russell.h.weight@intel.com>
>>
>> Add the available_images and image_load sysfs files. The available_images
>> file returns a space separated list of key words that may be written
>> into the image_load file. These keywords describe an FPGA, BMC, or
>> firmware image in FLASH or EEPROM storage that may be loaded.
>>
>> The image_load sysfs file may be written with a key word to trigger a
>> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>>
>> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>> ---
>> v3:
>> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
>> v2:
>> adds the steps for how to use the "image_load" sysfs file.
>> ---
>>  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
>>  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
>>  2 files changed, 139 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>> index 0a41afe0ab4c..3d8f04ca6f1b 100644
>> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>> @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
>>  Description:	Read only. Returns number of times the secure update
>>  		staging area has been flashed.
>>  		Format: "%u".
>> +
>> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
>> +Date:		July 2022
>> +KernelVersion:  5.20
>> +Contact:	Russ Weight <russell.h.weight@intel.com>
>> +Description:	Read-only. This file returns a space separated list of
>> +		key words that may be written into the image_load file
>> +		described below. These keywords decribe an FPGA, BMC,
>> +		or firmware image in FLASH or EEPROM storage that may
>> +		be loaded.
>> +
>> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
>> +Date:		July 2022
>> +KernelVersion:  5.20
>> +Contact:	Russ Weight <russell.h.weight@intel.com>
>> +Description:	Write-only. A key word may be written to this file to
>> +		trigger a reload of an FPGA, BMC, or firmware image from
>> +		FLASH or EEPROM. Refer to the available_images file for a
>> +		list of supported key words for the underlying device.
>> +		Writing an unsupported string to this file will result in
>> +		EINVAL being returned.
>> +		It should remove all of resources related to the old FPGA/BMC
>> +		image before trigger the image reload otherwise the host system
>> +		may crash. We recommended that follow the below steps or directly
>
> I suggest we solve this concern first before other detailed refinements.
>
> My opinion is, don't make the sysfs interface dependent of other user
> interfaces, like the following:
>
>> +		use the OPAE RSU script to perform the reload for FPGA/BMC image.
>> +		Here is the steps to trigger the reload for FPGA/BMC image:
>> +		1. disable the AER of the FPGA card.
>> +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
>> +		3. trigger image reload via "image_load" sysfs file.
>> +		4. remove all of PCI devices of the FPGA card via
>> +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
>> +		5. wait 10 seconds.
>> +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
>> +	        7. enable the AER of the FPGA card.
>
> It is not a good idea the writing of the sysfs node crashes the system,
> if we don't follow the whole steps.
>
> Thanks,
> Yilun

Hi Yilun,

The use case being supported with this trigger is the ability 
to reconfigure a FPGA or other programmable componenet on a board without the requiring 
the HW platform be able to power cycle a PCI slot or power cycling the 
whole system.  Unfortunately, when a FPGA connected to a PCI bus is 
reconfigured, it can cause a PCI error.  The actual pci error, if any, 
and any mitigation steps to handle the error is platform specific and dependent on 
the FPGA image itself.  Therefore predicting and implementing all 
necessary error mitigation in the kernel as part of the trigger would be 
an impossible task.

Matthew

>
>> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
>> index 72c677c910de..3a082911cf67 100644
>> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
>> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
>> @@ -14,6 +14,8 @@
>>  #include <linux/platform_device.h>
>>  #include <linux/slab.h>
>>
>> +struct image_load;
>> +
>>  struct m10bmc_sec {
>>  	struct device *dev;
>>  	struct intel_m10bmc *m10bmc;
>> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>>  	char *fw_name;
>>  	u32 fw_name_id;
>>  	bool cancel_request;
>> +	struct image_load *image_load;	/* terminated with { } member */
>> +};
>> +
>> +struct image_load {
>> +	const char *name;
>> +	int (*load_image)(struct m10bmc_sec *sec);
>>  };
>>
>>  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
>> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
>>
>>  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
>>
>> +static ssize_t available_images_show(struct device *dev,
>> +				     struct device_attribute *attr, char *buf)
>> +{
>> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
>> +	const struct image_load *hndlr;
>> +	ssize_t count = 0;
>> +
>> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>> +		count += scnprintf(buf + count, PAGE_SIZE - count,
>> +				   "%s ", hndlr->name);
>> +	}
>> +
>> +	buf[count - 1] = '\n';
>> +
>> +	return count;
>> +}
>> +static DEVICE_ATTR_RO(available_images);
>> +
>> +static ssize_t image_load_store(struct device *dev,
>> +				struct device_attribute *attr,
>> +				const char *buf, size_t count)
>> +{
>> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
>> +	const struct image_load *hndlr;
>> +	int ret = -EINVAL;
>> +
>> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>> +		if (sysfs_streq(buf, hndlr->name)) {
>> +			ret = hndlr->load_image(sec);
>> +			break;
>> +		}
>> +	}
>> +
>> +	return ret ? : count;
>> +}
>> +static DEVICE_ATTR_WO(image_load);
>> +
>> +static struct attribute *m10bmc_control_attrs[] = {
>> +	&dev_attr_available_images.attr,
>> +	&dev_attr_image_load.attr,
>> +	NULL,
>> +};
>> +
>> +static struct attribute_group m10bmc_control_attr_group = {
>> +	.name = "control",
>> +	.attrs = m10bmc_control_attrs,
>> +};
>> +
>>  static ssize_t flash_count_show(struct device *dev,
>>  				struct device_attribute *attr, char *buf)
>>  {
>> @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
>>
>>  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>>  	&m10bmc_security_attr_group,
>> +	&m10bmc_control_attr_group,
>>  	NULL,
>>  };
>>
>> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
>>  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>>  }
>>
>> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
>> +				     unsigned int val)
>> +{
>> +	u32 doorbell;
>> +	int ret;
>> +
>> +	if (val > 1) {
>> +		dev_err(sec->dev, "invalid reload val = %d\n", val);
>> +		return -EINVAL;
>> +	}
>> +
>> +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (doorbell & DRBL_REBOOT_DISABLED)
>> +		return -EBUSY;
>> +
>> +	return regmap_update_bits(sec->m10bmc->regmap,
>> +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
>> +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
>> +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
>> +				  DRBL_REBOOT_REQ);
>> +}
>> +
>> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
>> +{
>> +	return m10bmc_sec_bmc_image_load(sec, 0);
>> +}
>> +
>> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
>> +{
>> +	return m10bmc_sec_bmc_image_load(sec, 1);
>> +}
>> +
>> +static struct image_load m10bmc_image_load_hndlrs[] = {
>> +	{
>> +		.name = "bmc_factory",
>> +		.load_image = m10bmc_sec_bmc_image_load_1,
>> +	},
>> +	{
>> +		.name = "bmc_user",
>> +		.load_image = m10bmc_sec_bmc_image_load_0,
>> +	},
>> +	{}
>> +};
>> +
>>  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>>  {
>>  	u32 doorbell;
>> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
>>  	sec->dev = &pdev->dev;
>>  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>>  	dev_set_drvdata(&pdev->dev, sec);
>> +	sec->image_load = m10bmc_image_load_hndlrs;
>>
>>  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>>  		       xa_limit_32b, GFP_KERNEL);
>> --
>> 2.26.2
>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-08 23:39     ` matthew.gerlach
@ 2022-08-09  7:04       ` Zhang, Tianfei
  2022-08-11 20:41         ` matthew.gerlach
  2022-08-09 16:30       ` Xu Yilun
  1 sibling, 1 reply; 11+ messages in thread
From: Zhang, Tianfei @ 2022-08-09  7:04 UTC (permalink / raw)
  To: matthew.gerlach, Xu, Yilun
  Cc: mdf, linux-fpga, lee.jones, Weight, Russell H, Wu, Hao, trix



> -----Original Message-----
> From: matthew.gerlach@linux.intel.com <matthew.gerlach@linux.intel.com>
> Sent: Tuesday, August 9, 2022 7:39 AM
> To: Xu, Yilun <yilun.xu@intel.com>
> Cc: Zhang, Tianfei <tianfei.zhang@intel.com>; mdf@kernel.org; linux-
> fpga@vger.kernel.org; lee.jones@linaro.org; Weight, Russell H
> <russell.h.weight@intel.com>; Wu, Hao <hao.wu@intel.com>; trix@redhat.com
> Subject: Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC
> images
> 
> 
> 
> On Mon, 8 Aug 2022, Xu Yilun wrote:
> 
> > On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> >> From: Russ Weight <russell.h.weight@intel.com>
> >>
> >> Add the available_images and image_load sysfs files. The
> >> available_images file returns a space separated list of key words
> >> that may be written into the image_load file. These keywords describe
> >> an FPGA, BMC, or firmware image in FLASH or EEPROM storage that may be
> loaded.
> >>
> >> The image_load sysfs file may be written with a key word to trigger a
> >> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> >>
> >> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> >> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> >> ---
> >> v3:
> >> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> >> v2:
> >> adds the steps for how to use the "image_load" sysfs file.
> >> ---
> >>  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
> >>  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
> >>  2 files changed, 139 insertions(+)
> >>
> >> diff --git
> >> a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> index 0a41afe0ab4c..3d8f04ca6f1b 100644
> >> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
> >>  Description:	Read only. Returns number of times the secure update
> >>  		staging area has been flashed.
> >>  		Format: "%u".
> >> +
> >> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/available_images
> >> +Date:		July 2022
> >> +KernelVersion:  5.20
> >> +Contact:	Russ Weight <russell.h.weight@intel.com>
> >> +Description:	Read-only. This file returns a space separated list of
> >> +		key words that may be written into the image_load file
> >> +		described below. These keywords decribe an FPGA, BMC,
> >> +		or firmware image in FLASH or EEPROM storage that may
> >> +		be loaded.
> >> +
> >> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/image_load
> >> +Date:		July 2022
> >> +KernelVersion:  5.20
> >> +Contact:	Russ Weight <russell.h.weight@intel.com>
> >> +Description:	Write-only. A key word may be written to this file to
> >> +		trigger a reload of an FPGA, BMC, or firmware image from
> >> +		FLASH or EEPROM. Refer to the available_images file for a
> >> +		list of supported key words for the underlying device.
> >> +		Writing an unsupported string to this file will result in
> >> +		EINVAL being returned.
> >> +		It should remove all of resources related to the old FPGA/BMC
> >> +		image before trigger the image reload otherwise the host system
> >> +		may crash. We recommended that follow the below steps or
> directly
> >
> > I suggest we solve this concern first before other detailed refinements.

I have run the stress test on N3000 and N6000 card by only directly write string to " image_load"  sysfs file, 
and observed  that there are no Host System crash occurred, but the BMC functionality failed, I think this is 
the expected behavior  that is why we strongly recommended that end-user use our RSU script or follow our 
provided steps.

> >
> > My opinion is, don't make the sysfs interface dependent of other user
> > interfaces, like the following:
> >
> >> +		use the OPAE RSU script to perform the reload for FPGA/BMC
> image.
> >> +		Here is the steps to trigger the reload for FPGA/BMC image:
> >> +		1. disable the AER of the FPGA card.
> >> +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> >> +		3. trigger image reload via "image_load" sysfs file.
> >> +		4. remove all of PCI devices of the FPGA card via
> >> +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
> >> +		5. wait 10 seconds.
> >> +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> >> +	        7. enable the AER of the FPGA card.
> >
> > It is not a good idea the writing of the sysfs node crashes the
> > system, if we don't follow the whole steps.
> >
> > Thanks,
> > Yilun
> 
> Hi Yilun,
> 
> The use case being supported with this trigger is the ability to reconfigure a FPGA or
> other programmable componenet on a board without the requiring the HW platform
> be able to power cycle a PCI slot or power cycling the whole system.  Unfortunately,
> when a FPGA connected to a PCI bus is reconfigured, it can cause a PCI error.  The
> actual pci error, if any, and any mitigation steps to handle the error is platform
> specific and dependent on the FPGA image itself.  Therefore predicting and
> implementing all necessary error mitigation in the kernel as part of the trigger would
> be an impossible task.

I agree with Matthew's comment.
How about we change the documentation as below:

It should remove all of resources related to the old FPGA/BMC
image before trigger the image reload otherwise the functionalities 
of FPGA/BMC will fail. We recommended that follow the below steps 
or directly use the OPAE RSU script to perform the reload for FPGA/BMC
image.
Here is the steps to trigger the reload for FPGA/BMC image:
1. disable the AER of the FPGA card.
2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
3. trigger image reload via "image_load" sysfs file.
4. remove all of PCI devices of the FPGA card via
"/sys/bus/pci/devices/xxxx/remove" sysfs file.
5. wait 10 seconds.
6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
7. enable the AER of the FPGA card.



> 
> Matthew
> 
> >
> >> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-
> bmc-sec-update.c
> >> index 72c677c910de..3a082911cf67 100644
> >> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> >> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> >> @@ -14,6 +14,8 @@
> >>  #include <linux/platform_device.h>
> >>  #include <linux/slab.h>
> >>
> >> +struct image_load;
> >> +
> >>  struct m10bmc_sec {
> >>  	struct device *dev;
> >>  	struct intel_m10bmc *m10bmc;
> >> @@ -21,6 +23,12 @@ struct m10bmc_sec {
> >>  	char *fw_name;
> >>  	u32 fw_name_id;
> >>  	bool cancel_request;
> >> +	struct image_load *image_load;	/* terminated with { } member */
> >> +};
> >> +
> >> +struct image_load {
> >> +	const char *name;
> >> +	int (*load_image)(struct m10bmc_sec *sec);
> >>  };
> >>
> >>  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> >> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR +
> CSK_VEC_OFFSET);
> >>
> >>  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
> >>
> >> +static ssize_t available_images_show(struct device *dev,
> >> +				     struct device_attribute *attr, char *buf)
> >> +{
> >> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> >> +	const struct image_load *hndlr;
> >> +	ssize_t count = 0;
> >> +
> >> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> >> +		count += scnprintf(buf + count, PAGE_SIZE - count,
> >> +				   "%s ", hndlr->name);
> >> +	}
> >> +
> >> +	buf[count - 1] = '\n';
> >> +
> >> +	return count;
> >> +}
> >> +static DEVICE_ATTR_RO(available_images);
> >> +
> >> +static ssize_t image_load_store(struct device *dev,
> >> +				struct device_attribute *attr,
> >> +				const char *buf, size_t count)
> >> +{
> >> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> >> +	const struct image_load *hndlr;
> >> +	int ret = -EINVAL;
> >> +
> >> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> >> +		if (sysfs_streq(buf, hndlr->name)) {
> >> +			ret = hndlr->load_image(sec);
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	return ret ? : count;
> >> +}
> >> +static DEVICE_ATTR_WO(image_load);
> >> +
> >> +static struct attribute *m10bmc_control_attrs[] = {
> >> +	&dev_attr_available_images.attr,
> >> +	&dev_attr_image_load.attr,
> >> +	NULL,
> >> +};
> >> +
> >> +static struct attribute_group m10bmc_control_attr_group = {
> >> +	.name = "control",
> >> +	.attrs = m10bmc_control_attrs,
> >> +};
> >> +
> >>  static ssize_t flash_count_show(struct device *dev,
> >>  				struct device_attribute *attr, char *buf)
> >>  {
> >> @@ -195,6 +251,7 @@ static struct attribute_group
> m10bmc_security_attr_group = {
> >>
> >>  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> >>  	&m10bmc_security_attr_group,
> >> +	&m10bmc_control_attr_group,
> >>  	NULL,
> >>  };
> >>
> >> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32
> doorbell)
> >>  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
> >>  }
> >>
> >> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> >> +				     unsigned int val)
> >> +{
> >> +	u32 doorbell;
> >> +	int ret;
> >> +
> >> +	if (val > 1) {
> >> +		dev_err(sec->dev, "invalid reload val = %d\n", val);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	if (doorbell & DRBL_REBOOT_DISABLED)
> >> +		return -EBUSY;
> >> +
> >> +	return regmap_update_bits(sec->m10bmc->regmap,
> >> +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
> >> +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> >> +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
> >> +				  DRBL_REBOOT_REQ);
> >> +}
> >> +
> >> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
> >> +{
> >> +	return m10bmc_sec_bmc_image_load(sec, 0);
> >> +}
> >> +
> >> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
> >> +{
> >> +	return m10bmc_sec_bmc_image_load(sec, 1);
> >> +}
> >> +
> >> +static struct image_load m10bmc_image_load_hndlrs[] = {
> >> +	{
> >> +		.name = "bmc_factory",
> >> +		.load_image = m10bmc_sec_bmc_image_load_1,
> >> +	},
> >> +	{
> >> +		.name = "bmc_user",
> >> +		.load_image = m10bmc_sec_bmc_image_load_0,
> >> +	},
> >> +	{}
> >> +};
> >> +
> >>  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> >>  {
> >>  	u32 doorbell;
> >> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device
> *pdev)
> >>  	sec->dev = &pdev->dev;
> >>  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> >>  	dev_set_drvdata(&pdev->dev, sec);
> >> +	sec->image_load = m10bmc_image_load_hndlrs;
> >>
> >>  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> >>  		       xa_limit_32b, GFP_KERNEL);
> >> --
> >> 2.26.2
> >>
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-08 23:39     ` matthew.gerlach
  2022-08-09  7:04       ` Zhang, Tianfei
@ 2022-08-09 16:30       ` Xu Yilun
  2022-08-10 13:50         ` Zhang, Tianfei
  2022-08-11 21:03         ` matthew.gerlach
  1 sibling, 2 replies; 11+ messages in thread
From: Xu Yilun @ 2022-08-09 16:30 UTC (permalink / raw)
  To: matthew.gerlach
  Cc: Tianfei Zhang, mdf, linux-fpga, lee.jones, russell.h.weight,
	hao.wu, trix

On 2022-08-08 at 16:39:23 -0700, matthew.gerlach@linux.intel.com wrote:
> 
> 
> On Mon, 8 Aug 2022, Xu Yilun wrote:
> 
> > On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> > > From: Russ Weight <russell.h.weight@intel.com>
> > > 
> > > Add the available_images and image_load sysfs files. The available_images
> > > file returns a space separated list of key words that may be written
> > > into the image_load file. These keywords describe an FPGA, BMC, or
> > > firmware image in FLASH or EEPROM storage that may be loaded.
> > > 
> > > The image_load sysfs file may be written with a key word to trigger a
> > > reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> > > 
> > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> > > ---
> > > v3:
> > > uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> > > v2:
> > > adds the steps for how to use the "image_load" sysfs file.
> > > ---
> > >  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
> > >  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
> > >  2 files changed, 139 insertions(+)
> > > 
> > > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > index 0a41afe0ab4c..3d8f04ca6f1b 100644
> > > --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
> > >  Description:	Read only. Returns number of times the secure update
> > >  		staging area has been flashed.
> > >  		Format: "%u".
> > > +
> > > +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
> > > +Date:		July 2022
> > > +KernelVersion:  5.20
> > > +Contact:	Russ Weight <russell.h.weight@intel.com>
> > > +Description:	Read-only. This file returns a space separated list of
> > > +		key words that may be written into the image_load file
> > > +		described below. These keywords decribe an FPGA, BMC,
> > > +		or firmware image in FLASH or EEPROM storage that may
> > > +		be loaded.
> > > +
> > > +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
> > > +Date:		July 2022
> > > +KernelVersion:  5.20
> > > +Contact:	Russ Weight <russell.h.weight@intel.com>
> > > +Description:	Write-only. A key word may be written to this file to
> > > +		trigger a reload of an FPGA, BMC, or firmware image from
> > > +		FLASH or EEPROM. Refer to the available_images file for a
> > > +		list of supported key words for the underlying device.
> > > +		Writing an unsupported string to this file will result in
> > > +		EINVAL being returned.
> > > +		It should remove all of resources related to the old FPGA/BMC
> > > +		image before trigger the image reload otherwise the host system
> > > +		may crash. We recommended that follow the below steps or directly
> > 
> > I suggest we solve this concern first before other detailed refinements.
> > 
> > My opinion is, don't make the sysfs interface dependent of other user
> > interfaces, like the following:
> > 
> > > +		use the OPAE RSU script to perform the reload for FPGA/BMC image.
> > > +		Here is the steps to trigger the reload for FPGA/BMC image:
> > > +		1. disable the AER of the FPGA card.
> > > +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> > > +		3. trigger image reload via "image_load" sysfs file.
> > > +		4. remove all of PCI devices of the FPGA card via
> > > +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
> > > +		5. wait 10 seconds.
> > > +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> > > +	        7. enable the AER of the FPGA card.
> > 
> > It is not a good idea the writing of the sysfs node crashes the system,
> > if we don't follow the whole steps.
> > 
> > Thanks,
> > Yilun
> 
> Hi Yilun,
> 
> The use case being supported with this trigger is the ability to reconfigure
> a FPGA or other programmable componenet on a board without the requiring the
> HW platform be able to power cycle a PCI slot or power cycling the whole
> system.  Unfortunately, when a FPGA connected to a PCI bus is reconfigured,
> it can cause a PCI error.  The actual pci error, if any, and any mitigation
> steps to handle the error is platform specific and dependent on the FPGA
> image itself.  Therefore predicting and implementing all necessary error

Why the error handling is unpredictable?

> mitigation in the kernel as part of the trigger would be an impossible task.

Or could we just gate the pcie link? Just like we should disable fpga
bridges before reprogramming any fpga region.

Actually I did find something for link disabing which may be useful.

https://patchwork.kernel.org/project/linux-pci/patch/20190529104942.74991-1-mika.westerberg@linux.intel.com/

Thanks,
Yilun

> 
> Matthew
> 
> > 
> > > diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > index 72c677c910de..3a082911cf67 100644
> > > --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> > > +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > @@ -14,6 +14,8 @@
> > >  #include <linux/platform_device.h>
> > >  #include <linux/slab.h>
> > > 
> > > +struct image_load;
> > > +
> > >  struct m10bmc_sec {
> > >  	struct device *dev;
> > >  	struct intel_m10bmc *m10bmc;
> > > @@ -21,6 +23,12 @@ struct m10bmc_sec {
> > >  	char *fw_name;
> > >  	u32 fw_name_id;
> > >  	bool cancel_request;
> > > +	struct image_load *image_load;	/* terminated with { } member */
> > > +};
> > > +
> > > +struct image_load {
> > > +	const char *name;
> > > +	int (*load_image)(struct m10bmc_sec *sec);
> > >  };
> > > 
> > >  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> > > @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
> > > 
> > >  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
> > > 
> > > +static ssize_t available_images_show(struct device *dev,
> > > +				     struct device_attribute *attr, char *buf)
> > > +{
> > > +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > +	const struct image_load *hndlr;
> > > +	ssize_t count = 0;
> > > +
> > > +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > +		count += scnprintf(buf + count, PAGE_SIZE - count,
> > > +				   "%s ", hndlr->name);
> > > +	}
> > > +
> > > +	buf[count - 1] = '\n';
> > > +
> > > +	return count;
> > > +}
> > > +static DEVICE_ATTR_RO(available_images);
> > > +
> > > +static ssize_t image_load_store(struct device *dev,
> > > +				struct device_attribute *attr,
> > > +				const char *buf, size_t count)
> > > +{
> > > +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > +	const struct image_load *hndlr;
> > > +	int ret = -EINVAL;
> > > +
> > > +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > +		if (sysfs_streq(buf, hndlr->name)) {
> > > +			ret = hndlr->load_image(sec);
> > > +			break;
> > > +		}
> > > +	}
> > > +
> > > +	return ret ? : count;
> > > +}
> > > +static DEVICE_ATTR_WO(image_load);
> > > +
> > > +static struct attribute *m10bmc_control_attrs[] = {
> > > +	&dev_attr_available_images.attr,
> > > +	&dev_attr_image_load.attr,
> > > +	NULL,
> > > +};
> > > +
> > > +static struct attribute_group m10bmc_control_attr_group = {
> > > +	.name = "control",
> > > +	.attrs = m10bmc_control_attrs,
> > > +};
> > > +
> > >  static ssize_t flash_count_show(struct device *dev,
> > >  				struct device_attribute *attr, char *buf)
> > >  {
> > > @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
> > > 
> > >  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> > >  	&m10bmc_security_attr_group,
> > > +	&m10bmc_control_attr_group,
> > >  	NULL,
> > >  };
> > > 
> > > @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
> > >  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
> > >  }
> > > 
> > > +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> > > +				     unsigned int val)
> > > +{
> > > +	u32 doorbell;
> > > +	int ret;
> > > +
> > > +	if (val > 1) {
> > > +		dev_err(sec->dev, "invalid reload val = %d\n", val);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	if (doorbell & DRBL_REBOOT_DISABLED)
> > > +		return -EBUSY;
> > > +
> > > +	return regmap_update_bits(sec->m10bmc->regmap,
> > > +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
> > > +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> > > +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
> > > +				  DRBL_REBOOT_REQ);
> > > +}
> > > +
> > > +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
> > > +{
> > > +	return m10bmc_sec_bmc_image_load(sec, 0);
> > > +}
> > > +
> > > +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
> > > +{
> > > +	return m10bmc_sec_bmc_image_load(sec, 1);
> > > +}
> > > +
> > > +static struct image_load m10bmc_image_load_hndlrs[] = {
> > > +	{
> > > +		.name = "bmc_factory",
> > > +		.load_image = m10bmc_sec_bmc_image_load_1,
> > > +	},
> > > +	{
> > > +		.name = "bmc_user",
> > > +		.load_image = m10bmc_sec_bmc_image_load_0,
> > > +	},
> > > +	{}
> > > +};
> > > +
> > >  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> > >  {
> > >  	u32 doorbell;
> > > @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
> > >  	sec->dev = &pdev->dev;
> > >  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> > >  	dev_set_drvdata(&pdev->dev, sec);
> > > +	sec->image_load = m10bmc_image_load_hndlrs;
> > > 
> > >  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> > >  		       xa_limit_32b, GFP_KERNEL);
> > > --
> > > 2.26.2
> > > 
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-09 16:30       ` Xu Yilun
@ 2022-08-10 13:50         ` Zhang, Tianfei
  2022-08-11 21:03         ` matthew.gerlach
  1 sibling, 0 replies; 11+ messages in thread
From: Zhang, Tianfei @ 2022-08-10 13:50 UTC (permalink / raw)
  To: Xu, Yilun, matthew.gerlach
  Cc: mdf, linux-fpga, lee.jones, Weight, Russell H, Wu, Hao, trix



> -----Original Message-----
> From: Xu, Yilun <yilun.xu@intel.com>
> Sent: Wednesday, August 10, 2022 12:31 AM
> To: matthew.gerlach@linux.intel.com
> Cc: Zhang, Tianfei <tianfei.zhang@intel.com>; mdf@kernel.org; linux-
> fpga@vger.kernel.org; lee.jones@linaro.org; Weight, Russell H
> <russell.h.weight@intel.com>; Wu, Hao <hao.wu@intel.com>; trix@redhat.com
> Subject: Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC
> images
> 
> On 2022-08-08 at 16:39:23 -0700, matthew.gerlach@linux.intel.com wrote:
> >
> >
> > On Mon, 8 Aug 2022, Xu Yilun wrote:
> >
> > > On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> > > > From: Russ Weight <russell.h.weight@intel.com>
> > > >
> > > > Add the available_images and image_load sysfs files. The
> > > > available_images file returns a space separated list of key words
> > > > that may be written into the image_load file. These keywords
> > > > describe an FPGA, BMC, or firmware image in FLASH or EEPROM storage that
> may be loaded.
> > > >
> > > > The image_load sysfs file may be written with a key word to
> > > > trigger a reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> > > >
> > > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > > Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> > > > ---
> > > > v3:
> > > > uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> > > > v2:
> > > > adds the steps for how to use the "image_load" sysfs file.
> > > > ---
> > > >  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
> > > >  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
> > > >  2 files changed, 139 insertions(+)
> > > >
> > > > diff --git
> > > > a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > > b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > > index 0a41afe0ab4c..3d8f04ca6f1b 100644
> > > > ---
> > > > a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > > +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-upd
> > > > +++ ate
> > > > @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
> > > >  Description:	Read only. Returns number of times the secure update
> > > >  		staging area has been flashed.
> > > >  		Format: "%u".
> > > > +
> > > > +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/available_images
> > > > +Date:		July 2022
> > > > +KernelVersion:  5.20
> > > > +Contact:	Russ Weight <russell.h.weight@intel.com>
> > > > +Description:	Read-only. This file returns a space separated list of
> > > > +		key words that may be written into the image_load file
> > > > +		described below. These keywords decribe an FPGA, BMC,
> > > > +		or firmware image in FLASH or EEPROM storage that may
> > > > +		be loaded.
> > > > +
> > > > +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/image_load
> > > > +Date:		July 2022
> > > > +KernelVersion:  5.20
> > > > +Contact:	Russ Weight <russell.h.weight@intel.com>
> > > > +Description:	Write-only. A key word may be written to this file to
> > > > +		trigger a reload of an FPGA, BMC, or firmware image from
> > > > +		FLASH or EEPROM. Refer to the available_images file for a
> > > > +		list of supported key words for the underlying device.
> > > > +		Writing an unsupported string to this file will result in
> > > > +		EINVAL being returned.
> > > > +		It should remove all of resources related to the old FPGA/BMC
> > > > +		image before trigger the image reload otherwise the host system
> > > > +		may crash. We recommended that follow the below steps or
> > > > +directly
> > >
> > > I suggest we solve this concern first before other detailed refinements.
> > >
> > > My opinion is, don't make the sysfs interface dependent of other
> > > user interfaces, like the following:
> > >
> > > > +		use the OPAE RSU script to perform the reload for FPGA/BMC
> image.
> > > > +		Here is the steps to trigger the reload for FPGA/BMC image:
> > > > +		1. disable the AER of the FPGA card.
> > > > +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> > > > +		3. trigger image reload via "image_load" sysfs file.
> > > > +		4. remove all of PCI devices of the FPGA card via
> > > > +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
> > > > +		5. wait 10 seconds.
> > > > +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> > > > +	        7. enable the AER of the FPGA card.
> > >
> > > It is not a good idea the writing of the sysfs node crashes the
> > > system, if we don't follow the whole steps.
> > >
> > > Thanks,
> > > Yilun
> >
> > Hi Yilun,
> >
> > The use case being supported with this trigger is the ability to
> > reconfigure a FPGA or other programmable componenet on a board without
> > the requiring the HW platform be able to power cycle a PCI slot or
> > power cycling the whole system.  Unfortunately, when a FPGA connected
> > to a PCI bus is reconfigured, it can cause a PCI error.  The actual
> > pci error, if any, and any mitigation steps to handle the error is
> > platform specific and dependent on the FPGA image itself.  Therefore
> > predicting and implementing all necessary error
> 
> Why the error handling is unpredictable?

This error came from BMC/FPGA while burn the image if we don't remove this PCI device.
For example, the other kernel thread accessing the FPGA/BMC while we burn the new FPGA image.

> 
> > mitigation in the kernel as part of the trigger would be an impossible task.
> 
> Or could we just gate the pcie link? Just like we should disable fpga bridges before
> reprogramming any fpga region.

I think PCI "remove" sysfs file will remove all of subdevices like fpga bridges, and process of  disable AER
and trigger PCI remove will mitigate  the PCI errors.

> 
> Actually I did find something for link disabing which may be useful.
> 
> https://patchwork.kernel.org/project/linux-pci/patch/20190529104942.74991-1-
> mika.westerberg@linux.intel.com/

It looks like this link_disable patch has not accepted by PCI maintainer.
This link_disable sysfs file just want to protect against another user doing rescan.

> 
> Thanks,
> Yilun
> 
> >
> > Matthew
> >
> > >
> > > > diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > index 72c677c910de..3a082911cf67 100644
> > > > --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > @@ -14,6 +14,8 @@
> > > >  #include <linux/platform_device.h>  #include <linux/slab.h>
> > > >
> > > > +struct image_load;
> > > > +
> > > >  struct m10bmc_sec {
> > > >  	struct device *dev;
> > > >  	struct intel_m10bmc *m10bmc;
> > > > @@ -21,6 +23,12 @@ struct m10bmc_sec {
> > > >  	char *fw_name;
> > > >  	u32 fw_name_id;
> > > >  	bool cancel_request;
> > > > +	struct image_load *image_load;	/* terminated with { } member */
> > > > +};
> > > > +
> > > > +struct image_load {
> > > > +	const char *name;
> > > > +	int (*load_image)(struct m10bmc_sec *sec);
> > > >  };
> > > >
> > > >  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> > > > @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR +
> > > > CSK_VEC_OFFSET);
> > > >
> > > >  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
> > > >
> > > > +static ssize_t available_images_show(struct device *dev,
> > > > +				     struct device_attribute *attr, char *buf) {
> > > > +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > > +	const struct image_load *hndlr;
> > > > +	ssize_t count = 0;
> > > > +
> > > > +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > > +		count += scnprintf(buf + count, PAGE_SIZE - count,
> > > > +				   "%s ", hndlr->name);
> > > > +	}
> > > > +
> > > > +	buf[count - 1] = '\n';
> > > > +
> > > > +	return count;
> > > > +}
> > > > +static DEVICE_ATTR_RO(available_images);
> > > > +
> > > > +static ssize_t image_load_store(struct device *dev,
> > > > +				struct device_attribute *attr,
> > > > +				const char *buf, size_t count) {
> > > > +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > > +	const struct image_load *hndlr;
> > > > +	int ret = -EINVAL;
> > > > +
> > > > +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > > +		if (sysfs_streq(buf, hndlr->name)) {
> > > > +			ret = hndlr->load_image(sec);
> > > > +			break;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return ret ? : count;
> > > > +}
> > > > +static DEVICE_ATTR_WO(image_load);
> > > > +
> > > > +static struct attribute *m10bmc_control_attrs[] = {
> > > > +	&dev_attr_available_images.attr,
> > > > +	&dev_attr_image_load.attr,
> > > > +	NULL,
> > > > +};
> > > > +
> > > > +static struct attribute_group m10bmc_control_attr_group = {
> > > > +	.name = "control",
> > > > +	.attrs = m10bmc_control_attrs,
> > > > +};
> > > > +
> > > >  static ssize_t flash_count_show(struct device *dev,
> > > >  				struct device_attribute *attr, char *buf)  { @@ -
> 195,6 +251,7
> > > > @@ static struct attribute_group m10bmc_security_attr_group = {
> > > >
> > > >  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> > > >  	&m10bmc_security_attr_group,
> > > > +	&m10bmc_control_attr_group,
> > > >  	NULL,
> > > >  };
> > > >
> > > > @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec,
> u32 doorbell)
> > > >  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);  }
> > > >
> > > > +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> > > > +				     unsigned int val)
> > > > +{
> > > > +	u32 doorbell;
> > > > +	int ret;
> > > > +
> > > > +	if (val > 1) {
> > > > +		dev_err(sec->dev, "invalid reload val = %d\n", val);
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	if (doorbell & DRBL_REBOOT_DISABLED)
> > > > +		return -EBUSY;
> > > > +
> > > > +	return regmap_update_bits(sec->m10bmc->regmap,
> > > > +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
> > > > +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> > > > +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
> > > > +				  DRBL_REBOOT_REQ);
> > > > +}
> > > > +
> > > > +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec) {
> > > > +	return m10bmc_sec_bmc_image_load(sec, 0); }
> > > > +
> > > > +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec) {
> > > > +	return m10bmc_sec_bmc_image_load(sec, 1); }
> > > > +
> > > > +static struct image_load m10bmc_image_load_hndlrs[] = {
> > > > +	{
> > > > +		.name = "bmc_factory",
> > > > +		.load_image = m10bmc_sec_bmc_image_load_1,
> > > > +	},
> > > > +	{
> > > > +		.name = "bmc_user",
> > > > +		.load_image = m10bmc_sec_bmc_image_load_0,
> > > > +	},
> > > > +	{}
> > > > +};
> > > > +
> > > >  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> > > > {
> > > >  	u32 doorbell;
> > > > @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device
> *pdev)
> > > >  	sec->dev = &pdev->dev;
> > > >  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> > > >  	dev_set_drvdata(&pdev->dev, sec);
> > > > +	sec->image_load = m10bmc_image_load_hndlrs;
> > > >
> > > >  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> > > >  		       xa_limit_32b, GFP_KERNEL);
> > > > --
> > > > 2.26.2
> > > >
> > >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-09  7:04       ` Zhang, Tianfei
@ 2022-08-11 20:41         ` matthew.gerlach
  0 siblings, 0 replies; 11+ messages in thread
From: matthew.gerlach @ 2022-08-11 20:41 UTC (permalink / raw)
  To: Zhang, Tianfei
  Cc: Xu, Yilun, mdf, linux-fpga, lee.jones, Weight, Russell H, Wu, Hao, trix



On Tue, 9 Aug 2022, Zhang, Tianfei wrote:

>
>
>> -----Original Message-----
>> From: matthew.gerlach@linux.intel.com <matthew.gerlach@linux.intel.com>
>> Sent: Tuesday, August 9, 2022 7:39 AM
>> To: Xu, Yilun <yilun.xu@intel.com>
>> Cc: Zhang, Tianfei <tianfei.zhang@intel.com>; mdf@kernel.org; linux-
>> fpga@vger.kernel.org; lee.jones@linaro.org; Weight, Russell H
>> <russell.h.weight@intel.com>; Wu, Hao <hao.wu@intel.com>; trix@redhat.com
>> Subject: Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC
>> images
>>
>>
>>
>> On Mon, 8 Aug 2022, Xu Yilun wrote:
>>
>>> On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
>>>> From: Russ Weight <russell.h.weight@intel.com>
>>>>
>>>> Add the available_images and image_load sysfs files. The
>>>> available_images file returns a space separated list of key words
>>>> that may be written into the image_load file. These keywords describe
>>>> an FPGA, BMC, or firmware image in FLASH or EEPROM storage that may be
>> loaded.
>>>>
>>>> The image_load sysfs file may be written with a key word to trigger a
>>>> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>>>>
>>>> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
>>>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>>>> ---
>>>> v3:
>>>> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
>>>> v2:
>>>> adds the steps for how to use the "image_load" sysfs file.
>>>> ---
>>>>  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
>>>>  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
>>>>  2 files changed, 139 insertions(+)
>>>>
>>>> diff --git
>>>> a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> index 0a41afe0ab4c..3d8f04ca6f1b 100644
>>>> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
>>>>  Description:	Read only. Returns number of times the secure update
>>>>  		staging area has been flashed.
>>>>  		Format: "%u".
>>>> +
>>>> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-
>> update/.../control/available_images
>>>> +Date:		July 2022
>>>> +KernelVersion:  5.20
>>>> +Contact:	Russ Weight <russell.h.weight@intel.com>
>>>> +Description:	Read-only. This file returns a space separated list of
>>>> +		key words that may be written into the image_load file
>>>> +		described below. These keywords decribe an FPGA, BMC,
>>>> +		or firmware image in FLASH or EEPROM storage that may
>>>> +		be loaded.
>>>> +
>>>> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-
>> update/.../control/image_load
>>>> +Date:		July 2022
>>>> +KernelVersion:  5.20
>>>> +Contact:	Russ Weight <russell.h.weight@intel.com>
>>>> +Description:	Write-only. A key word may be written to this file to
>>>> +		trigger a reload of an FPGA, BMC, or firmware image from
>>>> +		FLASH or EEPROM. Refer to the available_images file for a
>>>> +		list of supported key words for the underlying device.
>>>> +		Writing an unsupported string to this file will result in
>>>> +		EINVAL being returned.
>>>> +		It should remove all of resources related to the old FPGA/BMC
>>>> +		image before trigger the image reload otherwise the host system
>>>> +		may crash. We recommended that follow the below steps or
>> directly
>>>
>>> I suggest we solve this concern first before other detailed refinements.
>
> I have run the stress test on N3000 and N6000 card by only directly write string to " image_load"  sysfs file,
> and observed  that there are no Host System crash occurred, but the BMC functionality failed, I think this is
> the expected behavior  that is why we strongly recommended that end-user use our RSU script or follow our
> provided steps.
>
>>>
>>> My opinion is, don't make the sysfs interface dependent of other user
>>> interfaces, like the following:
>>>
>>>> +		use the OPAE RSU script to perform the reload for FPGA/BMC
>> image.
>>>> +		Here is the steps to trigger the reload for FPGA/BMC image:
>>>> +		1. disable the AER of the FPGA card.
>>>> +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
>>>> +		3. trigger image reload via "image_load" sysfs file.
>>>> +		4. remove all of PCI devices of the FPGA card via
>>>> +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
>>>> +		5. wait 10 seconds.
>>>> +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
>>>> +	        7. enable the AER of the FPGA card.
>>>
>>> It is not a good idea the writing of the sysfs node crashes the
>>> system, if we don't follow the whole steps.
>>>
>>> Thanks,
>>> Yilun
>>
>> Hi Yilun,
>>
>> The use case being supported with this trigger is the ability to reconfigure a FPGA or
>> other programmable componenet on a board without the requiring the HW platform
>> be able to power cycle a PCI slot or power cycling the whole system.  Unfortunately,
>> when a FPGA connected to a PCI bus is reconfigured, it can cause a PCI error.  The
>> actual pci error, if any, and any mitigation steps to handle the error is platform
>> specific and dependent on the FPGA image itself.  Therefore predicting and
>> implementing all necessary error mitigation in the kernel as part of the trigger would
>> be an impossible task.
>
> I agree with Matthew's comment.
> How about we change the documentation as below:
>
> It should remove all of resources related to the old FPGA/BMC
> image before trigger the image reload otherwise the functionalities
> of FPGA/BMC will fail. We recommended that follow the below steps
> or directly use the OPAE RSU script to perform the reload for FPGA/BMC
> image.
> Here is the steps to trigger the reload for FPGA/BMC image:
> 1. disable the AER of the FPGA card.
> 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> 3. trigger image reload via "image_load" sysfs file.
> 4. remove all of PCI devices of the FPGA card via
> "/sys/bus/pci/devices/xxxx/remove" sysfs file.
> 5. wait 10 seconds.
> 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> 7. enable the AER of the FPGA card.
>

Hi Tianfei,

I think the safe sequence you describe can be made more robust as follows:

1. Unbind driver from all VFs to be affected by the trigger.
2. Shutdown all VFs from step 1.
3. Unbind driver from all PFs affectected by the trigger except the 
management PF.
4. Disable AER on all PFs and upstream pci root port.
5. trigger image reload of FPGA and/or other programmable devices
6. Unbind driver from management PF
7. Wait for reconfiguration of FPGA an/or other programmable parts (e.g. 
10 seconds)
8. re-scan PCI bus.
9. renable the AER on all PFs and upstream root port
10. bring up VFs as desired.

All of the steps above, except step 5, is easily performed in user space. 
The exact offset of the trigger register and the method to access it are
board specific.  These board differences are already handled in the 
intel-m10-bmc framework, which makes it possible to provide a consistent 
user interface accross the different boards for triggering the 
reconfiguration.

Matthew

>
>
>>
>> Matthew
>>
>>>
>>>> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-
>> bmc-sec-update.c
>>>> index 72c677c910de..3a082911cf67 100644
>>>> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> @@ -14,6 +14,8 @@
>>>>  #include <linux/platform_device.h>
>>>>  #include <linux/slab.h>
>>>>
>>>> +struct image_load;
>>>> +
>>>>  struct m10bmc_sec {
>>>>  	struct device *dev;
>>>>  	struct intel_m10bmc *m10bmc;
>>>> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>>>>  	char *fw_name;
>>>>  	u32 fw_name_id;
>>>>  	bool cancel_request;
>>>> +	struct image_load *image_load;	/* terminated with { } member */
>>>> +};
>>>> +
>>>> +struct image_load {
>>>> +	const char *name;
>>>> +	int (*load_image)(struct m10bmc_sec *sec);
>>>>  };
>>>>
>>>>  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
>>>> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR +
>> CSK_VEC_OFFSET);
>>>>
>>>>  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
>>>>
>>>> +static ssize_t available_images_show(struct device *dev,
>>>> +				     struct device_attribute *attr, char *buf)
>>>> +{
>>>> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> +	const struct image_load *hndlr;
>>>> +	ssize_t count = 0;
>>>> +
>>>> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> +		count += scnprintf(buf + count, PAGE_SIZE - count,
>>>> +				   "%s ", hndlr->name);
>>>> +	}
>>>> +
>>>> +	buf[count - 1] = '\n';
>>>> +
>>>> +	return count;
>>>> +}
>>>> +static DEVICE_ATTR_RO(available_images);
>>>> +
>>>> +static ssize_t image_load_store(struct device *dev,
>>>> +				struct device_attribute *attr,
>>>> +				const char *buf, size_t count)
>>>> +{
>>>> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> +	const struct image_load *hndlr;
>>>> +	int ret = -EINVAL;
>>>> +
>>>> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> +		if (sysfs_streq(buf, hndlr->name)) {
>>>> +			ret = hndlr->load_image(sec);
>>>> +			break;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	return ret ? : count;
>>>> +}
>>>> +static DEVICE_ATTR_WO(image_load);
>>>> +
>>>> +static struct attribute *m10bmc_control_attrs[] = {
>>>> +	&dev_attr_available_images.attr,
>>>> +	&dev_attr_image_load.attr,
>>>> +	NULL,
>>>> +};
>>>> +
>>>> +static struct attribute_group m10bmc_control_attr_group = {
>>>> +	.name = "control",
>>>> +	.attrs = m10bmc_control_attrs,
>>>> +};
>>>> +
>>>>  static ssize_t flash_count_show(struct device *dev,
>>>>  				struct device_attribute *attr, char *buf)
>>>>  {
>>>> @@ -195,6 +251,7 @@ static struct attribute_group
>> m10bmc_security_attr_group = {
>>>>
>>>>  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>>>>  	&m10bmc_security_attr_group,
>>>> +	&m10bmc_control_attr_group,
>>>>  	NULL,
>>>>  };
>>>>
>>>> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32
>> doorbell)
>>>>  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>>>>  }
>>>>
>>>> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
>>>> +				     unsigned int val)
>>>> +{
>>>> +	u32 doorbell;
>>>> +	int ret;
>>>> +
>>>> +	if (val > 1) {
>>>> +		dev_err(sec->dev, "invalid reload val = %d\n", val);
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
>>>> +	if (ret)
>>>> +		return ret;
>>>> +
>>>> +	if (doorbell & DRBL_REBOOT_DISABLED)
>>>> +		return -EBUSY;
>>>> +
>>>> +	return regmap_update_bits(sec->m10bmc->regmap,
>>>> +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
>>>> +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
>>>> +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
>>>> +				  DRBL_REBOOT_REQ);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
>>>> +{
>>>> +	return m10bmc_sec_bmc_image_load(sec, 0);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
>>>> +{
>>>> +	return m10bmc_sec_bmc_image_load(sec, 1);
>>>> +}
>>>> +
>>>> +static struct image_load m10bmc_image_load_hndlrs[] = {
>>>> +	{
>>>> +		.name = "bmc_factory",
>>>> +		.load_image = m10bmc_sec_bmc_image_load_1,
>>>> +	},
>>>> +	{
>>>> +		.name = "bmc_user",
>>>> +		.load_image = m10bmc_sec_bmc_image_load_0,
>>>> +	},
>>>> +	{}
>>>> +};
>>>> +
>>>>  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>>>>  {
>>>>  	u32 doorbell;
>>>> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device
>> *pdev)
>>>>  	sec->dev = &pdev->dev;
>>>>  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>>>>  	dev_set_drvdata(&pdev->dev, sec);
>>>> +	sec->image_load = m10bmc_image_load_hndlrs;
>>>>
>>>>  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>>>>  		       xa_limit_32b, GFP_KERNEL);
>>>> --
>>>> 2.26.2
>>>>
>>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
  2022-08-09 16:30       ` Xu Yilun
  2022-08-10 13:50         ` Zhang, Tianfei
@ 2022-08-11 21:03         ` matthew.gerlach
  1 sibling, 0 replies; 11+ messages in thread
From: matthew.gerlach @ 2022-08-11 21:03 UTC (permalink / raw)
  To: Xu Yilun
  Cc: Tianfei Zhang, mdf, linux-fpga, lee.jones, russell.h.weight,
	hao.wu, trix



On Wed, 10 Aug 2022, Xu Yilun wrote:

> On 2022-08-08 at 16:39:23 -0700, matthew.gerlach@linux.intel.com wrote:
>>
>>
>> On Mon, 8 Aug 2022, Xu Yilun wrote:
>>
>>> On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
>>>> From: Russ Weight <russell.h.weight@intel.com>
>>>>
>>>> Add the available_images and image_load sysfs files. The available_images
>>>> file returns a space separated list of key words that may be written
>>>> into the image_load file. These keywords describe an FPGA, BMC, or
>>>> firmware image in FLASH or EEPROM storage that may be loaded.
>>>>
>>>> The image_load sysfs file may be written with a key word to trigger a
>>>> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>>>>
>>>> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
>>>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>>>> ---
>>>> v3:
>>>> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
>>>> v2:
>>>> adds the steps for how to use the "image_load" sysfs file.
>>>> ---
>>>>  .../sysfs-driver-intel-m10-bmc-sec-update     |  34 ++++++
>>>>  drivers/fpga/intel-m10-bmc-sec-update.c       | 105 ++++++++++++++++++
>>>>  2 files changed, 139 insertions(+)
>>>>
>>>> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> index 0a41afe0ab4c..3d8f04ca6f1b 100644
>>>> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> @@ -59,3 +59,37 @@ Contact:	Russ Weight <russell.h.weight@intel.com>
>>>>  Description:	Read only. Returns number of times the secure update
>>>>  		staging area has been flashed.
>>>>  		Format: "%u".
>>>> +
>>>> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
>>>> +Date:		July 2022
>>>> +KernelVersion:  5.20
>>>> +Contact:	Russ Weight <russell.h.weight@intel.com>
>>>> +Description:	Read-only. This file returns a space separated list of
>>>> +		key words that may be written into the image_load file
>>>> +		described below. These keywords decribe an FPGA, BMC,
>>>> +		or firmware image in FLASH or EEPROM storage that may
>>>> +		be loaded.
>>>> +
>>>> +What:		/sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
>>>> +Date:		July 2022
>>>> +KernelVersion:  5.20
>>>> +Contact:	Russ Weight <russell.h.weight@intel.com>
>>>> +Description:	Write-only. A key word may be written to this file to
>>>> +		trigger a reload of an FPGA, BMC, or firmware image from
>>>> +		FLASH or EEPROM. Refer to the available_images file for a
>>>> +		list of supported key words for the underlying device.
>>>> +		Writing an unsupported string to this file will result in
>>>> +		EINVAL being returned.
>>>> +		It should remove all of resources related to the old FPGA/BMC
>>>> +		image before trigger the image reload otherwise the host system
>>>> +		may crash. We recommended that follow the below steps or directly
>>>
>>> I suggest we solve this concern first before other detailed refinements.
>>>
>>> My opinion is, don't make the sysfs interface dependent of other user
>>> interfaces, like the following:
>>>
>>>> +		use the OPAE RSU script to perform the reload for FPGA/BMC image.
>>>> +		Here is the steps to trigger the reload for FPGA/BMC image:
>>>> +		1. disable the AER of the FPGA card.
>>>> +		2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
>>>> +		3. trigger image reload via "image_load" sysfs file.
>>>> +		4. remove all of PCI devices of the FPGA card via
>>>> +		"/sys/bus/pci/devices/xxxx/remove" sysfs file.
>>>> +		5. wait 10 seconds.
>>>> +		6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
>>>> +	        7. enable the AER of the FPGA card.
>>>
>>> It is not a good idea the writing of the sysfs node crashes the system,
>>> if we don't follow the whole steps.
>>>
>>> Thanks,
>>> Yilun
>>
>> Hi Yilun,
>>
>> The use case being supported with this trigger is the ability to reconfigure
>> a FPGA or other programmable componenet on a board without the requiring the
>> HW platform be able to power cycle a PCI slot or power cycling the whole
>> system.  Unfortunately, when a FPGA connected to a PCI bus is reconfigured,
>> it can cause a PCI error.  The actual pci error, if any, and any mitigation
>> steps to handle the error is platform specific and dependent on the FPGA
>> image itself.  Therefore predicting and implementing all necessary error
>
> Why the error handling is unpredictable?
>
Error handling is unpredictable and platform specific for two reasons. 
First, the overall implementation of PCI bus can have differences.
For example, Advanced Error Reporting and/or Downstream Port Containment are
not implemented on every platform.  Secondly, the actual error caused the
triggering can be specific to a board and/or a FPGA design on the board.

>> mitigation in the kernel as part of the trigger would be an impossible task.
>
> Or could we just gate the pcie link? Just like we should disable fpga
> bridges before reprogramming any fpga region.

The ability to safely gate a pcie link is also a platform specific 
feature.  While platforms exists that provide the ability for safe PCIe hot 
swap, many do not.  Keeping as little of the overall safe sequence of FPGA 
reconfiguration in the kernel allows for the greatest flexibility to 
implement the platform specific part of the safe sequence in user space.

  >
> Actually I did find something for link disabing which may be useful.
>
> https://patchwork.kernel.org/project/linux-pci/patch/20190529104942.74991-1-mika.westerberg@linux.intel.com/
>
> Thanks,
> Yilun
>
>>
>> Matthew
>>
>>>
>>>> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> index 72c677c910de..3a082911cf67 100644
>>>> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> @@ -14,6 +14,8 @@
>>>>  #include <linux/platform_device.h>
>>>>  #include <linux/slab.h>
>>>>
>>>> +struct image_load;
>>>> +
>>>>  struct m10bmc_sec {
>>>>  	struct device *dev;
>>>>  	struct intel_m10bmc *m10bmc;
>>>> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>>>>  	char *fw_name;
>>>>  	u32 fw_name_id;
>>>>  	bool cancel_request;
>>>> +	struct image_load *image_load;	/* terminated with { } member */
>>>> +};
>>>> +
>>>> +struct image_load {
>>>> +	const char *name;
>>>> +	int (*load_image)(struct m10bmc_sec *sec);
>>>>  };
>>>>
>>>>  static DEFINE_XARRAY_ALLOC(fw_upload_xa);
>>>> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
>>>>
>>>>  #define FLASH_COUNT_SIZE 4096	/* count stored as inverted bit vector */
>>>>
>>>> +static ssize_t available_images_show(struct device *dev,
>>>> +				     struct device_attribute *attr, char *buf)
>>>> +{
>>>> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> +	const struct image_load *hndlr;
>>>> +	ssize_t count = 0;
>>>> +
>>>> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> +		count += scnprintf(buf + count, PAGE_SIZE - count,
>>>> +				   "%s ", hndlr->name);
>>>> +	}
>>>> +
>>>> +	buf[count - 1] = '\n';
>>>> +
>>>> +	return count;
>>>> +}
>>>> +static DEVICE_ATTR_RO(available_images);
>>>> +
>>>> +static ssize_t image_load_store(struct device *dev,
>>>> +				struct device_attribute *attr,
>>>> +				const char *buf, size_t count)
>>>> +{
>>>> +	struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> +	const struct image_load *hndlr;
>>>> +	int ret = -EINVAL;
>>>> +
>>>> +	for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> +		if (sysfs_streq(buf, hndlr->name)) {
>>>> +			ret = hndlr->load_image(sec);
>>>> +			break;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	return ret ? : count;
>>>> +}
>>>> +static DEVICE_ATTR_WO(image_load);
>>>> +
>>>> +static struct attribute *m10bmc_control_attrs[] = {
>>>> +	&dev_attr_available_images.attr,
>>>> +	&dev_attr_image_load.attr,
>>>> +	NULL,
>>>> +};
>>>> +
>>>> +static struct attribute_group m10bmc_control_attr_group = {
>>>> +	.name = "control",
>>>> +	.attrs = m10bmc_control_attrs,
>>>> +};
>>>> +
>>>>  static ssize_t flash_count_show(struct device *dev,
>>>>  				struct device_attribute *attr, char *buf)
>>>>  {
>>>> @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
>>>>
>>>>  static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>>>>  	&m10bmc_security_attr_group,
>>>> +	&m10bmc_control_attr_group,
>>>>  	NULL,
>>>>  };
>>>>
>>>> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
>>>>  		dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>>>>  }
>>>>
>>>> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
>>>> +				     unsigned int val)
>>>> +{
>>>> +	u32 doorbell;
>>>> +	int ret;
>>>> +
>>>> +	if (val > 1) {
>>>> +		dev_err(sec->dev, "invalid reload val = %d\n", val);
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
>>>> +	if (ret)
>>>> +		return ret;
>>>> +
>>>> +	if (doorbell & DRBL_REBOOT_DISABLED)
>>>> +		return -EBUSY;
>>>> +
>>>> +	return regmap_update_bits(sec->m10bmc->regmap,
>>>> +				  M10BMC_SYS_BASE + M10BMC_DOORBELL,
>>>> +				  DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
>>>> +				  FIELD_PREP(DRBL_CONFIG_SEL, val) |
>>>> +				  DRBL_REBOOT_REQ);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
>>>> +{
>>>> +	return m10bmc_sec_bmc_image_load(sec, 0);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
>>>> +{
>>>> +	return m10bmc_sec_bmc_image_load(sec, 1);
>>>> +}
>>>> +
>>>> +static struct image_load m10bmc_image_load_hndlrs[] = {
>>>> +	{
>>>> +		.name = "bmc_factory",
>>>> +		.load_image = m10bmc_sec_bmc_image_load_1,
>>>> +	},
>>>> +	{
>>>> +		.name = "bmc_user",
>>>> +		.load_image = m10bmc_sec_bmc_image_load_0,
>>>> +	},
>>>> +	{}
>>>> +};
>>>> +
>>>>  static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>>>>  {
>>>>  	u32 doorbell;
>>>> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
>>>>  	sec->dev = &pdev->dev;
>>>>  	sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>>>>  	dev_set_drvdata(&pdev->dev, sec);
>>>> +	sec->image_load = m10bmc_image_load_hndlrs;
>>>>
>>>>  	ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>>>>  		       xa_limit_32b, GFP_KERNEL);
>>>> --
>>>> 2.26.2
>>>>
>>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-08-11 21:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08  5:33 [PATCH v3 0/2] add Intel FPGA image reload support Tianfei Zhang
2022-08-08  5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
2022-08-08  9:57   ` Xu Yilun
2022-08-08 23:39     ` matthew.gerlach
2022-08-09  7:04       ` Zhang, Tianfei
2022-08-11 20:41         ` matthew.gerlach
2022-08-09 16:30       ` Xu Yilun
2022-08-10 13:50         ` Zhang, Tianfei
2022-08-11 21:03         ` matthew.gerlach
2022-08-08  5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
2022-08-08 16:22   ` Lee Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.