* [PATCH v3 0/2] add Intel FPGA image reload support
@ 2022-08-08 5:33 Tianfei Zhang
2022-08-08 5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
2022-08-08 5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
0 siblings, 2 replies; 11+ messages in thread
From: Tianfei Zhang @ 2022-08-08 5:33 UTC (permalink / raw)
To: mdf, yilun.xu, linux-fpga, lee.jones, russell.h.weight
Cc: hao.wu, trix, Tianfei Zhang
This patchset adds FPGA image reload support on Intel PAC N3000 Card.
Image reload means that a reload mechanism of an FPGA, BMC, or
firmware image from FLASH or EEPROM after flash or program the images
without power cycle the server.
This patchset introduces 2 new sysfs files for query the available
images and trigger a image reload.
patch 1: add the available_images and image_load sysfs files.
Write a key word into image_load sysfs file to trigger a reload of an
FPGA, BMC, or firmware image from FLASH or EEPROM.
patch 2: add a trigger to update a new Retimer firmware.
v3:
From Lee Jones's comment, uses regmap_update_bits() API instead of
the wrapper of m10bmc_sys_update_bits().
v2:
add more detail about how to use the image_load sysfs files in ABI
documentation.
Russ Weight (2):
fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback
.../sysfs-driver-intel-m10-bmc-sec-update | 34 +++
drivers/fpga/intel-m10-bmc-sec-update.c | 253 ++++++++++++++++++
include/linux/mfd/intel-m10-bmc.h | 31 +++
3 files changed, 318 insertions(+)
--
2.26.2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-08 5:33 [PATCH v3 0/2] add Intel FPGA image reload support Tianfei Zhang
@ 2022-08-08 5:33 ` Tianfei Zhang
2022-08-08 9:57 ` Xu Yilun
2022-08-08 5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
1 sibling, 1 reply; 11+ messages in thread
From: Tianfei Zhang @ 2022-08-08 5:33 UTC (permalink / raw)
To: mdf, yilun.xu, linux-fpga, lee.jones, russell.h.weight
Cc: hao.wu, trix, Tianfei Zhang
From: Russ Weight <russell.h.weight@intel.com>
Add the available_images and image_load sysfs files. The available_images
file returns a space separated list of key words that may be written
into the image_load file. These keywords describe an FPGA, BMC, or
firmware image in FLASH or EEPROM storage that may be loaded.
The image_load sysfs file may be written with a key word to trigger a
reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
---
v3:
uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
v2:
adds the steps for how to use the "image_load" sysfs file.
---
.../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
2 files changed, 139 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
index 0a41afe0ab4c..3d8f04ca6f1b 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
+++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
@@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
Description: Read only. Returns number of times the secure update
staging area has been flashed.
Format: "%u".
+
+What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
+Date: July 2022
+KernelVersion: 5.20
+Contact: Russ Weight <russell.h.weight@intel.com>
+Description: Read-only. This file returns a space separated list of
+ key words that may be written into the image_load file
+ described below. These keywords decribe an FPGA, BMC,
+ or firmware image in FLASH or EEPROM storage that may
+ be loaded.
+
+What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
+Date: July 2022
+KernelVersion: 5.20
+Contact: Russ Weight <russell.h.weight@intel.com>
+Description: Write-only. A key word may be written to this file to
+ trigger a reload of an FPGA, BMC, or firmware image from
+ FLASH or EEPROM. Refer to the available_images file for a
+ list of supported key words for the underlying device.
+ Writing an unsupported string to this file will result in
+ EINVAL being returned.
+ It should remove all of resources related to the old FPGA/BMC
+ image before trigger the image reload otherwise the host system
+ may crash. We recommended that follow the below steps or directly
+ use the OPAE RSU script to perform the reload for FPGA/BMC image.
+ Here is the steps to trigger the reload for FPGA/BMC image:
+ 1. disable the AER of the FPGA card.
+ 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
+ 3. trigger image reload via "image_load" sysfs file.
+ 4. remove all of PCI devices of the FPGA card via
+ "/sys/bus/pci/devices/xxxx/remove" sysfs file.
+ 5. wait 10 seconds.
+ 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
+ 7. enable the AER of the FPGA card.
diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
index 72c677c910de..3a082911cf67 100644
--- a/drivers/fpga/intel-m10-bmc-sec-update.c
+++ b/drivers/fpga/intel-m10-bmc-sec-update.c
@@ -14,6 +14,8 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
+struct image_load;
+
struct m10bmc_sec {
struct device *dev;
struct intel_m10bmc *m10bmc;
@@ -21,6 +23,12 @@ struct m10bmc_sec {
char *fw_name;
u32 fw_name_id;
bool cancel_request;
+ struct image_load *image_load; /* terminated with { } member */
+};
+
+struct image_load {
+ const char *name;
+ int (*load_image)(struct m10bmc_sec *sec);
};
static DEFINE_XARRAY_ALLOC(fw_upload_xa);
@@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
#define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
+static ssize_t available_images_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct m10bmc_sec *sec = dev_get_drvdata(dev);
+ const struct image_load *hndlr;
+ ssize_t count = 0;
+
+ for (hndlr = sec->image_load; hndlr->name; hndlr++) {
+ count += scnprintf(buf + count, PAGE_SIZE - count,
+ "%s ", hndlr->name);
+ }
+
+ buf[count - 1] = '\n';
+
+ return count;
+}
+static DEVICE_ATTR_RO(available_images);
+
+static ssize_t image_load_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct m10bmc_sec *sec = dev_get_drvdata(dev);
+ const struct image_load *hndlr;
+ int ret = -EINVAL;
+
+ for (hndlr = sec->image_load; hndlr->name; hndlr++) {
+ if (sysfs_streq(buf, hndlr->name)) {
+ ret = hndlr->load_image(sec);
+ break;
+ }
+ }
+
+ return ret ? : count;
+}
+static DEVICE_ATTR_WO(image_load);
+
+static struct attribute *m10bmc_control_attrs[] = {
+ &dev_attr_available_images.attr,
+ &dev_attr_image_load.attr,
+ NULL,
+};
+
+static struct attribute_group m10bmc_control_attr_group = {
+ .name = "control",
+ .attrs = m10bmc_control_attrs,
+};
+
static ssize_t flash_count_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
static const struct attribute_group *m10bmc_sec_attr_groups[] = {
&m10bmc_security_attr_group,
+ &m10bmc_control_attr_group,
NULL,
};
@@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
}
+static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
+ unsigned int val)
+{
+ u32 doorbell;
+ int ret;
+
+ if (val > 1) {
+ dev_err(sec->dev, "invalid reload val = %d\n", val);
+ return -EINVAL;
+ }
+
+ ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
+ if (ret)
+ return ret;
+
+ if (doorbell & DRBL_REBOOT_DISABLED)
+ return -EBUSY;
+
+ return regmap_update_bits(sec->m10bmc->regmap,
+ M10BMC_SYS_BASE + M10BMC_DOORBELL,
+ DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
+ FIELD_PREP(DRBL_CONFIG_SEL, val) |
+ DRBL_REBOOT_REQ);
+}
+
+static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
+{
+ return m10bmc_sec_bmc_image_load(sec, 0);
+}
+
+static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
+{
+ return m10bmc_sec_bmc_image_load(sec, 1);
+}
+
+static struct image_load m10bmc_image_load_hndlrs[] = {
+ {
+ .name = "bmc_factory",
+ .load_image = m10bmc_sec_bmc_image_load_1,
+ },
+ {
+ .name = "bmc_user",
+ .load_image = m10bmc_sec_bmc_image_load_0,
+ },
+ {}
+};
+
static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
{
u32 doorbell;
@@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
sec->dev = &pdev->dev;
sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
dev_set_drvdata(&pdev->dev, sec);
+ sec->image_load = m10bmc_image_load_hndlrs;
ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
xa_limit_32b, GFP_KERNEL);
--
2.26.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback
2022-08-08 5:33 [PATCH v3 0/2] add Intel FPGA image reload support Tianfei Zhang
2022-08-08 5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
@ 2022-08-08 5:33 ` Tianfei Zhang
2022-08-08 16:22 ` Lee Jones
1 sibling, 1 reply; 11+ messages in thread
From: Tianfei Zhang @ 2022-08-08 5:33 UTC (permalink / raw)
To: mdf, yilun.xu, linux-fpga, lee.jones, russell.h.weight
Cc: hao.wu, trix, Tianfei Zhang
From: Russ Weight <russell.h.weight@intel.com>
Create m10bmc_sec_retimer_load() callback function
to provide a trigger to update a new retimer (Intel
C827 Ethernet transceiver) firmware on Intel PAC
N3000 Card.
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
---
v3:
uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
---
drivers/fpga/intel-m10-bmc-sec-update.c | 148 ++++++++++++++++++++++++
include/linux/mfd/intel-m10-bmc.h | 31 +++++
2 files changed, 179 insertions(+)
diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
index 3a082911cf67..bef07e97c107 100644
--- a/drivers/fpga/intel-m10-bmc-sec-update.c
+++ b/drivers/fpga/intel-m10-bmc-sec-update.c
@@ -300,6 +300,150 @@ static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
return m10bmc_sec_bmc_image_load(sec, 1);
}
+static int trigger_retimer_eeprom_load(struct m10bmc_sec *sec)
+{
+ struct intel_m10bmc *m10bmc = sec->m10bmc;
+ unsigned int val;
+ int ret;
+
+ ret = regmap_update_bits(m10bmc->regmap,
+ M10BMC_SYS_BASE + M10BMC_DOORBELL,
+ DRBL_PKVL_EEPROM_LOAD_SEC,
+ DRBL_PKVL_EEPROM_LOAD_SEC);
+ if (ret)
+ return ret;
+
+ /*
+ * If the current NIOS FW supports this retimer update feature, then
+ * it will clear the same PKVL_EEPROM_LOAD bit in 2 seconds. Otherwise
+ * the driver needs to clear the PKVL_EEPROM_LOAD bit manually and
+ * return an error code.
+ */
+ ret = regmap_read_poll_timeout(m10bmc->regmap,
+ M10BMC_SYS_BASE + M10BMC_DOORBELL, val,
+ (!(val & DRBL_PKVL_EEPROM_LOAD_SEC)),
+ M10BMC_PKVL_LOAD_INTERVAL_US,
+ M10BMC_PKVL_LOAD_TIMEOUT_US);
+ if (ret == -ETIMEDOUT) {
+ dev_err(sec->dev, "PKVL_EEPROM_LOAD clear timedout\n");
+ regmap_update_bits(m10bmc->regmap,
+ M10BMC_SYS_BASE + M10BMC_DOORBELL,
+ DRBL_PKVL_EEPROM_LOAD_SEC, 0);
+ ret = -ENODEV;
+ } else if (ret) {
+ dev_err(sec->dev, "poll EEPROM_LOAD error %d\n", ret);
+ }
+
+ return ret;
+}
+
+static int poll_retimer_eeprom_load_done(struct m10bmc_sec *sec)
+{
+ struct intel_m10bmc *m10bmc = sec->m10bmc;
+ unsigned int doorbell;
+ int ret;
+
+ /*
+ * RSU_STAT_PKVL_REJECT indicates that the current image is
+ * already programmed. RSU_PROG_PKVL_PROM_DONE that the firmware
+ * update process has finished, but does not necessarily indicate
+ * a successful update.
+ */
+ ret = regmap_read_poll_timeout(m10bmc->regmap,
+ M10BMC_SYS_BASE + M10BMC_DOORBELL,
+ doorbell,
+ ((rsu_prog(doorbell) ==
+ RSU_PROG_PKVL_PROM_DONE) ||
+ (rsu_stat(doorbell) ==
+ RSU_STAT_PKVL_REJECT)),
+ M10BMC_PKVL_PRELOAD_INTERVAL_US,
+ M10BMC_PKVL_PRELOAD_TIMEOUT_US);
+ if (ret) {
+ if (ret == -ETIMEDOUT)
+ dev_err(sec->dev,
+ "Doorbell check timedout: 0x%08x\n", doorbell);
+ else
+ dev_err(sec->dev, "poll Doorbell error\n");
+ return ret;
+ }
+
+ if (rsu_stat(doorbell) == RSU_STAT_PKVL_REJECT) {
+ dev_err(sec->dev, "duplicate image rejected\n");
+ return -ECANCELED;
+ }
+
+ return 0;
+}
+
+static int poll_retimer_preload_done(struct m10bmc_sec *sec)
+{
+ struct intel_m10bmc *m10bmc = sec->m10bmc;
+ unsigned int val;
+ int ret;
+
+ /*
+ * Wait for the updated firmware to be loaded by the PKVL device
+ * and confirm that the updated firmware is operational
+ */
+ ret = regmap_read_poll_timeout(m10bmc->regmap,
+ M10BMC_SYS_BASE + M10BMC_PKVL_POLL_CTRL, val,
+ ((val & M10BMC_PKVL_PRELOAD) == M10BMC_PKVL_PRELOAD),
+ M10BMC_PKVL_PRELOAD_INTERVAL_US,
+ M10BMC_PKVL_PRELOAD_TIMEOUT_US);
+ if (ret) {
+ dev_err(sec->dev, "poll M10BMC_PKVL_PRELOAD error %d\n", ret);
+ return ret;
+ }
+
+ if ((val & M10BMC_PKVL_UPG_STATUS_MASK) != M10BMC_PKVL_UPG_STATUS_GOOD) {
+ dev_err(sec->dev, "error detected during upgrade\n");
+ return -EIO;
+ }
+
+ return 0;
+}
+
+static int retimer_check_idle(struct m10bmc_sec *sec)
+{
+ u32 doorbell;
+ int ret;
+
+ ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
+ if (ret)
+ return -EIO;
+
+ if (rsu_prog(doorbell) != RSU_PROG_IDLE &&
+ rsu_prog(doorbell) != RSU_PROG_RSU_DONE &&
+ rsu_prog(doorbell) != RSU_PROG_PKVL_PROM_DONE) {
+ log_error_regs(sec, doorbell);
+ return -EBUSY;
+ }
+
+ return 0;
+}
+
+static int m10bmc_sec_retimer_eeprom_load(struct m10bmc_sec *sec)
+{
+ int ret;
+
+ ret = retimer_check_idle(sec);
+ if (ret)
+ goto exit;
+
+ ret = trigger_retimer_eeprom_load(sec);
+ if (ret)
+ goto exit;
+
+ ret = poll_retimer_eeprom_load_done(sec);
+ if (ret)
+ goto exit;
+
+ ret = poll_retimer_preload_done(sec);
+
+exit:
+ return ret;
+}
+
static struct image_load m10bmc_image_load_hndlrs[] = {
{
.name = "bmc_factory",
@@ -309,6 +453,10 @@ static struct image_load m10bmc_image_load_hndlrs[] = {
.name = "bmc_user",
.load_image = m10bmc_sec_bmc_image_load_0,
},
+ {
+ .name = "retimer_fw",
+ .load_image = m10bmc_sec_retimer_eeprom_load,
+ },
{}
};
diff --git a/include/linux/mfd/intel-m10-bmc.h b/include/linux/mfd/intel-m10-bmc.h
index f0044b14136e..c06b9d3d1c5d 100644
--- a/include/linux/mfd/intel-m10-bmc.h
+++ b/include/linux/mfd/intel-m10-bmc.h
@@ -36,6 +36,37 @@
#define M10BMC_VER_PCB_INFO_MSK GENMASK(31, 24)
#define M10BMC_VER_LEGACY_INVALID 0xffffffff
+/* Retimer related registers, in system register region */
+#define M10BMC_PKVL_POLL_CTRL 0x80
+#define M10BMC_PKVL_A_PRELOAD BIT(16)
+#define M10BMC_PKVL_A_PRELOAD_TO BIT(17)
+#define M10BMC_PKVL_A_DATA_TOO_BIG BIT(18)
+#define M10BMC_PKVL_A_HDR_CKSUM BIT(20)
+#define M10BMC_PKVL_B_PRELOAD BIT(24)
+#define M10BMC_PKVL_B_PRELOAD_TO BIT(25)
+#define M10BMC_PKVL_B_DATA_TOO_BIG BIT(26)
+#define M10BMC_PKVL_B_HDR_CKSUM BIT(28)
+
+#define M10BMC_PKVL_PRELOAD (M10BMC_PKVL_A_PRELOAD | M10BMC_PKVL_B_PRELOAD)
+#define M10BMC_PKVL_PRELOAD_TIMEOUT (M10BMC_PKVL_A_PRELOAD_TO | \
+ M10BMC_PKVL_B_PRELOAD_TO)
+#define M10BMC_PKVL_DATA_TOO_BIG (M10BMC_PKVL_A_DATA_TOO_BIG | \
+ M10BMC_PKVL_B_DATA_TOO_BIG)
+#define M10BMC_PKVL_HDR_CHECKSUM (M10BMC_PKVL_A_HDR_CKSUM | \
+ M10BMC_PKVL_B_HDR_CKSUM)
+
+#define M10BMC_PKVL_UPG_STATUS_MASK (M10BMC_PKVL_PRELOAD | M10BMC_PKVL_PRELOAD_TIMEOUT |\
+ M10BMC_PKVL_DATA_TOO_BIG | M10BMC_PKVL_HDR_CHECKSUM)
+#define M10BMC_PKVL_UPG_STATUS_GOOD (M10BMC_PKVL_PRELOAD | M10BMC_PKVL_HDR_CHECKSUM)
+
+/* interval 100ms and timeout 2s */
+#define M10BMC_PKVL_LOAD_INTERVAL_US (100 * 1000)
+#define M10BMC_PKVL_LOAD_TIMEOUT_US (2 * 1000 * 1000)
+
+/* interval 100ms and timeout 30s */
+#define M10BMC_PKVL_PRELOAD_INTERVAL_US (100 * 1000)
+#define M10BMC_PKVL_PRELOAD_TIMEOUT_US (30 * 1000 * 1000)
+
/* Secure update doorbell register, in system register region */
#define M10BMC_DOORBELL 0x400
--
2.26.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-08 5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
@ 2022-08-08 9:57 ` Xu Yilun
2022-08-08 23:39 ` matthew.gerlach
0 siblings, 1 reply; 11+ messages in thread
From: Xu Yilun @ 2022-08-08 9:57 UTC (permalink / raw)
To: Tianfei Zhang; +Cc: mdf, linux-fpga, lee.jones, russell.h.weight, hao.wu, trix
On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> From: Russ Weight <russell.h.weight@intel.com>
>
> Add the available_images and image_load sysfs files. The available_images
> file returns a space separated list of key words that may be written
> into the image_load file. These keywords describe an FPGA, BMC, or
> firmware image in FLASH or EEPROM storage that may be loaded.
>
> The image_load sysfs file may be written with a key word to trigger a
> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>
> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> ---
> v3:
> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> v2:
> adds the steps for how to use the "image_load" sysfs file.
> ---
> .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
> drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
> 2 files changed, 139 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> index 0a41afe0ab4c..3d8f04ca6f1b 100644
> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
> Description: Read only. Returns number of times the secure update
> staging area has been flashed.
> Format: "%u".
> +
> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
> +Date: July 2022
> +KernelVersion: 5.20
> +Contact: Russ Weight <russell.h.weight@intel.com>
> +Description: Read-only. This file returns a space separated list of
> + key words that may be written into the image_load file
> + described below. These keywords decribe an FPGA, BMC,
> + or firmware image in FLASH or EEPROM storage that may
> + be loaded.
> +
> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
> +Date: July 2022
> +KernelVersion: 5.20
> +Contact: Russ Weight <russell.h.weight@intel.com>
> +Description: Write-only. A key word may be written to this file to
> + trigger a reload of an FPGA, BMC, or firmware image from
> + FLASH or EEPROM. Refer to the available_images file for a
> + list of supported key words for the underlying device.
> + Writing an unsupported string to this file will result in
> + EINVAL being returned.
> + It should remove all of resources related to the old FPGA/BMC
> + image before trigger the image reload otherwise the host system
> + may crash. We recommended that follow the below steps or directly
I suggest we solve this concern first before other detailed refinements.
My opinion is, don't make the sysfs interface dependent of other user
interfaces, like the following:
> + use the OPAE RSU script to perform the reload for FPGA/BMC image.
> + Here is the steps to trigger the reload for FPGA/BMC image:
> + 1. disable the AER of the FPGA card.
> + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> + 3. trigger image reload via "image_load" sysfs file.
> + 4. remove all of PCI devices of the FPGA card via
> + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
> + 5. wait 10 seconds.
> + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> + 7. enable the AER of the FPGA card.
It is not a good idea the writing of the sysfs node crashes the system,
if we don't follow the whole steps.
Thanks,
Yilun
> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
> index 72c677c910de..3a082911cf67 100644
> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> @@ -14,6 +14,8 @@
> #include <linux/platform_device.h>
> #include <linux/slab.h>
>
> +struct image_load;
> +
> struct m10bmc_sec {
> struct device *dev;
> struct intel_m10bmc *m10bmc;
> @@ -21,6 +23,12 @@ struct m10bmc_sec {
> char *fw_name;
> u32 fw_name_id;
> bool cancel_request;
> + struct image_load *image_load; /* terminated with { } member */
> +};
> +
> +struct image_load {
> + const char *name;
> + int (*load_image)(struct m10bmc_sec *sec);
> };
>
> static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
>
> #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
>
> +static ssize_t available_images_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> + const struct image_load *hndlr;
> + ssize_t count = 0;
> +
> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> + count += scnprintf(buf + count, PAGE_SIZE - count,
> + "%s ", hndlr->name);
> + }
> +
> + buf[count - 1] = '\n';
> +
> + return count;
> +}
> +static DEVICE_ATTR_RO(available_images);
> +
> +static ssize_t image_load_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> + const struct image_load *hndlr;
> + int ret = -EINVAL;
> +
> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> + if (sysfs_streq(buf, hndlr->name)) {
> + ret = hndlr->load_image(sec);
> + break;
> + }
> + }
> +
> + return ret ? : count;
> +}
> +static DEVICE_ATTR_WO(image_load);
> +
> +static struct attribute *m10bmc_control_attrs[] = {
> + &dev_attr_available_images.attr,
> + &dev_attr_image_load.attr,
> + NULL,
> +};
> +
> +static struct attribute_group m10bmc_control_attr_group = {
> + .name = "control",
> + .attrs = m10bmc_control_attrs,
> +};
> +
> static ssize_t flash_count_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
>
> static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> &m10bmc_security_attr_group,
> + &m10bmc_control_attr_group,
> NULL,
> };
>
> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
> dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
> }
>
> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> + unsigned int val)
> +{
> + u32 doorbell;
> + int ret;
> +
> + if (val > 1) {
> + dev_err(sec->dev, "invalid reload val = %d\n", val);
> + return -EINVAL;
> + }
> +
> + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> + if (ret)
> + return ret;
> +
> + if (doorbell & DRBL_REBOOT_DISABLED)
> + return -EBUSY;
> +
> + return regmap_update_bits(sec->m10bmc->regmap,
> + M10BMC_SYS_BASE + M10BMC_DOORBELL,
> + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> + FIELD_PREP(DRBL_CONFIG_SEL, val) |
> + DRBL_REBOOT_REQ);
> +}
> +
> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
> +{
> + return m10bmc_sec_bmc_image_load(sec, 0);
> +}
> +
> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
> +{
> + return m10bmc_sec_bmc_image_load(sec, 1);
> +}
> +
> +static struct image_load m10bmc_image_load_hndlrs[] = {
> + {
> + .name = "bmc_factory",
> + .load_image = m10bmc_sec_bmc_image_load_1,
> + },
> + {
> + .name = "bmc_user",
> + .load_image = m10bmc_sec_bmc_image_load_0,
> + },
> + {}
> +};
> +
> static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> {
> u32 doorbell;
> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
> sec->dev = &pdev->dev;
> sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> dev_set_drvdata(&pdev->dev, sec);
> + sec->image_load = m10bmc_image_load_hndlrs;
>
> ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> xa_limit_32b, GFP_KERNEL);
> --
> 2.26.2
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback
2022-08-08 5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
@ 2022-08-08 16:22 ` Lee Jones
0 siblings, 0 replies; 11+ messages in thread
From: Lee Jones @ 2022-08-08 16:22 UTC (permalink / raw)
To: Tianfei Zhang
Cc: mdf, yilun.xu, linux-fpga, russell.h.weight, hao.wu, trix, Lee Jones
On Mon, 08 Aug 2022, Tianfei Zhang wrote:
> From: Russ Weight <russell.h.weight@intel.com>
>
> Create m10bmc_sec_retimer_load() callback function
> to provide a trigger to update a new retimer (Intel
> C827 Ethernet transceiver) firmware on Intel PAC
> N3000 Card.
>
> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> ---
> v3:
> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> ---
> drivers/fpga/intel-m10-bmc-sec-update.c | 148 ++++++++++++++++++++++++
> include/linux/mfd/intel-m10-bmc.h | 31 +++++
Acked-by: Lee Jones <lee@kernel.org>
--
DEPRECATED: Please use lee@kernel.org
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-08 9:57 ` Xu Yilun
@ 2022-08-08 23:39 ` matthew.gerlach
2022-08-09 7:04 ` Zhang, Tianfei
2022-08-09 16:30 ` Xu Yilun
0 siblings, 2 replies; 11+ messages in thread
From: matthew.gerlach @ 2022-08-08 23:39 UTC (permalink / raw)
To: Xu Yilun
Cc: Tianfei Zhang, mdf, linux-fpga, lee.jones, russell.h.weight,
hao.wu, trix
On Mon, 8 Aug 2022, Xu Yilun wrote:
> On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
>> From: Russ Weight <russell.h.weight@intel.com>
>>
>> Add the available_images and image_load sysfs files. The available_images
>> file returns a space separated list of key words that may be written
>> into the image_load file. These keywords describe an FPGA, BMC, or
>> firmware image in FLASH or EEPROM storage that may be loaded.
>>
>> The image_load sysfs file may be written with a key word to trigger a
>> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>>
>> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>> ---
>> v3:
>> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
>> v2:
>> adds the steps for how to use the "image_load" sysfs file.
>> ---
>> .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
>> drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
>> 2 files changed, 139 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>> index 0a41afe0ab4c..3d8f04ca6f1b 100644
>> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>> @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
>> Description: Read only. Returns number of times the secure update
>> staging area has been flashed.
>> Format: "%u".
>> +
>> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
>> +Date: July 2022
>> +KernelVersion: 5.20
>> +Contact: Russ Weight <russell.h.weight@intel.com>
>> +Description: Read-only. This file returns a space separated list of
>> + key words that may be written into the image_load file
>> + described below. These keywords decribe an FPGA, BMC,
>> + or firmware image in FLASH or EEPROM storage that may
>> + be loaded.
>> +
>> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
>> +Date: July 2022
>> +KernelVersion: 5.20
>> +Contact: Russ Weight <russell.h.weight@intel.com>
>> +Description: Write-only. A key word may be written to this file to
>> + trigger a reload of an FPGA, BMC, or firmware image from
>> + FLASH or EEPROM. Refer to the available_images file for a
>> + list of supported key words for the underlying device.
>> + Writing an unsupported string to this file will result in
>> + EINVAL being returned.
>> + It should remove all of resources related to the old FPGA/BMC
>> + image before trigger the image reload otherwise the host system
>> + may crash. We recommended that follow the below steps or directly
>
> I suggest we solve this concern first before other detailed refinements.
>
> My opinion is, don't make the sysfs interface dependent of other user
> interfaces, like the following:
>
>> + use the OPAE RSU script to perform the reload for FPGA/BMC image.
>> + Here is the steps to trigger the reload for FPGA/BMC image:
>> + 1. disable the AER of the FPGA card.
>> + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
>> + 3. trigger image reload via "image_load" sysfs file.
>> + 4. remove all of PCI devices of the FPGA card via
>> + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
>> + 5. wait 10 seconds.
>> + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
>> + 7. enable the AER of the FPGA card.
>
> It is not a good idea the writing of the sysfs node crashes the system,
> if we don't follow the whole steps.
>
> Thanks,
> Yilun
Hi Yilun,
The use case being supported with this trigger is the ability
to reconfigure a FPGA or other programmable componenet on a board without the requiring
the HW platform be able to power cycle a PCI slot or power cycling the
whole system. Unfortunately, when a FPGA connected to a PCI bus is
reconfigured, it can cause a PCI error. The actual pci error, if any,
and any mitigation steps to handle the error is platform specific and dependent on
the FPGA image itself. Therefore predicting and implementing all
necessary error mitigation in the kernel as part of the trigger would be
an impossible task.
Matthew
>
>> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
>> index 72c677c910de..3a082911cf67 100644
>> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
>> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
>> @@ -14,6 +14,8 @@
>> #include <linux/platform_device.h>
>> #include <linux/slab.h>
>>
>> +struct image_load;
>> +
>> struct m10bmc_sec {
>> struct device *dev;
>> struct intel_m10bmc *m10bmc;
>> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>> char *fw_name;
>> u32 fw_name_id;
>> bool cancel_request;
>> + struct image_load *image_load; /* terminated with { } member */
>> +};
>> +
>> +struct image_load {
>> + const char *name;
>> + int (*load_image)(struct m10bmc_sec *sec);
>> };
>>
>> static DEFINE_XARRAY_ALLOC(fw_upload_xa);
>> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
>>
>> #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
>>
>> +static ssize_t available_images_show(struct device *dev,
>> + struct device_attribute *attr, char *buf)
>> +{
>> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
>> + const struct image_load *hndlr;
>> + ssize_t count = 0;
>> +
>> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>> + count += scnprintf(buf + count, PAGE_SIZE - count,
>> + "%s ", hndlr->name);
>> + }
>> +
>> + buf[count - 1] = '\n';
>> +
>> + return count;
>> +}
>> +static DEVICE_ATTR_RO(available_images);
>> +
>> +static ssize_t image_load_store(struct device *dev,
>> + struct device_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
>> + const struct image_load *hndlr;
>> + int ret = -EINVAL;
>> +
>> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>> + if (sysfs_streq(buf, hndlr->name)) {
>> + ret = hndlr->load_image(sec);
>> + break;
>> + }
>> + }
>> +
>> + return ret ? : count;
>> +}
>> +static DEVICE_ATTR_WO(image_load);
>> +
>> +static struct attribute *m10bmc_control_attrs[] = {
>> + &dev_attr_available_images.attr,
>> + &dev_attr_image_load.attr,
>> + NULL,
>> +};
>> +
>> +static struct attribute_group m10bmc_control_attr_group = {
>> + .name = "control",
>> + .attrs = m10bmc_control_attrs,
>> +};
>> +
>> static ssize_t flash_count_show(struct device *dev,
>> struct device_attribute *attr, char *buf)
>> {
>> @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
>>
>> static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>> &m10bmc_security_attr_group,
>> + &m10bmc_control_attr_group,
>> NULL,
>> };
>>
>> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
>> dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>> }
>>
>> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
>> + unsigned int val)
>> +{
>> + u32 doorbell;
>> + int ret;
>> +
>> + if (val > 1) {
>> + dev_err(sec->dev, "invalid reload val = %d\n", val);
>> + return -EINVAL;
>> + }
>> +
>> + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
>> + if (ret)
>> + return ret;
>> +
>> + if (doorbell & DRBL_REBOOT_DISABLED)
>> + return -EBUSY;
>> +
>> + return regmap_update_bits(sec->m10bmc->regmap,
>> + M10BMC_SYS_BASE + M10BMC_DOORBELL,
>> + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
>> + FIELD_PREP(DRBL_CONFIG_SEL, val) |
>> + DRBL_REBOOT_REQ);
>> +}
>> +
>> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
>> +{
>> + return m10bmc_sec_bmc_image_load(sec, 0);
>> +}
>> +
>> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
>> +{
>> + return m10bmc_sec_bmc_image_load(sec, 1);
>> +}
>> +
>> +static struct image_load m10bmc_image_load_hndlrs[] = {
>> + {
>> + .name = "bmc_factory",
>> + .load_image = m10bmc_sec_bmc_image_load_1,
>> + },
>> + {
>> + .name = "bmc_user",
>> + .load_image = m10bmc_sec_bmc_image_load_0,
>> + },
>> + {}
>> +};
>> +
>> static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>> {
>> u32 doorbell;
>> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
>> sec->dev = &pdev->dev;
>> sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>> dev_set_drvdata(&pdev->dev, sec);
>> + sec->image_load = m10bmc_image_load_hndlrs;
>>
>> ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>> xa_limit_32b, GFP_KERNEL);
>> --
>> 2.26.2
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-08 23:39 ` matthew.gerlach
@ 2022-08-09 7:04 ` Zhang, Tianfei
2022-08-11 20:41 ` matthew.gerlach
2022-08-09 16:30 ` Xu Yilun
1 sibling, 1 reply; 11+ messages in thread
From: Zhang, Tianfei @ 2022-08-09 7:04 UTC (permalink / raw)
To: matthew.gerlach, Xu, Yilun
Cc: mdf, linux-fpga, lee.jones, Weight, Russell H, Wu, Hao, trix
> -----Original Message-----
> From: matthew.gerlach@linux.intel.com <matthew.gerlach@linux.intel.com>
> Sent: Tuesday, August 9, 2022 7:39 AM
> To: Xu, Yilun <yilun.xu@intel.com>
> Cc: Zhang, Tianfei <tianfei.zhang@intel.com>; mdf@kernel.org; linux-
> fpga@vger.kernel.org; lee.jones@linaro.org; Weight, Russell H
> <russell.h.weight@intel.com>; Wu, Hao <hao.wu@intel.com>; trix@redhat.com
> Subject: Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC
> images
>
>
>
> On Mon, 8 Aug 2022, Xu Yilun wrote:
>
> > On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> >> From: Russ Weight <russell.h.weight@intel.com>
> >>
> >> Add the available_images and image_load sysfs files. The
> >> available_images file returns a space separated list of key words
> >> that may be written into the image_load file. These keywords describe
> >> an FPGA, BMC, or firmware image in FLASH or EEPROM storage that may be
> loaded.
> >>
> >> The image_load sysfs file may be written with a key word to trigger a
> >> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> >>
> >> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> >> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> >> ---
> >> v3:
> >> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> >> v2:
> >> adds the steps for how to use the "image_load" sysfs file.
> >> ---
> >> .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
> >> drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
> >> 2 files changed, 139 insertions(+)
> >>
> >> diff --git
> >> a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> index 0a41afe0ab4c..3d8f04ca6f1b 100644
> >> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> >> @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
> >> Description: Read only. Returns number of times the secure update
> >> staging area has been flashed.
> >> Format: "%u".
> >> +
> >> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/available_images
> >> +Date: July 2022
> >> +KernelVersion: 5.20
> >> +Contact: Russ Weight <russell.h.weight@intel.com>
> >> +Description: Read-only. This file returns a space separated list of
> >> + key words that may be written into the image_load file
> >> + described below. These keywords decribe an FPGA, BMC,
> >> + or firmware image in FLASH or EEPROM storage that may
> >> + be loaded.
> >> +
> >> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/image_load
> >> +Date: July 2022
> >> +KernelVersion: 5.20
> >> +Contact: Russ Weight <russell.h.weight@intel.com>
> >> +Description: Write-only. A key word may be written to this file to
> >> + trigger a reload of an FPGA, BMC, or firmware image from
> >> + FLASH or EEPROM. Refer to the available_images file for a
> >> + list of supported key words for the underlying device.
> >> + Writing an unsupported string to this file will result in
> >> + EINVAL being returned.
> >> + It should remove all of resources related to the old FPGA/BMC
> >> + image before trigger the image reload otherwise the host system
> >> + may crash. We recommended that follow the below steps or
> directly
> >
> > I suggest we solve this concern first before other detailed refinements.
I have run the stress test on N3000 and N6000 card by only directly write string to " image_load" sysfs file,
and observed that there are no Host System crash occurred, but the BMC functionality failed, I think this is
the expected behavior that is why we strongly recommended that end-user use our RSU script or follow our
provided steps.
> >
> > My opinion is, don't make the sysfs interface dependent of other user
> > interfaces, like the following:
> >
> >> + use the OPAE RSU script to perform the reload for FPGA/BMC
> image.
> >> + Here is the steps to trigger the reload for FPGA/BMC image:
> >> + 1. disable the AER of the FPGA card.
> >> + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> >> + 3. trigger image reload via "image_load" sysfs file.
> >> + 4. remove all of PCI devices of the FPGA card via
> >> + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
> >> + 5. wait 10 seconds.
> >> + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> >> + 7. enable the AER of the FPGA card.
> >
> > It is not a good idea the writing of the sysfs node crashes the
> > system, if we don't follow the whole steps.
> >
> > Thanks,
> > Yilun
>
> Hi Yilun,
>
> The use case being supported with this trigger is the ability to reconfigure a FPGA or
> other programmable componenet on a board without the requiring the HW platform
> be able to power cycle a PCI slot or power cycling the whole system. Unfortunately,
> when a FPGA connected to a PCI bus is reconfigured, it can cause a PCI error. The
> actual pci error, if any, and any mitigation steps to handle the error is platform
> specific and dependent on the FPGA image itself. Therefore predicting and
> implementing all necessary error mitigation in the kernel as part of the trigger would
> be an impossible task.
I agree with Matthew's comment.
How about we change the documentation as below:
It should remove all of resources related to the old FPGA/BMC
image before trigger the image reload otherwise the functionalities
of FPGA/BMC will fail. We recommended that follow the below steps
or directly use the OPAE RSU script to perform the reload for FPGA/BMC
image.
Here is the steps to trigger the reload for FPGA/BMC image:
1. disable the AER of the FPGA card.
2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
3. trigger image reload via "image_load" sysfs file.
4. remove all of PCI devices of the FPGA card via
"/sys/bus/pci/devices/xxxx/remove" sysfs file.
5. wait 10 seconds.
6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
7. enable the AER of the FPGA card.
>
> Matthew
>
> >
> >> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-
> bmc-sec-update.c
> >> index 72c677c910de..3a082911cf67 100644
> >> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> >> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> >> @@ -14,6 +14,8 @@
> >> #include <linux/platform_device.h>
> >> #include <linux/slab.h>
> >>
> >> +struct image_load;
> >> +
> >> struct m10bmc_sec {
> >> struct device *dev;
> >> struct intel_m10bmc *m10bmc;
> >> @@ -21,6 +23,12 @@ struct m10bmc_sec {
> >> char *fw_name;
> >> u32 fw_name_id;
> >> bool cancel_request;
> >> + struct image_load *image_load; /* terminated with { } member */
> >> +};
> >> +
> >> +struct image_load {
> >> + const char *name;
> >> + int (*load_image)(struct m10bmc_sec *sec);
> >> };
> >>
> >> static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> >> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR +
> CSK_VEC_OFFSET);
> >>
> >> #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
> >>
> >> +static ssize_t available_images_show(struct device *dev,
> >> + struct device_attribute *attr, char *buf)
> >> +{
> >> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> >> + const struct image_load *hndlr;
> >> + ssize_t count = 0;
> >> +
> >> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> >> + count += scnprintf(buf + count, PAGE_SIZE - count,
> >> + "%s ", hndlr->name);
> >> + }
> >> +
> >> + buf[count - 1] = '\n';
> >> +
> >> + return count;
> >> +}
> >> +static DEVICE_ATTR_RO(available_images);
> >> +
> >> +static ssize_t image_load_store(struct device *dev,
> >> + struct device_attribute *attr,
> >> + const char *buf, size_t count)
> >> +{
> >> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> >> + const struct image_load *hndlr;
> >> + int ret = -EINVAL;
> >> +
> >> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> >> + if (sysfs_streq(buf, hndlr->name)) {
> >> + ret = hndlr->load_image(sec);
> >> + break;
> >> + }
> >> + }
> >> +
> >> + return ret ? : count;
> >> +}
> >> +static DEVICE_ATTR_WO(image_load);
> >> +
> >> +static struct attribute *m10bmc_control_attrs[] = {
> >> + &dev_attr_available_images.attr,
> >> + &dev_attr_image_load.attr,
> >> + NULL,
> >> +};
> >> +
> >> +static struct attribute_group m10bmc_control_attr_group = {
> >> + .name = "control",
> >> + .attrs = m10bmc_control_attrs,
> >> +};
> >> +
> >> static ssize_t flash_count_show(struct device *dev,
> >> struct device_attribute *attr, char *buf)
> >> {
> >> @@ -195,6 +251,7 @@ static struct attribute_group
> m10bmc_security_attr_group = {
> >>
> >> static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> >> &m10bmc_security_attr_group,
> >> + &m10bmc_control_attr_group,
> >> NULL,
> >> };
> >>
> >> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32
> doorbell)
> >> dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
> >> }
> >>
> >> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> >> + unsigned int val)
> >> +{
> >> + u32 doorbell;
> >> + int ret;
> >> +
> >> + if (val > 1) {
> >> + dev_err(sec->dev, "invalid reload val = %d\n", val);
> >> + return -EINVAL;
> >> + }
> >> +
> >> + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + if (doorbell & DRBL_REBOOT_DISABLED)
> >> + return -EBUSY;
> >> +
> >> + return regmap_update_bits(sec->m10bmc->regmap,
> >> + M10BMC_SYS_BASE + M10BMC_DOORBELL,
> >> + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> >> + FIELD_PREP(DRBL_CONFIG_SEL, val) |
> >> + DRBL_REBOOT_REQ);
> >> +}
> >> +
> >> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
> >> +{
> >> + return m10bmc_sec_bmc_image_load(sec, 0);
> >> +}
> >> +
> >> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
> >> +{
> >> + return m10bmc_sec_bmc_image_load(sec, 1);
> >> +}
> >> +
> >> +static struct image_load m10bmc_image_load_hndlrs[] = {
> >> + {
> >> + .name = "bmc_factory",
> >> + .load_image = m10bmc_sec_bmc_image_load_1,
> >> + },
> >> + {
> >> + .name = "bmc_user",
> >> + .load_image = m10bmc_sec_bmc_image_load_0,
> >> + },
> >> + {}
> >> +};
> >> +
> >> static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> >> {
> >> u32 doorbell;
> >> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device
> *pdev)
> >> sec->dev = &pdev->dev;
> >> sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> >> dev_set_drvdata(&pdev->dev, sec);
> >> + sec->image_load = m10bmc_image_load_hndlrs;
> >>
> >> ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> >> xa_limit_32b, GFP_KERNEL);
> >> --
> >> 2.26.2
> >>
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-08 23:39 ` matthew.gerlach
2022-08-09 7:04 ` Zhang, Tianfei
@ 2022-08-09 16:30 ` Xu Yilun
2022-08-10 13:50 ` Zhang, Tianfei
2022-08-11 21:03 ` matthew.gerlach
1 sibling, 2 replies; 11+ messages in thread
From: Xu Yilun @ 2022-08-09 16:30 UTC (permalink / raw)
To: matthew.gerlach
Cc: Tianfei Zhang, mdf, linux-fpga, lee.jones, russell.h.weight,
hao.wu, trix
On 2022-08-08 at 16:39:23 -0700, matthew.gerlach@linux.intel.com wrote:
>
>
> On Mon, 8 Aug 2022, Xu Yilun wrote:
>
> > On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> > > From: Russ Weight <russell.h.weight@intel.com>
> > >
> > > Add the available_images and image_load sysfs files. The available_images
> > > file returns a space separated list of key words that may be written
> > > into the image_load file. These keywords describe an FPGA, BMC, or
> > > firmware image in FLASH or EEPROM storage that may be loaded.
> > >
> > > The image_load sysfs file may be written with a key word to trigger a
> > > reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> > >
> > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> > > ---
> > > v3:
> > > uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> > > v2:
> > > adds the steps for how to use the "image_load" sysfs file.
> > > ---
> > > .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
> > > drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
> > > 2 files changed, 139 insertions(+)
> > >
> > > diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > index 0a41afe0ab4c..3d8f04ca6f1b 100644
> > > --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
> > > Description: Read only. Returns number of times the secure update
> > > staging area has been flashed.
> > > Format: "%u".
> > > +
> > > +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
> > > +Date: July 2022
> > > +KernelVersion: 5.20
> > > +Contact: Russ Weight <russell.h.weight@intel.com>
> > > +Description: Read-only. This file returns a space separated list of
> > > + key words that may be written into the image_load file
> > > + described below. These keywords decribe an FPGA, BMC,
> > > + or firmware image in FLASH or EEPROM storage that may
> > > + be loaded.
> > > +
> > > +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
> > > +Date: July 2022
> > > +KernelVersion: 5.20
> > > +Contact: Russ Weight <russell.h.weight@intel.com>
> > > +Description: Write-only. A key word may be written to this file to
> > > + trigger a reload of an FPGA, BMC, or firmware image from
> > > + FLASH or EEPROM. Refer to the available_images file for a
> > > + list of supported key words for the underlying device.
> > > + Writing an unsupported string to this file will result in
> > > + EINVAL being returned.
> > > + It should remove all of resources related to the old FPGA/BMC
> > > + image before trigger the image reload otherwise the host system
> > > + may crash. We recommended that follow the below steps or directly
> >
> > I suggest we solve this concern first before other detailed refinements.
> >
> > My opinion is, don't make the sysfs interface dependent of other user
> > interfaces, like the following:
> >
> > > + use the OPAE RSU script to perform the reload for FPGA/BMC image.
> > > + Here is the steps to trigger the reload for FPGA/BMC image:
> > > + 1. disable the AER of the FPGA card.
> > > + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> > > + 3. trigger image reload via "image_load" sysfs file.
> > > + 4. remove all of PCI devices of the FPGA card via
> > > + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
> > > + 5. wait 10 seconds.
> > > + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> > > + 7. enable the AER of the FPGA card.
> >
> > It is not a good idea the writing of the sysfs node crashes the system,
> > if we don't follow the whole steps.
> >
> > Thanks,
> > Yilun
>
> Hi Yilun,
>
> The use case being supported with this trigger is the ability to reconfigure
> a FPGA or other programmable componenet on a board without the requiring the
> HW platform be able to power cycle a PCI slot or power cycling the whole
> system. Unfortunately, when a FPGA connected to a PCI bus is reconfigured,
> it can cause a PCI error. The actual pci error, if any, and any mitigation
> steps to handle the error is platform specific and dependent on the FPGA
> image itself. Therefore predicting and implementing all necessary error
Why the error handling is unpredictable?
> mitigation in the kernel as part of the trigger would be an impossible task.
Or could we just gate the pcie link? Just like we should disable fpga
bridges before reprogramming any fpga region.
Actually I did find something for link disabing which may be useful.
https://patchwork.kernel.org/project/linux-pci/patch/20190529104942.74991-1-mika.westerberg@linux.intel.com/
Thanks,
Yilun
>
> Matthew
>
> >
> > > diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > index 72c677c910de..3a082911cf67 100644
> > > --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> > > +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > @@ -14,6 +14,8 @@
> > > #include <linux/platform_device.h>
> > > #include <linux/slab.h>
> > >
> > > +struct image_load;
> > > +
> > > struct m10bmc_sec {
> > > struct device *dev;
> > > struct intel_m10bmc *m10bmc;
> > > @@ -21,6 +23,12 @@ struct m10bmc_sec {
> > > char *fw_name;
> > > u32 fw_name_id;
> > > bool cancel_request;
> > > + struct image_load *image_load; /* terminated with { } member */
> > > +};
> > > +
> > > +struct image_load {
> > > + const char *name;
> > > + int (*load_image)(struct m10bmc_sec *sec);
> > > };
> > >
> > > static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> > > @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
> > >
> > > #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
> > >
> > > +static ssize_t available_images_show(struct device *dev,
> > > + struct device_attribute *attr, char *buf)
> > > +{
> > > + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > + const struct image_load *hndlr;
> > > + ssize_t count = 0;
> > > +
> > > + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > + count += scnprintf(buf + count, PAGE_SIZE - count,
> > > + "%s ", hndlr->name);
> > > + }
> > > +
> > > + buf[count - 1] = '\n';
> > > +
> > > + return count;
> > > +}
> > > +static DEVICE_ATTR_RO(available_images);
> > > +
> > > +static ssize_t image_load_store(struct device *dev,
> > > + struct device_attribute *attr,
> > > + const char *buf, size_t count)
> > > +{
> > > + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > + const struct image_load *hndlr;
> > > + int ret = -EINVAL;
> > > +
> > > + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > + if (sysfs_streq(buf, hndlr->name)) {
> > > + ret = hndlr->load_image(sec);
> > > + break;
> > > + }
> > > + }
> > > +
> > > + return ret ? : count;
> > > +}
> > > +static DEVICE_ATTR_WO(image_load);
> > > +
> > > +static struct attribute *m10bmc_control_attrs[] = {
> > > + &dev_attr_available_images.attr,
> > > + &dev_attr_image_load.attr,
> > > + NULL,
> > > +};
> > > +
> > > +static struct attribute_group m10bmc_control_attr_group = {
> > > + .name = "control",
> > > + .attrs = m10bmc_control_attrs,
> > > +};
> > > +
> > > static ssize_t flash_count_show(struct device *dev,
> > > struct device_attribute *attr, char *buf)
> > > {
> > > @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
> > >
> > > static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> > > &m10bmc_security_attr_group,
> > > + &m10bmc_control_attr_group,
> > > NULL,
> > > };
> > >
> > > @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
> > > dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
> > > }
> > >
> > > +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> > > + unsigned int val)
> > > +{
> > > + u32 doorbell;
> > > + int ret;
> > > +
> > > + if (val > 1) {
> > > + dev_err(sec->dev, "invalid reload val = %d\n", val);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + if (doorbell & DRBL_REBOOT_DISABLED)
> > > + return -EBUSY;
> > > +
> > > + return regmap_update_bits(sec->m10bmc->regmap,
> > > + M10BMC_SYS_BASE + M10BMC_DOORBELL,
> > > + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> > > + FIELD_PREP(DRBL_CONFIG_SEL, val) |
> > > + DRBL_REBOOT_REQ);
> > > +}
> > > +
> > > +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
> > > +{
> > > + return m10bmc_sec_bmc_image_load(sec, 0);
> > > +}
> > > +
> > > +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
> > > +{
> > > + return m10bmc_sec_bmc_image_load(sec, 1);
> > > +}
> > > +
> > > +static struct image_load m10bmc_image_load_hndlrs[] = {
> > > + {
> > > + .name = "bmc_factory",
> > > + .load_image = m10bmc_sec_bmc_image_load_1,
> > > + },
> > > + {
> > > + .name = "bmc_user",
> > > + .load_image = m10bmc_sec_bmc_image_load_0,
> > > + },
> > > + {}
> > > +};
> > > +
> > > static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> > > {
> > > u32 doorbell;
> > > @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
> > > sec->dev = &pdev->dev;
> > > sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> > > dev_set_drvdata(&pdev->dev, sec);
> > > + sec->image_load = m10bmc_image_load_hndlrs;
> > >
> > > ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> > > xa_limit_32b, GFP_KERNEL);
> > > --
> > > 2.26.2
> > >
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-09 16:30 ` Xu Yilun
@ 2022-08-10 13:50 ` Zhang, Tianfei
2022-08-11 21:03 ` matthew.gerlach
1 sibling, 0 replies; 11+ messages in thread
From: Zhang, Tianfei @ 2022-08-10 13:50 UTC (permalink / raw)
To: Xu, Yilun, matthew.gerlach
Cc: mdf, linux-fpga, lee.jones, Weight, Russell H, Wu, Hao, trix
> -----Original Message-----
> From: Xu, Yilun <yilun.xu@intel.com>
> Sent: Wednesday, August 10, 2022 12:31 AM
> To: matthew.gerlach@linux.intel.com
> Cc: Zhang, Tianfei <tianfei.zhang@intel.com>; mdf@kernel.org; linux-
> fpga@vger.kernel.org; lee.jones@linaro.org; Weight, Russell H
> <russell.h.weight@intel.com>; Wu, Hao <hao.wu@intel.com>; trix@redhat.com
> Subject: Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC
> images
>
> On 2022-08-08 at 16:39:23 -0700, matthew.gerlach@linux.intel.com wrote:
> >
> >
> > On Mon, 8 Aug 2022, Xu Yilun wrote:
> >
> > > On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
> > > > From: Russ Weight <russell.h.weight@intel.com>
> > > >
> > > > Add the available_images and image_load sysfs files. The
> > > > available_images file returns a space separated list of key words
> > > > that may be written into the image_load file. These keywords
> > > > describe an FPGA, BMC, or firmware image in FLASH or EEPROM storage that
> may be loaded.
> > > >
> > > > The image_load sysfs file may be written with a key word to
> > > > trigger a reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
> > > >
> > > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > > Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> > > > ---
> > > > v3:
> > > > uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
> > > > v2:
> > > > adds the steps for how to use the "image_load" sysfs file.
> > > > ---
> > > > .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
> > > > drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
> > > > 2 files changed, 139 insertions(+)
> > > >
> > > > diff --git
> > > > a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > > b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > > index 0a41afe0ab4c..3d8f04ca6f1b 100644
> > > > ---
> > > > a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
> > > > +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-upd
> > > > +++ ate
> > > > @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
> > > > Description: Read only. Returns number of times the secure update
> > > > staging area has been flashed.
> > > > Format: "%u".
> > > > +
> > > > +What: /sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/available_images
> > > > +Date: July 2022
> > > > +KernelVersion: 5.20
> > > > +Contact: Russ Weight <russell.h.weight@intel.com>
> > > > +Description: Read-only. This file returns a space separated list of
> > > > + key words that may be written into the image_load file
> > > > + described below. These keywords decribe an FPGA, BMC,
> > > > + or firmware image in FLASH or EEPROM storage that may
> > > > + be loaded.
> > > > +
> > > > +What: /sys/bus/platform/drivers/intel-m10bmc-sec-
> update/.../control/image_load
> > > > +Date: July 2022
> > > > +KernelVersion: 5.20
> > > > +Contact: Russ Weight <russell.h.weight@intel.com>
> > > > +Description: Write-only. A key word may be written to this file to
> > > > + trigger a reload of an FPGA, BMC, or firmware image from
> > > > + FLASH or EEPROM. Refer to the available_images file for a
> > > > + list of supported key words for the underlying device.
> > > > + Writing an unsupported string to this file will result in
> > > > + EINVAL being returned.
> > > > + It should remove all of resources related to the old FPGA/BMC
> > > > + image before trigger the image reload otherwise the host system
> > > > + may crash. We recommended that follow the below steps or
> > > > +directly
> > >
> > > I suggest we solve this concern first before other detailed refinements.
> > >
> > > My opinion is, don't make the sysfs interface dependent of other
> > > user interfaces, like the following:
> > >
> > > > + use the OPAE RSU script to perform the reload for FPGA/BMC
> image.
> > > > + Here is the steps to trigger the reload for FPGA/BMC image:
> > > > + 1. disable the AER of the FPGA card.
> > > > + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> > > > + 3. trigger image reload via "image_load" sysfs file.
> > > > + 4. remove all of PCI devices of the FPGA card via
> > > > + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
> > > > + 5. wait 10 seconds.
> > > > + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> > > > + 7. enable the AER of the FPGA card.
> > >
> > > It is not a good idea the writing of the sysfs node crashes the
> > > system, if we don't follow the whole steps.
> > >
> > > Thanks,
> > > Yilun
> >
> > Hi Yilun,
> >
> > The use case being supported with this trigger is the ability to
> > reconfigure a FPGA or other programmable componenet on a board without
> > the requiring the HW platform be able to power cycle a PCI slot or
> > power cycling the whole system. Unfortunately, when a FPGA connected
> > to a PCI bus is reconfigured, it can cause a PCI error. The actual
> > pci error, if any, and any mitigation steps to handle the error is
> > platform specific and dependent on the FPGA image itself. Therefore
> > predicting and implementing all necessary error
>
> Why the error handling is unpredictable?
This error came from BMC/FPGA while burn the image if we don't remove this PCI device.
For example, the other kernel thread accessing the FPGA/BMC while we burn the new FPGA image.
>
> > mitigation in the kernel as part of the trigger would be an impossible task.
>
> Or could we just gate the pcie link? Just like we should disable fpga bridges before
> reprogramming any fpga region.
I think PCI "remove" sysfs file will remove all of subdevices like fpga bridges, and process of disable AER
and trigger PCI remove will mitigate the PCI errors.
>
> Actually I did find something for link disabing which may be useful.
>
> https://patchwork.kernel.org/project/linux-pci/patch/20190529104942.74991-1-
> mika.westerberg@linux.intel.com/
It looks like this link_disable patch has not accepted by PCI maintainer.
This link_disable sysfs file just want to protect against another user doing rescan.
>
> Thanks,
> Yilun
>
> >
> > Matthew
> >
> > >
> > > > diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > index 72c677c910de..3a082911cf67 100644
> > > > --- a/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
> > > > @@ -14,6 +14,8 @@
> > > > #include <linux/platform_device.h> #include <linux/slab.h>
> > > >
> > > > +struct image_load;
> > > > +
> > > > struct m10bmc_sec {
> > > > struct device *dev;
> > > > struct intel_m10bmc *m10bmc;
> > > > @@ -21,6 +23,12 @@ struct m10bmc_sec {
> > > > char *fw_name;
> > > > u32 fw_name_id;
> > > > bool cancel_request;
> > > > + struct image_load *image_load; /* terminated with { } member */
> > > > +};
> > > > +
> > > > +struct image_load {
> > > > + const char *name;
> > > > + int (*load_image)(struct m10bmc_sec *sec);
> > > > };
> > > >
> > > > static DEFINE_XARRAY_ALLOC(fw_upload_xa);
> > > > @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR +
> > > > CSK_VEC_OFFSET);
> > > >
> > > > #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
> > > >
> > > > +static ssize_t available_images_show(struct device *dev,
> > > > + struct device_attribute *attr, char *buf) {
> > > > + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > > + const struct image_load *hndlr;
> > > > + ssize_t count = 0;
> > > > +
> > > > + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > > + count += scnprintf(buf + count, PAGE_SIZE - count,
> > > > + "%s ", hndlr->name);
> > > > + }
> > > > +
> > > > + buf[count - 1] = '\n';
> > > > +
> > > > + return count;
> > > > +}
> > > > +static DEVICE_ATTR_RO(available_images);
> > > > +
> > > > +static ssize_t image_load_store(struct device *dev,
> > > > + struct device_attribute *attr,
> > > > + const char *buf, size_t count) {
> > > > + struct m10bmc_sec *sec = dev_get_drvdata(dev);
> > > > + const struct image_load *hndlr;
> > > > + int ret = -EINVAL;
> > > > +
> > > > + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
> > > > + if (sysfs_streq(buf, hndlr->name)) {
> > > > + ret = hndlr->load_image(sec);
> > > > + break;
> > > > + }
> > > > + }
> > > > +
> > > > + return ret ? : count;
> > > > +}
> > > > +static DEVICE_ATTR_WO(image_load);
> > > > +
> > > > +static struct attribute *m10bmc_control_attrs[] = {
> > > > + &dev_attr_available_images.attr,
> > > > + &dev_attr_image_load.attr,
> > > > + NULL,
> > > > +};
> > > > +
> > > > +static struct attribute_group m10bmc_control_attr_group = {
> > > > + .name = "control",
> > > > + .attrs = m10bmc_control_attrs,
> > > > +};
> > > > +
> > > > static ssize_t flash_count_show(struct device *dev,
> > > > struct device_attribute *attr, char *buf) { @@ -
> 195,6 +251,7
> > > > @@ static struct attribute_group m10bmc_security_attr_group = {
> > > >
> > > > static const struct attribute_group *m10bmc_sec_attr_groups[] = {
> > > > &m10bmc_security_attr_group,
> > > > + &m10bmc_control_attr_group,
> > > > NULL,
> > > > };
> > > >
> > > > @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec,
> u32 doorbell)
> > > > dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result); }
> > > >
> > > > +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
> > > > + unsigned int val)
> > > > +{
> > > > + u32 doorbell;
> > > > + int ret;
> > > > +
> > > > + if (val > 1) {
> > > > + dev_err(sec->dev, "invalid reload val = %d\n", val);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + if (doorbell & DRBL_REBOOT_DISABLED)
> > > > + return -EBUSY;
> > > > +
> > > > + return regmap_update_bits(sec->m10bmc->regmap,
> > > > + M10BMC_SYS_BASE + M10BMC_DOORBELL,
> > > > + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
> > > > + FIELD_PREP(DRBL_CONFIG_SEL, val) |
> > > > + DRBL_REBOOT_REQ);
> > > > +}
> > > > +
> > > > +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec) {
> > > > + return m10bmc_sec_bmc_image_load(sec, 0); }
> > > > +
> > > > +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec) {
> > > > + return m10bmc_sec_bmc_image_load(sec, 1); }
> > > > +
> > > > +static struct image_load m10bmc_image_load_hndlrs[] = {
> > > > + {
> > > > + .name = "bmc_factory",
> > > > + .load_image = m10bmc_sec_bmc_image_load_1,
> > > > + },
> > > > + {
> > > > + .name = "bmc_user",
> > > > + .load_image = m10bmc_sec_bmc_image_load_0,
> > > > + },
> > > > + {}
> > > > +};
> > > > +
> > > > static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
> > > > {
> > > > u32 doorbell;
> > > > @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device
> *pdev)
> > > > sec->dev = &pdev->dev;
> > > > sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
> > > > dev_set_drvdata(&pdev->dev, sec);
> > > > + sec->image_load = m10bmc_image_load_hndlrs;
> > > >
> > > > ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
> > > > xa_limit_32b, GFP_KERNEL);
> > > > --
> > > > 2.26.2
> > > >
> > >
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-09 7:04 ` Zhang, Tianfei
@ 2022-08-11 20:41 ` matthew.gerlach
0 siblings, 0 replies; 11+ messages in thread
From: matthew.gerlach @ 2022-08-11 20:41 UTC (permalink / raw)
To: Zhang, Tianfei
Cc: Xu, Yilun, mdf, linux-fpga, lee.jones, Weight, Russell H, Wu, Hao, trix
On Tue, 9 Aug 2022, Zhang, Tianfei wrote:
>
>
>> -----Original Message-----
>> From: matthew.gerlach@linux.intel.com <matthew.gerlach@linux.intel.com>
>> Sent: Tuesday, August 9, 2022 7:39 AM
>> To: Xu, Yilun <yilun.xu@intel.com>
>> Cc: Zhang, Tianfei <tianfei.zhang@intel.com>; mdf@kernel.org; linux-
>> fpga@vger.kernel.org; lee.jones@linaro.org; Weight, Russell H
>> <russell.h.weight@intel.com>; Wu, Hao <hao.wu@intel.com>; trix@redhat.com
>> Subject: Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC
>> images
>>
>>
>>
>> On Mon, 8 Aug 2022, Xu Yilun wrote:
>>
>>> On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
>>>> From: Russ Weight <russell.h.weight@intel.com>
>>>>
>>>> Add the available_images and image_load sysfs files. The
>>>> available_images file returns a space separated list of key words
>>>> that may be written into the image_load file. These keywords describe
>>>> an FPGA, BMC, or firmware image in FLASH or EEPROM storage that may be
>> loaded.
>>>>
>>>> The image_load sysfs file may be written with a key word to trigger a
>>>> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>>>>
>>>> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
>>>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>>>> ---
>>>> v3:
>>>> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
>>>> v2:
>>>> adds the steps for how to use the "image_load" sysfs file.
>>>> ---
>>>> .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
>>>> drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
>>>> 2 files changed, 139 insertions(+)
>>>>
>>>> diff --git
>>>> a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> index 0a41afe0ab4c..3d8f04ca6f1b 100644
>>>> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
>>>> Description: Read only. Returns number of times the secure update
>>>> staging area has been flashed.
>>>> Format: "%u".
>>>> +
>>>> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-
>> update/.../control/available_images
>>>> +Date: July 2022
>>>> +KernelVersion: 5.20
>>>> +Contact: Russ Weight <russell.h.weight@intel.com>
>>>> +Description: Read-only. This file returns a space separated list of
>>>> + key words that may be written into the image_load file
>>>> + described below. These keywords decribe an FPGA, BMC,
>>>> + or firmware image in FLASH or EEPROM storage that may
>>>> + be loaded.
>>>> +
>>>> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-
>> update/.../control/image_load
>>>> +Date: July 2022
>>>> +KernelVersion: 5.20
>>>> +Contact: Russ Weight <russell.h.weight@intel.com>
>>>> +Description: Write-only. A key word may be written to this file to
>>>> + trigger a reload of an FPGA, BMC, or firmware image from
>>>> + FLASH or EEPROM. Refer to the available_images file for a
>>>> + list of supported key words for the underlying device.
>>>> + Writing an unsupported string to this file will result in
>>>> + EINVAL being returned.
>>>> + It should remove all of resources related to the old FPGA/BMC
>>>> + image before trigger the image reload otherwise the host system
>>>> + may crash. We recommended that follow the below steps or
>> directly
>>>
>>> I suggest we solve this concern first before other detailed refinements.
>
> I have run the stress test on N3000 and N6000 card by only directly write string to " image_load" sysfs file,
> and observed that there are no Host System crash occurred, but the BMC functionality failed, I think this is
> the expected behavior that is why we strongly recommended that end-user use our RSU script or follow our
> provided steps.
>
>>>
>>> My opinion is, don't make the sysfs interface dependent of other user
>>> interfaces, like the following:
>>>
>>>> + use the OPAE RSU script to perform the reload for FPGA/BMC
>> image.
>>>> + Here is the steps to trigger the reload for FPGA/BMC image:
>>>> + 1. disable the AER of the FPGA card.
>>>> + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
>>>> + 3. trigger image reload via "image_load" sysfs file.
>>>> + 4. remove all of PCI devices of the FPGA card via
>>>> + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
>>>> + 5. wait 10 seconds.
>>>> + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
>>>> + 7. enable the AER of the FPGA card.
>>>
>>> It is not a good idea the writing of the sysfs node crashes the
>>> system, if we don't follow the whole steps.
>>>
>>> Thanks,
>>> Yilun
>>
>> Hi Yilun,
>>
>> The use case being supported with this trigger is the ability to reconfigure a FPGA or
>> other programmable componenet on a board without the requiring the HW platform
>> be able to power cycle a PCI slot or power cycling the whole system. Unfortunately,
>> when a FPGA connected to a PCI bus is reconfigured, it can cause a PCI error. The
>> actual pci error, if any, and any mitigation steps to handle the error is platform
>> specific and dependent on the FPGA image itself. Therefore predicting and
>> implementing all necessary error mitigation in the kernel as part of the trigger would
>> be an impossible task.
>
> I agree with Matthew's comment.
> How about we change the documentation as below:
>
> It should remove all of resources related to the old FPGA/BMC
> image before trigger the image reload otherwise the functionalities
> of FPGA/BMC will fail. We recommended that follow the below steps
> or directly use the OPAE RSU script to perform the reload for FPGA/BMC
> image.
> Here is the steps to trigger the reload for FPGA/BMC image:
> 1. disable the AER of the FPGA card.
> 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
> 3. trigger image reload via "image_load" sysfs file.
> 4. remove all of PCI devices of the FPGA card via
> "/sys/bus/pci/devices/xxxx/remove" sysfs file.
> 5. wait 10 seconds.
> 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
> 7. enable the AER of the FPGA card.
>
Hi Tianfei,
I think the safe sequence you describe can be made more robust as follows:
1. Unbind driver from all VFs to be affected by the trigger.
2. Shutdown all VFs from step 1.
3. Unbind driver from all PFs affectected by the trigger except the
management PF.
4. Disable AER on all PFs and upstream pci root port.
5. trigger image reload of FPGA and/or other programmable devices
6. Unbind driver from management PF
7. Wait for reconfiguration of FPGA an/or other programmable parts (e.g.
10 seconds)
8. re-scan PCI bus.
9. renable the AER on all PFs and upstream root port
10. bring up VFs as desired.
All of the steps above, except step 5, is easily performed in user space.
The exact offset of the trigger register and the method to access it are
board specific. These board differences are already handled in the
intel-m10-bmc framework, which makes it possible to provide a consistent
user interface accross the different boards for triggering the
reconfiguration.
Matthew
>
>
>>
>> Matthew
>>
>>>
>>>> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-
>> bmc-sec-update.c
>>>> index 72c677c910de..3a082911cf67 100644
>>>> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> @@ -14,6 +14,8 @@
>>>> #include <linux/platform_device.h>
>>>> #include <linux/slab.h>
>>>>
>>>> +struct image_load;
>>>> +
>>>> struct m10bmc_sec {
>>>> struct device *dev;
>>>> struct intel_m10bmc *m10bmc;
>>>> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>>>> char *fw_name;
>>>> u32 fw_name_id;
>>>> bool cancel_request;
>>>> + struct image_load *image_load; /* terminated with { } member */
>>>> +};
>>>> +
>>>> +struct image_load {
>>>> + const char *name;
>>>> + int (*load_image)(struct m10bmc_sec *sec);
>>>> };
>>>>
>>>> static DEFINE_XARRAY_ALLOC(fw_upload_xa);
>>>> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR +
>> CSK_VEC_OFFSET);
>>>>
>>>> #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
>>>>
>>>> +static ssize_t available_images_show(struct device *dev,
>>>> + struct device_attribute *attr, char *buf)
>>>> +{
>>>> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> + const struct image_load *hndlr;
>>>> + ssize_t count = 0;
>>>> +
>>>> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> + count += scnprintf(buf + count, PAGE_SIZE - count,
>>>> + "%s ", hndlr->name);
>>>> + }
>>>> +
>>>> + buf[count - 1] = '\n';
>>>> +
>>>> + return count;
>>>> +}
>>>> +static DEVICE_ATTR_RO(available_images);
>>>> +
>>>> +static ssize_t image_load_store(struct device *dev,
>>>> + struct device_attribute *attr,
>>>> + const char *buf, size_t count)
>>>> +{
>>>> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> + const struct image_load *hndlr;
>>>> + int ret = -EINVAL;
>>>> +
>>>> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> + if (sysfs_streq(buf, hndlr->name)) {
>>>> + ret = hndlr->load_image(sec);
>>>> + break;
>>>> + }
>>>> + }
>>>> +
>>>> + return ret ? : count;
>>>> +}
>>>> +static DEVICE_ATTR_WO(image_load);
>>>> +
>>>> +static struct attribute *m10bmc_control_attrs[] = {
>>>> + &dev_attr_available_images.attr,
>>>> + &dev_attr_image_load.attr,
>>>> + NULL,
>>>> +};
>>>> +
>>>> +static struct attribute_group m10bmc_control_attr_group = {
>>>> + .name = "control",
>>>> + .attrs = m10bmc_control_attrs,
>>>> +};
>>>> +
>>>> static ssize_t flash_count_show(struct device *dev,
>>>> struct device_attribute *attr, char *buf)
>>>> {
>>>> @@ -195,6 +251,7 @@ static struct attribute_group
>> m10bmc_security_attr_group = {
>>>>
>>>> static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>>>> &m10bmc_security_attr_group,
>>>> + &m10bmc_control_attr_group,
>>>> NULL,
>>>> };
>>>>
>>>> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32
>> doorbell)
>>>> dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>>>> }
>>>>
>>>> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
>>>> + unsigned int val)
>>>> +{
>>>> + u32 doorbell;
>>>> + int ret;
>>>> +
>>>> + if (val > 1) {
>>>> + dev_err(sec->dev, "invalid reload val = %d\n", val);
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + if (doorbell & DRBL_REBOOT_DISABLED)
>>>> + return -EBUSY;
>>>> +
>>>> + return regmap_update_bits(sec->m10bmc->regmap,
>>>> + M10BMC_SYS_BASE + M10BMC_DOORBELL,
>>>> + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
>>>> + FIELD_PREP(DRBL_CONFIG_SEL, val) |
>>>> + DRBL_REBOOT_REQ);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
>>>> +{
>>>> + return m10bmc_sec_bmc_image_load(sec, 0);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
>>>> +{
>>>> + return m10bmc_sec_bmc_image_load(sec, 1);
>>>> +}
>>>> +
>>>> +static struct image_load m10bmc_image_load_hndlrs[] = {
>>>> + {
>>>> + .name = "bmc_factory",
>>>> + .load_image = m10bmc_sec_bmc_image_load_1,
>>>> + },
>>>> + {
>>>> + .name = "bmc_user",
>>>> + .load_image = m10bmc_sec_bmc_image_load_0,
>>>> + },
>>>> + {}
>>>> +};
>>>> +
>>>> static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>>>> {
>>>> u32 doorbell;
>>>> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device
>> *pdev)
>>>> sec->dev = &pdev->dev;
>>>> sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>>>> dev_set_drvdata(&pdev->dev, sec);
>>>> + sec->image_load = m10bmc_image_load_hndlrs;
>>>>
>>>> ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>>>> xa_limit_32b, GFP_KERNEL);
>>>> --
>>>> 2.26.2
>>>>
>>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images
2022-08-09 16:30 ` Xu Yilun
2022-08-10 13:50 ` Zhang, Tianfei
@ 2022-08-11 21:03 ` matthew.gerlach
1 sibling, 0 replies; 11+ messages in thread
From: matthew.gerlach @ 2022-08-11 21:03 UTC (permalink / raw)
To: Xu Yilun
Cc: Tianfei Zhang, mdf, linux-fpga, lee.jones, russell.h.weight,
hao.wu, trix
On Wed, 10 Aug 2022, Xu Yilun wrote:
> On 2022-08-08 at 16:39:23 -0700, matthew.gerlach@linux.intel.com wrote:
>>
>>
>> On Mon, 8 Aug 2022, Xu Yilun wrote:
>>
>>> On 2022-08-08 at 01:33:16 -0400, Tianfei Zhang wrote:
>>>> From: Russ Weight <russell.h.weight@intel.com>
>>>>
>>>> Add the available_images and image_load sysfs files. The available_images
>>>> file returns a space separated list of key words that may be written
>>>> into the image_load file. These keywords describe an FPGA, BMC, or
>>>> firmware image in FLASH or EEPROM storage that may be loaded.
>>>>
>>>> The image_load sysfs file may be written with a key word to trigger a
>>>> reload of an FPGA, BMC, or firmware image from FLASH or EEPROM.
>>>>
>>>> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
>>>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>>>> ---
>>>> v3:
>>>> uses regmap_update_bits() API instead of m10bmc_sys_update_bits().
>>>> v2:
>>>> adds the steps for how to use the "image_load" sysfs file.
>>>> ---
>>>> .../sysfs-driver-intel-m10-bmc-sec-update | 34 ++++++
>>>> drivers/fpga/intel-m10-bmc-sec-update.c | 105 ++++++++++++++++++
>>>> 2 files changed, 139 insertions(+)
>>>>
>>>> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> index 0a41afe0ab4c..3d8f04ca6f1b 100644
>>>> --- a/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> +++ b/Documentation/ABI/testing/sysfs-driver-intel-m10-bmc-sec-update
>>>> @@ -59,3 +59,37 @@ Contact: Russ Weight <russell.h.weight@intel.com>
>>>> Description: Read only. Returns number of times the secure update
>>>> staging area has been flashed.
>>>> Format: "%u".
>>>> +
>>>> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/available_images
>>>> +Date: July 2022
>>>> +KernelVersion: 5.20
>>>> +Contact: Russ Weight <russell.h.weight@intel.com>
>>>> +Description: Read-only. This file returns a space separated list of
>>>> + key words that may be written into the image_load file
>>>> + described below. These keywords decribe an FPGA, BMC,
>>>> + or firmware image in FLASH or EEPROM storage that may
>>>> + be loaded.
>>>> +
>>>> +What: /sys/bus/platform/drivers/intel-m10bmc-sec-update/.../control/image_load
>>>> +Date: July 2022
>>>> +KernelVersion: 5.20
>>>> +Contact: Russ Weight <russell.h.weight@intel.com>
>>>> +Description: Write-only. A key word may be written to this file to
>>>> + trigger a reload of an FPGA, BMC, or firmware image from
>>>> + FLASH or EEPROM. Refer to the available_images file for a
>>>> + list of supported key words for the underlying device.
>>>> + Writing an unsupported string to this file will result in
>>>> + EINVAL being returned.
>>>> + It should remove all of resources related to the old FPGA/BMC
>>>> + image before trigger the image reload otherwise the host system
>>>> + may crash. We recommended that follow the below steps or directly
>>>
>>> I suggest we solve this concern first before other detailed refinements.
>>>
>>> My opinion is, don't make the sysfs interface dependent of other user
>>> interfaces, like the following:
>>>
>>>> + use the OPAE RSU script to perform the reload for FPGA/BMC image.
>>>> + Here is the steps to trigger the reload for FPGA/BMC image:
>>>> + 1. disable the AER of the FPGA card.
>>>> + 2. unbind the PFs/VFs which have bound with VFIO/UIO driver.
>>>> + 3. trigger image reload via "image_load" sysfs file.
>>>> + 4. remove all of PCI devices of the FPGA card via
>>>> + "/sys/bus/pci/devices/xxxx/remove" sysfs file.
>>>> + 5. wait 10 seconds.
>>>> + 6. re-scan the PCI bus via "/sys/bus/pci/rescan" sysfs file.
>>>> + 7. enable the AER of the FPGA card.
>>>
>>> It is not a good idea the writing of the sysfs node crashes the system,
>>> if we don't follow the whole steps.
>>>
>>> Thanks,
>>> Yilun
>>
>> Hi Yilun,
>>
>> The use case being supported with this trigger is the ability to reconfigure
>> a FPGA or other programmable componenet on a board without the requiring the
>> HW platform be able to power cycle a PCI slot or power cycling the whole
>> system. Unfortunately, when a FPGA connected to a PCI bus is reconfigured,
>> it can cause a PCI error. The actual pci error, if any, and any mitigation
>> steps to handle the error is platform specific and dependent on the FPGA
>> image itself. Therefore predicting and implementing all necessary error
>
> Why the error handling is unpredictable?
>
Error handling is unpredictable and platform specific for two reasons.
First, the overall implementation of PCI bus can have differences.
For example, Advanced Error Reporting and/or Downstream Port Containment are
not implemented on every platform. Secondly, the actual error caused the
triggering can be specific to a board and/or a FPGA design on the board.
>> mitigation in the kernel as part of the trigger would be an impossible task.
>
> Or could we just gate the pcie link? Just like we should disable fpga
> bridges before reprogramming any fpga region.
The ability to safely gate a pcie link is also a platform specific
feature. While platforms exists that provide the ability for safe PCIe hot
swap, many do not. Keeping as little of the overall safe sequence of FPGA
reconfiguration in the kernel allows for the greatest flexibility to
implement the platform specific part of the safe sequence in user space.
>
> Actually I did find something for link disabing which may be useful.
>
> https://patchwork.kernel.org/project/linux-pci/patch/20190529104942.74991-1-mika.westerberg@linux.intel.com/
>
> Thanks,
> Yilun
>
>>
>> Matthew
>>
>>>
>>>> diff --git a/drivers/fpga/intel-m10-bmc-sec-update.c b/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> index 72c677c910de..3a082911cf67 100644
>>>> --- a/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> +++ b/drivers/fpga/intel-m10-bmc-sec-update.c
>>>> @@ -14,6 +14,8 @@
>>>> #include <linux/platform_device.h>
>>>> #include <linux/slab.h>
>>>>
>>>> +struct image_load;
>>>> +
>>>> struct m10bmc_sec {
>>>> struct device *dev;
>>>> struct intel_m10bmc *m10bmc;
>>>> @@ -21,6 +23,12 @@ struct m10bmc_sec {
>>>> char *fw_name;
>>>> u32 fw_name_id;
>>>> bool cancel_request;
>>>> + struct image_load *image_load; /* terminated with { } member */
>>>> +};
>>>> +
>>>> +struct image_load {
>>>> + const char *name;
>>>> + int (*load_image)(struct m10bmc_sec *sec);
>>>> };
>>>>
>>>> static DEFINE_XARRAY_ALLOC(fw_upload_xa);
>>>> @@ -137,6 +145,54 @@ DEVICE_ATTR_SEC_CSK_RO(pr, PR_PROG_ADDR + CSK_VEC_OFFSET);
>>>>
>>>> #define FLASH_COUNT_SIZE 4096 /* count stored as inverted bit vector */
>>>>
>>>> +static ssize_t available_images_show(struct device *dev,
>>>> + struct device_attribute *attr, char *buf)
>>>> +{
>>>> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> + const struct image_load *hndlr;
>>>> + ssize_t count = 0;
>>>> +
>>>> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> + count += scnprintf(buf + count, PAGE_SIZE - count,
>>>> + "%s ", hndlr->name);
>>>> + }
>>>> +
>>>> + buf[count - 1] = '\n';
>>>> +
>>>> + return count;
>>>> +}
>>>> +static DEVICE_ATTR_RO(available_images);
>>>> +
>>>> +static ssize_t image_load_store(struct device *dev,
>>>> + struct device_attribute *attr,
>>>> + const char *buf, size_t count)
>>>> +{
>>>> + struct m10bmc_sec *sec = dev_get_drvdata(dev);
>>>> + const struct image_load *hndlr;
>>>> + int ret = -EINVAL;
>>>> +
>>>> + for (hndlr = sec->image_load; hndlr->name; hndlr++) {
>>>> + if (sysfs_streq(buf, hndlr->name)) {
>>>> + ret = hndlr->load_image(sec);
>>>> + break;
>>>> + }
>>>> + }
>>>> +
>>>> + return ret ? : count;
>>>> +}
>>>> +static DEVICE_ATTR_WO(image_load);
>>>> +
>>>> +static struct attribute *m10bmc_control_attrs[] = {
>>>> + &dev_attr_available_images.attr,
>>>> + &dev_attr_image_load.attr,
>>>> + NULL,
>>>> +};
>>>> +
>>>> +static struct attribute_group m10bmc_control_attr_group = {
>>>> + .name = "control",
>>>> + .attrs = m10bmc_control_attrs,
>>>> +};
>>>> +
>>>> static ssize_t flash_count_show(struct device *dev,
>>>> struct device_attribute *attr, char *buf)
>>>> {
>>>> @@ -195,6 +251,7 @@ static struct attribute_group m10bmc_security_attr_group = {
>>>>
>>>> static const struct attribute_group *m10bmc_sec_attr_groups[] = {
>>>> &m10bmc_security_attr_group,
>>>> + &m10bmc_control_attr_group,
>>>> NULL,
>>>> };
>>>>
>>>> @@ -208,6 +265,53 @@ static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell)
>>>> dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result);
>>>> }
>>>>
>>>> +static int m10bmc_sec_bmc_image_load(struct m10bmc_sec *sec,
>>>> + unsigned int val)
>>>> +{
>>>> + u32 doorbell;
>>>> + int ret;
>>>> +
>>>> + if (val > 1) {
>>>> + dev_err(sec->dev, "invalid reload val = %d\n", val);
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, &doorbell);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + if (doorbell & DRBL_REBOOT_DISABLED)
>>>> + return -EBUSY;
>>>> +
>>>> + return regmap_update_bits(sec->m10bmc->regmap,
>>>> + M10BMC_SYS_BASE + M10BMC_DOORBELL,
>>>> + DRBL_CONFIG_SEL | DRBL_REBOOT_REQ,
>>>> + FIELD_PREP(DRBL_CONFIG_SEL, val) |
>>>> + DRBL_REBOOT_REQ);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_0(struct m10bmc_sec *sec)
>>>> +{
>>>> + return m10bmc_sec_bmc_image_load(sec, 0);
>>>> +}
>>>> +
>>>> +static int m10bmc_sec_bmc_image_load_1(struct m10bmc_sec *sec)
>>>> +{
>>>> + return m10bmc_sec_bmc_image_load(sec, 1);
>>>> +}
>>>> +
>>>> +static struct image_load m10bmc_image_load_hndlrs[] = {
>>>> + {
>>>> + .name = "bmc_factory",
>>>> + .load_image = m10bmc_sec_bmc_image_load_1,
>>>> + },
>>>> + {
>>>> + .name = "bmc_user",
>>>> + .load_image = m10bmc_sec_bmc_image_load_0,
>>>> + },
>>>> + {}
>>>> +};
>>>> +
>>>> static enum fw_upload_err rsu_check_idle(struct m10bmc_sec *sec)
>>>> {
>>>> u32 doorbell;
>>>> @@ -565,6 +669,7 @@ static int m10bmc_sec_probe(struct platform_device *pdev)
>>>> sec->dev = &pdev->dev;
>>>> sec->m10bmc = dev_get_drvdata(pdev->dev.parent);
>>>> dev_set_drvdata(&pdev->dev, sec);
>>>> + sec->image_load = m10bmc_image_load_hndlrs;
>>>>
>>>> ret = xa_alloc(&fw_upload_xa, &sec->fw_name_id, sec,
>>>> xa_limit_32b, GFP_KERNEL);
>>>> --
>>>> 2.26.2
>>>>
>>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-08-11 21:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 5:33 [PATCH v3 0/2] add Intel FPGA image reload support Tianfei Zhang
2022-08-08 5:33 ` [PATCH v3 1/2] fpga: m10bmc-sec: add sysfs to reload FPGA/BMC images Tianfei Zhang
2022-08-08 9:57 ` Xu Yilun
2022-08-08 23:39 ` matthew.gerlach
2022-08-09 7:04 ` Zhang, Tianfei
2022-08-11 20:41 ` matthew.gerlach
2022-08-09 16:30 ` Xu Yilun
2022-08-10 13:50 ` Zhang, Tianfei
2022-08-11 21:03 ` matthew.gerlach
2022-08-08 5:33 ` [PATCH v3 2/2] fpga: m10bmc-sec: add m10bmc_sec_retimer_load callback Tianfei Zhang
2022-08-08 16:22 ` Lee Jones
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.